Today marks the release of Bioconductor 3.7 (official announcement). Congratulations to the hundreds of developers who have collectively developed more than 1500 software packages, not to mention the annotation, experiment data, and workflow packages.
A huge thank you to the core team for the incredible work they do, especially around release time!
Much of my recent development work on Bioconductor has been leveraging and extending the DelayedArray framework developed by Hervé Pagès (Bioconductor Core Team).
Over the Australian summer, I flew home from Baltimore to attend the wedding of two of my dearest friends, Kirsty and Kelly. I had the added excitement of helping officiate their ceremony and could not have been more thrilled. There’s was to be the first legal same-sex marriage in Tasmania, the southernmost state of Australia. Needless to say, it was a bloody incredible day, as recorded in several media articles, including The Advocate, PEDESTRIAN.
A couple of weeks back I was in Boston for BioC2017, the annual Bioconductor meeting. This is my favourite conference because I get to hear from and meet the awesome community that develop and use R/Bioconductor packages to enable research in high-throughput biology. The agenda and slides for the 3 days are available from https://www.bioconductor.org/help/course-materials/2017/BioC2017/. I’m drawing on these notes that I scrawled during Developer Day, the first day of the meeting.
I started this post as a straight review of the Australian Statistical Conference 2014, but it turned into something else about large conferences and what makes me excited about statistics (spoiler alert, the answer is data science).
This year’s conference was in Sydney and was held jointly with the annual meeting of the Institute of Mathematical Statistics. Being held jointly with the IMS had two main effects:
This was by far the biggest statistics conference I’ve been to, with some 500-600 delegates and up to 8 parallel session.
Earlier this year this equation was doing the rounds of the geek community. When plotted it supposedly drew the Batman symbol.
Simple, no? See here for a thorough discussion of how all the pieces fit together.
It would seem that you have to work pretty hard for Batman to appear in your graphs. But today I stumbled across Batman in real scientific paper. Observe…
The plot comes from Supplementary Figure 11 of Li, Y.
I attended a talk by Dr Stephen Turner, the founder and Chief Technology Officer of Pacific Biosciences, promoting PacBio’s SMRT (Single Molecule Real Time) sequencing platform. While I’d heard of the “next-next-generation” of sequencing technologies at least 18 months ago, this was the first time I’d paid much attention to them. What sets the “next-next-gen” from the “next-gen” platforms (why won’t this terminology die already!) is that rather than sequencing a cluster (Illumina) or a bead (SOLiD) of amplified and identical molecules we sequence a single molecule.
Rick Tankard, who is a research assistant in the Bahlo Lab at WEHI, gave a seminar discussing the quality scores produced by the Illumina sequencing machines. He also discussed the analysis pipeline he has built, in conjunction with other members of the Bahlo lab, for the detection of rare variants in MPS experiments
The quality scores produced off the sequencer are meant to give an indication of the quality of the base-call (for SOLiD data the qualities measure the quality of the colour- or dinucleotide-call).
Update (21/02/2014) The voom method has now been published in Genome Biology
Original post Gordon Smyth is well known for his development of the limma Bioconductor package for the analysis of differential gene expression using microarrays. More recently his group has led the way in the development of software for the statistical analysis of gene expression using RNA-seq with the edgeR Bioconductor package and the voom() method in limma. Today, Gordon spoke about modeling the variance in RNA-seq data for studying gene expression, in particular using the voom() method, and contrasted this approach with that taken by edgeR (and other Poisson/Negative binomial-like methods).
Neochromosomes Neochromosomes (NCs) are “extra” chromosomes that are found in around 3% of tumour genomes. They are a hallmark of liposarcoma, which is a cancer of the fat cells studied by the Papenfuss lab at WEHI in conjunction with colleagues at the Peter MacCallum Cancer Centre. Circular chromosomes and giant rod chromosomes are both examples of NCs and are comprised of multiple donor segments from other regions of the region that are frequently highly amplified.
Dr Alicia Oshlack, head of Bioinformatics at the Murdoch Children’s Research Institute and formerly of WEHI Bioinformatics, gave today’s Bioinformatics seminar at WEHI. Her topic was “Analysing Human Infinium 450k Methylation Arrays”, in particular the normalisation and quality control issues associated with them. I’ve fleshed out the notes I made during the seminar, below.
Alicia credited Jovana Maksimovic and Livinia Gordon (members of her lab) with most of the work she was to present today.