[Bioc-sig-seq] "pooled" dispersion estimation in edgeR

Gordon K Smyth Fri, 15 Jul 2011 19:30:04 -0700

Hi Sean,

On Fri, 15 Jul 2011, Sean Ruddy wrote:

Hi Gordon,
Thanks for the response. One of my data sets has 8 conditions and noreplicates and so I wanted to emulate DESeq's way of pooling the samplesand also use an offset matrix. I was hoping to avoid doing it manuallyso that I don't mess it up. I could do this all in edgeR and pool thesamples but I'm not sure how well this would work under edgeR vs. DESeq.

edgeR has a very flexible interface, so there was no need to explicitlyintroduce a "pooled" method. Instead, this sort of thing can be handledby the usual functions in the usual way. Suppose you have a data objecty, which includes an offset matrix:


   y$offset <- your matrix

Then you can estimate the "pooled" dispersion simply by:

   y <- estimateGLMCommonDisp(y)

The fact that you don't supply a design matrix means that the samples areautomatically treated as one group, i.e., pooled. You can estimate atrended or tagwise dispersions in the same way. Then


   fit <- glmFit(y,design)  etc

will do any analysis you want using dispersions estimated when the sampleswere pooled.

I and the other edgeR authors are anxious to get feedback, so write againif this doesn't turn out to be clear.

I am curious though what sounds off to you in my previous email. I don'tfeel entirely comfortable doing this manually but hopefully it's justbecause I left out some details. I was trying to follow the DESeq methodand the only difference I saw was in the size factor calculations whichI changed for my own needs by using the offset values for each tag andsample.

Even if you could estimate the variances yourself, I don't see any manualway that you could perform valid statistical tests, while correctlyaccounting for the offsets. The whole negative binomial methodologyrequires genuine counts rather than adjusted counts. So handling theoffsets needs to be built-in.


Best wishes
Gordon

I appreciate the help!

Best,
Sean

On Fri, Jul 15, 2011 at 12:02 AM, Gordon K Smyth <sm...@wehi.edu.au> wrote:
Hi Sean,

I'm curious to know why not use edgeR, since edgeR does what you want and
DESeq doesn't?

I might be wrong, but the manual analysis that you describe doesn't sound
right.

Best wishes
Gordon

 Date: Thu, 14 Jul 2011 12:54:49 -0700
From: Sean Ruddy <srudd...@gmail.com>
To: bioc-sig-sequencing@r-project.**org<bioc-sig-sequencing@r-project.org>
Subject: [Bioc-sig-seq] Supplying own variance functions and adjusted
       counts  to a DESeq dataset

Hi,
I have a RNA-Seq count data set that requires separate offset valuesfor each tag and sample. DESeq does not appear to take a matrix ofoffset values (unlike edgeR) in any of its functions so I've carriedout the analysis manually, ie. calculating a size factor for each tagof each sample, adjusting the counts, then proceeding to calculatemeans and variances of the adjusted counts, and finally fitting acurve for each condition to the mean-var plot using locfit().
Essentially, I'd like to put these variance functions (or at least allthe predicted variances) and adjusted counts inside a DESeq object sothat I can take advantage of the other functions DESeq offers, tests,plots, etc...
Thanks for the help!

Sean


______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

[Bioc-sig-seq] "pooled" dispersion estimation in edgeR

Reply via email to