Hi Sean,

On Fri, 15 Jul 2011, Sean Ruddy wrote:

Hi Gordon,

Thanks for the response. One of my data sets has 8 conditions and no replicates and so I wanted to emulate DESeq's way of pooling the samples and also use an offset matrix. I was hoping to avoid doing it manually so that I don't mess it up. I could do this all in edgeR and pool the samples but I'm not sure how well this would work under edgeR vs. DESeq.

edgeR has a very flexible interface, so there was no need to explicitly introduce a "pooled" method. Instead, this sort of thing can be handled by the usual functions in the usual way. Suppose you have a data object y, which includes an offset matrix:

   y$offset <- your matrix

Then you can estimate the "pooled" dispersion simply by:

   y <- estimateGLMCommonDisp(y)

The fact that you don't supply a design matrix means that the samples are automatically treated as one group, i.e., pooled. You can estimate a trended or tagwise dispersions in the same way. Then

   fit <- glmFit(y,design)  etc

will do any analysis you want using dispersions estimated when the samples were pooled.

I and the other edgeR authors are anxious to get feedback, so write again if this doesn't turn out to be clear.

I am curious though what sounds off to you in my previous email. I don't feel entirely comfortable doing this manually but hopefully it's just because I left out some details. I was trying to follow the DESeq method and the only difference I saw was in the size factor calculations which I changed for my own needs by using the offset values for each tag and sample.

Even if you could estimate the variances yourself, I don't see any manual way that you could perform valid statistical tests, while correctly accounting for the offsets. The whole negative binomial methodology requires genuine counts rather than adjusted counts. So handling the offsets needs to be built-in.

Best wishes
Gordon

I appreciate the help!

Best,
Sean

On Fri, Jul 15, 2011 at 12:02 AM, Gordon K Smyth <sm...@wehi.edu.au> wrote:

Hi Sean,

I'm curious to know why not use edgeR, since edgeR does what you want and
DESeq doesn't?

I might be wrong, but the manual analysis that you describe doesn't sound
right.

Best wishes
Gordon

 Date: Thu, 14 Jul 2011 12:54:49 -0700
From: Sean Ruddy <srudd...@gmail.com>
To: bioc-sig-sequencing@r-project.**org<bioc-sig-sequencing@r-project.org>
Subject: [Bioc-sig-seq] Supplying own variance functions and adjusted
       counts  to a DESeq dataset

Hi,

I have a RNA-Seq count data set that requires separate offset values for each tag and sample. DESeq does not appear to take a matrix of offset values (unlike edgeR) in any of its functions so I've carried out the analysis manually, ie. calculating a size factor for each tag of each sample, adjusting the counts, then proceeding to calculate means and variances of the adjusted counts, and finally fitting a curve for each condition to the mean-var plot using locfit().

Essentially, I'd like to put these variance functions (or at least all the predicted variances) and adjusted counts inside a DESeq object so that I can take advantage of the other functions DESeq offers, tests, plots, etc...

Thanks for the help!

Sean

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to