Hi Sean,
On Fri, 15 Jul 2011, Sean Ruddy wrote:
Hi Gordon,
Thanks for the response. One of my data sets has 8 conditions and no
replicates and so I wanted to emulate DESeq's way of pooling the samples
and also use an offset matrix. I was hoping to avoid doing it manually
so that I don't mess it up. I could do this all in edgeR and pool the
samples but I'm not sure how well this would work under edgeR vs. DESeq.
edgeR has a very flexible interface, so there was no need to explicitly
introduce a "pooled" method. Instead, this sort of thing can be handled
by the usual functions in the usual way. Suppose you have a data object
y, which includes an offset matrix:
y$offset <- your matrix
Then you can estimate the "pooled" dispersion simply by:
y <- estimateGLMCommonDisp(y)
The fact that you don't supply a design matrix means that the samples are
automatically treated as one group, i.e., pooled. You can estimate a
trended or tagwise dispersions in the same way. Then
fit <- glmFit(y,design) etc
will do any analysis you want using dispersions estimated when the samples
were pooled.
I and the other edgeR authors are anxious to get feedback, so write again
if this doesn't turn out to be clear.
I am curious though what sounds off to you in my previous email. I don't
feel entirely comfortable doing this manually but hopefully it's just
because I left out some details. I was trying to follow the DESeq method
and the only difference I saw was in the size factor calculations which
I changed for my own needs by using the offset values for each tag and
sample.
Even if you could estimate the variances yourself, I don't see any manual
way that you could perform valid statistical tests, while correctly
accounting for the offsets. The whole negative binomial methodology
requires genuine counts rather than adjusted counts. So handling the
offsets needs to be built-in.
Best wishes
Gordon
I appreciate the help!
Best,
Sean
On Fri, Jul 15, 2011 at 12:02 AM, Gordon K Smyth <sm...@wehi.edu.au> wrote:
Hi Sean,
I'm curious to know why not use edgeR, since edgeR does what you want and
DESeq doesn't?
I might be wrong, but the manual analysis that you describe doesn't sound
right.
Best wishes
Gordon
Date: Thu, 14 Jul 2011 12:54:49 -0700
From: Sean Ruddy <srudd...@gmail.com>
To: bioc-sig-sequencing@r-project.**org<bioc-sig-sequencing@r-project.org>
Subject: [Bioc-sig-seq] Supplying own variance functions and adjusted
counts to a DESeq dataset
Hi,
I have a RNA-Seq count data set that requires separate offset values
for each tag and sample. DESeq does not appear to take a matrix of
offset values (unlike edgeR) in any of its functions so I've carried
out the analysis manually, ie. calculating a size factor for each tag
of each sample, adjusting the counts, then proceeding to calculate
means and variances of the adjusted counts, and finally fitting a
curve for each condition to the mean-var plot using locfit().
Essentially, I'd like to put these variance functions (or at least all
the predicted variances) and adjusted counts inside a DESeq object so
that I can take advantage of the other functions DESeq offers, tests,
plots, etc...
Thanks for the help!
Sean
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing