Hi Gordon,

I seem to learn something new about edgeR everytime I use it.

Thanks for the help!


On Fri, Sep 16, 2011 at 5:47 PM, Gordon K Smyth <sm...@wehi.edu.au> wrote:

> Dear Sean,
> The dispersion estimation functions in edgeR have a lower limit for the
> dispersions that they will estimate.  For estimateCommonDisp(), the lower
> limit is just above 0.0001.  For estimateTagwiseDisp() the lower limit is
> just above 0.001.  For your data, the ideal dispersion estimate appears to
> be zero, so the functions are simply returning to you the pre-set lower
> limits.
> I agree that was a bit sloppy of us (the edgeR authors) for the lower
> limits to be inconsistent between the functions.  The reason for
> estimateTagwiseDisp() having a higher limit is that it does a grid search,
> so we wanted to limit the number of grid points for computational
> efficiency.
> The new glm functions in edgeR, estimateGLMCommonDisp() etc have somewhat
> less restrictive lower limits than the classic functions that you are using.
> The bottom line is that with technical data such as the yeast data, we do
> not view the differences between dispersion estimates of 1e-3 or 1e-4 as
> scientifically meaningful.  We would simply observe that the dispersion
> appears to be at the lower boundary, showing that the data has essentially
> no biological variability.  We would set the dispersions to be zero.
> Best wishes
> Gordon
>  Date: Thu, 15 Sep 2011 18:03:28 -0700
>> From: Sean Ruddy <srudd...@gmail.com>
>> To: bioc-sig-sequencing@r-project.**org<bioc-sig-sequencing@r-project.org>
>> Subject: [Bioc-sig-seq] edgeR tagwise estimates not converging to
>>        common estimate with large prior.n value
>> Hi,
>> Thanks in advance for any help. I have the latest R software (2.13.1) and
>> edgeR software (2.8.4). I'm running into a problem where I estimate a
>> common
>> dispersion parameter of 0.0001 and when I subsequently estimate tagwise
>> dispersions using the default prior.n = 10, the summary statistics are
>> Min.  1st Qu.  Median    Mean    3rd Qu.    Max.
>> 0.001  0.001      0.001     0.001     0.001      0.022
>> ie, all estimates are 10 times larger than the common dispersion estimate.
>> Since the method is supposed to shrink toward the common value this seems
>> a
>> little surprising. When I increase prior.n to a large number I expect the
>> tagwise estimates to all converge to the common dispersion, but as you
>> might
>> guess from the table above it converges to 0.001 = 10*common.
>> The data comes from the bioconductor package "yeastRNASeq" and it appears
>> from the description of the data that the two samples in each group are
>> actually from sequencing the same extraction of mRNA, ie not biological
>> and
>> not even really technical replicates. So the common dispersion should be
>> zero as the counts should follow the poisson.
>> I cannot explain the behavior of the estimates but I'm afraid it might be
>> something in the code so I'll include that below.
>> library(yeastRNASeq)
>> data( geneLevelData )
>> d <- DGEList( geneLevelData , group = c( rep( "Mutant" , 2 ) , rep( "Wild"
>> ,
>> 2 ) ) )
>> d <- calcNormFactors( d )
>> d <- d[rowSums(d$counts) >= 5, ]
>> d <- estimateCommonDisp( d )
>> d$common.dispersion
>> [1] 0.000101
>> d <- estimateTagwiseDisp( d , prior.n = 10 )
>> summary( d$tagwise.dispersion )
>>  Min. 1st Qu.  Median    Mean  3rd Qu.    Max.
>> 0.001  0.001     0.001      0.001  0.001     0.022
>> d <- estimateTagwiseDisp( d , prior.n = 1000 )
>> summary( d$tagwise.dispersion )
>> Min.    1st Qu.  Median    Mean   3rd Qu.    Max.
>> 0.001   0.001     0.001      0.001   0.001     0.001
>> It could just be an oddity of the data set itself but I don't have enough
>> experience using edgeR across different RNA-Seq experiments to know how
>> these methods should behave.
>> Thanks,
>> Sean
> ______________________________**______________________________**__________
> The information in this email is confidential and inte...{{dropped:10}}

Bioc-sig-sequencing mailing list

Reply via email to