Hi Davis/Gordon:
I posted my question here again hope you can see it. 
When I tried edgeR and met a problem with the number of pseudocounts 
for each library after normalization, which should come to close numbers. This 
have been addressed in edgeR 
several times that "the total counts in each libray of the pseudocounts 
agrees well with the common library size" (page 27 & 44 of the 
user's guide), but my result are quite different between treatments although 
for the replicates within treatment the  pseudocounts are very similar.  I 
can't get to the common.lib.size for each treatment after I tried several 
methods (TMM, RLE and quantile).
1) Did I miss anything during my run with edgeR? How can I assure the 
normalization went well?

2) Does the normalized library size of the conditions matter or NOT, if they 
are different from the common.lib.size?


3) Is the result still meaningful even the library sizes of  pseudocounts are 
different? 


4) What could probably be the reason(s) to cause the library sizes of  
pseudocounts so different?


5) Should I remove the smaller number reads as some other people do? 
After I removed the smaller numbers of counts (<=40 in >=6 out of 
14 samples), the normalized library sizes become very similar.


I can feel my lack of mathematics for the packages. I attach part of my code 
here. 

---------------------------------------------------------------------
d$samples$lib.size
#"Zygote1",  21012147
"Zygote2",  19924212 
"Octant1",   9660245    
"Octant2",  26002900 
"Globular1",17139388      
"Globular2", 7649319   
"Heart1",   16430105  
"Heart2",   20101956   
"Torpedo1", 12920266  
"Torpedo2",  6306742 
"Bent1",    44241095 
"Bent2",    20094409 
"Mature1",  15166090 
"Mature2",  23203758

d$common.lib.size                   
[1] 16554344.47

colSums(d$pseudo.alt)
# Zygote1  21523774.62
Zygote2    21638415.63
Octant1    14533481.82
Octant2    12046955.46
Globular1  18920316.62
Globular2  18439528.30
Heart1    11754608.30
Heart2    12759230.11
Torpedo1  11248245.52 
Torpedo2  11410667.92 
Bent1     16101723.65
Bent2     17980670.24 
Mature1   26785396.02 
Mature2   27067289.80 
#   

> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C               LC_TIME=en_CA.UTF-8  
      LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_CA.UTF-8    LC_PAPER=en_CA.UTF-8 
      LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             
LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] ALL_1.4.7      Biobase_2.12.1 limma_3.8.2    edgeR_2.2.5   
loaded via a namespace (and not attached):
[1] tools_2.13.0
---------------------------------------------------------------------
[[elided Hotmail spam]]


Yifang
                

Yifang Tan
                                          
        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to