[aroma.affymetrix] Re: Reproducing RMA with Gene ST data (Was: Re: [aroma.affymetrix] Re: How do you analyze Gene ST Data?)

Mark Robinson Wed, 26 Nov 2008 16:34:19 -0800

Hi all.

Just to follow up on these comments here.


'fitPLM' with default parameters in the affyPLM package should give  
practically identical results to the 'standard' pipeline (RMA bg corr  
+ quantile + fit) within aroma.affymetrix, assuming the underlying  
annotation is the same.  This was an easy comparison back in the day  
of 3' IVT arrays.  Now, its a little more difficult.

If anyone is willing, I'd be keen to know if these two actually do  
give the same results on the newer chips i.e. is the underlying  
annotation the same?  I seem to recall that because these newer chips  
occasionally have probes that are shared amongst different probesets,  
that the older style affy package would not be able to handle it.  For  
example, Jim MacDonald's post:

http://article.gmane.org/gmane.science.biology.informatics.conductor/19184

Jim says there that you "don't want to use affy" for these chips (not  
100% sure why).  He suggests the whole pdInfoBuilder/oligo thing which  
at one time had some bugs, but is probably more stable now.  I haven't  
dug deeper as to whether the annotation that 'fitPLM' uses by default  
('hugene10st.db' presumably?) matches the annotation that would be  
used by aroma.affymetrix (the converted-to-binary 'unsupported' CDF  
file).  I know Mark Cowley did find some inconsistencies:

http://article.gmane.org/gmane.science.biology.informatics.conductor/19738

This makes me think that we may want to alternatively construct the  
XXGene 1.0 CDFs (XX=Hu,Mo,Ra) directly from the PGF/CLF files instead  
of from the unsupported CDF.  I suspect that there will only be minor  
changes, so I haven't looked too deeply into it.

Anyone want to check?

In addition to what Henrik says about flavour="affyPLM" ... for a lot  
of my work, there is definitely additional value in using the  
auxiliary information from the PLMs (i.e. weights, residuals) ... you  
don't get this directly with oligo/median polish.


Few more specific notes ...

>> library(affy)
>> TestData <- ReadAffy()
>> TestEset <- rma(TestData)
>>
>> If you plot(AromaEset[,1],TestEset[,1]) you can visualize how
>> different the data is.


I assume you ensured the probesets and samples are in the same order?   
(Or, is this somehow covered by the plot method ...)  I can't tell  
from this sequence of commands.  I don't know what this plot looks  
like, so I don't whether to be alarmed or not.


>> However, I noticed that my GeneST normalized data is quite different
>> from the data that I produce using the Affy package. When looking at
>> the controls on the array I see that the Aroma normalized data is
>> between 5-10% lower than that produced by the Bioconductor Affy
>> packages. However, for some probes this difference is can be quite
>> large (values are averaged across 16 samples):
>>
>> Probe                    Bionconductor            Aroma
>> 10341096 (neg_control)   6079                     852
>> 10341735 (neg_control)   25                       87
>> 10340969 (pos_control)   3953                     1758
>> 10338477 (pos_control)   293                      611
>>
>> Ultimately, when my downstream analysis looks for differential
>> expression the differences between the two analysis approaches become
>> minimal, but I did notice that the Aroma package seems to call more
>> control probes and probes with no known gene as being differentially
>> expressed (i.e. it looks noisier). These probes were all pretty close
>> to my 2 fold cutoff, and at 3 fold or greater the data looked the
>> same.


Tough to really tell.  You'd probably want to average on the log2  
scale, not the linear scale.  This could be a difference in median- 
polish versus robust PLM, or could be a differnce in annotation.   
Requires some digging.

Also, you may want to remove control probesets before doing DE  
analysis --- for one thing, you pay a slightly lesser penalty for  
multiple testing.

Cheers,
Mark


------------------------------
Mark Robinson
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: [EMAIL PROTECTED]
e: [EMAIL PROTECTED]
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------





--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

[aroma.affymetrix] Re: Reproducing RMA with Gene ST data (Was: Re: [aroma.affymetrix] Re: How do you analyze Gene ST Data?)

Reply via email to