Re: [aroma.affymetrix] Re: Aroma.affymetrix vs. Affymetrix Powertools

Henrik Bengtsson Tue, 07 Dec 2010 20:14:31 -0800

Hi Sebastian,

2010/12/6 Sebastian Schönherr <ratsc...@gmail.com>:
> Hi Henrik,
> thanks for your quick reply.
> Our goal is to reproduce the results of Affymetrix Powertools by using
> our own implementation (written in Java) for normalization and median
> polish:
> Until now we can reproduce the normalization and we did it as follows:
> (Affymetrix 500k)
> Normalization:
> 1) We executed the Affymetrix Powertools program "apt-cel-
> transformer":
> apt-cel-transformer -d Mapping250K_Nsp.cdf -c quant-norm.sketch=0 --
> out-dir outputFolder inputFiles (all values are used for
> normalization, not only pm)
> 2) Then, we executed the normalization step as well using
> aroma.affymetrix:
> cdf <- AffymetrixCdfFile$byChipType("Mapping250K_Nsp");
> cs <- AffymetrixCelSet$byName("Project1", cdf=cdf);
> qn <- QuantileNormalization(cs);


Make sure you are aware of the different options for QN, cf. page
'Empirical probe-signal densities and rank-based quantile
normalization':

  http://aroma-project.org/node/141

> csN <- process(qn);
> 3) After that we extracted several, random chosen SNPs from the binary
> CEL files and compared the results, which were identical for all three
> approaches (ours, affymetrix powertools and aroma.affymetrix).

Note that in R *identical* and *equal* have different meanings, cf.
identical() and all.equal().  Identical is really identical whereas
equal allows for some numeric discrepancies.  I suspect you mean
"equal" when you compare APT and aroma.affymetrix.  How do you
compare?


> So far so good.
> ---------------------
> The next step is the implementation of median polish where we run into
> trouble:
> Median Polish: (apt-cel-transformer is included in this script "apt-
> probeset-summarize")
> 1) apt-probeset-summarize --cdf-file Mapping250K_Nsp.cdf --out-dir
> outDirectory  --cel-files inputFiles -a quant-norm.sketch=0,med-
> polish,expr.genotype=true
> 2)aroma:
>  plm <- RmaPlm(csN, flavor="oligo");
>  fit(plm);

You may wanna try with RmaSnpPlm(..., mergeStrands=TRUE), because now
you use RmaPlm(...) which is the same as RmaSnpPlm(...,
mergeStrands=FALSE).  You are now fitting the log-additive model for
each *unit group* separately, and there are typically four unit groups
per SNP in this chip type, two on the sense strand and two on the
antisense strand.   Note that different SNP chips have different
setups, and even mixed where some SNPs are interrogated on one strand,
another on the other and yet some on both.

> ces <- getChipEffectSet(plm);
> unitNames <- getUnitNames(cdf);
> theta <- extractTheta(ces,groups=1:2);

...which is why have to subset by groups=1:2 (because you do estimate
chip effects for 4 group).

> theta <- log2(theta);
> dimnames(theta)[[1]] <- unitNames;
> 3)
> Unfortunately, we could not reproduce neither the values from the
> powertools nor from aroma.affymetrix (using oligo) with our
> implementation. All are going in the same direction but we didn't get
> mathematical exact results (Some SNPs are pretty close)

Probably explained by the above.

> So the point where we stuck at the moment is if (1) we are missing a
> step out what has to be done between normalization and median polish
> (for example the log2 transformation is done at the powertools before
> median polish and in aroma after median polish) or (2) we maybe have
> an error in reasoning while reading out the values (extractTheta).

The log-additive model in RMA is fitted on the log-scale.  All data in
the aroma framework are by convention stored on the intensity scale,
which means that we unlog the RMA chip-effect estimates before storing
them.

>
> So thanks again for aroma.affymetrix, it helps a lot to interpret our
> results.

Finally a comment about implement RMA etc in Java: If you are going to
support more than one chiptype, that is, other chip types than the one
your are describing, make sure you understand the CDF structure and
the different configurations the units can have in the different SNP
arrays as well as expression arrays etc.  In order to support all, you
will find that you either have to come up with one generic solution
(the path we took for aroma.affymetrix) or one solution/implementation
for each specific chip type.  It's lots of work.  I don't want to
discourage you - I just want to say that this will take more time than
one first anticipates.

Cheers,

/Henrik

> Cheers Sebastian
>
>
>
> On Dec 4, 3:32 am, Henrik Bengtsson <henrik.bengts...@aroma-
> project.org> wrote:
>> Hi Sebastian.
>>
>> 2010/12/3 Sebastian Schönherr <ratsc...@gmail.com>:
>>
>> > Hi there,
>> > I've a question regarding to RMA in aroma:
>> > In a first step we've compared the normalization method used in aroma
>> > with the method used in the Affymetrix Powertools which gave us
>> > identical results. (500K chip) ("QuantileNormalization(cs)" vs "apt-
>> > cel-transformer")
>>
>> Thanks for this information.  It is really useful.  If you are not
>> aware of it already, we're collecting results of this kind at:
>>
>> http://aroma-project.org/replication/
>>
>> Would you mind writing down/sharing the details on exactly how you did
>> the comparison, e.g. one R/aroma script and one APT script?  How do
>> you conclude they give identical results, e.g. by correlation,
>> graphical, ...?
>>
>> > So in a next step we wanted to test median polish as well (standard
>> > flavor) on this data and compared it again to Affymetrix (using "apt-
>> > probeset-summarize" with med-polish,expr.genotype=true). We tried hard
>> > and tested a lot of configurations but we've never got the same
>> > results. (Same trend, but slightly different in the digits after the
>> > comma).
>> > R-script, quite straight forward:
>> > cdf <- AffymetrixCdfFile$byChipType("Mapping250K_Nsp");
>> > cs <- AffymetrixCelSet$byName("testset", cdf=cdf);
>> > qn <- QuantileNormalization(cs);
>> > csN <- process(qn);
>> > plm <- RmaPlm(csN);
>> >  fit(plm);
>>
>> Atually, the default for the RmaPlm class is actually
>> flavor="affyPLM", which uses the robust linear model M-estimator as
>> implemented by preprocessCore/affyPLM.  
>> Inhttp://aroma-project.org/replication/RMAwe show that aroma.affymetrix
>> can reproduce the affyPLM estimates quiet well.  To use the median
>> polish estimator, use flavor="oligo".  
>> Inhttp://aroma-project.org/replication/gcRMAwe use that to show that
>> aroma.affymetrix can reproduce gcrrma too.  Likewise, justSNPRMA() for
>> AffymetrixCelSet uses flavor="oligo" to reproduce SNPRMA pipeline of
>> the oligo package, cf.http://aroma-project.org/replication/SNPRMA
>>
>> Hope this helps
>>
>> /Henrik
>>
>> > The background correction was omitted in both scripts
>> > Any ideas?
>> > Thanks for your help
>> > Sebastian
>>
>> > --
>> > When reporting problems on aroma.affymetrix, make sure 1) to run the 
>> > latest version of the package, 2) to report the output of sessionInfo() 
>> > and traceback(), and 3) to post a complete code example.
>>
>> > You received this message because you are subscribed to the Google Groups 
>> > "aroma.affymetrix" group with websitehttp://www.aroma-project.org/.
>> > To post to this group, send email to aroma-affymetrix@googlegroups.com
>> > To unsubscribe and other options, go tohttp://www.aroma-project.org/forum/
>
> On Dec 4, 3:32 am, Henrik Bengtsson <henrik.bengts...@aroma-
> project.org> wrote:
>> Hi Sebastian.
>>
>> 2010/12/3 Sebastian Schönherr <ratsc...@gmail.com>:
>>
>> > Hi there,
>> > I've a question regarding to RMA in aroma:
>> > In a first step we've compared the normalization method used in aroma
>> > with the method used in the Affymetrix Powertools which gave us
>> > identical results. (500K chip) ("QuantileNormalization(cs)" vs "apt-
>> > cel-transformer")
>>
>> Thanks for this information.  It is really useful.  If you are not
>> aware of it already, we're collecting results of this kind at:
>>
>> http://aroma-project.org/replication/
>>
>> Would you mind writing down/sharing the details on exactly how you did
>> the comparison, e.g. one R/aroma script and one APT script?  How do
>> you conclude they give identical results, e.g. by correlation,
>> graphical, ...?
>>
>> > So in a next step we wanted to test median polish as well (standard
>> > flavor) on this data and compared it again to Affymetrix (using "apt-
>> > probeset-summarize" with med-polish,expr.genotype=true). We tried hard
>> > and tested a lot of configurations but we've never got the same
>> > results. (Same trend, but slightly different in the digits after the
>> > comma).
>> > R-script, quite straight forward:
>> > cdf <- AffymetrixCdfFile$byChipType("Mapping250K_Nsp");
>> > cs <- AffymetrixCelSet$byName("testset", cdf=cdf);
>> > qn <- QuantileNormalization(cs);
>> > csN <- process(qn);
>> > plm <- RmaPlm(csN);
>> >  fit(plm);
>>
>> Atually, the default for the RmaPlm class is actually
>> flavor="affyPLM", which uses the robust linear model M-estimator as
>> implemented by preprocessCore/affyPLM.  
>> Inhttp://aroma-project.org/replication/RMAwe show that aroma.affymetrix
>> can reproduce the affyPLM estimates quiet well.  To use the median
>> polish estimator, use flavor="oligo".  
>> Inhttp://aroma-project.org/replication/gcRMAwe use that to show that
>> aroma.affymetrix can reproduce gcrrma too.  Likewise, justSNPRMA() for
>> AffymetrixCelSet uses flavor="oligo" to reproduce SNPRMA pipeline of
>> the oligo package, cf.http://aroma-project.org/replication/SNPRMA
>>
>> Hope this helps
>>
>> /Henrik
>>
>> > The background correction was omitted in both scripts
>> > Any ideas?
>> > Thanks for your help
>> > Sebastian
>>
>> > --
>> > When reporting problems on aroma.affymetrix, make sure 1) to run the 
>> > latest version of the package, 2) to report the output of sessionInfo() 
>> > and traceback(), and 3) to post a complete code example.
>>
>> > You received this message because you are subscribed to the Google Groups 
>> > "aroma.affymetrix" group with websitehttp://www.aroma-project.org/.
>> > To post to this group, send email to aroma-affymetrix@googlegroups.com
>> > To unsubscribe and other options, go tohttp://www.aroma-project.org/forum/
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
> version of the package, 2) to report the output of sessionInfo() and 
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups 
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Re: [aroma.affymetrix] Re: Aroma.affymetrix vs. Affymetrix Powertools

Reply via email to