Hi Steve.

I don't know how common this is.  Basically, a colleague found a gene  
that was very differentially expressed when analyzing using the  
Affymetrix probesets definition and found virtually nothing when using  
the custom CDF that bundles all the probes for a gene together.  The  
reason was simple.  There were several probesets designed for this  
gene and presumably they measure different isoforms.  The probes for  
the DE probeset showed the difference, but all the other probesets  
didn't.  When you use a robust linear model like RMA, outliers get  
downweighted.  Because the DE probes accounted for a small proportion  
of the probes (I think there was 3 or 4 other probesets at this  
locus), their effect got washed out.

So, its a tradeoff.  Sometimes (perhaps most of the time) you gain by  
lumping them all together ... more information, more power to detect  
changes.  But, sometimes (perhaps rarely) it can mislead.  I'm sure  
I'm not the only one to observe such things.  The probe-level data  
(usually?) doesn't lie.  But, since you are comparing across  
platforms, you will undoubtedly find this as you go along.  Different  
microarray designs often measure slightly different things.

One other thing.  Be sure to convert your CDF to binary if it is not  
already using affxparser's convertCdf().  Having this info stored in  
binary format will make the processing much faster.  I think the MBNI  
custom CDFs are text.

Cheers,
Mark


On 20/06/2009, at 6:55 AM, Steve P wrote:

> Mark,
>
> Thanks for the information. That is very helpful.
>
> I want to do the latter, which is to "combine probesets such that all
> probes for a given gene (by some definition -- RefSeq, Ensembl, etc)
> are used to arise at the summarize value."
>
> I was able to obtain a custom CDF for the U133-A array. So I will try
> that approach. But part of the reason I want to do this is to be able
> to compare values across platforms, so I may need to find/build a
> custom CDF for the other platform.
>
> I would appreciate any cautionary advice you have about summarizing at
> the gene level.
>
> Regards,
> -Steve
>
> On Jun 17, 9:56 am, Steve Piccolo <steve.picc...@gmail.com> wrote:
>> Yesterday I posted this question to the list, but the spam blocker  
>> didn't
>> let it through. Below my question is a response from Mark Robinson.
>>
>> --------------------------------------------------------------------------- 
>> -----------------------------------
>>
>> Following the example provided 
>> athttp://groups.google.com/group/aroma-affymetrix/web/gene-1-0-st-array 
>> ...
>> ,
>> I am running the following code:
>>
>> chipType <- "HT_HG-U133A"
>> dataSet = "myData"
>>
>> library(aroma.affymetrix)
>> verbose <- Arguments$getVerbose(-8, timestamp=TRUE)
>>
>> cdf <- AffymetrixCdfFile$byChipType(chipType)
>> cs <- AffymetrixCelSet$byName(dataSet, cdf=cdf)
>>
>> bc <- RmaBackgroundCorrection(cs)
>> csBC <- process(bc,verbose=verbose)
>> qn <- QuantileNormalization(csBC)
>> csN <- process(qn, verbose=verbose)
>>
>> plm <- RmaPlm(csN)
>> fit(plm, verbose=verbose)
>>
>> ces <- getChipEffectSet(plm)
>> gExprs <- extractDataFrame(ces, units=NULL, addNames=TRUE)
>>
>> This seems to be working beautifully.
>>
>> However, I'm doing an analysis that requires my expression values to
>> be summarized at the gene level rather than the probeset level.
>>
>> In the gExprs object that results from the above analysis, I get a
>> data.frame object in which each row contains expression values for a
>> given probeset across all samples. What I would love to see in each
>> row is an expression value for a given gene. I believe RMA has the
>> ability to do this, but I'm not sure how to do it via
>> aroma.affymetrix.
>>
>> Any suggestions? I'm happy to provide any more details that would be
>> helpful.
>>
>> Regards,
>> -Steve
>>
>> --------------------------------------------------------------------------- 
>> -----------------------------------
>>
>> Hi Steve.
>>
>> As to your question, it depends on what you need.  When you say you  
>> want
>> every row to be a gene, do you just want to know the gene name that  
>> goes
>> with the probeset identifier, or do you want to combine probesets  
>> such that
>> all probes for a given gene (by some definition -- RefSeq, Ensembl,  
>> etc) are
>> used to arise at the summarize value (a la the MBNI CustomCDF)?
>>
>> If the former, then there are annotation packages within R.
>>
>> If the latter, I have a few cautionary tales of doing this, since the
>> different probesets for a given locus can be measuring different  
>> variants.
>>  But if you still want to do this, we need to make a CDF file  
>> specific to
>> the annotation you want.  For the standard HG-U133 arrays, I know  
>> the MBNI
>> guys made the CDFs and we could use those within aroma.affymetrix.   
>> I don't
>> know if they build custom CDFs for the HT- arrays.
>>
>> Hope that gets you started.
>>
>> Cheers,
>> Mark- Show quoted text -
>> ------------------------------
>> Mark Robinson, PhD (Melb)
>> Epigenetics Laboratory, Garvan
>> Bioinformatics Division, WEHI
>> e: m.robin...@garvan.org.au
>> e: mrobin...@wehi.edu.au
>> p: +61 (0)3 9345 2628
>> f: +61 (0)3 9347 0852
>> ------------------------------
> >

------------------------------
Mark Robinson, PhD (Melb)
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robin...@garvan.org.au
e: mrobin...@wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------






--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to