Hi Randy.
From that error message, it looks like there was a mix of CDF files
being used (my guess is 54675 corresponds to the number of Affymetrix
probesets, whereas 30625 corresponds to the Refseq reorganization of
probesets). Can you post the code you ran?
Cheers,
Mark
On 16-Jan-10, at 11:41 AM, Randy Gobbel wrote:
I'm also trying to get gene-level expression values, using HG-
U133_Plus_2 data. I downloaded the custom CDF that combines probes
into probesets that correspond to RefSeq genes, linked from the
aroma.affymetrix group page for this chip type (Hs133P_Hs_REFSEQ.cdf),
and ran the same set of commands. It works up to the point of trying
to extract expression values, then dies with:
Exception: Range of argument 'indices' is out of range [1,30625]:
[1,54675]
At this point, I'm not sure what to do next. Suggestions? It looks
like you were the creator of the CDF--is it the right one for this?
-Randy
On Jun 19 2009, 10:08 pm, Mark Robinson <mrobin...@wehi.edu.au> wrote:
Hi Steve.
I don't know how common this is. Basically, a colleague found agene
that was very differentially expressed when analyzing using the
Affymetrix probesets definition and found virtually nothing when
using
the custom CDF that bundles all the probes for agenetogether. The
reason was simple. There were several probesets designed for this
geneand presumably they measure different isoforms. The probes for
the DE probeset showed the difference, but all the other probesets
didn't. When you use a robust linear model like RMA, outliers get
downweighted. Because the DE probes accounted for a small proportion
of the probes (I think there was 3 or 4 other probesets at this
locus), their effect got washed out.
So, its a tradeoff. Sometimes (perhaps most of the time) you gain by
lumping them all together ... more information, more power to detect
changes. But, sometimes (perhaps rarely) it can mislead. I'm sure
I'm not the only one to observe such things. The probe-level data
(usually?) doesn't lie. But, since you are comparing across
platforms, you will undoubtedly find this as you go along. Different
microarray designs often measure slightly different things.
One other thing. Be sure to convert your CDF to binary if it is not
already using affxparser's convertCdf(). Having this info stored in
binary format will make the processing much faster. I think the MBNI
custom CDFs are text.
Cheers,
Mark
On 20/06/2009, at 6:55 AM, Steve P wrote:
Mark,
Thanks for the information. That is very helpful.
I want to do the latter, which is to "combine probesets such that
all
probes for a givengene(by some definition -- RefSeq, Ensembl, etc)
are used to arise at the summarize value."
I was able to obtain a custom CDF for the U133-A array. So I will
try
that approach. But part of the reason I want to do this is to be
able
to compare values across platforms, so I may need to find/build a
custom CDF for the other platform.
I would appreciate any cautionary advice you have about
summarizing at
thegenelevel.
Regards,
-Steve
On Jun 17, 9:56 am, Steve Piccolo <steve.picc...@gmail.com> wrote:
Yesterday I posted this question to the list, but the spam blocker
didn't
let it through. Below my question is a response from Mark Robinson.
---------------------------------------------------------------------------
-----------------------------------
Following the example provided
athttp://groups.google.com/group/aroma-affymetrix/web/gene-1-0-st-array
...
,
I am running the following code:
chipType <- "HT_HG-U133A"
dataSet = "myData"
library(aroma.affymetrix)
verbose <- Arguments$getVerbose(-8, timestamp=TRUE)
cdf <- AffymetrixCdfFile$byChipType(chipType)
cs <- AffymetrixCelSet$byName(dataSet, cdf=cdf)
bc <- RmaBackgroundCorrection(cs)
csBC <- process(bc,verbose=verbose)
qn <- QuantileNormalization(csBC)
csN <- process(qn, verbose=verbose)
plm <- RmaPlm(csN)
fit(plm, verbose=verbose)
ces <- getChipEffectSet(plm)
gExprs <- extractDataFrame(ces, units=NULL, addNames=TRUE)
This seems to be working beautifully.
However, I'm doing an analysis that requires my expression values
to
be summarized at thegenelevel rather than the probeset level.
In the gExprs object that results from the above analysis, I get a
data.frame object in which each row contains expression values
for a
given probeset across all samples. What I would love to see in each
row is an expression value for a givengene. I believe RMA has the
ability to do this, but I'm not sure how to do it via
aroma.affymetrix.
Any suggestions? I'm happy to provide any more details that would
be
helpful.
Regards,
-Steve
---------------------------------------------------------------------------
-----------------------------------
Hi Steve.
As to your question, it depends on what you need. When you say you
want
every row to be agene, do you just want to know thegenename that
goes
with the probeset identifier, or do you want to combine probesets
such that
all probes for a givengene(by some definition -- RefSeq, Ensembl,
etc) are
used to arise at the summarize value (a la the MBNI CustomCDF)?
If the former, then there are annotation packages within R.
If the latter, I have a few cautionary tales of doing this, since
the
different probesets for a given locus can be measuring different
variants.
But if you still want to do this, we need to make a CDF file
specific to
the annotation you want. For the standard HG-U133 arrays, I know
the MBNI
guys made the CDFs and we could use those within aroma.affymetrix.
I don't
know if they build custom CDFs for the HT- arrays.
Hope that gets you started.
Cheers,
Mark- Show quoted text -
------------------------------
Mark Robinson, PhD (Melb)
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robin...@garvan.org.au
e: mrobin...@wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------
------------------------------
Mark Robinson, PhD (Melb)
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robin...@garvan.org.au
e: mrobin...@wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------
------------------------------
Mark Robinson, PhD (Melb)
Epigenetics Laboratory, Garvan
Bioinformatics Division, WEHI
e: m.robin...@garvan.org.au
e: mrobin...@wehi.edu.au
p: +61 (0)3 9345 2628
f: +61 (0)3 9347 0852
------------------------------
______________________________________________________________________
The information in this email is confidential and intended solely for the
addressee.
You must not disclose, forward, print or use it without the permission of the
sender.
______________________________________________________________________
--
When reporting problems on aroma.affymetrix, make sure 1) to run the latest
version of the package, 2) to report the output of sessionInfo() and
traceback(), and 3) to post a complete code example.
You received this message because you are subscribed to the Google Groups
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/aroma-affymetrix?hl=en