Thanks for the insight Mark. I have included the gene symbols for the
183 probes with greater than 100 cells per unit. It turns out 147 of
them are for the same gene RT1-C113 This gene has been the following
summary at NCBI
(http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=Retrieve&dopt=Graphics&list_uids=24151)
:
Summary
DISCONTINUED: This record has been withdrawn by RGD
What is the meaning of all this?
And is there a way to change the options for aroma so that the probes
with greater than 100 cells per unit are skipped? Or would I have to
remove them manually?
thanks again,
Sebastien
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"C1r"
"Mphosph10"
"Ruvbl1"
"Tmem34"
"Pim2"
"RGD1561513"
"Fstl1"
"Areg"
"RT1-C113"
"Id3"
"RT1-C113"
"RGD1308106"
"Cops5"
"Vil1"
"Sppl3"
"Mobkl1a"
"Zfp143"
"RGD1565370"
"Qprt"
"Rangap1"
"Pde6d"
"Tmem199"
"Alcam"
"RGD1311493"
"Cpa6"
"RGD1624210"
"Cops5"
"RT1-C113"
"Pygo2"
"RGD1562372"
"Rb1cc1"
"Kcne3"
"Porcn"
"Rras"
"Cuta"
"Zfp313"
"Olr1333-ps"
"RGD1562284"
"Epb4.1l5"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
Mark Robinson wrote:
> Hi Sebastien.
>
> Interesting observation, I hadn't noticed that.
>
> Since the major difference is in the fitting stage, my guess would be
> that there are just more larger units (although less total units) in
> the RaGene CDF. This is certainly true for say, the number of
> probesets with more than 100 probes:
>
>
> cdf1 <- AffymetrixCdfFile$fromChipType("RaGene-1_0-st-v1",tags="r3")
> cdf2 <- AffymetrixCdfFile$fromChipType("HuGene-1_0-st-v1",tags="r3")
> cpu1 <- nbrOfCellsPerUnit(cdf1)
> cpu2 <- nbrOfCellsPerUnit(cdf2)
>
> > sum(cpu1 > 100)
> [1] 183
> > sum(cpu2 > 100)
> [1] 70
>
> I haven't looked in close detail, but it may be worth removing some of
> the large probesets in the interest of speed. Sometimes these are
> just controls anyways. aroma.affymetrix already does this by default
> for the super large probesets (it jumps to median polish instead of a
> robust linear model).
>
> > options()$aroma.affymetrix.settings$models$RmaPlm
> $medianPolishThreshold
> [1] 500 6
>
> $skipThreshold
> [1] 5000 1
>
>
> Hope that helps.
> Mark
>
>
>
> On 27/02/2009, at 5:57 PM, Sebastien Gerega wrote:
>
>
>> Hi,
>> I have been playing around with the Aroma package and using sample
>> data
>> from the Affymetrix site. I've noticed that normalising ragene10st
>> arrays takes about 10 times longer than it does for hugene10st. For
>> example:
>> ragene10st:
>>
>> Total time for complete data set: 20.31min = 0.34h
>> Fraction of time spent on different tasks: Fitting: 96.5%, Reading:
>> 0.9%, Writing: 2.6% (of which 60.78% is for encoding/writing
>> chip-effects), Explicit garbage collection: 0.0%
>>
>>
>> hugene10st:
>> Total time for complete data set: 2.38min = 0.04h
>> Fraction of time spent on different tasks: Fitting: 69.6%, Reading:
>> 6.5%, Writing: 23.5% (of which 61.68% is for encoding/writing
>> chip-effects), Explicit garbage collection: 0.4
>>
>> Both analyses are being run with 6 cel files. Is this expected and
>> if so
>> what is the reason for the difference?
>> thanks,
>> Sebastien
>>
>>
>>
>
> ------------------------------
> Mark Robinson
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: [email protected]
> e: [email protected]
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
> ------------------------------
>
>
>
>
>
> >
>
>
--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest
version of the package, 2) to report the output of sessionInfo() and
traceback(), and 3) to post a complete code example.
You received this message because you are subscribed to the Google Groups
"aroma.affymetrix" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---