Thanks for the insight Mark. I have included the gene symbols for the 
183 probes with greater than 100 cells per unit. It turns out 147 of 
them are for the same gene RT1-C113 This gene has been the following 
summary at NCBI 
(http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=Retrieve&dopt=Graphics&list_uids=24151)
 
:

Summary
    DISCONTINUED: This record has been withdrawn by RGD

What is the meaning of all this?

And is there a way to change the options for aroma so that the probes 
with greater than 100 cells per unit are skipped? Or would I have to 
remove them manually?
thanks again,
Sebastien



"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"C1r"
"Mphosph10"
"Ruvbl1"
"Tmem34"
"Pim2"
"RGD1561513"
"Fstl1"
"Areg"
"RT1-C113"
"Id3"
"RT1-C113"
"RGD1308106"
"Cops5"
"Vil1"
"Sppl3"
"Mobkl1a"
"Zfp143"
"RGD1565370"
"Qprt"
"Rangap1"
"Pde6d"
"Tmem199"
"Alcam"
"RGD1311493"
"Cpa6"
"RGD1624210"
"Cops5"
"RT1-C113"
"Pygo2"
"RGD1562372"
"Rb1cc1"
"Kcne3"
"Porcn"
"Rras"
"Cuta"
"Zfp313"
"Olr1333-ps"
"RGD1562284"
"Epb4.1l5"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"
"RT1-C113"




Mark Robinson wrote:
> Hi Sebastien.
>
> Interesting observation, I hadn't noticed that.
>
> Since the major difference is in the fitting stage, my guess would be  
> that there are just more larger units (although less total units) in  
> the RaGene CDF.  This is certainly true for say, the number of  
> probesets with more than 100 probes:
>
>
> cdf1 <- AffymetrixCdfFile$fromChipType("RaGene-1_0-st-v1",tags="r3")
> cdf2 <- AffymetrixCdfFile$fromChipType("HuGene-1_0-st-v1",tags="r3")
> cpu1 <- nbrOfCellsPerUnit(cdf1)
> cpu2 <- nbrOfCellsPerUnit(cdf2)
>
>  > sum(cpu1 > 100)
> [1] 183
>  > sum(cpu2 > 100)
> [1] 70
>
> I haven't looked in close detail, but it may be worth removing some of  
> the large probesets in the interest of speed.  Sometimes these are  
> just controls anyways.  aroma.affymetrix already does this by default  
> for the super large probesets (it jumps to median polish instead of a  
> robust linear model).
>
>  > options()$aroma.affymetrix.settings$models$RmaPlm
> $medianPolishThreshold
> [1] 500   6
>
> $skipThreshold
> [1] 5000    1
>
>
> Hope that helps.
> Mark
>
>
>
> On 27/02/2009, at 5:57 PM, Sebastien Gerega wrote:
>
>   
>> Hi,
>> I have been playing around with the Aroma package and using sample  
>> data
>> from the Affymetrix site. I've noticed that normalising ragene10st
>> arrays takes about 10 times longer than it does for hugene10st. For  
>> example:
>> ragene10st:
>>
>> Total time for complete data set: 20.31min = 0.34h
>> Fraction of time spent on different tasks: Fitting: 96.5%, Reading:
>> 0.9%, Writing: 2.6% (of which 60.78% is for encoding/writing
>> chip-effects), Explicit garbage collection: 0.0%
>>
>>
>> hugene10st:
>> Total time for complete data set: 2.38min = 0.04h
>> Fraction of time spent on different tasks: Fitting: 69.6%, Reading:
>> 6.5%, Writing: 23.5% (of which 61.68% is for encoding/writing
>> chip-effects), Explicit garbage collection: 0.4
>>
>> Both analyses are being run with 6 cel files. Is this expected and  
>> if so
>> what is the reason for the difference?
>> thanks,
>> Sebastien
>>
>>
>>     
>
> ------------------------------
> Mark Robinson
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: [email protected]
> e: [email protected]
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
> ------------------------------
>
>
>
>
>
> >
>
>   


--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to