[aroma.affymetrix] Re: benchmarking CN analysis with aroma.affymetrix (with CBS)

Henrik Bengtsson Tue, 25 Aug 2009 18:28:44 -0700

Hi.

On Thu, Aug 20, 2009 at 11:25 PM, ssv<ssv....@gmail.com> wrote:
>
> Hi Henrik
>
> I was benchmarking aroma.affymetrix performance on 72 affymetrix 250k
> sty files (36 normal, 36 tumor- paired analysis). Here is what I have
> so far:
>
> ==================================
> 1) setting up CEL sets and locating the CDF file
>
> user    system          elapsed
> 7.44            1.28                    8.78
>
> 2) Allelic cross-talk calibration
>  user          system          elapsed
> 1871.61         107.77          6341.33
>
> 3) Probe-level modelling test (for CN analysis)
>
> user  system elapsed
> 3008.95  393.22 8989.35
>
> 4) Fragment-length normalization test
> user  system elapsed
> 291.53   10.53  359.18
>
> 5) Setup a paired CBS model
>
> For pair creation:
> user  system elapsed
>   1.32    0.09    1.42
>
> For CBS model:
>  user  system elapsed
>   0.14    0.02    0.15
>
> 6)  Link the ChromosomeExplorer to the segmentation model
> user  system elapsed
>      0       0       0
> 7) Fit the model for a few chromosomes (2,19)  for first two arrays
>
>  user  system elapsed
>  28.45    2.24   50.04
>
> ==============================
>
> What i am surprise was the time taken for CBS model (cns <- CbsModel
> (sets$tumor, sets$normal). It was much lesser than anticipated. As you
> can see, allelic cross talk calculation and PLM took most of the
> computation time. I wonder if this is normal.


thanks for this report.

CBS is indeed fast, and as I wrote in another message, the new DNAcopy
v1.19.2 (and above) is even faster (way faster for the newer chip
types).  However, please note that setting up the segmentation model
(cns <- CbsModel(...) without the following process(cns):ing does
nothing.  If you don't do the latter, then it will be done when you do
process() on the ChromosomeExplorer object, so the time spent doing
the latter might include doing CBS. [hard to tell when you don't post
the complete script].   If not calling CBS, most of the time
ChromosomeExplorer spend should be to generated PNG images.

It is known that PLM takes time, mainly because it is designed to work
with any chip type and CDF, meaning it cannot assume anything about
the structure.  However, I know of one possible way to make it a fair
bit faster, but that will take a major redesign which I just don't
find the time to do.

Was this the first time you processed Mapping250K_Sty CEL files on
this machine?  Because, the first time aroma.affymetrix runs a new
chip type there is a significant overhead from setting up internal
annotation data structures, which are cached on the file system.  All
following R sessions will detect this an load the cached results.
Thus, you should expect much faster processing after the first round.
Allelic cross-talk calibration identifies the six allele-pair groups
from the probe-sequence files (or the CDF) this way, which may explain
what you observe.

>
> Sorry for troubling you with all the logs (output) below. But
> following are the logs.

Email space is basically free - better with more than less.

/Henrik

>
> suresh
>
>
> ==============================================
>
>
>> traceback()
> No traceback available
>
>
>
>> sessionInfo()
> R version 2.9.1 (2009-06-26)
> i386-pc-mingw32
>
> locale:
> English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
>  [1] GLAD_2.0.0             Cairo_1.4-5
> RColorBrewer_1.0-2     DNAcopy_1.18.0         aroma.affymetrix_1.1.2
> aroma.apd_0.1.6        affxparser_1.16.0
>  [8] R.huge_0.1.8           aroma.core_1.1.4
> aroma.light_1.12.2     matrixStats_0.1.7      R.rsp_0.3.5
> R.filesets_0.5.3       digest_0.3.1
> [15] R.cache_0.1.8          R.utils_1.1.7
> R.oo_1.4.9             R.methodsS3_1.0.3
>
> loaded via a namespace (and not attached):
> [1] tools_2.9.1
>
>> sets$normal
> $Mapping250K_Sty
> CnChipEffectSet:
> Name: adeno_carcinoma
> Tags: ACC,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/adeno_carcinoma,ACC,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Sty
> Platform: Affymetrix
> Chip type: Mapping250K_Sty,monocell
> Number of arrays: 36
> Names: 88224, 88256, ..., 67950
> Time period: 2009-08-20 18:23:46 -- 2009-08-20 18:23:52
> Total file size: 313.14MB
> RAM: 0.04MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
>
>>> sets$tumor
> $Mapping250K_Sty
> CnChipEffectSet:
> Name: adeno_carcinoma
> Tags: ACC,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/adeno_carcinoma,ACC,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Sty
> Platform: Affymetrix
> Chip type: Mapping250K_Sty,monocell
> Number of arrays: 36
> Names: 88240, 88272, ..., 67966
> Time period: 2009-08-20 18:23:46 -- 2009-08-20 18:23:52
> Total file size: 313.14MB
> RAM: 0.04MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
>
>> cns
> CbsModel:
> Name: adeno_carcinoma
> Tags: ACC,-XY,RMA,+300,A+B,FLN,-XY,paired
> Chip type (virtual): Mapping250K_Sty
> Path: cbsData/adeno_carcinoma,ACC,-XY,RMA,+300,A+B,FLN,-XY,paired/
> Mapping250K_Sty
> Number of chip types: 1
> Chip-effect set & reference file pairs:
> Chip type #1 of 1 ('Mapping250K_Sty'):
> Chip-effect set:
> CnChipEffectSet:
> Name: adeno_carcinoma
> Tags: ACC,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/adeno_carcinoma,ACC,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Sty
> Platform: Affymetrix
> Chip type: Mapping250K_Sty,monocell
> Number of arrays: 36
> Names: 88240, 88272, ..., 67966
> Time period: 2009-08-20 18:23:46 -- 2009-08-20 18:23:52
> Total file size: 313.14MB
> RAM: 0.04MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
> Reference file:
> CnChipEffectSet:
> Name: adeno_carcinoma
> Tags: ACC,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/adeno_carcinoma,ACC,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Sty
> Platform: Affymetrix
> Chip type: Mapping250K_Sty,monocell
> Number of arrays: 36
> Names: 88224, 88256, ..., 67950
> Time period: 2009-08-20 18:23:46 -- 2009-08-20 18:23:52
> Total file size: 313.14MB
> RAM: 0.04MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
> RAM: 0.00MB
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

[aroma.affymetrix] Re: benchmarking CN analysis with aroma.affymetrix (with CBS)

Reply via email to