[aroma.affymetrix] Re: Total Copy Number Analysis - Hardware requirements

Henrik Bengtsson Wed, 14 Jan 2009 18:41:51 -0800

Hi.

On Wed, Jan 14, 2009 at 7:02 AM, connys <cornelia.sche...@googlemail.com> wrote:
>
> HI,
> I have read the documentation of this tool, but could not find
> anything regarding the hardware requirements (computational need). I
> want to analyze about 1000 samples affy 6.0  and I would also be
> interested to know of how this requirement scales up, let's say to the
> 10,000 samples.
> Does anyone have any idea/suggestions?


The short answer is that you can get by with ~1.5GB of RAM using any
operating system (this is pointed out on the front page).  Since
aroma.affymetrix access the file system quite a bit, you wish to have
a fast file system.  It is faster to work with data on a local HDD
than on a network-based file system.  Having a fast local drive will
help; don't know of any benchmarking but I guess the greater the HDD's
cache is, the better.  Of course, with more RAM, things will go a bit
faster (up to a certain level).  You might also want to get a 64-bit
OS for the future, although this is not (at all) required by
aroma.affymetrix. See also Page 'Improving processing time' in the
User Guide:

 http://groups.google.com/group/aroma-affymetrix/web/improving-processing-time


More details: Currently, (I believe) there is not a single method
implemented in aroma.affymetrix that is not bounded in memory
regardless of chiptype and number of samples.  In other words,
whatever method you apply, your memory usage will stay below a certain
upper limit.  People have reported that they've used aroma.affymetrix
to process 5000+ Affymetrix gene expression arrays without problems.
Note that we do this not by providing approximate algorithms, but more
clever designs of algorithms, so you will still get the same results
as if everything would be kept in memory.  The only exception to the
latter is our recent CrlmmModel for estimating genotypes according to
CRLMM; the CRLMM algorithm is hierarchical by nature and it is
therefore hard to design an exact estimator that is bounded in memory.
 To achieve bounded memory runs, SNPs are processed in chunks.   The
greater number of arrays modeled, the fewer number SNPs we have per
chunk, in order to keep the overall memory usage bounded.

Not sure what kind of GWS6 analysis you are planning, but the CRMA v2
method for estimating full-resolution raw CNs is a single-array method
by design, so it will definitely scale infinitely.  Even multi-array
methods such as CRMA (v1) should scale quite well, because of the
bounded algorithms used.

Hope this helps

/Henrik

>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

[aroma.affymetrix] Re: Total Copy Number Analysis - Hardware requirements

Reply via email to