Hi. On Wed, Jan 14, 2009 at 7:02 AM, connys <cornelia.sche...@googlemail.com> wrote: > > HI, > I have read the documentation of this tool, but could not find > anything regarding the hardware requirements (computational need). I > want to analyze about 1000 samples affy 6.0 and I would also be > interested to know of how this requirement scales up, let's say to the > 10,000 samples. > Does anyone have any idea/suggestions?
The short answer is that you can get by with ~1.5GB of RAM using any operating system (this is pointed out on the front page). Since aroma.affymetrix access the file system quite a bit, you wish to have a fast file system. It is faster to work with data on a local HDD than on a network-based file system. Having a fast local drive will help; don't know of any benchmarking but I guess the greater the HDD's cache is, the better. Of course, with more RAM, things will go a bit faster (up to a certain level). You might also want to get a 64-bit OS for the future, although this is not (at all) required by aroma.affymetrix. See also Page 'Improving processing time' in the User Guide: http://groups.google.com/group/aroma-affymetrix/web/improving-processing-time More details: Currently, (I believe) there is not a single method implemented in aroma.affymetrix that is not bounded in memory regardless of chiptype and number of samples. In other words, whatever method you apply, your memory usage will stay below a certain upper limit. People have reported that they've used aroma.affymetrix to process 5000+ Affymetrix gene expression arrays without problems. Note that we do this not by providing approximate algorithms, but more clever designs of algorithms, so you will still get the same results as if everything would be kept in memory. The only exception to the latter is our recent CrlmmModel for estimating genotypes according to CRLMM; the CRLMM algorithm is hierarchical by nature and it is therefore hard to design an exact estimator that is bounded in memory. To achieve bounded memory runs, SNPs are processed in chunks. The greater number of arrays modeled, the fewer number SNPs we have per chunk, in order to keep the overall memory usage bounded. Not sure what kind of GWS6 analysis you are planning, but the CRMA v2 method for estimating full-resolution raw CNs is a single-array method by design, so it will definitely scale infinitely. Even multi-array methods such as CRMA (v1) should scale quite well, because of the bounded algorithms used. Hope this helps /Henrik > > > > > --~--~---------~--~----~------------~-------~--~----~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to aroma-affymetrix-unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~----------~----~----~----~------~----~------~--~---