Re: [aroma.affymetrix] Parallel Processing

Greg Wall Wed, 22 Dec 2010 09:43:56 -0800

Thank you.  It's working nicely.



On Tue, Dec 21, 2010 at 10:16 AM, Henrik Bengtsson <
henrik.bengts...@gmail.com> wrote:

> Hi.
>
> On Fri, Dec 17, 2010 at 1:33 PM, Gregory W <greg.d.w...@gmail.com> wrote:
> > Hello,
> >
> > Thanks for aroma.affymetrix and this helpful site.
> >
> > I was hoping to get some general advice. I know CRMAv2 is a single
> > array method and thus makes processing different arrays in parallel
> > possible.
> >
> > I was wondering how you would setup the rawData and annotationData
> > directories when doing multi-staged analysis.
> >
> > For instance, for my project I have 20 patients.  I will be getting
> > the data for the first 10 patients immediately, and then the data from
> > the remaining 10 patients a few weeks later.  All experiments will be
> > performed on the same chipType.
> >
> > I was thinking on this structure:
> >
> > --rawData/
> >         --patient01/GenomeWideSNP_6
> >         --patient02/GenomeWideSNP_6/
> >         --patient03/GenomeWideSNP_6/
> >         ...
> >         --patient20/GenomeWideSNP_6/
>
> Even if they come singly, I would treat those samples as being part of
> the same data set, e.g.
>
>  rawData/MyDataSet/GenomeWideSNP_6/
>
> A rule of thumb is that when you in the future would redo the same
> analysis, that is how would set it up.
>
> >
> > And then preprocessing each patient separately in an R session with:
> >        dataSet <- "patient01";
> >        chipType <- "GenomeWideSNP_6";
> >        cdf <- AffymetrixCdfFile$byChipType(chipType, tags="Full");
> >        dsList <- doCRMAv2(dataSet, cdf=cdf, combineAlleles=FALSE,
> > verbose=verbose);
>
> So, if you add them all to the same data set as they come in, just
> rerun the above; already processed arrays will be detected and
> "skipped".  The 'dsList' at the end will contain all arrays currently
> exist in the data set.
>
> FYI, there is doASCRMAv2(), so that you do not have to specify
> 'combineAlleles', i.e.
>
> dsList <- doASCRMAv2(dataSet, cdf=cdf, verbose=verbose);
>
> Note also that you can do:
>
> csR <- AffymetrixCelSet$byName("MyDataSet", cdf=cdf);
> dsList <- doASCRMAv2(csR, verbose=verbose);
>
> which is the same but more explicit, and you have the option to subset
> 'csR'.  That is, if you want to process different arrays on different
> machines (which is what the subject of your message indicates), then
> you can do for instance:
>
> subset <- c(5,6,7);
> csR <- AffymetrixCelSet$byName("MyDataSet", cdf=cdf);
> csR <- extract(csR, subset);
> dsList <- doASCRMAv2(csR, verbose=verbose);
>
> If you batch process this, you can pass command line arguments to your
> script and use commandArgs() to get them, i.e. you can set 'subset'
> this way.
>
> >
> > Once I get all the arrays and have preprocessed them I would like to
> > segment the data using CBS.  The first 10 patients are normal and the
> > last 10 diseased -- i.e. a tumor-normal arrays for each sample.
> >
> > However, since I processed each array individually each would have
> > their own AromaUnitTotalCnBinarySet.  Would I just read each in
> > individually, and then manipulate it in order to create the necessary
> > matching of normal over tumor needed for the CBS algorithm?
>
> So, with the above suggestion of mine, this will not be an issue.
>
> (FYI, one can use append() to merge data sets).
>
> >
> > If down the road we get another 20 arrays again with tumor normal
> > samples how would I integrate these new arrays with my previous
> > arrays?
>
> As above.
>
> /Henrik
>
> > Just create additional directories:
> >
> > --rawData/
> >         --patient21/GenomeWideSNP_6
> >         --patient22/GenomeWideSNP_6/
> >         --patient23/GenomeWideSNP_6/
> >         ...
> >         --patient40/GenomeWideSNP_6/
> >
> >
> > I hope I explained my question reasonably clear.
> >
> > Thanks, Greg
> >
> >
> > --
> > When reporting problems on aroma.affymetrix, make sure 1) to run the
> latest
> > version of the package, 2) to report the output of sessionInfo() and
> > traceback(), and 3) to post a complete code example.
> >
> >
> > You received this message because you are subscribed to the Google Groups
> > "aroma.affymetrix" group with website http://www.aroma-project.org/.
> > To post to this group, send email to aroma-affymetrix@googlegroups.com
> > To unsubscribe and other options, go to
> http://www.aroma-project.org/forum/
> >
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest
> version of the package, 2) to report the output of sessionInfo() and
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to
> http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Re: [aroma.affymetrix] Parallel Processing

Reply via email to