Hi.

On Thu, Dec 2, 2010 at 10:24 AM, Kai <wangz...@gmail.com> wrote:
> Hi Henrik,
>
> I was trying to run segmentation on a large set of ~300 SNP genotyping
> array profiles. Currently I am loading all the profiles into aroma and
> run the segmenter on them one by one, which takes a really long time.
> I was wondering whether there is a way to break the computation into
> parts and run all the parts simultaneously. Specifically:
>
> 1) Is there a way to load only a subset of profiles in a project into
> a "AromaUnitTotalCnBinarySet"?

It is true that ds <- AromaUnitTotalCnBinarySet$byName(...) will setup
a set containing all data files.  However, after that you can always
subset using extract(), e.g. ds <- extract(ds, 1:5);

> 2) Is there a way to run segmentation on single, or a subset of
> profiles in a "AromaUnitTotalCnBinarySet"?

Yes, either by subsetting as above already from start or simply by
specifying the 'arrays' argument to fit()/process() [below].  Note
that the latter is a more convenient approach for various reasons.
First, if the reference used to calculate CN ratios is the robust
average across all samples, then you do not have to worry about
getting it correct.  (This is not a problem in your particular case
because you use that cbs$.calculateRatios <- FALSE feature).  Second,
the final ChromosomeExplorer HTML page correctly list all samples.  If
you subset immediately after setting up the data set (as above), then
you basically have to make sure to process() one ChromosomeExplorer on
all arrays.

> 3) Assuming that one can generate the CBS segmentation model on single
> profiles, is there a way to load them together back in to aroma and
> pass to the "ChromosomeExplorer"?

Yes. The simplest way is to simply run what you are doing below ones
at the very end.  Already segmented and process samples will be
skipped, also already generated image files etc.

>
> The current process I have (which performs segmentation one sample at
> a time) is implemented as follows:
>
> # segment by CBS model
> ds = AromaUnitTotalCnBinarySet
> $byName("dataset,paired",chipType="HumanOmni1-Quad");
>
> cbs = CbsModel(ds);
> cbs$.calculateRatios = FALSE;
>
> fit(cbs, chromosomes=c(1:23), min.width=5, undo.splits="sdundo",
> undo.SD=1, verbose=2);
>
> # display data and segmentation in ChromosomeExplorer
> ce = ChromosomeExplorer(cbs);
> process(ce,chromosomes=c(1:23));
>
> Any other suggestion you may provide is also highly appreciated. Thank
> you very much.

library("aroma.core");
verbose <- Arguments$getVerbose(-4, timestamp=TRUE);

# Arrays that your host should process
arrays <- 5:8;  # Change for different hosts

# Setup the complete data set
ds <- 
AromaUnitTotalCnBinarySet$byName("dataset,paired",chipType="HumanOmni1-Quad");

# Setup the "complete" segmentation model
cbs <- CbsModel(ds, min.width=5, undo.splits="sdundo", undo.SD=1);
cbs$.calculateRatios <- FALSE;

# Fit a subset of the arrays
fit(cbs, arrays=arrays, chromosomes=c(1:23), verbose=verbose);

# Setup the "complete" ChromosomeExplorer, ...
ce <- ChromosomeExplorer(cbs);

# ...but generate PNG image files only for a subset
process(ce, arrays=arrays, chromosomes=c(1:23), verbose=verbose);

Note that you actually do not have to call fit() explicitly, because
process() will do it implicitly (with the same arguments).

If you want to call the above script from the command line, then
replace the arrays <- ... line with

arrays <- commandArgs(asValue=TRUE)$arrays;
arrays <- eval(parse(text=arrays));
arrays <- Arguments$getIndices(arrays);

and then you can run the above by:

R --args --arrays=1:5 < script.R

Hope this helps

Henrik

>
> Best,
> Kai
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
> version of the package, 2) to report the output of sessionInfo() and 
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups 
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Reply via email to