[aroma.affymetrix] Re: Clarification of Raw CN estimates and CBS mean

2011-12-12 Thread Greg Wall
CORRECTION:

ref.theta <- extractMatrix(Normal)
theta <- extractMatrix(Tumor)
C <- 2 * theta/ref.theta


On Mon, Dec 12, 2011 at 1:09 PM, Gregory W  wrote:

> Hello,
>
> Many thanks for the site and quick feedback to the discussion board.
>
> I was hoping to get a little clarification about the difference
> between these two statistics: Raw CN and CBS mean.
>
> After running CBS I get the following data frame:
>
> cbs  <- CbsModel(tumor, normal, min.width=5, alpha = .01)
> fit(cbs, min.width=5, verbose=0, force=TRUE)
> regions <- getRegions(cbs)
> regions <- regions[[1]][,1:5]
> head(regions)
>
>  chromosomestart stopmean count
> 1  151599 25455929  0.0870 14665
> 2  1 25465716 25519534 -1.275029 ***
> 3  1 25519574 26308518  0.0994   369
> 4  1 26313794 27104892  0.4272   336
> 5  1 27108799 38042779  0.1022  6161
> 6  1 38044082 38063269 -0.6952 8
>
>
> Where column "mean" is the result from DNAcopy segmentation for the
> particular region -- which is just the mean log-ratio-value of all
> probes in the region.  And I'm assuming that since I ran doCRMAv2 on
> the data, the CBS code above is performed on normalized "tumor" and
> "normal" data.
>
>
> Focusing on regions 2 show above with trailing ***, if I run:
>
> ref.theta <- extractMatrix(ref.average)
> theta <- extractMatrix(file.group)
> C  <- 2 * theta/ref.theta
>
> And then calculate the mean raw CN using matrix C for the same probes
> in regions 2 used to calculate the CBS mean of -1.2750: does this CN
> mean relate in any way to the CBS mean?  I realize the log ratio is
> logged and centered around 0 for the segmented data, whereas, CN is
> around 2, but aside from simple transformations can you help me
> understand the difference between these two statistics?
>
> Does the mean CN for probes included in a particular CBS segmented
> regions have any value?
>
> Many thanks!
> Greg
>
>
>
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/


Re: [aroma.affymetrix] Parallel Processing

2010-12-22 Thread Greg Wall
Thank you.  It's working nicely.



On Tue, Dec 21, 2010 at 10:16 AM, Henrik Bengtsson <
henrik.bengts...@gmail.com> wrote:

> Hi.
>
> On Fri, Dec 17, 2010 at 1:33 PM, Gregory W  wrote:
> > Hello,
> >
> > Thanks for aroma.affymetrix and this helpful site.
> >
> > I was hoping to get some general advice. I know CRMAv2 is a single
> > array method and thus makes processing different arrays in parallel
> > possible.
> >
> > I was wondering how you would setup the rawData and annotationData
> > directories when doing multi-staged analysis.
> >
> > For instance, for my project I have 20 patients.  I will be getting
> > the data for the first 10 patients immediately, and then the data from
> > the remaining 10 patients a few weeks later.  All experiments will be
> > performed on the same chipType.
> >
> > I was thinking on this structure:
> >
> > --rawData/
> > --patient01/GenomeWideSNP_6
> > --patient02/GenomeWideSNP_6/
> > --patient03/GenomeWideSNP_6/
> > ...
> > --patient20/GenomeWideSNP_6/
>
> Even if they come singly, I would treat those samples as being part of
> the same data set, e.g.
>
>  rawData/MyDataSet/GenomeWideSNP_6/
>
> A rule of thumb is that when you in the future would redo the same
> analysis, that is how would set it up.
>
> >
> > And then preprocessing each patient separately in an R session with:
> >dataSet <- "patient01";
> >chipType <- "GenomeWideSNP_6";
> >cdf <- AffymetrixCdfFile$byChipType(chipType, tags="Full");
> >dsList <- doCRMAv2(dataSet, cdf=cdf, combineAlleles=FALSE,
> > verbose=verbose);
>
> So, if you add them all to the same data set as they come in, just
> rerun the above; already processed arrays will be detected and
> "skipped".  The 'dsList' at the end will contain all arrays currently
> exist in the data set.
>
> FYI, there is doASCRMAv2(), so that you do not have to specify
> 'combineAlleles', i.e.
>
> dsList <- doASCRMAv2(dataSet, cdf=cdf, verbose=verbose);
>
> Note also that you can do:
>
> csR <- AffymetrixCelSet$byName("MyDataSet", cdf=cdf);
> dsList <- doASCRMAv2(csR, verbose=verbose);
>
> which is the same but more explicit, and you have the option to subset
> 'csR'.  That is, if you want to process different arrays on different
> machines (which is what the subject of your message indicates), then
> you can do for instance:
>
> subset <- c(5,6,7);
> csR <- AffymetrixCelSet$byName("MyDataSet", cdf=cdf);
> csR <- extract(csR, subset);
> dsList <- doASCRMAv2(csR, verbose=verbose);
>
> If you batch process this, you can pass command line arguments to your
> script and use commandArgs() to get them, i.e. you can set 'subset'
> this way.
>
> >
> > Once I get all the arrays and have preprocessed them I would like to
> > segment the data using CBS.  The first 10 patients are normal and the
> > last 10 diseased -- i.e. a tumor-normal arrays for each sample.
> >
> > However, since I processed each array individually each would have
> > their own AromaUnitTotalCnBinarySet.  Would I just read each in
> > individually, and then manipulate it in order to create the necessary
> > matching of normal over tumor needed for the CBS algorithm?
>
> So, with the above suggestion of mine, this will not be an issue.
>
> (FYI, one can use append() to merge data sets).
>
> >
> > If down the road we get another 20 arrays again with tumor normal
> > samples how would I integrate these new arrays with my previous
> > arrays?
>
> As above.
>
> /Henrik
>
> > Just create additional directories:
> >
> > --rawData/
> > --patient21/GenomeWideSNP_6
> > --patient22/GenomeWideSNP_6/
> > --patient23/GenomeWideSNP_6/
> > ...
> > --patient40/GenomeWideSNP_6/
> >
> >
> > I hope I explained my question reasonably clear.
> >
> > Thanks, Greg
> >
> >
> > --
> > When reporting problems on aroma.affymetrix, make sure 1) to run the
> latest
> > version of the package, 2) to report the output of sessionInfo() and
> > traceback(), and 3) to post a complete code example.
> >
> >
> > You received this message because you are subscribed to the Google Groups
> > "aroma.affymetrix" group with website http://www.aroma-project.org/.
> > To post to this group, send email to aroma-affymetrix@googlegroups.com
> > To unsubscribe and other options, go to
> http://www.aroma-project.org/forum/
> >
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest
> version of the package, 2) to report the output of sessionInfo() and
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to
> http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to