Thank you.  Much appreciated.  /Henrik

On Fri, Dec 2, 2011 at 1:23 PM, Steven McKinney <> wrote:
> This indeed does help get me started.
> I have one follow-up question:
> How do I nominate you for Sainthood?
> Thanks very much for your excellent package
> and valuable guidance.
> Steven McKinney
>> -----Original Message-----
>> From: [mailto:aroma-
>>] On Behalf Of Henrik Bengtsson
>> Sent: December-02-11 12:49 PM
>> To:
>> Subject: Re: [aroma.affymetrix] Combining data from multiple chip types
>> Hi.
>> Yes, the Aroma framework can handle this.
>> On Fri, Dec 2, 2011 at 12:19 PM, Steven McKinney <>
>> wrote:
>> > Hi all,
>> >
>> > I am running an analysis on Affymetrix SNP6, 250K Nsp and 250K Sty chip
>> types.
>> > For various reasons, patient samples were assessed either on SNP6 chips
>> or
>> > on 500K chipsets (250K Nsp and 250K Sty).  To further complicate things,
>> > an occasional 250K Nsp chip processing failed, so some patients have data
>> > only on a 250K Sty chip.
>> Ok, so each sample is processed on either of:
>> 1. GenomeWideSNP_6
>> 2. Mapping250K_Nsp
>> 3. Mapping250K_Sty & Mapping250K_Nsp
>> >
>> > I see on the web page
>> >
>> >
>> >
>> > the description
>> >
>> > * Paired & non-paired copy-number analysis: All generations, i.e. 10K,
>> 100K, 500K, 5.0 & 6.0. CBS & GLAD * segmentation methods.  Combine data
>> from multiple chip types.
>> >
>> >
>> > My question is, at what point can data from multiple chip types be
>> combined?
>> >
>> > As I start my aroma.affymetrix analytic pipeline (shown below), I first
>> process the
>> > GenomeWideSNP_6 chips, then the 250K Nsp, then the 250K Sty.  Is this
>> appropriate,
>> > or is there a way to combine processing of all chip types from the start?
>> >
>> > If not from the start, at what step can I combine data?
>> You can safely preprocess the different chip types independently.  For
>> simplicity, use doCRMAv2();
>> Note argument 'plm'.   Also, as mention, if you are interested
>> allele-specific analysis (e.g. LOH), use doASCRMAv2() in place of
>> doCRMAv2().
>> It is for at the segmentation step you need to care about merging chip
>> types.  The segmentation model classes of the Aroma framework (e.g.
>> CbsModel), will take care of the merging by simply interweaving the
>> loci/total CN estimates from multiple chip types (if such are
>> available for the sample currently being segmented).  Using
>> do[AS]CRMAv2(), you will basically get an AromaUnitTotalCnBinarySet
>> for each chip type.  If you place those in an R list, e.g.
>> dsList <- list();
>> dsList[["GenomeWideSNP_6"]] <- doCRMAv2(..., chipType="GenomeWideSNP_6");
>> dsList[["Mapping250K_Nsp"]] <- doCRMAv2(...,
>> chipType="Mapping250K_Nsp", plm="RmaPlm");
>> dsList[["Mapping250K_Sty"]] <- doCRMAv2(...,
>> chipType="Mapping250K_Sty", plm="RmaPlm");
>> You can simply do
>> sm <- CbsModel(dsList);
>> and proceed as illustrated in vignette 'Total copy-number segmentation
>> (non-paired CBS)' [].
>> This idea of merging chip types, is also used in vignette 'Vignette:
>> Total copy number analysis using CRMA v1 (10K, 100K, 500K)'
>> [].
>> What you need to be careful about is how your array files are named,
>> because that is key for CbsModel to be able to identify which array
>> files map to the same sample/individual.  This is also mention in the
>> "CRMAv1" vignette.  Note that you do not physically have to rename
>> your array/CEL files.  Instead you can utilize so called full-name
>> translators, cf. how-to page 'How to: Use fullname translators to
>> rename data files'
>> [].  These can
>> be applied after doing preprocessing (e.g. CRMAv2), so you don't have
>> to worry about that until segmentation.
>> Potential problems: In the merging step, there is nothing specific
>> that is done to make sure that the CN estimates from the different
>> chip types to be merged are on the same scale, i.e. same observed CN
>> mean levels for the same underlying/true CN level.  It simply assumes
>> that this has been taken care of by the preprocessing method.  I'd
>> say, small discrepancies are alright because merging will still
>> increase the power to detect change points, which is the number one
>> objective of segmentation methods such as CBS.  If there are large
>> discrepancies (which I doubt you'll see), you may have to normalize CN
>> estimates to be one the same linear scale, cf. vignette 'MSCN:
>> Multi-source copy-number normalization'
>> [].  As you can see in the MSCN
>> paper (Bengtsson et al. 2009;,
>> bringing estimates on the same scale improves the power to detect
>> change points compared to not doing before merging.
>> Hope this helps get you started
>> Henrik
>> >
>> > Any advice, or pointers to documentation on this issue of combining data
>> from multiple chip types that
>> > I have not yet found, would be appreciated.
>> >
>> > Best
>> >
>> > Steve
>> >
>> >
>> > require("aroma.affymetrix")
>> >
>> > log <- verbose <- Arguments$getVerbose(-9, timestamp=TRUE)
>> > ## Don't display too many decimals.
>> > options(digits=5)
>> >
>> > cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags = "Full")
>> > print(cdf)
>> >
>> > gi <- getGenomeInformation(cdf)
>> > print(gi)
>> >
>> > si <- getSnpInformation(cdf)
>> > print(si)
>> >
>> > acs <- AromaCellSequenceFile$byChipType(getChipType(cdf, fullname =
>> FALSE))
>> > print(acs)
>> >
>> > csR <- AffymetrixCelSet$byName("Primary", cdf = cdf)
>> > print(csR)
>> >
>> > cs <- csR
>> >
>> > par(mar = c(4, 4, 4, 1) + 0.1)
>> > plotDensity(cs, lwd = 2, ylim = c(-0.1, 0.80))
>> > stext(side = 3, pos = 0, getFullName(cs))
>> > filename <- sprintf("%s,%s,plotDensity.pdf", getFullName(cs),
>> getChipType(cs))
>> > dev.print(pdf, file = filename, width = 7, height = 5)
>> >
>> > ### 500K
>> >
>> >
>> > cdf5N <- AffymetrixCdfFile$byChipType("Mapping250K_Nsp")
>> > print(cdf5N)
>> >
>> > gi5N <- getGenomeInformation(cdf5N)
>> > print(gi5N)
>> >
>> > si5N <- getSnpInformation(cdf5N)
>> > print(si5N)
>> >
>> > acs5N <- AromaCellSequenceFile$byChipType(getChipType(cdf5N, fullname =
>> FALSE))
>> > print(acs5N)
>> >
>> > csR5N <- AffymetrixCelSet$byName("Primary", cdf = cdf5N)
>> > print(csR5N)
>> >
>> > cs5N <- csR5N
>> >
>> > par(mar = c(4, 4, 4, 1) + 0.1)
>> > plotDensity(cs5N, lwd = 2, ylim = c(-0.1, 0.80))
>> > stext(side = 3, pos = 0, getFullName(cs5N))
>> > filename5N <- sprintf("%s,%s,plotDensity.pdf", getFullName(cs5N),
>> getChipType(cs5N))
>> > dev.print(pdf, file = filename5N, width = 7, height = 5)
>> >
>> > . etc.
>> >
>> >
>> >
>> > Steven McKinney, Ph.D.
>> >
>> > Statistician
>> > Molecular Oncology and Breast Cancer Program
>> > British Columbia Cancer Research Centre
>> >
>> > email: smckinney +at+ bccrc +dot+ ca
>> >
>> >
>> > BCCRC
>> > Molecular Oncology
>> > 675 West 10th Ave, Floor 4
>> > Vancouver B.C.
>> > V5Z 1L3
>> > Canada
>> >
>> > --
>> > When reporting problems on aroma.affymetrix, make sure 1) to run the
>> latest version of the package, 2) to report the output of sessionInfo() and
>> traceback(), and 3) to post a complete code example.
>> >
>> >
>> > You received this message because you are subscribed to the Google Groups
>> "aroma.affymetrix" group with website
>> > To post to this group, send email to
>> > To unsubscribe and other options, go to http://www.aroma-
>> --
>> When reporting problems on aroma.affymetrix, make sure 1) to run the latest
>> version of the package, 2) to report the output of sessionInfo() and
>> traceback(), and 3) to post a complete code example.
>> You received this message because you are subscribed to the Google Groups
>> "aroma.affymetrix" group with website
>> To post to this group, send email to
>> To unsubscribe and other options, go to
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
> version of the package, 2) to report the output of sessionInfo() and 
> traceback(), and 3) to post a complete code example.
> You received this message because you are subscribed to the Google Groups 
> "aroma.affymetrix" group with website
> To post to this group, send email to
> To unsubscribe and other options, go to

When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.

You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website
To post to this group, send email to
To unsubscribe and other options, go to

Reply via email to