Thank you. Much appreciated. /Henrik On Fri, Dec 2, 2011 at 1:23 PM, Steven McKinney <smckin...@bccrc.ca> wrote: > This indeed does help get me started. > > I have one follow-up question: > How do I nominate you for Sainthood? > > Thanks very much for your excellent package > and valuable guidance. > > > Steven McKinney > > >> -----Original Message----- >> From: aroma-affymetrix@googlegroups.com [mailto:aroma- >> affymet...@googlegroups.com] On Behalf Of Henrik Bengtsson >> Sent: December-02-11 12:49 PM >> To: aroma-affymetrix@googlegroups.com >> Subject: Re: [aroma.affymetrix] Combining data from multiple chip types >> >> Hi. >> >> Yes, the Aroma framework can handle this. >> >> On Fri, Dec 2, 2011 at 12:19 PM, Steven McKinney <smckin...@bccrc.ca> >> wrote: >> > Hi all, >> > >> > I am running an analysis on Affymetrix SNP6, 250K Nsp and 250K Sty chip >> types. >> > For various reasons, patient samples were assessed either on SNP6 chips >> or >> > on 500K chipsets (250K Nsp and 250K Sty). To further complicate things, >> > an occasional 250K Nsp chip processing failed, so some patients have data >> > only on a 250K Sty chip. >> >> Ok, so each sample is processed on either of: >> >> 1. GenomeWideSNP_6 >> 2. Mapping250K_Nsp >> 3. Mapping250K_Sty & Mapping250K_Nsp >> >> > >> > I see on the web page >> > >> > http://www.aroma-project.org/features >> > >> > the description >> > >> > COPY-NUMBER ANALYSIS: >> > * Paired & non-paired copy-number analysis: All generations, i.e. 10K, >> 100K, 500K, 5.0 & 6.0. CBS & GLAD * segmentation methods. Combine data >> from multiple chip types. >> > >> > >> > My question is, at what point can data from multiple chip types be >> combined? >> > >> > As I start my aroma.affymetrix analytic pipeline (shown below), I first >> process the >> > GenomeWideSNP_6 chips, then the 250K Nsp, then the 250K Sty. Is this >> appropriate, >> > or is there a way to combine processing of all chip types from the start? >> > >> > If not from the start, at what step can I combine data? >> >> You can safely preprocess the different chip types independently. For >> simplicity, use doCRMAv2(); >> >> http://aroma-project.org/blocks/doCRMAv2 >> >> Note argument 'plm'. Also, as mention, if you are interested >> allele-specific analysis (e.g. LOH), use doASCRMAv2() in place of >> doCRMAv2(). >> >> It is for at the segmentation step you need to care about merging chip >> types. The segmentation model classes of the Aroma framework (e.g. >> CbsModel), will take care of the merging by simply interweaving the >> loci/total CN estimates from multiple chip types (if such are >> available for the sample currently being segmented). Using >> do[AS]CRMAv2(), you will basically get an AromaUnitTotalCnBinarySet >> for each chip type. If you place those in an R list, e.g. >> >> dsList <- list(); >> dsList[["GenomeWideSNP_6"]] <- doCRMAv2(..., chipType="GenomeWideSNP_6"); >> dsList[["Mapping250K_Nsp"]] <- doCRMAv2(..., >> chipType="Mapping250K_Nsp", plm="RmaPlm"); >> dsList[["Mapping250K_Sty"]] <- doCRMAv2(..., >> chipType="Mapping250K_Sty", plm="RmaPlm"); >> >> You can simply do >> >> sm <- CbsModel(dsList); >> >> and proceed as illustrated in vignette 'Total copy-number segmentation >> (non-paired CBS)' [http://aroma-project.org/vignettes/NonPairedCBS]. >> This idea of merging chip types, is also used in vignette 'Vignette: >> Total copy number analysis using CRMA v1 (10K, 100K, 500K)' >> [http://aroma-project.org/vignettes/CRMAv1]. >> >> What you need to be careful about is how your array files are named, >> because that is key for CbsModel to be able to identify which array >> files map to the same sample/individual. This is also mention in the >> "CRMAv1" vignette. Note that you do not physically have to rename >> your array/CEL files. Instead you can utilize so called full-name >> translators, cf. how-to page 'How to: Use fullname translators to >> rename data files' >> [http://aroma-project.org/howtos/setFullNamesTranslator]. These can >> be applied after doing preprocessing (e.g. CRMAv2), so you don't have >> to worry about that until segmentation. >> >> >> Potential problems: In the merging step, there is nothing specific >> that is done to make sure that the CN estimates from the different >> chip types to be merged are on the same scale, i.e. same observed CN >> mean levels for the same underlying/true CN level. It simply assumes >> that this has been taken care of by the preprocessing method. I'd >> say, small discrepancies are alright because merging will still >> increase the power to detect change points, which is the number one >> objective of segmentation methods such as CBS. If there are large >> discrepancies (which I doubt you'll see), you may have to normalize CN >> estimates to be one the same linear scale, cf. vignette 'MSCN: >> Multi-source copy-number normalization' >> [http://aroma-project.org/vignettes/MSCN]. As you can see in the MSCN >> paper (Bengtsson et al. 2009; http://aroma-project.org/publications/), >> bringing estimates on the same scale improves the power to detect >> change points compared to not doing before merging. >> >> Hope this helps get you started >> >> Henrik >> >> > >> > Any advice, or pointers to documentation on this issue of combining data >> from multiple chip types that >> > I have not yet found, would be appreciated. >> > >> > Best >> > >> > Steve >> > >> > >> > require("aroma.affymetrix") >> > >> > log <- verbose <- Arguments$getVerbose(-9, timestamp=TRUE) >> > ## Don't display too many decimals. >> > options(digits=5) >> > >> > cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags = "Full") >> > print(cdf) >> > >> > gi <- getGenomeInformation(cdf) >> > print(gi) >> > >> > si <- getSnpInformation(cdf) >> > print(si) >> > >> > acs <- AromaCellSequenceFile$byChipType(getChipType(cdf, fullname = >> FALSE)) >> > print(acs) >> > >> > csR <- AffymetrixCelSet$byName("Primary", cdf = cdf) >> > print(csR) >> > >> > cs <- csR >> > >> > par(mar = c(4, 4, 4, 1) + 0.1) >> > plotDensity(cs, lwd = 2, ylim = c(-0.1, 0.80)) >> > stext(side = 3, pos = 0, getFullName(cs)) >> > filename <- sprintf("%s,%s,plotDensity.pdf", getFullName(cs), >> getChipType(cs)) >> > dev.print(pdf, file = filename, width = 7, height = 5) >> > >> > ### 500K >> > >> > >> > cdf5N <- AffymetrixCdfFile$byChipType("Mapping250K_Nsp") >> > print(cdf5N) >> > >> > gi5N <- getGenomeInformation(cdf5N) >> > print(gi5N) >> > >> > si5N <- getSnpInformation(cdf5N) >> > print(si5N) >> > >> > acs5N <- AromaCellSequenceFile$byChipType(getChipType(cdf5N, fullname = >> FALSE)) >> > print(acs5N) >> > >> > csR5N <- AffymetrixCelSet$byName("Primary", cdf = cdf5N) >> > print(csR5N) >> > >> > cs5N <- csR5N >> > >> > par(mar = c(4, 4, 4, 1) + 0.1) >> > plotDensity(cs5N, lwd = 2, ylim = c(-0.1, 0.80)) >> > stext(side = 3, pos = 0, getFullName(cs5N)) >> > filename5N <- sprintf("%s,%s,plotDensity.pdf", getFullName(cs5N), >> getChipType(cs5N)) >> > dev.print(pdf, file = filename5N, width = 7, height = 5) >> > >> > . etc. >> > >> > >> > >> > Steven McKinney, Ph.D. >> > >> > Statistician >> > Molecular Oncology and Breast Cancer Program >> > British Columbia Cancer Research Centre >> > >> > email: smckinney +at+ bccrc +dot+ ca >> > >> > >> > BCCRC >> > Molecular Oncology >> > 675 West 10th Ave, Floor 4 >> > Vancouver B.C. >> > V5Z 1L3 >> > Canada >> > >> > -- >> > When reporting problems on aroma.affymetrix, make sure 1) to run the >> latest version of the package, 2) to report the output of sessionInfo() and >> traceback(), and 3) to post a complete code example. >> > >> > >> > You received this message because you are subscribed to the Google Groups >> "aroma.affymetrix" group with website http://www.aroma-project.org/. >> > To post to this group, send email to aroma-affymetrix@googlegroups.com >> > To unsubscribe and other options, go to http://www.aroma- >> project.org/forum/ >> >> -- >> When reporting problems on aroma.affymetrix, make sure 1) to run the latest >> version of the package, 2) to report the output of sessionInfo() and >> traceback(), and 3) to post a complete code example. >> >> >> You received this message because you are subscribed to the Google Groups >> "aroma.affymetrix" group with website http://www.aroma-project.org/. >> To post to this group, send email to aroma-affymetrix@googlegroups.com >> To unsubscribe and other options, go to http://www.aroma-project.org/forum/ > > -- > When reporting problems on aroma.affymetrix, make sure 1) to run the latest > version of the package, 2) to report the output of sessionInfo() and > traceback(), and 3) to post a complete code example. > > > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix" group with website http://www.aroma-project.org/. > To post to this group, send email to aroma-affymetrix@googlegroups.com > To unsubscribe and other options, go to http://www.aroma-project.org/forum/
-- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/