On Sun, Jan 15, 2012 at 11:20 AM, Henrik Bengtsson <henrik.bengts...@aroma-project.org> wrote: > On Sat, Jan 14, 2012 at 10:02 PM, Hoon Kim <wise...@gmail.com> wrote: >> Dear Henrik, >> >> Thank you very much for your time to reply to my questions, and your >> suggestions worked very well. By following your advice, I successfully >> evaluated LOHs for all samples. >> >> I wonder if I can ask you two more questions, one related to your previous >> email and the other related to the output of PSCBS. >> >> 1. A question related to your previous email: >> ------------------A part of your previous email-------- >> # Estimate the LOH *threshold* from autosomal chromosomes alone >> DeltaLOH <- estimateDeltaLOH(extractChromosomes(fit, 1:22)); >> # Call LOH >> fitC <- callLOH(fit, delta=DeltaLOH); >> ----------------------------------------------------------------- >> In this case, do I also have to estimate the Alleleic Balance *threshold* >> (deltaAB) from autosomal chromosomes alone in order to do "callAB"? > > You could do that, but since the AB caller relies on weaker > assumptions it is more likely to work well anyway. The advantage of > estimating the DeltaAB and DeltaLOH threshold from only the autosomal > chromosomes is that (ceteris paribus) the estimates will be the same > regardless of the germline being XX or XY; include X and Y in those > estimates will give a tiny (and negligible) bias between XXs and XYs. > >> >> 2. Please, find the attached file that contains the PSCBS output for two >> patients. In "Patient 1," most lohCalls are "NA" rather than "False," while >> only a few of NAs are shown in "Patient 2." I would greatly appreciate it >> if you let me know why Patient 1 has many NAs rather than "False." >> Unfortunately, I couldn't find a reference where I can find answer myself, >> but I might have missed. > > See argument 'xorCalls' in callLOH(): "If @TRUE, a region already > called AB, will not be called LOH." The basic idea is that if a > segment is already called AB, it cannot be LOH. Thus, if AB = TRUE > then LOH = FALSE, i.e. (AB,LOH)=(TRUE,FALSE). However, we still run > the LOH caller and gets the calls. Now, to distinguish the cases > where the AB and LOH agree from when they disagree, we report either > (AB,LOH)=(TRUE,FALSE) or (TRUE,NA). The former is reported when they > agree and the latter when they disagree. You can always override this > yourself by the logic: LOH[AB] <- FALSE.
Forgot to add that (AB,LOH)=(TRUE,NA) call also be reported when the LOH caller itself could not decide on either TRUE or FALSE, and hence return NA, e.g. when there is too little evidence/data. For your Patient #1, it is likely that the estimate of the DeltaLOH threshold is not correct/sensible, which would explain your (TRUE,FALSE) and (TRUE,NA) calls, as well as the (FALSE,FALSE) and (FALSE,TRUE) calls. /Henrik > > In the next release help("callLOH", package="PSCBS") will clarify this better. > > Cheers, > > Henrik > >> >> Thank you very much. >> >> Best, >> >> Hoon >> >> On Fri, Jan 13, 2012 at 4:31 PM, Henrik Bengtsson >> <henrik.bengts...@aroma-project.org> wrote: >>> >>> Hi, >>> >>> I could reproduce this using the PairedPSCBS object that you sent me; >>> >>> library("PSCBS"); >>> >>> # Load Paired PSCBS results with AB calls already done >>> fit <- loadObject("HoonKim_20120110.Rbin"); >>> fitC <- callLOH(fit); >>> ## Error: muC1atNonAB < muC1atAB is not TRUE >>> >>> # The CN data (figure attached) looks good >>> toPNG("HoonKim_20120110", tags="PSCBS", width=840, aspectRatio=0.8, { >>> plotTracks(fit); }) >>> >>> What causing the problem is basically that nothing is going on in the >>> tumor, that is, it behaves as the germline. No change points were >>> detected on any chromomosome (Chr 1-22, 23 (X), 24 (Y), 25 (M)), >>> resulting in one segment per chromosome. Each such segment/chromosome >>> was then correctly called to be in allelic balance (AB) by callAB() >>> that you did earlier. The exception is ChrY, which is non-AB. The >>> total CNs and the BAFs for Chr24 are probably "garbage", because this >>> sample is an XX person, meaning we only observe noise on ChrY. >>> >>> One obvious solution is to exclude ChrY, either already when running >>> the segmentation, or immediately afterward, e.g. >>> >>> df <- subset(df, chromosome %in% 1:23); >>> fit <- segmentByPairedPSCBS(df, verbose=verbose); >>> fit <- callAB(fit); >>> fit <- callLOH(fit); >>> >>> Alternatively, if you wish to keep ChrY, say for consistency with >>> other samples, you can do as follows: >>> >>> # Estimate the LOH *threshold* from autosomal chromosomes alone >>> DeltaLOH <- estimateDeltaLOH(extractChromosomes(fit, 1:22)); >>> # Call LOH >>> fitC <- callLOH(fit, delta=DeltaLOH); >>> >>> In both cases you will get a warning that DeltaLOH was set to -Inf, >>> which is alright. See below. >>> >>> >>> DETAILS: >>> When calling LOH, we compare (a bootstrap quantile estimate of) the >>> minor CN (C1) of each segment toward a threshold DeltaLOH. See Eqn >>> (7) in Olshen et al. (2011) [http://aroma-project.org/publications/] >>> for details. The main problem in calling LOH is to choose a good >>> DeltaLOH. As explained in the paper, this threshold is sample >>> specific and requires rather strong assumptions and ad hoc estimators. >>> In the Section S1.2 of the Supplementary Materials, we propose on >>> approach to estimate DeltaLOH from data. The idea is to choose >>> DeltaLOH as the midpoint between where we believe the true CN=1 and >>> true CN=0 is. We estimate the true CN=1 (mu1) from all segment that >>> are in AB and with TCN near two. That estimate is not that debatable >>> and works for data sets. However, estimating where the true CN=0 >>> (mu0) is, requires that we have some segments that have a zero minor >>> CN. The method for estimating DeltaLOH is estimateDeltaLOH(). Use it >>> and you will get: >>> >>> DeltaLOH <- estimateDeltaLOH(fit); >>> ## Error: muC1atNonAB < muC1atAB is not TRUE >>> >>> which confirms that it's there the error occurs. BTW, the error is a >>> sanity check asserting that our assumptions are correct and that we >>> can safely go ahead and estimate DeltaLOH. >>> >>> First, when *all* segments are in AB, then all minor CNs are 1 (and so >>> the major CNs). In that case we cannot estimate mu0 and hence not >>> DeltaLOH. It is a non-identifiable setup. However, when *all* >>> segments are AB, we know that none are in LOH. Thus, in that case we >>> can use DeltaLOH = -Inf as a threshold (as in the above workarounds). >>> >>> However, in your case, ChrY is not called AB and hence it will try to >>> estimate DeltaLOH as outlined in the paper. It gets a good estimate >>> of mu1, but not mu0. It has to estimate mu0 based on the single >>> non-AB segment, which is the data on ChrY. This is what it looks like >>> if you turn on the verbose messages: >>> >>> DeltaLOH <- estimateDeltaLOH(fit, verbose=-10); >>> ... >>> Weighted median of (corrected) C1 in allelic balance: 0.993 >>> Smallest C1 among segments not in allelic balance: 1.15 >>> There are 1 segments with in total 295 heterozygous SNPs with this level. >>> ... >>> >>> From this you see that the mu1 = 0.993 and mu0=1.15. However, true >>> CN=0 (mu0) should always be less than true CN=1 (mu1), which is why >>> the sanity check fails. >>> >>> I will discuss with my PSCBS coauthors if it possible to automate the >>> above, e.g. by detecting that ChrY is just noise with signal so that >>> it can be excluded. However, doing that introduces new assumptions, >>> e.g. assuming it is a human genome that is segmented and so on. >>> >>> Thanks for the report >>> >>> Henrik >>> >>> On Tue, Jan 10, 2012 at 6:39 PM, Henrik Bengtsson >>> <henrik.bengts...@aroma-project.org> wrote: >>> > Hi, >>> > >>> > when you say "bad pair", what do you really mean? >>> > >>> > Please send me (offline; not to the mailing list) an example of the >>> > 'fit' object where you get that error message, e.g. >>> > >>> > saveObject(fit, "HoonKim_20120110.Rbin"); >>> > >>> > so that I can try to reproduce this. >>> > >>> > /Henrik >>> > >>> > On Mon, Jan 9, 2012 at 11:37 AM, wisekh6 <wise...@gmail.com> wrote: >>> >> Dear Henrik, >>> >> I have been doing paired PSCBS analysis on a Genome Wide SNP array 6.0 >>> >> data set in order to estimate LOH for 20 pairs of samples. >>> >> The code on the aroma web site successfully estimated LOH for most >>> >> pairs, while it fails to estimate LOH for several bad pairs. >>> >> An error message I got from a bad pair is : >>> >>>fit <- callLOH(fit, verbose=verbose); >>> >> Error: muC1atNonAB < muC1atAB is not TRUE >>> >> >>> >> The code, which is not exactly the same as my actual one, that I use >>> >> is as follows: >>> >> =============================================== >>> >> library("aroma.affymetrix"); >>> >> verbose <- Arguments$getVerbose(-10, timestamp=TRUE); >>> >> library("PSCBS"); >>> >> >>> >> res <- doASCRMAv2(csR, verbose=verbose); >>> >> data <- extractPSCNArray(res$total); >>> >> # Total CNs for the tumor relative to the matched normal >>> >> CT <- 2 * (data[,"total","T"] / data[,"total","N"]); >>> >> # Allele B fractions for the tumor >>> >> betaT <- data[,"fracB","T"]; >>> >> # Allele B fractions for the normal >>> >> betaN <- data[,"fracB","N"]; >>> >> # Get (chromosome, position) annotation data >>> >> ugp <- getAromaUgpFile(res$total); >>> >> chromosome <- ugp[,1,drop=TRUE]; >>> >> x <- ugp[,2,drop=TRUE]; >>> >> >>> >> # Setup data structure for Paired PSCBS >>> >> df <- data.frame(chromosome=chromosome, x=x, CT=CT, betaT=betaT, >>> >> betaN=betaN); >>> >> >>> >> # Run PSCBS >>> >> fit <- segmentByPairedPSCBS(df, verbose=verbose); >>> >> print("...segmentByPairedPSCBS") >>> >> >>> >> fit <- callAB(fit, verbose=verbose); >>> >> print("...callAB") >>> >> >>> >> fit <- callLOH(fit, verbose=verbose); >>> >> print("...callLOH") >>> >> =============================================== >>> >> >>> >> I wonder if you provide me with any suggestion regarding this error >>> >> message. >>> >> >>> >> Thank you, >>> >> >>> >> Best, >>> >> >>> >> Hoon >>> >> >>> >> sessionInfo() is as follows: >>> >> =============================================== >>> >>> sessionInfo() >>> >> >>> >> R version 2.13.0 (2011-04-13) >>> >> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >> >>> >> locale: >>> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> >> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>> >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >> >>> >> attached base packages: >>> >> [1] splines stats graphics grDevices utils datasets >>> >> methods >>> >> [8] base >>> >> >>> >> other attached packages: >>> >> [1] Hmisc_3.8-3 survival_2.36-5 DNAcopy_1.26.0 >>> >> [4] PSCBS_0.17.1 aroma.affymetrix_2.3.0 affxparser_1.24.0 >>> >> [7] aroma.apd_0.2.0 R.huge_0.3.0 aroma.core_2.3.2 >>> >> [10] aroma.light_1.22.0 matrixStats_0.4.3 R.rsp_0.7.0 >>> >> [13] R.cache_0.5.2 R.filesets_1.1.3 digest_0.5.1 >>> >> [16] R.utils_1.9.6 R.oo_1.8.3 R.methodsS3_1.2.1 >>> >> >>> >> loaded via a namespace (and not attached): >>> >> [1] cluster_1.13.3 grid_2.13.0 lattice_0.19-23 >>> >> =============================================== >>> >> >>> >> -- >>> >> When reporting problems on aroma.affymetrix, make sure 1) to run the >>> >> latest version of the package, 2) to report the output of sessionInfo() >>> >> and >>> >> traceback(), and 3) to post a complete code example. >>> >> >>> >> >>> >> You received this message because you are subscribed to the Google >>> >> Groups "aroma.affymetrix" group with website >>> >> http://www.aroma-project.org/. >>> >> To post to this group, send email to aroma-affymetrix@googlegroups.com >>> >> To unsubscribe and other options, go to >>> >> http://www.aroma-project.org/forum/ >>> >>> -- >>> When reporting problems on aroma.affymetrix, make sure 1) to run the >>> latest version of the package, 2) to report the output of sessionInfo() and >>> traceback(), and 3) to post a complete code example. >>> >>> >>> You received this message because you are subscribed to the Google Groups >>> "aroma.affymetrix" group with website http://www.aroma-project.org/. >>> To post to this group, send email to aroma-affymetrix@googlegroups.com >>> To unsubscribe and other options, go to >>> http://www.aroma-project.org/forum/ >> >> >> -- >> When reporting problems on aroma.affymetrix, make sure 1) to run the latest >> version of the package, 2) to report the output of sessionInfo() and >> traceback(), and 3) to post a complete code example. >> >> >> You received this message because you are subscribed to the Google Groups >> "aroma.affymetrix" group with website http://www.aroma-project.org/. >> To post to this group, send email to aroma-affymetrix@googlegroups.com >> To unsubscribe and other options, go to http://www.aroma-project.org/forum/ -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/