Hi, On Wed, Aug 27, 2008 at 9:44 PM, Roger <[EMAIL PROTECTED]> wrote: > > Hi, > > It took me some time (and finding a colaborator to help me) to write > an R package wrapping my copy number algorithm GADA since I have been > busy with so many other things. > > I used the answer you kindly provided some time ago to add my > algorithm in aroma.affymetrix: > http://groups.google.com/group/aroma-affymetrix/browse_thread/thread/614c7cdc1cbb8cf5/655efedf35aa6a3c?lnk=gst&q=theta+copy+number#655efedf35aa6a3c > I think it was very detailed and I figured out the gaps by looking how > the CBS and GLAD cases > were handled. I really appreciate it. > > I think we still have to do some more checks to make sure everything > is working well. > So far, I'm quite happy that it looks like it goes much faster than > CBS method.
Good to hear that you got it working and managed to wrap it up for use in aroma.affymetrix. Would you mind sharing this experience with others by writing up the steps, say, in a Google Doc on this. I can have a look at it and then we you like we can put it online as a vignette. Also, did you create a stand-alone low-level 'GADA' package that does not depend on aroma.affymetrix? I suggest/ed this because then it is easier for you to maintain the code and others have a chance to utilize GADA without using aroma.affymetrix. Larger user base for you and higher quality for everyone. When you have all in place and it looks stable, we can make the GadaModel wrapper class part of the aroma.affymetrix package, if you want to. Alternatively, you can create a aroma.affymetrix.GADA add-on package. > > While implementing, I noticed that when I extract the copy number > intensities, you also estimate > something that looks like the standard deviation sdM... > >> data <- getRawCnData(gada, ceList=ceList, refList=refList, chromosome=22, >> verbose=-10); >> str(data) > num [1:74, 1:6] 15685581 16604328 16958128 17506879 18008688 ... > - attr(*, "dimnames")=List of 2 > ..$ : NULL > ..$ : chr [1:6] "x" "M" "sdTheta" "sdM" ... > > Is this correct? If that is the case I would be interested in knowing > how are these > SD calculated... I may be able to use them in my algorithm. Wow, I looked at the code and I didn't even remember I wrote that myself, but I did. If you look at getRawCnData() of CopyNumberChromosomalModel, you'll find # Estimate the std dev of the raw log2(CN). # [only if ref is average across arrays] [...] # Use Gauss' approximation (since mu and sigma are on the # intensity scale) sdM <- log2(exp(1)) * sqrt(1+1/n) * data$sdTheta / data$theta; where 'data' is the reference data. Thus, 'sdM' is an estimate of the standard deviation of the log-ratios across all arrays, *iff the reference is calculate from all arrays*. So, instead of calculating 'M' for each array and then calling sd()/mad() across arrays, this tries to infer that from the average array (the reference), because that hold the mean and the stdvs of all theta:s across arrays. I'd say that 'sdM' is a feature very much in alpha version that you should use. If you start to rely on it, there will just be confusions, especially if other kind of references are used. Maybe I should remove it; as you see, you can calculate it easily yourself given the average reference file. > > Finally, one last question is that I get a few probes with the M value > NA. I guess it is fine, > but I just wonder how they arise. See the CRMA paper: H. Bengtsson; R. Irizarry; B. Carvalho; T. Speed, Estimation and assessment of raw copy numbers at the single locus level, Bioinformatics, 2008. [pmid: 18204055] [doi: 10.1093/bioinformatics/btn016] You didn't say how you got your log-ratios in the first place and on what platform, but the short answer is that allelic crosstalk corrects for offset in data and thereby it introduces non-positive signals which might carry through all the way and when taking the logarithm they become NAs. This is an inherit problem of all "models/steps" taking the log. Cheers Henrik > > I really appreciate your help, > > Roger, > > > > --~--~---------~--~----~------------~-------~--~----~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~----------~----~----~----~------~----~------~--~---