[aroma.affymetrix] Re: incorporating other copy number algorithms in aroma.affymetrix

Henrik Bengtsson Fri, 29 Aug 2008 12:06:07 -0700

Hi,

On Wed, Aug 27, 2008 at 9:44 PM, Roger <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> It took me some time (and finding a colaborator to help me) to write
> an R package wrapping my copy number algorithm GADA since I have been
> busy with so many other things.
>
> I used the answer you kindly provided some time ago to add my
> algorithm in aroma.affymetrix:
> http://groups.google.com/group/aroma-affymetrix/browse_thread/thread/614c7cdc1cbb8cf5/655efedf35aa6a3c?lnk=gst&q=theta+copy+number#655efedf35aa6a3c
> I think it was very detailed and I figured out the gaps by looking how
> the CBS and GLAD cases
> were handled. I really appreciate it.
>
> I think we still have to do some more checks to make sure everything
> is working well.
> So far,  I'm quite happy that  it looks like it goes much faster than
> CBS method.


Good to hear that you got it working and managed to wrap it up for use
in aroma.affymetrix.  Would you mind sharing this experience with
others by writing up the steps, say, in a Google Doc on this.  I can
have a look at it and then we you like we can put it online as a
vignette.

Also, did you create a stand-alone low-level 'GADA' package that does
not depend on aroma.affymetrix?  I suggest/ed this because then it is
easier for you to maintain the code and others have a chance to
utilize GADA without using aroma.affymetrix.  Larger user base for you
and higher quality for everyone.

When you have all in place and it looks stable, we can make the
GadaModel wrapper class part of the aroma.affymetrix package, if you
want to.  Alternatively, you can create a aroma.affymetrix.GADA add-on
package.

>
> While implementing, I noticed that when I extract the copy number
> intensities, you also estimate
> something that looks like the standard deviation sdM...
>
>> data <- getRawCnData(gada, ceList=ceList, refList=refList, chromosome=22, 
>> verbose=-10);
>> str(data)
>  num [1:74, 1:6] 15685581 16604328 16958128 17506879 18008688 ...
>  - attr(*, "dimnames")=List of 2
>  ..$ : NULL
>  ..$ : chr [1:6] "x" "M" "sdTheta" "sdM" ...
>
> Is this correct? If that is the case I would be interested in knowing
> how are these
> SD calculated... I may be able to use them in my algorithm.

Wow, I looked at the code and I didn't even remember I wrote that
myself, but I did.  If you look at getRawCnData() of
CopyNumberChromosomalModel, you'll find

  # Estimate the std dev of the raw log2(CN).
  # [only if ref is average across arrays]
  [...]
  # Use Gauss' approximation (since mu and sigma are on the
  # intensity scale)
  sdM <- log2(exp(1)) * sqrt(1+1/n) * data$sdTheta / data$theta;

where 'data' is the reference data.  Thus, 'sdM' is an estimate of the
standard deviation of the log-ratios across all arrays, *iff the
reference is calculate from all arrays*.  So, instead of calculating
'M' for each array and then calling sd()/mad() across arrays, this
tries to infer that from the average array (the reference), because
that hold the mean and the stdvs of all theta:s across arrays.

I'd say that 'sdM' is a feature very much in alpha version that you
should use.  If you start to rely on it, there will just be
confusions, especially if other kind of references are used.   Maybe I
should remove it; as you see, you can calculate it easily yourself
given the average reference file.

>
> Finally, one last question is that I get a few probes with the M value
> NA. I guess it is fine,
> but I just wonder how they arise.

See the CRMA paper:

H. Bengtsson; R. Irizarry; B. Carvalho; T. Speed, Estimation and
assessment of raw copy numbers at the single locus level,
Bioinformatics, 2008. [pmid: 18204055] [doi:
10.1093/bioinformatics/btn016]

You didn't say how you got your log-ratios in the first place and on
what platform, but the short answer is that allelic crosstalk corrects
for offset in data and thereby it introduces non-positive signals
which might carry through all the way and when taking the logarithm
they become NAs.  This is an inherit problem of all "models/steps"
taking the log.

Cheers

Henrik

>
> I really appreciate your help,
>
> Roger,
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

[aroma.affymetrix] Re: incorporating other copy number algorithms in aroma.affymetrix

Reply via email to