[aroma.affymetrix] Re: Reference data for CN analysis

Henrik Bengtsson Thu, 25 Jun 2009 22:35:20 -0700

Hi.

2009/6/25 Lavinia Gordon <lavinia.gor...@mcri.edu.au>:
>
> Dear all
>
> A query re samples to use for a reference pool - the bigger the
> better?  I am not so certain.  I have looked at a number of similar
> samples repeatedly, and as these samples have arrived over a period of
> time, so my pool of normals has grown.  I have re-run a few analyses
> altering the pool size, and the results are not always cleaner.  My
> pool samples have all been run at the same place, however they are
> very heterogeneous.  Does anyone have any thoughts/opinions on this?
> Any suggested QC/plots to try and select the best reference samples
> from my pool of ~100?  [N.B. these are 6.0 chips].


When you say heterogeneous, do you mean that they contain a lot of CN
aberrations or do you mean that they have different noise levels?

If "not too many" things go on in your reference samples, then you
should expect to get an improvement when you calculate the reference
channel as the average over a larger and larger pool of samples.  A
few years ago I checked this on 500K data and I found a dramatic drop
in SNRs when increasing from 5 to 10 to 20 reference samples and then
i flattened out.

However, if you look Nannya et al (2005), their CNAG method tries to
identify a subset of reference samples that gives best SNRs.  This is
to say that you believe there is a set of reference samples that are
more "normal" than others.  An alternative argument, which may make
even more sense is that there will always be some systematic effects
remaining in the estimates and if you can locate a set of reference
samples that have similar remaining effects as you test sample, they
will cancel out better than if other reference samples where used.
Note that this strategy uses different pools of references for each
test sample.   I know that the Broad Institute (Gaddy Getz, Scott
Carter et al.) are doing something similar in the TCGA project and
they say they get better SNRs.

This is something I wanted to look into for quite a while, but there
hasn't and there still isn't any time for me to do this.  I think it
is worth investigating how to obtain better reference signals from a
pool of samples.  Yet another useful project if someone has the time.

At least this should be a start

Henrik

REFERENCES:

[1]  Nannya, Y.; Sanada, M.; Nakazaki, K.; Hosoya, N.; Wang, L.;
Hangaishi, A.; Kurokawa, M.; Chiba, S.; Bailey, D. K.; Kennedy, G. C.
& Ogawa, S., A robust algorithm for copy number detection using
high-density oligonucleotide single nucleotide polymorphism genotyping
arrays. #CancerRes#, 2005, 65, 6071-6079 PMID: 16024607.

>
> with thanks,
>
> Lavinia Gordon.
> >
>

--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

[aroma.affymetrix] Re: Reference data for CN analysis

Reply via email to