[aroma.affymetrix] Re: X chromosome CNV analysis

Henrik Bengtsson Tue, 28 Oct 2008 15:16:28 -0700

Hi.

On Tue, Oct 28, 2008 at 3:13 AM, marco <[EMAIL PROTECTED]> wrote:
>
> Dear Henrik,
>
>  when I run the SEX corrected variant, I find different results
> (different CNVs) for the chromosomes 1-22.
> I was expecting to see changes only in the X chromosome CNVs. Does it
> make sense or I made mistakes somewhere?


Yes, it should only affect the chromosomes with a different ploidy
than the default CN=2.  I'll have to look into this, e.g. create a
redundancy test, check the code etc.  That will take some time.

> BTW: the reports file produced has a tag "Paired", is this expected?

Yes, that is because when you pass 'ceRef' as the references then the
CbsModel thinks it is a paired setup (which is the most common case).
There is currently no public API for changing this, but you can do
[use with care as it may change] by:

> print(getTags(cbs, collapse=","))
[1] "ACC,-XY,RMA,+300,A+B,FLN,-XY,paired"
> cbs$.paired <- FALSE   # <==
> print(getTags(cbs, collapse=","))
[1] "ACC,-XY,RMA,+300,A+B,FLN,-XY"

/Henrik

>
> This is my piece of code:
> ##########
> cdf      <- AffymetrixCdfFile$fromChipType("GenomeWideSNP_6",
> tags="Full")
> print(cdf)
> gi       <- getGenomeInformation(cdf)
> print(gi)
> si       <- getSnpInformation(cdf)
> print(si)
> cs       <- AffymetrixCelSet$fromName("ESC_IBD", cdf=cdf,verbose=-20)
> print(cs)
> acc      <- AllelicCrosstalkCalibration(cs)
> print(acc)
> csC      <- process(acc, verbose=verbose)
> print(csC)
> plm      <- AvgCnPlm(csC, mergeStrands=TRUE, combineAlleles=TRUE,
> shift=+300)
> print(plm)
> fit(plm, verbose=verbose)
> ces      <- getChipEffectSet(plm)
> print(ces)
> fln      <- FragmentLengthNormalization(ces)
> print(fln)
> cesN     <- process(fln, verbose=verbose)
> print(cesN)
> cf       <- getFile(cs, indexOf(cs, "MD1"));
> attrs    <- getAttributes(cf);
> str(attrs);
> #
> nXY           <- t(sapply(cs, function(cf) getAttributes(cf)[c("n23",
> "n24")]));
> rownames(nXY) <- getNames(cs);
> print(nXY);
> ceRef         <- calculateBaseline(cesN, chromosomes=1:23,
> ploidy=2,defaultPloidy=2, verbose=verbose);
> print(ceRef)
> cbs     <- CbsModel(cesN, ceRef)
> print(cbs)
> ce      <- ChromosomeExplorer(cbs)
> print(ce)
> process(ce,arrays=c(21:26), chromosomes=c(1:23), verbose=verbose)
>
> ########### for non sex corrected
> cbs      <- CbsModel(cesN) #This calculates CNVs with reference the
> robust median estimate from mall the arrays
> print(cbs)
> ce       <- ChromosomeExplorer(cbs)
> print(ce)
> process(ce,arrays=c(21:26), chromosomes=c(1:23), verbose=verbose)
> ###########
>
> Cheers
>
> Marco
>
>
>
>
> On Oct 25, 5:52 pm, marco <[EMAIL PROTECTED]> wrote:
>> Dear Henrik,
>>
>>  the trick is to have an empty line between each array in the saf
>> file!
>>
>> This would work!
>> #
>> name:MD10
>> tags:XY
>>
>> name:MD11
>> tags:XY
>>
>> name:MD12
>> tags:XY
>>
>> name:MD13
>> ....
>> ...
>> ...
>>
>> Cheers
>>
>> Marco
>>
>> On Oct 9, 11:15 pm, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote:
>>
>> > Hi.
>>
>> > On Tue, Oct 7, 2008 at 6:33 AM, marco <[EMAIL PROTECTED]> wrote:
>>
>> > > Dear Henrik,
>>
>> > >  seem that the file is found, but I still have the same problem. I
>> > > wonder if the format might be confusing for the software?
>>
>> > Yes, it looks like it finds the annotationData/samples/ploidy.saf
>> > file.  As you say, it might be that the format of you file is
>> > incorrect.  See what you get if you do:
>>
>> > sas <- SampleAnnotationSet$fromPath("annotationData/samples");  # next
>> > rel: byName()
>> > saf <- getFile(sas, indexOf(sas, "ploidy"));
>> > data <- readDataFrame(saf);
>> > print(data);
>>
>> > You should get a data frame where each row is a sample and the columns
>> > are all possible attributes.  For instance, using the HapMap270.saf
>> > file I referred to in an earlier message you get:
>>
>> > # Warning to people reading this (now and in the future):
>> > # This is still part of the internal API and hence not documented.
>> > sas <- SampleAnnotationSet$fromPath("annotationData/samples");
>> > saf <- getFile(sas, indexOf(sas, "HapMap270"));
>> > data <- readDataFrame(saf);
>>
>> >     name      familyID individualID fatherID motherID gender   population 
>> > tags
>> > 1,] "NA12003" "1420"   "9"          "NA"     "NA"     "male"   "CEU"      
>> > "XY"
>> > 2,] "NA12004" "1420"   "10"         "NA"     "NA"     "female" "CEU"      
>> > "XX"
>> > 3,] "NA10838" "1420"   "1"          "9"      "10"     "male"   "CEU"      
>> > "XY"
>>
>> > > Actually the important thing is that the arrays are processed with the
>> > > right number of X/Y chromosomes.
>> > > Is there any other way to check it?
>>
>> > Not other than checking the attributes (and highly detailed verbose 
>> > output).
>>
>> > Cheers
>>
>> > /Henrik
>>
>> > > Cheers
>> > > Marco
>>
>> > >> cs       <- AffymetrixCelSet$fromName("ESC_IBD", cdf=cdf,verbose=-20)
>> > > Defining AffymetrixCelSet from files...
>> > >  Defining an AffymetrixCelSet object from files...
>> > >  Path: rawData/ESC_IBD/GenomeWideSNP_6
>> > >  Pattern: [.](c|C)(e|E)(l|L)$
>> > >  File class: AffymetrixCelFile
>> > >  Scanning directory for files...
>> > >   Found 26 files.
>> > >  Scanning directory for files...done
>> > >  Defining 26 files...
>> > > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
>> > > 21, 22, 23, 24, 25, 26,
>> > >  Defining 26 files...done
>> > >  Allocating a new AffymetrixCelSet instance...
>> > >   Arguments:
>> > >   Number of files:26
>> > >    list()
>> > >  Allocating a new AffymetrixCelSet instance...done
>> > >  Updating newly allocated AffymetrixCelSet...
>> > >   Updating AffymetrixCelSet...
>> > >    Scanning for and applying sample annotation files...
>> > >     Defining 1 files...
>> > > 1,
>> > >     Defining 1 files...done
>> > >     SampleAnnotationSet:
>> > >     Name: annotationData
>> > >     Tags:
>> > >     Full name: annotationData
>> > >     Number of files: 1
>> > >     Names: ploidy
>> > >     Path (to the first file): annotationData/samples
>> > >     Total file size: 0.00MB
>> > >     RAM: 0.00MB
>> > >    Scanning for and applying sample annotation files...done
>> > >   Updating AffymetrixCelSet...done
>> > >  Updating newly allocated AffymetrixCelSet...done
>> > >  Defining an AffymetrixCelSet object from files...done
>> > >  Retrieved files: 26...
>> > >  The chip type according to the path is: GenomeWideSNP_6
>> > >  Since 'checkChipType=FALSE', then the chip type specified by the
>> > > directory name is used: GenomeWideSNP_6
>> > >  Using prespecified CDF: GenomeWideSNP_6,Full
>> > >  Updating the CDF for all files...
>> > >  Updating the CDF for all files...done
>> > >  Updating AffymetrixCelSet...
>> > >   Scanning for and applying sample annotation files...
>> > >    Defining 1 files...
>> > > 1,
>> > >    Defining 1 files...done
>> > >    SampleAnnotationSet:
>> > >    Name: annotationData
>> > >    Tags:
>> > >    Full name: annotationData
>> > >    Number of files: 1
>> > >    Names: ploidy
>> > >    Path (to the first file): annotationData/samples
>> > >    Total file size: 0.00MB
>> > >    RAM: 0.00MB
>> > >   Scanning for and applying sample annotation files...done
>> > >  Updating AffymetrixCelSet...done
>> > >  Retrieved files: 26...done
>> > > Defining AffymetrixCelSet from files...done
>> > >> cf    <- getFile(cs, indexOf(cs, "MD1"));
>> > >> attrs <- getAttributes(cf);
>> > > Error in order(names(attrs)) : argument 1 is not a vector
>>
>> > > On Oct 5, 1:54 am, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote:
>> > >> Hi.
>>
>> > >> On Thu, Oct 2, 2008 at 2:39 AM, marco <[EMAIL PROTECTED]> wrote:
>>
>> > >> > Dear Henrik,
>>
>> > >> >  I tried the X chromosome variant but I am not sure I can het it to
>> > >> > work.
>> > >> > I made a *.saf file and place it into annotationData/samples/
>> > >> > File looks like this:
>>
>> > >> > name:MD10
>> > >> > tags:XY
>> > >> > name:MD11
>> > >> > tags:XY
>> > >> > name:MD12
>> > >> > tags:XY
>> > >> > name:MD13
>> > >> > ...
>> > >> > ...
>>
>> > >> This looks correct to me.  Maybe you should try to add an empty line
>> > >> between the entries.  What is the full filename of this file?
>>
>> > >> > Anyway I cannot get the function getAttributes to work, so I am unsure
>> > >> > if the *.saf is read correctly.
>> > >> > Below is the output:
>> > >> >> cdf      <- AffymetrixCdfFile$fromChipType("GenomeWideSNP_6", 
>> > >> >> tags="Full")
>> > >> >> print(cdf)
>> > >> > AffymetrixCdfFile:
>> > >> > Path: annotationData/chipTypes/GenomeWideSNP_6
>> > >> > Filename: GenomeWideSNP_6,Full.cdf
>> > >> > Filesize: 470.44MB
>> > >> > Chip type: GenomeWideSNP_6,Full
>> > >> > RAM: 0.00MB
>> > >> > File format: v4 (binary; XDA)
>> > >> > Dimension: 2572x2680
>> > >> > Number of cells: 6892960
>> > >> > Number of units: 1881415
>> > >> > Cells per unit: 3.66
>> > >> > Number of QC units: 4
>> > >> >> gi       <- getGenomeInformation(cdf)
>> > >> >> print(gi)
>> > >> > UgpGenomeInformation:
>> > >> > Name: GenomeWideSNP_6
>> > >> > Tags: Full,na24,HB20080214
>> > >> > Pathname: annotationData/chipTypes/GenomeWideSNP_6/
>> > >> > GenomeWideSNP_6,Full,na24,HB20080214.ugp
>> > >> > File size: 8.97MB
>> > >> > RAM: 0.00MB
>> > >> > Chip type: GenomeWideSNP_6,Full
>> > >> >> si       <- getSnpInformation(cdf)
>> > >> >> print(si)
>> > >> > UflSnpInformation:
>> > >> > Name: GenomeWideSNP_6
>> > >> > Tags: Full,na24,HB20080214
>> > >> > Pathname: annotationData/chipTypes/GenomeWideSNP_6/
>> > >> > GenomeWideSNP_6,Full,na24,HB20080214.ufl
>> > >> > File size: 7.18MB
>> > >> > RAM: 0.00MB
>> > >> > Chip type: GenomeWideSNP_6,Full
>> > >> > Number of enzymes: 2
>> > >> >> cs       <- AffymetrixCelSet$fromName("ESC", cdf=cdf)
>>
>> > >> If you do
>>
>> > >> cs <- AffymetrixCelSet$fromName("ESC", cdf=cdf, verbose=-20)
>>
>> > >> You should see from the output showing what SAF files are located and
>> > >> that they are read, e.g.
>>
>> > >> cdf <- AffymetrixCdfFile$byChipType("Mapping50K_Hind240");
>> > >> csR <- AffymetrixCelSet$byName("HapMap270,100K,CEU,testSet", cdf=cdf,
>> > >> verbose=-20);
>>
>> > >> ...
>> > >> 20081004 16:47:49|  Allocating a new AffymetrixCelSet instance...done
>> > >> 20081004 16:47:49|  Updating newly allocated AffymetrixCelSet...
>> > >> 20081004 16:47:49|   Updating AffymetrixCelSet...
>> > >> 20081004 16:47:49|    Scanning for and applying sample annotation 
>> > >> files...
>> > >>      SampleAnnotationSet:
>> > >>      Name: annotationData
>> > >>      Tags:
>> > >>      Full name: annotationData
>> > >>      Number of files: 7
>> > >>      Names: 000.default, AGRF_2007a, ..., HapMap270
>> > >>      Path (to the first file): annotationData/samples
>> > >>      Total file size: 0.03MB
>> > >>      RAM: 0.00MB
>> > >> 20081004 16:47:50|    Scanning for and applying sample annotation 
>> > >> files...done
>> > >> 20081004 16:47:50|   Updating AffymetrixCelSet...done
>> > >> 20081004 16:47:50|  Updating newly allocated AffymetrixCelSet...done
>> > >> 20081004 16:47:50| Defining an AffymetrixCelSet object from files...done
>> > >> ...
>>
>> > >> >> print(cs)
>> > >> > AffymetrixCelSet:
>> > >> > Name: ESC
>> > >> > Tags:
>> > >> > Path: rawData/ESC/GenomeWideSNP_6
>> > >> > Platform: Affymetrix
>> > >> > Chip type: GenomeWideSNP_6,Full
>> > >> > Number of arrays: 26
>> > >> > Names: MD10, MD11, ..., VT06_TER2102EP
>> > >> > Time period: 2008-07-11 11:09:02 -- 2008-09-03 14:47:43
>> > >> > Total file size: 1712.75MB
>> > >> > RAM: 0.04MB
>> > >> >> cf    <- getFile(cs, indexOf(cs, "MD10"));
>> > >> > AffymetrixCelFile:
>> > >> > Name: MD10
>> > >> > Tags:
>> > >> > Pathname: rawData/ESC/GenomeWideSNP_6/MD10.CEL
>> > >> > File size: 65.88MB
>> > >> > RAM: 0.01MB
>> > >> > File format: v1 (binary; CC)
>> > >> > Platform: Affymetrix
>> > >> > Chip type: GenomeWideSNP_6,Full
>> > >> > Timestamp: 2008-07-17 19:31:03
>> > >> >> attrs <- getAttributes(cf);
>> > >> > Error in order(names(attrs)) : argument 1 is not a vector
>>
>> > >> This does indeed indicate that there were no attributes set, i.e. it
>> > >> looks like the *.saf file was not located.  (In next release, this
>> > >> will return NULL instead of giving an error).
>>
>> > >> Did the above help?
>>
>> > >> /Henrik
>>
>> > >> >> sessionInfo()
>>
>> > >> > R version 2.7.2 (2008-08-25)
>> > >> > x86_64-unknown-linux-gnu
>>
>> > >> > locale:
>> > >> > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>
>> > >> > attached base packages:
>> > >> > [1] stats     graphics  grDevices datasets  utils     methods
>> > >> > base
>>
>> > >> > other attached packages:
>> > >> >  [1] aroma.affymetrix_0.9.4 aroma.apd_0.1.3
>> > >> > R.huge_0.1.6
>> > >> >  [4] affxparser_1.12.2      aroma.core_0.9.4
>> > >> > sfit_0.1.5
>> > >> >  [7] aroma.light_1.8.1      digest_0.3.1
>>
>> ...
>>
>> read more »
> >
>

--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

[aroma.affymetrix] Re: X chromosome CNV analysis

Reply via email to