Hi. On Tue, Oct 7, 2008 at 6:33 AM, marco <[EMAIL PROTECTED]> wrote: > > Dear Henrik, > > seem that the file is found, but I still have the same problem. I > wonder if the format might be confusing for the software?
Yes, it looks like it finds the annotationData/samples/ploidy.saf file. As you say, it might be that the format of you file is incorrect. See what you get if you do: sas <- SampleAnnotationSet$fromPath("annotationData/samples"); # next rel: byName() saf <- getFile(sas, indexOf(sas, "ploidy")); data <- readDataFrame(saf); print(data); You should get a data frame where each row is a sample and the columns are all possible attributes. For instance, using the HapMap270.saf file I referred to in an earlier message you get: # Warning to people reading this (now and in the future): # This is still part of the internal API and hence not documented. sas <- SampleAnnotationSet$fromPath("annotationData/samples"); saf <- getFile(sas, indexOf(sas, "HapMap270")); data <- readDataFrame(saf); name familyID individualID fatherID motherID gender population tags 1,] "NA12003" "1420" "9" "NA" "NA" "male" "CEU" "XY" 2,] "NA12004" "1420" "10" "NA" "NA" "female" "CEU" "XX" 3,] "NA10838" "1420" "1" "9" "10" "male" "CEU" "XY" > Actually the important thing is that the arrays are processed with the > right number of X/Y chromosomes. > Is there any other way to check it? Not other than checking the attributes (and highly detailed verbose output). Cheers /Henrik > > Cheers > Marco > >> cs <- AffymetrixCelSet$fromName("ESC_IBD", cdf=cdf,verbose=-20) > Defining AffymetrixCelSet from files... > Defining an AffymetrixCelSet object from files... > Path: rawData/ESC_IBD/GenomeWideSNP_6 > Pattern: [.](c|C)(e|E)(l|L)$ > File class: AffymetrixCelFile > Scanning directory for files... > Found 26 files. > Scanning directory for files...done > Defining 26 files... > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, > 21, 22, 23, 24, 25, 26, > Defining 26 files...done > Allocating a new AffymetrixCelSet instance... > Arguments: > Number of files:26 > list() > Allocating a new AffymetrixCelSet instance...done > Updating newly allocated AffymetrixCelSet... > Updating AffymetrixCelSet... > Scanning for and applying sample annotation files... > Defining 1 files... > 1, > Defining 1 files...done > SampleAnnotationSet: > Name: annotationData > Tags: > Full name: annotationData > Number of files: 1 > Names: ploidy > Path (to the first file): annotationData/samples > Total file size: 0.00MB > RAM: 0.00MB > Scanning for and applying sample annotation files...done > Updating AffymetrixCelSet...done > Updating newly allocated AffymetrixCelSet...done > Defining an AffymetrixCelSet object from files...done > Retrieved files: 26... > The chip type according to the path is: GenomeWideSNP_6 > Since 'checkChipType=FALSE', then the chip type specified by the > directory name is used: GenomeWideSNP_6 > Using prespecified CDF: GenomeWideSNP_6,Full > Updating the CDF for all files... > Updating the CDF for all files...done > Updating AffymetrixCelSet... > Scanning for and applying sample annotation files... > Defining 1 files... > 1, > Defining 1 files...done > SampleAnnotationSet: > Name: annotationData > Tags: > Full name: annotationData > Number of files: 1 > Names: ploidy > Path (to the first file): annotationData/samples > Total file size: 0.00MB > RAM: 0.00MB > Scanning for and applying sample annotation files...done > Updating AffymetrixCelSet...done > Retrieved files: 26...done > Defining AffymetrixCelSet from files...done >> cf <- getFile(cs, indexOf(cs, "MD1")); >> attrs <- getAttributes(cf); > Error in order(names(attrs)) : argument 1 is not a vector >> > > > > On Oct 5, 1:54 am, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote: >> Hi. >> >> >> >> On Thu, Oct 2, 2008 at 2:39 AM, marco <[EMAIL PROTECTED]> wrote: >> >> > Dear Henrik, >> >> > I tried the X chromosome variant but I am not sure I can het it to >> > work. >> > I made a *.saf file and place it into annotationData/samples/ >> > File looks like this: >> >> > name:MD10 >> > tags:XY >> > name:MD11 >> > tags:XY >> > name:MD12 >> > tags:XY >> > name:MD13 >> > ... >> > ... >> >> This looks correct to me. Maybe you should try to add an empty line >> between the entries. What is the full filename of this file? >> >> >> >> >> >> > Anyway I cannot get the function getAttributes to work, so I am unsure >> > if the *.saf is read correctly. >> > Below is the output: >> >> cdf <- AffymetrixCdfFile$fromChipType("GenomeWideSNP_6", tags="Full") >> >> print(cdf) >> > AffymetrixCdfFile: >> > Path: annotationData/chipTypes/GenomeWideSNP_6 >> > Filename: GenomeWideSNP_6,Full.cdf >> > Filesize: 470.44MB >> > Chip type: GenomeWideSNP_6,Full >> > RAM: 0.00MB >> > File format: v4 (binary; XDA) >> > Dimension: 2572x2680 >> > Number of cells: 6892960 >> > Number of units: 1881415 >> > Cells per unit: 3.66 >> > Number of QC units: 4 >> >> gi <- getGenomeInformation(cdf) >> >> print(gi) >> > UgpGenomeInformation: >> > Name: GenomeWideSNP_6 >> > Tags: Full,na24,HB20080214 >> > Pathname: annotationData/chipTypes/GenomeWideSNP_6/ >> > GenomeWideSNP_6,Full,na24,HB20080214.ugp >> > File size: 8.97MB >> > RAM: 0.00MB >> > Chip type: GenomeWideSNP_6,Full >> >> si <- getSnpInformation(cdf) >> >> print(si) >> > UflSnpInformation: >> > Name: GenomeWideSNP_6 >> > Tags: Full,na24,HB20080214 >> > Pathname: annotationData/chipTypes/GenomeWideSNP_6/ >> > GenomeWideSNP_6,Full,na24,HB20080214.ufl >> > File size: 7.18MB >> > RAM: 0.00MB >> > Chip type: GenomeWideSNP_6,Full >> > Number of enzymes: 2 >> >> cs <- AffymetrixCelSet$fromName("ESC", cdf=cdf) >> >> If you do >> >> cs <- AffymetrixCelSet$fromName("ESC", cdf=cdf, verbose=-20) >> >> You should see from the output showing what SAF files are located and >> that they are read, e.g. >> >> cdf <- AffymetrixCdfFile$byChipType("Mapping50K_Hind240"); >> csR <- AffymetrixCelSet$byName("HapMap270,100K,CEU,testSet", cdf=cdf, >> verbose=-20); >> >> ... >> 20081004 16:47:49| Allocating a new AffymetrixCelSet instance...done >> 20081004 16:47:49| Updating newly allocated AffymetrixCelSet... >> 20081004 16:47:49| Updating AffymetrixCelSet... >> 20081004 16:47:49| Scanning for and applying sample annotation files... >> SampleAnnotationSet: >> Name: annotationData >> Tags: >> Full name: annotationData >> Number of files: 7 >> Names: 000.default, AGRF_2007a, ..., HapMap270 >> Path (to the first file): annotationData/samples >> Total file size: 0.03MB >> RAM: 0.00MB >> 20081004 16:47:50| Scanning for and applying sample annotation >> files...done >> 20081004 16:47:50| Updating AffymetrixCelSet...done >> 20081004 16:47:50| Updating newly allocated AffymetrixCelSet...done >> 20081004 16:47:50| Defining an AffymetrixCelSet object from files...done >> ... >> >> >> >> >> print(cs) >> > AffymetrixCelSet: >> > Name: ESC >> > Tags: >> > Path: rawData/ESC/GenomeWideSNP_6 >> > Platform: Affymetrix >> > Chip type: GenomeWideSNP_6,Full >> > Number of arrays: 26 >> > Names: MD10, MD11, ..., VT06_TER2102EP >> > Time period: 2008-07-11 11:09:02 -- 2008-09-03 14:47:43 >> > Total file size: 1712.75MB >> > RAM: 0.04MB >> >> cf <- getFile(cs, indexOf(cs, "MD10")); >> > AffymetrixCelFile: >> > Name: MD10 >> > Tags: >> > Pathname: rawData/ESC/GenomeWideSNP_6/MD10.CEL >> > File size: 65.88MB >> > RAM: 0.01MB >> > File format: v1 (binary; CC) >> > Platform: Affymetrix >> > Chip type: GenomeWideSNP_6,Full >> > Timestamp: 2008-07-17 19:31:03 >> >> attrs <- getAttributes(cf); >> > Error in order(names(attrs)) : argument 1 is not a vector >> >> This does indeed indicate that there were no attributes set, i.e. it >> looks like the *.saf file was not located. (In next release, this >> will return NULL instead of giving an error). >> >> Did the above help? >> >> /Henrik >> >> >> sessionInfo() >> >> > R version 2.7.2 (2008-08-25) >> > x86_64-unknown-linux-gnu >> >> > locale: >> > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C >> >> > attached base packages: >> > [1] stats graphics grDevices datasets utils methods >> > base >> >> > other attached packages: >> > [1] aroma.affymetrix_0.9.4 aroma.apd_0.1.3 >> > R.huge_0.1.6 >> > [4] affxparser_1.12.2 aroma.core_0.9.4 >> > sfit_0.1.5 >> > [7] aroma.light_1.8.1 digest_0.3.1 >> > matrixStats_0.1.3 >> > [10] R.rsp_0.3.4 R.cache_0.1.7 >> > R.utils_1.0.4 >> > [13] R.oo_1.4.6 R.methodsS3_1.0.3 >> >> > Best Regards >> >> > Marco >> >> > On Sep 19, 10:10 pm, "Henrik Bengtsson" <[EMAIL PROTECTED]> >> > wrote: >> >> Hi. >> >> >> On Fri, Sep 19, 2008 at 4:35 AM, marco <[EMAIL PROTECTED]> wrote: >> >> >> > Dear List, >> >> >> > I wonder about how X chromosome is treated in aroma.affymetrix. >> >> > Is the mix of male/female samples somehow taken into account, or X is >> >> > processed as any other chromosome? >> >> >> Good idea ;) Yes, the CRMA model does take into account the fact >> >> that different samples have different ploidies on ChrX when using the >> >> pool of arrays as a reference. The idea is to calculate the robust >> >> average across all arrays and correct for the bias that >> >> non-copy-neutral samples introduce. See Section '3.2.7 Reference >> >> signals' in: >> >> >> H. Bengtsson; R. Irizarry; B. Carvalho; T. Speed, Estimation and >> >> assessment of raw copy numbers at the single locus level, >> >> Bioinformatics, 2008. [pmid: 18204055] [doi: >> >> 10.1093/bioinformatics/btn016] >> >> >> for more details. The model/method requires that at least one sample >> >> is copy neutral, i.e. you need at least one "female" in order to >> >> estimate a diploid reference on ChrX. The same bias-correction >> >> method can also be used when some of samples are say trisomy 21. For >> >> ChrY, our current model cannot give you a *diploid* ChrY reference, >> >> but a *copy neutral* one, i.e. CN=1 (requires at least one "male"). >> >> To the best of my understand, none of the other methods out there use >> >> this, but instead it is common to see that only female samples are >> >> used for the ChrX reference. See the above CRMA paper to see how much >> >> the ChrX CN estimates are improved when you use the above >> >> bias-corrected method instead. >> >> >> > In these case the female and males test array are supposed to have >> >> > log2 values on the average over and below zero? >> >> >> I'm note sure what you mean by this, but maybe the above answered this >> >> question too. >> >> >> So, how do you do this in aroma.affymetrix? I have on purpose avoided >> >> giving the details on this until someone asks for it, because it >> >> involved the use of a new kind of non-finalized sample annotation >> >> files (SAFs). I don't want to bother people with alpha versions in >> >> case the API/format changes. It is unlikely that it will change much >> >> but if you can accept that it might change, I created a new vignette >> >> explaining how to do it: >> >> >> Vignette 'Sex-chromosome bias-corrected reference signals from pooled >> >> average'http://groups.google.com/group/aroma-affymetrix/web/sex-chromosome-bi... >> >> >> It also illustrated a lot of other things so it might be useful for >> >> others too. >> >> >> Hope this helps >> >> >> Henrik >> >> >> > Regards >> >> >> > Marco > > > --~--~---------~--~----~------------~-------~--~----~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~----------~----~----~----~------~----~------~--~---