Hi All, I'm interested in doing some basic quality control with Affy SNP6.0 chips. In particular, I'd like to get summary statistics and plot densities of the intensity values for each array both before and after normalization (allelic crosstalk calibration and base position normalization). Plotting the densities pre/post-normalization is no problem. Moreover, I can use extractAffyBatch() to get the intensity values and summary statistics before normalizing. However, I get an error about a corrupted CEL file when I try to use extractAffyBatch() post-normalization.
Is there a way to extract the intensity values without using extractAffyBatch()? My code, traceback, and sessionInfo are below. Thanks in advance, Vonn > #Load aroma > library(aroma.affymetrix) Loading required package: R.utils Loading required package: R.oo Loading required package: R.methodsS3 R.methodsS3 v1.2.0 (2010-03-13) successfully loaded. See ?R.methodsS3 for help. R.oo v1.7.3 (2010-06-04) successfully loaded. See ?R.oo for help. R.utils v1.5.0 (2010-08-04) successfully loaded. See ?R.utils for help. Loading required package: R.filesets Loading required package: digest R.filesets v0.8.3 (2010-07-06) successfully loaded. See ?R.filesets for help. Loading required package: aroma.core Loading required package: R.cache R.cache v0.3.0 (2010-03-13) successfully loaded. See ?R.cache for help. Loading required package: R.rsp R.rsp v0.3.6 (2009-09-16) successfully loaded. See ?R.rsp for help. Type browseRsp() to open the RSP main menu in your browser. Loading required package: matrixStats matrixStats v0.2.1 (2010-04-05) successfully loaded. See ?matrixStats for help. Loading required package: aroma.light aroma.light v1.16.1 (2010-06-23) successfully loaded. See ?aroma.light for help. aroma.core v1.7.0 (2010-07-26) successfully loaded. See ?aroma.core for help. Loading required package: aroma.apd Loading required package: R.huge R.huge v0.2.0 (2009-10-16) successfully loaded. See ?R.huge for help. Loading required package: affxparser aroma.apd v0.1.7 (2009-10-16) successfully loaded. See ?aroma.apd for help. aroma.affymetrix v1.7.0 (2010-07-26) successfully loaded. See ? aroma.affymetrix for help. > log = verbose = Arguments$getVerbose(-8, timestamp = TRUE) > options(digits = 4) > > #Test to make sure things are working > cdf = AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags = "Full") > print(cdf) AffymetrixCdfFile: Path: annotationData/chipTypes/GenomeWideSNP_6 Filename: GenomeWideSNP_6,Full.cdf Filesize: 470.44MB Chip type: GenomeWideSNP_6,Full RAM: 0.00MB File format: v4 (binary; XDA) Dimension: 2572x2680 Number of cells: 6892960 Number of units: 1881415 Cells per unit: 3.66 Number of QC units: 4 > > gi = getGenomeInformation(cdf) > print(gi) UgpGenomeInformation: Name: GenomeWideSNP_6 Tags: Full,na30,hg18,HB20100215 Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215 Pathname: annotationData/chipTypes/GenomeWideSNP_6/ GenomeWideSNP_6,Full,na30,hg18,HB20100215.ugp File size: 8.97 MB (9407867 bytes) RAM: 0.00 MB Chip type: GenomeWideSNP_6,Full > > si = getSnpInformation(cdf) > print(si) UflSnpInformation: Name: GenomeWideSNP_6 Tags: Full,na30,hg18,HB20100215 Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215 Pathname: annotationData/chipTypes/GenomeWideSNP_6/ GenomeWideSNP_6,Full,na30,hg18,HB20100215.ufl File size: 7.18 MB (7526452 bytes) RAM: 0.00 MB Chip type: GenomeWideSNP_6,Full Number of enzymes: 2 > > acs = AromaCellSequenceFile$byChipType(getChipType(cdf, fullname = FALSE)) > print(acs) AromaCellSequenceFile: Name: GenomeWideSNP_6 Tags: HB20080710 Full name: GenomeWideSNP_6,HB20080710 Pathname: annotationData/chipTypes/GenomeWideSNP_6/ GenomeWideSNP_6,HB20080710.acs File size: 170.92 MB (179217531 bytes) RAM: 0.00 MB Number of data rows: 6892960 File format: v1 Dimensions: 6892960x26 Column classes: raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw Number of bytes per column: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 Footer: <createdOn>20080710 22:47:02 PDT</ createdOn><platform>Affymetrix</platform><chipType>GenomeWideSNP_6</ chipType><srcFile><filename>GenomeWideSNP_6.probe_tab</ filename><filesize>341479928</ filesize><checksum>2037c033c09fd8f7c06bd042a77aef15</checksum></ srcFile><srcFile2><filename>GenomeWideSNP_6.CN_probe_tab</ filename><filesize>96968290</ filesize><checksum>3dc2d3178f5eafdbea9c8b6eca88a89c</checksum></ srcFile2> Chip type: GenomeWideSNP_6 Platform: Affymetrix > > #Load CEL files in the folder "testing." Note that inside the working > directory we have the > #path rawData//testing//GenomeWideSNP_6, which contains the appropriate CEL > files. > csR = AffymetrixCelSet$byName("smalltesting1", cdf = cdf) > print(csR) AffymetrixCelSet: Name: smalltesting1 Tags: Path: rawData/smalltesting1/GenomeWideSNP_6 Platform: Affymetrix Chip type: GenomeWideSNP_6,Full Number of arrays: 9 Names: sample1, sample10, ..., sample9 Time period: 2010-04-20 13:23:11 -- 2010-04-20 18:43:03 Total file size: 592.90MB RAM: 0.01MB > #Create AffyBatch object for csR > ab = extractAffyBatch(csR) Loading required package: affy Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Attaching package: 'Biobase' The following object(s) are masked from 'package:matrixStats': anyMissing, rowMedians Attaching package: 'affy' The following object(s) are masked from 'package:aroma.light': plotDensity The following object(s) are masked _by_ package:Biobase : .Depends The following object(s) are masked _by_ package:aroma.affymetrix : .Depends The following object(s) are masked _by_ package:aroma.apd : .Depends The following object(s) are masked _by_ package:R.huge : .Depends The following object(s) are masked _by_ package:aroma.core : .Depends The following object(s) are masked _by_ package:aroma.light : .Depends plotDensity The following object(s) are masked _by_ package:matrixStats : .Depends The following object(s) are masked _by_ package:R.rsp : .Depends The following object(s) are masked _by_ package:R.cache : .Depends The following object(s) are masked _by_ package:R.filesets : .Depends The following object(s) are masked _by_ package:R.utils : .Depends The following object(s) are masked _by_ package:R.oo : .Depends Loading required package: genomewidesnp6,fullcdf Warning messages: 1: In fcn(...) : Packages reordered in search path: package:affy 2: In extractAffyBatch.AffymetrixCelSet(csR) : CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The 'affy' package will later try to download it from Bioconductor and install it. > > ######################### > ######################### > #####Now for normalization. As per CMRAv2.0, we perform allelic > #####crosstalk calibration and normalization for nucleotide-position > #####probe sequence effects. > ######################### > ######################### > acc = AllelicCrosstalkCalibration(csR, model = "CRMAv2") > csC = process(acc, verbose = verbose) 20100903 13:35:10|Calibrating data set for allelic cross talk... 20100903 13:35:11| Already calibrated 20100903 13:35:11|Calibrating data set for allelic cross talk...done > bpn = BasePositionNormalization(csC, target = "zero") > csN = process(bpn, verbose = verbose) 20100903 13:35:11|Normalization data set for probe-sequence effects... 20100903 13:35:11| Already normalized 20100903 13:35:11|Normalization data set for probe-sequence effects...done > > #Now repeat the above QC steps based on the normalized data > abN = extractAffyBatch(csN) Loading required package: genomewidesnp6,fullcdf Error in read.affybatch(filenames = l$filenames, phenoData = l $phenoData, : It appears that the file probeData/smalltesting1,ACC,ra,-XY,BPN,-XY/ GenomeWideSNP_6/sample1.CEL is corrupted. In addition: Warning message: In extractAffyBatch.AffymetrixCelSet(csN) : CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The 'affy' package will later try to download it from Bioconductor and install it. > > traceback() 5: .Call("read_abatch", filenames, rm.mask, rm.outliers, rm.extra, ref.cdfName, dim.intensity, verbose, PACKAGE = "affyio") 4: read.affybatch(filenames = l$filenames, phenoData = l$phenoData, description = l$description, notes = notes, compress = compress, rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra = rm.extra, verbose = verbose, sd = sd, cdfname = cdfname) 3: ReadAffy(filenames = filenames, sampleNames = sampleNames, ..., verbose = as.logical(verbose)) 2: extractAffyBatch.AffymetrixCelSet(csN) 1: extractAffyBatch(csN) > sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Biobase_2.6.1 aroma.affymetrix_1.7.0 aroma.apd_0.1.7 [4] affxparser_1.18.0 R.huge_0.2.0 aroma.core_1.7.0 [7] aroma.light_1.16.1 matrixStats_0.2.1 R.rsp_0.3.6 [10] R.cache_0.3.0 R.filesets_0.8.3 digest_0.4.2 [13] R.utils_1.5.0 R.oo_1.7.3 affy_1.24.2 [16] R.methodsS3_1.2.0 loaded via a namespace (and not attached): [1] affyio_1.14.0 preprocessCore_1.8.0 tools_2.11.1 > -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/