Hello, I'm analyzing data from *Affymetrix HuGene-1_0-st-v1* chips using both RMA and GCRMA methods. For this purpose I'm using the binary CDF file (HuGene-1_0-st-v1,r3.cdf.gz) provided at http://aroma-project.org/chipTypes/HuGene-1_0-st-v1. (please note, that the link is wrong and pointing to the Ensembl version. The one I'm using, was downloaded directly from the directory: http://aroma-project.org/data/annotationData/chipTypes/HuGene-1_0-st-v1/ ).
The background adjustment (both rma and gcrma) completes successfully, but during the process I'm getting the following messages for each file: > Cannot create CEL file of version 4 > (probeData/partProjectGene,rma/HuGene-1_0-st-v1/1Z-1_(HuGene-1_0-st-v1).CEL.tmp). > > Template CEL file is of version 1: > rawData/partProjectGene/HuGene-1_0-st-v1/1Z-1_(HuGene-1_0-st-v1).CEL I guess it's because the versions of CELs and .cdf differ, but I have no idea, whether it affects the analysis output somehow. Should I be worried? Since I'm quite new to microarray normalization and aroma, the mechanism of gcrma normalization is partially unclear to me. >From what I've learned reading other topics/lists, because HuGene array is PM-only, I should use 'affinities' model and point to control probes. Probe.tab file for HuGene chip is quite different from the probe.tab for other chips (eg. HG-U133_Plus_2): > head(HuGene-1_0-st-v1, 2) Probe.ID Transcript.Cluster.ID probe.x probe.y assembly seqname start stop strand probe.sequence target.strandedness category 1 438514 7896736 663 417 build-GRCh37/hg19 chr1 54904 54928 + AATGGCTTGTCCCTGTATTCTCAGC Sense main 2 685482 7896736 881 652 build-GRCh37/hg19 chr1 54906 54930 + GCAATGGCTTGTCCCTGTATTCTCA Sense main > head(HG-U133_Plus_2, 2) Probe.Set.Name Probe.X Probe.Y Probe.Interrogation.Position Probe.Sequence Target.Strandedness 1 1007_s_at 718 317 3330 CACCCAGCTGGTCCTGTGGATGGGA Antisense 2 1007_s_at 1105 483 3443 GCCCCACTGGACAACACTGATTCCT Antisense Because of that, I modified the probe.tab file based on the following guide: http://compbio.sysbiol.cam.ac.uk/Resources/GeneST/ and it presents as below. Now, the former *"Probe.ID" has been used as * *"Probe.Interrogation.Position"*. This of course no longer makes sense, but can it somehow affect the gcrma background adjustment, is this column even utilized by the script? Probe Set Name Probe X Probe Y Probe Interrogation Position Probe Sequence Target Strandedness 1 7896736 663 417 438514 AATGGCTTGTCCCTGTATTCTCAGC Sense 2 7896736 881 652 685482 GCAATGGCTTGTCCCTGTATTCTCA Sense Another concern is about the probes I'm using to compute affinities. > table(HuGene-1_0-st-v1$category) control->affx control->bgp->antigenomic main normgene ->exon normgene->intron rescue->FLmRNA->unmapped 4649 16943 818005 4517 10990 6389 Now I'm utilizing antigenomic probes only, running the following command: ctrlAntiIndex <- which(HuGene-1_0-st-v1$category == 'control->bgp->antigenomic') bcGcA <- GcRmaBackgroundCorrection(cs, tags=c('gcrma','affinities'), type='affinities' , indicesNegativeControl = ctrlAntiIndex) However I'm not sure, if I shouldn't use 'control->affx' probes too (or maybe instead)?. If not, should I filter them out, especially knowing that some of them are <25mers, so could inappropriately affect the background correction? Also, I'm not entirely sure which index for 'indicesNegativeControl= ' parameter should I provide. Currently it's the index of entry in the probe.tab file, but I don't know if it should also match the .cdf file, since both come from different sources. I'd be happy if you could help me with any of those issues. Best regards, Marcin Kaminski > sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Polish_Poland.1250 LC_CTYPE=Polish_Poland.1250 LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C LC_TIME=Polish_Poland.1250 attached base packages: [1] grid parallel stats graphics grDevices utils datasets methods base other attached packages: [1] VennDiagram_1.6.7 dendextend_0.17.1 bioDist_1.36.0 KernSmooth_2.23-12 RColorBrewer_1.0-5 [6] limma_3.20.8 simpleaffy_2.40.0 genefilter_1.46.1 preprocessCore_1.26.1 aroma.light_2.0.0 [11] matrixStats_0.10.1 aroma.affymetrix_2.12.4 aroma.core_2.12.4 R.devices_2.9.2 R.filesets_2.5.9 [16] R.utils_1.32.6 R.oo_1.18.2 affyPLM_1.40.1 R.methodsS3_1.6.2 affxparser_1.36.0 [21] hugene10stv1gcrmacdf_1.40.0 AnnotationForge_1.6.1 org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7 [26] gcrma_2.36.0 affy_1.42.3 AnnotationDbi_1.26.0 GenomeInfoDb_1.0.2 Biobase_2.24.0 [31] BiocGenerics_0.10.0 makecdfenv_1.40.0 affyio_1.32.0 BiocInstaller_1.14.2 rj_2.0.2-1 loaded via a namespace (and not attached): [1] annotate_1.42.1 aroma.apd_0.5.0 base64enc_0.1-2 Biostrings_2.32.1 digest_0.6.4 DNAcopy_1.38.1 IRanges_1.22.10 magrittr_1.0.1 [9] PSCBS_0.43.0 R.cache_0.10.0 R.huge_0.8.0 R.rsp_0.19.3 rj.gd_2.0.0-1 splines_3.1.1 stats4_3.1.1 survival_2.37-7 [17] tools_3.1.1 whisker_0.3-2 XML_3.98-1.1 xtable_1.7-4 XVector_0.4.0 zlibbioc_1.10.0 -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.