[aroma.affymetrix] Re: Reproducing RMA with Gene ST data (Was: Re: [aroma.affymetrix] Re: How do you analyze Gene ST Data?)
Hi all. Just to follow up on these comments here. 'fitPLM' with default parameters in the affyPLM package should give practically identical results to the 'standard' pipeline (RMA bg corr + quantile + fit) within aroma.affymetrix, assuming the underlying annotation is the same. This was an easy comparison back in the day of 3' IVT arrays. Now, its a little more difficult. If anyone is willing, I'd be keen to know if these two actually do give the same results on the newer chips i.e. is the underlying annotation the same? I seem to recall that because these newer chips occasionally have probes that are shared amongst different probesets, that the older style affy package would not be able to handle it. For example, Jim MacDonald's post: http://article.gmane.org/gmane.science.biology.informatics.conductor/19184 Jim says there that you "don't want to use affy" for these chips (not 100% sure why). He suggests the whole pdInfoBuilder/oligo thing which at one time had some bugs, but is probably more stable now. I haven't dug deeper as to whether the annotation that 'fitPLM' uses by default ('hugene10st.db' presumably?) matches the annotation that would be used by aroma.affymetrix (the converted-to-binary 'unsupported' CDF file). I know Mark Cowley did find some inconsistencies: http://article.gmane.org/gmane.science.biology.informatics.conductor/19738 This makes me think that we may want to alternatively construct the XXGene 1.0 CDFs (XX=Hu,Mo,Ra) directly from the PGF/CLF files instead of from the unsupported CDF. I suspect that there will only be minor changes, so I haven't looked too deeply into it. Anyone want to check? In addition to what Henrik says about flavour="affyPLM" ... for a lot of my work, there is definitely additional value in using the auxiliary information from the PLMs (i.e. weights, residuals) ... you don't get this directly with oligo/median polish. Few more specific notes ... >> library(affy) >> TestData <- ReadAffy() >> TestEset <- rma(TestData) >> >> If you plot(AromaEset[,1],TestEset[,1]) you can visualize how >> different the data is. I assume you ensured the probesets and samples are in the same order? (Or, is this somehow covered by the plot method ...) I can't tell from this sequence of commands. I don't know what this plot looks like, so I don't whether to be alarmed or not. >> However, I noticed that my GeneST normalized data is quite different >> from the data that I produce using the Affy package. When looking at >> the controls on the array I see that the Aroma normalized data is >> between 5-10% lower than that produced by the Bioconductor Affy >> packages. However, for some probes this difference is can be quite >> large (values are averaged across 16 samples): >> >> ProbeBionconductorAroma >> 10341096 (neg_control) 6079 852 >> 10341735 (neg_control) 25 87 >> 10340969 (pos_control) 3953 1758 >> 10338477 (pos_control) 293 611 >> >> Ultimately, when my downstream analysis looks for differential >> expression the differences between the two analysis approaches become >> minimal, but I did notice that the Aroma package seems to call more >> control probes and probes with no known gene as being differentially >> expressed (i.e. it looks noisier). These probes were all pretty close >> to my 2 fold cutoff, and at 3 fold or greater the data looked the >> same. Tough to really tell. You'd probably want to average on the log2 scale, not the linear scale. This could be a difference in median- polish versus robust PLM, or could be a differnce in annotation. Requires some digging. Also, you may want to remove control probesets before doing DE analysis --- for one thing, you pay a slightly lesser penalty for multiple testing. Cheers, Mark -- Mark Robinson Epigenetics Laboratory, Garvan Bioinformatics Division, WEHI e: [EMAIL PROTECTED] e: [EMAIL PROTECTED] p: +61 (0)3 9345 2628 f: +61 (0)3 9347 0852 -- --~--~-~--~~~---~--~~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~--~~~~--~~--~--~---
[aroma.affymetrix] Re: Generation of annotation for custom/unknown cel/cdfs
Hi. On Wed, Nov 26, 2008 at 2:08 PM, David Rosenberg <[EMAIL PROTECTED]> wrote: > Thank you much for your help. A couple of follow-ups: > 1. In terms of 'x', 'y' and 'other' chromosomes (and positions), should I > just assign 'dummy' values? Should any be left NA? For the human genome, I map ChrX=23 and ChrY=24. This seems to be the de facto standard. For mitochondrial DNA, I've assigned it ChrM=25. I don't know of a standard for this. It is ok to have NAs, that is, for those units you cannot map estimates to the genome. They will not show up in the ChromosomeExplorer and they will not be part of any segmentation either and so. > 2. Lacking a NetAffx csv file, how might I determine fragment lengths? I have no aces for this. If you know the restriction enzymes used, you can find out at what locations they digest the DNA. This information will define the lengths of all digested fragments. Then you have to map the units in your CDF to these fragments. I haven't done this myself and you will have to turn to external resources for identifying the locations etc. You could also try to contact Affymetrix to see if they have this information for your custom CDF. Cheers /Henrik > Thanks again. > > On Nov 26, 2008 1:12 AM, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote: > > > Hi. > > On Tue, Nov 25, 2008 at 10:56 AM, David Rosenberg > <[EMAIL PROTECTED]> wrote: > > I apolog... > > The UGP and UFL files are handled by the AromaUgpFile and the > AromaUflFile classes which are both subclasses of the > AromaTabularBinaryFile which always defines the underlying binary file > format (see > http://groups.google.com/group/aroma-affymetrix/web/aromatabularbinaryfile). > > See Page 'Aroma Unit Genome Position (UGP) files' > (http://groups.google.com/group/aroma-affymetrix/web/unit-genome-position-ugp-files) > for examples how to import NetAffx CSV data or other data to UGP files > or manually assign the values. Note that the records of an UGP file > (and also an UFL file) are indexed by units, that is, each row maps to > the corresponding unit (with the same index) in the CDF. Since your > unit names (we avoid the term probeset because a unit can contain > multiple groups/sets of probes) have format > -- you can infer the (unit, > chromosome, position) from the unit names alone, in case you don't > have an NetAffx CSV file. For example: > > unitNames <- getUnitNames(cdf); > pattern <- "(.*)-(.*)-(.*)"; > chr <- gsub(pattern, "\\2", unitNames); > pos <- gsub(pattern, "\\3", unitNames); > > chr <- as.integer(chr); # If there are 'X' and 'Y' you have to deal > with those too. > pos <- as.integer(pos); > > Then, allocate an empty UGP file and assign (chr, pos) to it: > > ugp <- AromaUgpFile$allocateFromCdf(cdf, tags="mm33,DM20081126"); > ugp[,1] <- chr; > ugp[,2] <- pos; > > That should do. > > You can do similar things for your UFL file. See Page 'Unit Fragment > Length (UFL) files' > (http://groups.google.com/group/aroma-affymetrix/web/unit-fragment-length-ufl-files) > for further details. > > Hope this helps > > Henrik > >> > > > > > --~--~-~--~~~---~--~~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~--~~~~--~~--~--~---
[aroma.affymetrix] Re: Generation of annotation for custom/unknown cel/cdfs
Thank you much for your help. A couple of follow-ups: 1. In terms of 'x', 'y' and 'other' chromosomes (and positions), should I just assign 'dummy' values? Should any be left NA? 2. Lacking a NetAffx csv file, how might I determine fragment lengths? Thanks again. On Nov 26, 2008 1:12 AM, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote: Hi. On Tue, Nov 25, 2008 at 10:56 AM, David Rosenberg < [EMAIL PROTECTED]> wrote: > > I apolog... The UGP and UFL files are handled by the AromaUgpFile and the AromaUflFile classes which are both subclasses of the AromaTabularBinaryFile which always defines the underlying binary file format (see http://groups.google.com/group/aroma-affymetrix/web/aromatabularbinaryfile). See Page 'Aroma Unit Genome Position (UGP) files' ( http://groups.google.com/group/aroma-affymetrix/web/unit-genome-position-ugp-files ) for examples how to import NetAffx CSV data or other data to UGP files or manually assign the values. Note that the records of an UGP file (and also an UFL file) are indexed by units, that is, each row maps to the corresponding unit (with the same index) in the CDF. Since your unit names (we avoid the term probeset because a unit can contain multiple groups/sets of probes) have format -- you can infer the (unit, chromosome, position) from the unit names alone, in case you don't have an NetAffx CSV file. For example: unitNames <- getUnitNames(cdf); pattern <- "(.*)-(.*)-(.*)"; chr <- gsub(pattern, "\\2", unitNames); pos <- gsub(pattern, "\\3", unitNames); chr <- as.integer(chr); # If there are 'X' and 'Y' you have to deal with those too. pos <- as.integer(pos); Then, allocate an empty UGP file and assign (chr, pos) to it: ugp <- AromaUgpFile$allocateFromCdf(cdf, tags="mm33,DM20081126"); ugp[,1] <- chr; ugp[,2] <- pos; That should do. You can do similar things for your UFL file. See Page 'Unit Fragment Length (UFL) files' ( http://groups.google.com/group/aroma-affymetrix/web/unit-fragment-length-ufl-files ) for further details. Hope this helps Henrik > > > > --~--~-~--~~~---~--~~ When reporting problems on aroma.af... --~--~-~--~~~---~--~~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~--~~~~--~~--~--~---
[aroma.affymetrix] Re: annotationData over network using Windows Shortcuts *.lnk (Was: Re: [aroma.affymetrix] Re: How do you analyze Gene ST Data?)
Hi, so this is a bit confusing. I've been trying to reproduce it myself over a local Windows network, but I cannot. I did however locate a potential problem where R incorrectly believes it does not have the permission to read the file, which might be what you are experience. Could you please try the following? If you get an error, please report what traceback() gives. library("aroma.affymetrix"); # This should be the absolute pathname to the CDF file pathname <- "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/chipTypes/MoGene-1_0-st-v1/MoGene-1_0-st-v1,r3.cdf" # Assert that the path is correct stopifnot(isDirectory(dirname(pathname))); # Get some file info print(as.list(file.info(pathname))); # Check file permissions print(file.access(pathname, mode=0)); print(file.access(pathname, mode=1)); print(file.access(pathname, mode=2)); print(file.access(pathname, mode=4)); # Try to read the CDF reader using affxparser hdr <- readCdfHeader(pathname); str(hdr); cdf <- AffymetrixCdfFile(pathname); print(cdf); pathname2 <- AffymetrixCdfFile$findByChipType("MoGene-1_0-st-v1", tags="r3", verbose=-100); print(pathname2); print(as.list(file.info(pathname2))); cdf <- AffymetrixCdfFile$byChipType("MoGene-1_0-st-v1", tags="r3", verbose=-100); print(pathname2); /Henrik On Wed, Nov 26, 2008 at 8:46 AM, pwhiteusa <[EMAIL PROTECTED]> wrote: > > Hi Henrik, > > This was the output this time around: > >> path2 <- filePath(path, "chipTypes", "MoGene-1_0-st-v1") >> print(path2); > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ > annotationData/chipTypes/MoGene-1_0-st-v1" >> print(file.info(path2)); > > size > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > chipTypes/MoGene-1_0-st-v10 > > isdir > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > chipTypes/MoGene-1_0-st-v1 TRUE > > mode > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > chipTypes/MoGene-1_0-st-v1 777 > > mtime > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > chipTypes/MoGene-1_0-st-v1 2008-11-20 12:45:36 > > ctime > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > chipTypes/MoGene-1_0-st-v1 2008-11-20 12:45:36 > > atime > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > chipTypes/MoGene-1_0-st-v1 2008-11-25 18:07:38 > > exe > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > chipTypes/MoGene-1_0-st-v1 no > >> print(list.files(path=path2)); > [1] "MoGene-1_0-st-v1,r3.cdf" > > Cheers, > > Peter > > > On Nov 24, 2:59 pm, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote: >> Hi, >> >> one typo by me, please retry with: >> >> path2 <- filePath(path, "chipTypes", "MoGene-1_0-st-v1"); >> >> [I forgot 'chipTypes'] >> >> /Henrik >> >> On Mon, Nov 24, 2008 at 11:56 AM, pwhiteusa <[EMAIL PROTECTED]> wrote: >> >> > The answers are dispersed below >> >> >> > I just wanted to update you on the issue of creating Windows shortcuts >> >> > to point to the correct directory. >> >> >> > So the CDF files I want to use is on a mapped network drive - it is >> >> > mapped to my "P Drive": >> >> >> > P:\ANNOTATION\aromaAffymetrix\annotationData\chipTypes\HuGene-1_0-st- >> >> > v1\HuGene-1_0-st-v1,r3.cdf >> >> >> > I created the following shortcut in my working experiment directory, >> >> > "P:\EXPERIMENTS\NCHRI\TEST\CEL\annotationData.lnk", with the following >> >> > in the Target box: >> >> >> > P:\ANNOTATION\aromaAffymetrix\annotationData >> >> >> Did you create that by "drag'n'drop' by a right clicking or did you >> >> edit the "Target box" manually? Always do the drag'n'drop, because >> >> the property box of a Windows Shortcut does not show all internal >> >> fields or the shortcut and editing it is not safe. >> >> > Yes, I used drag and drop. I just checked the Target box to confirm. >> >> >> > When I type in the command >> >> >> > cdf <- AffymetrixCdfFile$fromChipType("MoGene-1_0-st-v1",tags="r3") >> >> >> > it throws an error because it is unable to find the file. Displaying >> >> > the path reveals: >> >> >> >> Arguments$getReadablePath("annotationData") >> >> > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ >> >> > annotationData" >> >> >> What happens if you do: >> >> >> path <- Arguments$getReadablePath("annotationData"); >> >> print(list.files(path=path)); >> >> > $ path >> > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ >> > annotationData" >> > $ print(list.files(path=path)) >> > [1] "chipTypes" >> >> >> path2 <- filePath(path, "MoGene-1_0-st-v1"); >> >> print(path2); >> >> print(file.info(path2)); >> >> print(list.files(path=path2)); >> >> > $ print(path2); >> > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ >> > annotationData/MoGene-1_0-st-v1" >> > $ print(file.info(path2)); >> >> > size >> > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ >> > MoGene-1_0-st-v1 NA >> >> > isdir >> > //R
[aroma.affymetrix] Re: Reproducing RMA with Gene ST data (Was: Re: [aroma.affymetrix] Re: How do you analyze Gene ST Data?)
Hi. a comment on RmaPlm and argument 'flavor': The RmaPlm class is only summarizing the probe signals - normalization etc are done before RmaPlm. The summarization model is the log-additive model with probe affinities and chip effects. The 'flavor' argument specifies which implementation of the fitting algorithm to use. The default is "affyPLM", which indicates that it uses the implementation in 'affyPLM', which now has moved to the 'preprocessCore' package. Both 'affyPLM' and 'preprocessCore' are developed and maintained by Ben Bolstad. The "oligo" flavor was using the fitting algorithm in the 'oligo' package, which I think in turn originates from the 'affy' package, which in turn was an early version of Ben Bolstad's code. I haven't tried this in awhile, and it appears that 'oligo' has been updated and the internal function for fitting the model is no longer available. This is why you get the error message. Since BB maintain preprocessCore and there has been a lot of improvements in the algorithm (e.g. it supports probe-specific weights and much more), I recommend that you use the "affyPLM" flavor. Since this is default, you do not have to specify the argument 'flavor' at all. For reproducibility of the RMA pipeline in aroma.affymetrix when compared with 'affyPLM', see Page 'Reproducibility of other implementations': http://groups.google.com/group/aroma-affymetrix/web/redundancy-tests As you see, the results are remarkable similar. FYI, for each update of the package this comparison is part of the redundancy testing. Note that only PMs are quantile normalized, this might be a reason for you observing differences. There is more meat on how QuantileNormalization and plotDensity() behaves on Page 'Empirical probe-signal densities and rank-based quantile normalization': http://groups.google.com/group/aroma-affymetrix/web/empirical-probe-signal-densities-and-rank-based-quantile-normalization I am also not sure how similar affyPLM and Affy is, but I trust affyPLM more than Affy, because I see affyPLM as a more up to date version of Affy. Please feel free to contribute with code for comparing toward Affy. If you do, please use the same data set. Cheers Henrik On Wed, Nov 26, 2008 at 12:44 PM, pwhiteusa <[EMAIL PROTECTED]> wrote: > > Hi Manasa, > > So I get exactly the same results with either of the following: > > plm <- RmaPlm(csN, flavor="affyPLM") #according to the documentation > this is the default > plm <- ExonRmaPlm(csN, flavor = "affyPLM", mergeGroups=TRUE) #seems to > take longer to run > > However, I noticed that my GeneST normalized data is quite different > from the data that I produce using the Affy package. When looking at > the controls on the array I see that the Aroma normalized data is > between 5-10% lower than that produced by the Bioconductor Affy > packages. However, for some probes this difference is can be quite > large (values are averaged across 16 samples): > > ProbeBionconductorAroma > 10341096 (neg_control) 6079 852 > 10341735 (neg_control) 25 87 > 10340969 (pos_control) 3953 1758 > 10338477 (pos_control) 293 611 > > Ultimately, when my downstream analysis looks for differential > expression the differences between the two analysis approaches become > minimal, but I did notice that the Aroma package seems to call more > control probes and probes with no known gene as being differentially > expressed (i.e. it looks noisier). These probes were all pretty close > to my 2 fold cutoff, and at 3 fold or greater the data looked the > same. > > Do you know where the packages diverge if their probe summation > approach or is it a difference in background correction? I did attempt > to use the flavor="oligo" (as described in the ?RmaPlm help file) but > this returned the following error: > >> plm2 <- RmaPlm(csN, flavor="oligo") >> fit(plm2) > Exception: The fit function for requested RMA PLM flavor failed: oligo > > Thanks, > > Peter > > P.S. Here is a summary of the code I used for the Bioconductor > approach: > > library(affy) > TestData <- ReadAffy() > TestEset <- rma(TestData) > > If you plot(AromaEset[,1],TestEset[,1]) you can visualize how > different the data is. > > On Nov 24, 11:42 pm, ManasaR <[EMAIL PROTECTED]> wrote: >> Hi, >> >> Just to say that i've been working with some GeneST1.0 data following >> the tips above and it has been a breeze thanks to you guys. I noticed >> something and thought i should mention it - though it might be trivial >> and most users might have already discovered it. Just in case there >> are people like me out there :-) >> >> With regards to the first reply Mark made above and the following >> point, >> >> >1. Process Gene 1.0 ST data much the same as Exon array data, except >> >that you'll need to replace 'ExonRmaPlm' with 'RmaPlm' >> >> RmaPlm does not have "mergeGroups" as an argument. Im assuming t
[aroma.affymetrix] Re: any way to get the name of the SNP as rs#?
Hi, On Wed, Nov 26, 2008 at 7:46 AM, mortiz <[EMAIL PROTECTED]> wrote: > > hi everyone, > > i was wondering if there is any function which gives as a result the > names of the SNPs in their "rs#" form? this mapping information is available in the Affymetrix NetAffx CSV files, which you can download from the Affymetrix support page for each chip type. You find links from the chip type pages on the aroma.affymetrix site. An NetAffx CSV file is a large tab-delimited file. Put in under to corresponding annotationData/ directory and aroma.affymetrix will be able to find it. Here is an example how to read such files in aroma.affymetrix: The pathname of the NetAffx CSV file is: annotationData/chipTypes/Mapping250K_Nsp/Affymetrix/Mapping250K_Nsp.na26.annot.csv library("aroma.affymetrix"); chipType <- "Mapping250K_Nsp"; csv <- AffymetrixNetAffxCsvFile$byChipType(chipType, tags=".na26"); print(csv); AffymetrixNetAffxCsvFile: Name: Mapping250K_Nsp.na26.annot Tags: Pathname: annotationData/chipTypes/Mapping250K_Nsp/Affymetrix/Mapping250K_Nsp.na26.annot.csv File size: 426.48MB RAM: 0.03MB Number of data rows: NA Columns [26]: 'probeSetID', 'affySNPID', 'dbSNPRSID', 'chromosome', 'physicalPos ition', 'strand', 'chrXPseudo-autosomalRegion1', 'cytoband', 'flank', 'alleleA', 'alleleB', 'associatedGene', 'geneticMap', 'microsatellite', 'fragmentEnzymeTyp eLengthStartStop', 'alleleFrequencies', 'heterozygousAlleleFrequencies', 'number Ofindividuals/NumberOfChromosomes', 'inHapmap', 'strandVersusdbSNP', 'copyNumber Variation', 'probeCount', 'chrXPseudo-autosomalRegion2', 'inFinalList', 'minorAl lele', 'minorAlleleFrequency' Number of text lines: NA # We only want to read two of the columns. We use regular expression to identify # them by their names, and we read them as "character" strings. Everything else # will be ignored (the corresponding 'colClasses' will be set to "NULL"). colClassPatterns <- c("(probeSetID|dbSNPRSID)"="character"); df <- readDataFrame(csv, colClassPatterns=colClassPatterns, rows=10:20); print(df); probeSetID dbSNPRSID 10 SNP_A-1780611 rs7660291 11 SNP_A-1780610 rs1340013 12 SNP_A-1780415 rs7730126 13 SNP_A-1780576 rs6675133 14 SNP_A-1780413 rs10834942 15 SNP_A-1780412 rs17419765 16 SNP_A-1780574 rs12233450 17 SNP_A-1780379 rs7995987 18 SNP_A-1780572 rs10091369 19 SNP_A-1780378 rs7828844 20 SNP_A-1780377 rs11040883 To read all entries, just drop the 'rows' argument. Hope this helps Henrik > > thanks! > > maria > > > --~--~-~--~~~---~--~~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~--~~~~--~~--~--~---
[aroma.affymetrix] Re: How do you analyze Gene ST Data?
Hi Manasa, So I get exactly the same results with either of the following: plm <- RmaPlm(csN, flavor="affyPLM") #according to the documentation this is the default plm <- ExonRmaPlm(csN, flavor = "affyPLM", mergeGroups=TRUE) #seems to take longer to run However, I noticed that my GeneST normalized data is quite different from the data that I produce using the Affy package. When looking at the controls on the array I see that the Aroma normalized data is between 5-10% lower than that produced by the Bioconductor Affy packages. However, for some probes this difference is can be quite large (values are averaged across 16 samples): ProbeBionconductorAroma 10341096 (neg_control) 6079 852 10341735 (neg_control) 25 87 10340969 (pos_control) 3953 1758 10338477 (pos_control) 293 611 Ultimately, when my downstream analysis looks for differential expression the differences between the two analysis approaches become minimal, but I did notice that the Aroma package seems to call more control probes and probes with no known gene as being differentially expressed (i.e. it looks noisier). These probes were all pretty close to my 2 fold cutoff, and at 3 fold or greater the data looked the same. Do you know where the packages diverge if their probe summation approach or is it a difference in background correction? I did attempt to use the flavor="oligo" (as described in the ?RmaPlm help file) but this returned the following error: > plm2 <- RmaPlm(csN, flavor="oligo") > fit(plm2) Exception: The fit function for requested RMA PLM flavor failed: oligo Thanks, Peter P.S. Here is a summary of the code I used for the Bioconductor approach: library(affy) TestData <- ReadAffy() TestEset <- rma(TestData) If you plot(AromaEset[,1],TestEset[,1]) you can visualize how different the data is. On Nov 24, 11:42 pm, ManasaR <[EMAIL PROTECTED]> wrote: > Hi, > > Just to say that i've been working with some GeneST1.0 data following > the tips above and it has been a breeze thanks to you guys. I noticed > something and thought i should mention it - though it might be trivial > and most users might have already discovered it. Just in case there > are people like me out there :-) > > With regards to the first reply Mark made above and the following > point, > > >1. Process Gene 1.0 ST data much the same as Exon array data, except > >that you'll need to replace 'ExonRmaPlm' with 'RmaPlm' > > RmaPlm does not have "mergeGroups" as an argument. Im assuming this > might be a default but do correct me if im wrong since i do want > values for each transcript separately. So if you have error messages > at this point, > > >plmTr <- RmaPlm(csN, mergeGroups=TRUE) > >print(plmTr) > > this is probably why > > Also, i declared flavour = "affyPLM" since that's what i was using to > analyse this data outside of aroma.affymetrix and i wasnt sure what > the default was. > > Cheers, > > Manasa > > On Nov 25, 6:59 am, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote: > > > Hi, > > > one typo by me, please retry with: > > > path2 <- filePath(path, "chipTypes", "MoGene-1_0-st-v1"); > > > [I forgot 'chipTypes'] > > > /Henrik > > > On Mon, Nov 24, 2008 at 11:56 AM, pwhiteusa <[EMAIL PROTECTED]> wrote: > > > > The answers are dispersed below > > > >> > I just wanted to update you on the issue of creating Windows shortcuts > > >> > to point to the correct directory. > > > >> > So the CDF files I want to use is on a mapped network drive - it is > > >> > mapped to my "P Drive": > > > >> > P:\ANNOTATION\aromaAffymetrix\annotationData\chipTypes\HuGene-1_0-st- > > >> > v1\HuGene-1_0-st-v1,r3.cdf > > > >> > I created the following shortcut in my working experiment directory, > > >> > "P:\EXPERIMENTS\NCHRI\TEST\CEL\annotationData.lnk", with the following > > >> > in the Target box: > > > >> > P:\ANNOTATION\aromaAffymetrix\annotationData > > > >> Did you create that by "drag'n'drop' by a right clicking or did you > > >> edit the "Target box" manually? Always do the drag'n'drop, because > > >> the property box of a Windows Shortcut does not show all internal > > >> fields or the shortcut and editing it is not safe. > > > > Yes, I used drag and drop. I just checked the Target box to confirm. > > > >> > When I type in the command > > > >> > cdf <- AffymetrixCdfFile$fromChipType("MoGene-1_0-st-v1",tags="r3") > > > >> > it throws an error because it is unable to find the file. Displaying > > >> > the path reveals: > > > >> >> Arguments$getReadablePath("annotationData") > > >> > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ > > >> > annotationData" > > > >> What happens if you do: > > > >> path <- Arguments$getReadablePath("annotationData"); > > >> print(list.files(path=path)); > > > > $ path > > > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ > > > annotationData" > > > $ print(list.files(path=path)) > > > [1] "c
[aroma.affymetrix] Update on reported issues with generating PNGs using the built-in cairo device
Hi, previously a few people have reported problems when generating PNGs using the new built-in cairo device. From the NEWS file of R v2.8.0 *patched* [http://cran.r-project.org/bin/windows/base/NEWS.R-2.8.0pat] their is one bug fix that might fix this problem: BUG FIXES ... o bmp(), jpeg(), and png() on unix with type="cairo" would segfault on closure if the output file could not be opened. So, in case you experience segfaults/core dumps when running the ChromosomeExplorer or ArrayExplorer, try to update to a newer version of R v2.8.0 *patched*. If anyone notices that this solves the problem, please let us know. Cheers Henrik --~--~-~--~~~---~--~~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~--~~~~--~~--~--~---
[aroma.affymetrix] Re: Discussion on affymetrix-defined-transcript-clusters
Thanks Henrik and Mark, I had found the comment at the dChip forum that you found, Henrik, but wasn't convinced ... I'll try contacting Affymetrix and try doing some analysis, and post back here if that clears this up. ~Joe On Nov 25, 6:19 pm, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote: > I found this comment by Cheng Li at the dChip forum: "You can rename > "HuEx-1_0-st-v2.pgf" to "HuEx-1_0-st-v1.pgf". I understand the two > should be for the same physical array, but just different annotation > of probes into probeset." > [http://dchip.forum5.com/viewtopic.php?t=214&mforum=dchip] > > More a long the same lines from Affymetrix: "changed the chiptype > setting in the CEL files from HuEx-1_0-st-v1 to HuEx-1_0-st-v2: This > should mitigate chiptype mismatches when using 3rd party software with > the unsupported CDF file." > [http://www.affymetrix.com/analysis/netaffx/exon/announcements.affx] > > Don't take it is a fact, but as a start. I'd say, send an email to > Affymetrix support (see their webpage) for a definite answer. If you > find out more about this, please let us know. > > Also, at some stage there seems to have existed a 'HuEx-1_0-st-ta1' > chip type as well, > cf.http://www.affymetrix.com/support/developer/powertools/changelog/file... > > Cheers > > Henrik > > On Tue, Nov 25, 2008 at 5:36 PM, Mark Robinson <[EMAIL PROTECTED]> wrote: > > > Hi Joe. > > > While I've analyzed several Exon array datasets, I have never seen any > > HuEx-1_0-st-v1 CEL files. Did you get a very early version of these > > chips? From what I can tell from the Affymetrix web site, there is no > > annotation or anything available for this. But, typically the > > annotation from Affy is versioned separately. For example, from: > > >http://www.affymetrix.com/products_services/arrays/specific/exon.affx... > > > you can see that the 'Annotations CSV' has version na21, na22, etc. > > This seems to be quite separate from the v1/v2 notation. > > > I know Ken had analyzed some data from an earlier version of the Exon > > chip, but that was not made commerically available. > > > All this to say: I don't know. :) > > > Cheers, > > Mark > > > On 26/11/2008, at 12:22 PM, Henrik Bengtsson wrote: > > >> I'll let the experts on exon arrays answer this, but for clarification > >> this question is related to page: > > >>http://groups.google.com/group/aroma-affymetrix/web/affymetrix-define... > > >> /Henrik > > >> On Tue, Nov 25, 2008 at 4:57 PM, Joe Fass <[EMAIL PROTECTED]> > >> wrote: > > >>> This may not be the best place to ask this, but can I ask for a quick > >>> clarification about the v1 / v2 notation? Does this simply refer to > >>> more updated annotations? Is it valid to use the v2 cdf for v1 cel > >>> files? > > > -- > > Mark Robinson > > Epigenetics Laboratory, Garvan > > Bioinformatics Division, WEHI > > e: [EMAIL PROTECTED] > > e: [EMAIL PROTECTED] > > p: +61 (0)3 9345 2628 > > f: +61 (0)3 9347 0852 > > -- --~--~-~--~~~---~--~~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~--~~~~--~~--~--~---
[aroma.affymetrix] Re: How do you analyze Gene ST Data?
Hi Henrik, This was the output this time around: > path2 <- filePath(path, "chipTypes", "MoGene-1_0-st-v1") > print(path2); [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ annotationData/chipTypes/MoGene-1_0-st-v1" > print(file.info(path2)); size //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ chipTypes/MoGene-1_0-st-v10 isdir //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ chipTypes/MoGene-1_0-st-v1 TRUE mode //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ chipTypes/MoGene-1_0-st-v1 777 mtime //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ chipTypes/MoGene-1_0-st-v1 2008-11-20 12:45:36 ctime //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ chipTypes/MoGene-1_0-st-v1 2008-11-20 12:45:36 atime //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ chipTypes/MoGene-1_0-st-v1 2008-11-25 18:07:38 exe //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ chipTypes/MoGene-1_0-st-v1 no > print(list.files(path=path2)); [1] "MoGene-1_0-st-v1,r3.cdf" Cheers, Peter On Nov 24, 2:59 pm, "Henrik Bengtsson" <[EMAIL PROTECTED]> wrote: > Hi, > > one typo by me, please retry with: > > path2 <- filePath(path, "chipTypes", "MoGene-1_0-st-v1"); > > [I forgot 'chipTypes'] > > /Henrik > > On Mon, Nov 24, 2008 at 11:56 AM, pwhiteusa <[EMAIL PROTECTED]> wrote: > > > The answers are dispersed below > > >> > I just wanted to update you on the issue of creating Windows shortcuts > >> > to point to the correct directory. > > >> > So the CDF files I want to use is on a mapped network drive - it is > >> > mapped to my "P Drive": > > >> > P:\ANNOTATION\aromaAffymetrix\annotationData\chipTypes\HuGene-1_0-st- > >> > v1\HuGene-1_0-st-v1,r3.cdf > > >> > I created the following shortcut in my working experiment directory, > >> > "P:\EXPERIMENTS\NCHRI\TEST\CEL\annotationData.lnk", with the following > >> > in the Target box: > > >> > P:\ANNOTATION\aromaAffymetrix\annotationData > > >> Did you create that by "drag'n'drop' by a right clicking or did you > >> edit the "Target box" manually? Always do the drag'n'drop, because > >> the property box of a Windows Shortcut does not show all internal > >> fields or the shortcut and editing it is not safe. > > > Yes, I used drag and drop. I just checked the Target box to confirm. > > >> > When I type in the command > > >> > cdf <- AffymetrixCdfFile$fromChipType("MoGene-1_0-st-v1",tags="r3") > > >> > it throws an error because it is unable to find the file. Displaying > >> > the path reveals: > > >> >> Arguments$getReadablePath("annotationData") > >> > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ > >> > annotationData" > > >> What happens if you do: > > >> path <- Arguments$getReadablePath("annotationData"); > >> print(list.files(path=path)); > > > $ path > > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ > > annotationData" > > $ print(list.files(path=path)) > > [1] "chipTypes" > > >> path2 <- filePath(path, "MoGene-1_0-st-v1"); > >> print(path2); > >> print(file.info(path2)); > >> print(list.files(path=path2)); > > > $ print(path2); > > [1] "//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ > > annotationData/MoGene-1_0-st-v1" > > $ print(file.info(path2)); > > > size > > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > > MoGene-1_0-st-v1 NA > > > isdir > > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > > MoGene-1_0-st-v1 NA > > > mode > > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > > MoGene-1_0-st-v1 > > > mtime > > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > > MoGene-1_0-st-v1 > > > ctime > > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > > MoGene-1_0-st-v1 > > > atime > > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > > MoGene-1_0-st-v1 > > > exe > > //RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/annotationData/ > > MoGene-1_0-st-v1 > > > $ print(list.files(path=path2)); > > character(0) > > Warning message: > > In list.files(path = path2) : > > list.files: '//RES2K3FS01/RESAffymetrix/ANNOTATION/aromaAffymetrix/ > > annotationData/MoGene-1_0-st-v1' is not a readable directory > > >> > So it looks as though the issue is that "P:\" is being replaced by R > >> > with the actual name of the Network drive. If I copy the > >> > annotationData directory to my C:\ and create the shortcut to point > >> > there it works just fine, but the arguments command returns "C:\": > > >> >> Arguments$getReadablePath("annotationData") > >> > [1] "C:/annotationData" > > >> So, if I understand you correctly, you are saying that "shortcuting" > >> to another location on the local C: drive works, but when you try to > >> do the same for a drive unit (P:) mounted to a network drive, then you > >> do not get the drive unit path, but th
[aroma.affymetrix] any way to get the name of the SNP as rs#?
hi everyone, i was wondering if there is any function which gives as a result the names of the SNPs in their "rs#" form? thanks! maria --~--~-~--~~~---~--~~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~--~~~~--~~--~--~---