Hi, Mark! Thank you so much! After an intensive study of FIRMA and aroma.affymetrix, I think I sort of figure out a routine for my exon study. And now I am more into R programming instead of C or Matlab :) One side question I still have is the probe affinity effect. I never used it and just wonder what is it for? Maybe I still don't fully understand FIRMA model very well :(
Thanks! Sabrina On Dec 10, 5:34 pm, Mark Robinson <[EMAIL PROTECTED]> wrote: > > I assume that you referred me to the probeset annotation csv file , > > not the transcript annotation csv file. :) > > > I did get the exon coordinates! it seemed to me that the probeset_id > > in the probeset annotation file is equivalent to groupName of > > exFirma. > > Yes! The transcript CSV file may be useful for other things, but > indeed not for exon coordinates. > > In aroma.affymetrix terminology for exon arrays, > unit=transcript_cluster_id, group=probeset_id and cell=probe. > > > One question I have is in the exon array plot from GenomeGraphs, the > > unrData is the probe level data, in other words, in my case, I should > > use the normalized data, csN, is that correct? > > If you look closely at the Purdom paper, there are a few things you > may want to plot in the context of Ensembl annotation. The normalized > data is the obvious one, which you have linked to an object 'csN'. > > In addition, you may want to plot the residuals: > > rs <- calculateResiduals(plmTr) # plmTr is the fitted ExonRmaPlm with > mergeGroups=TRUE > d <- extractMatrix(rs,cells=...) > > ... or for example you may want to plot the raw data, adjusted by the > probe effects. For example, RMA fits the model > > Y_{ij} = a_i + b_j > > (Y_{ij} = normalized data, a_i = chip effects, b_j = probe effects), > you may be interested in plotting: > > Y_{ij} - \hat{b}_j > > ... i.e. raw data, adjusted by the estimated probe effects. > > > another question is graph related though. Since I have two groups > > among my 9 samples, when I plot exon array, is there any way to use > > different color coding for these samples on the same graph? Thanks! > > Short answer: > > library(GenomeGraphs) > ?"ExonArray-class" > > Long answer: create your ExonArray object and use DisplayPars slot to > specify colour and line width: > > ea1<-new("ExonArray", intensity = d, probeStart = ..., probeEnd=..., > probeId = ..., nProbes = ..., dp = DisplayPars(color = > col, lwd=lwd, > mapColor = "dodgerblue2",plotMap=TRUE)) > > where 'col' and 'lwd' are either length 1 (in which case all lines get > the same width and colour) or vectors of the length of the number of > samples ... > > Thanks for all these questions Sabrina. This will help make a nice > vignette ... when I get time :) > > Unless you'd be interested in summarizing your approach! :) > > Cheers, > Mark > > > Sabrina > > > On Dec 9, 5:02 pm, Mark Robinson <[EMAIL PROTECTED]> wrote: > >> Hi Sabrina. > > >> Great work! See below. > > >> On 10/12/2008, at 5:27 AM, sabrina wrote: > > >>> Hi, Mark! > >>> Thank you so much! I think I pretty much figured out how to get the > >>> gene level, exon level expressions and comparison done. I checked > >>> the > >>> GenomeGraphs as you suggested. I tried the code in the exon array > >>> section, and it worked fine. So here is the question related to my > >>> data set. > > >>> In order to plot exon data, I used the following code to get exon > >>> summary (intensities) which is pbsetSummLog2: > > >>> plmNoMerge<- ExonRmaPlm(csN, mergeGroups=FALSE) > >>> fit(plmNoMerge) > >>> readUnits(plmNoMerge,units=1) > > >>> chpNoMerge<-getChipEffectSet(plmNoMerge) > > >>> pbsetSumm<-extractMatrix(chpNoMerge,returnUgcMap=TRUE) > >>> pbsetSummLog2<-log2(pbsetSumm); > >>> pbsetNames<-readCdfGroupNames(getPathname(getCdf(chpNoMerge)), > >>> unit=unique(attr(pbsetSumm,"unitGroupCellMap")[,"unit"])) > >>> rownames(pbsetSumm)<-unlist(pbsetNames) > >>> rownames(pbsetSummLog2)<-unlist(pbsetNames) > > >>> Because each significant exon that was detected by FIRMA is > >>> associated > >>> with one gene, and that gene has several exons, therefore, I used > >>> the > >>> following to find all exons associated to that gene: > >>> (x is the result from topTable) > > >>> temp<-grep(exFirma[x$ID,1][1],exFirma[,1]); > >>> gp_temp<-exFirma[temp,2] > > >>> gene1<-pbsetSummLog2[temp,] > > >>> which gives me: > >>> array1 array 2 ... > >>> 4308385 6.338784 6.896304 > >>> 4376965 1.973171 2.272406 > > >>> I know that "430835" is the groupName (aka exon id), but I am not > >>> sure > >>> where to find the start and end position of these individual exons, > >>> can you show me how to do it? > > >>> The reason I asked this is because in makeExonArray, I need the > >>> probeStart and probeEnd positions. Thanks!!! > > >> Well, you can get the probeset start and end positions from the > >> 'NetAffx Annotation Files' from Affymetrix. For human, this would be > >> at: > > >>http://www.affymetrix.com/products_services/arrays/specific/ > >> exon.affx... > > >> I think you said you were using mouse, so you can find this at: > > >>http://www.affymetrix.com/products_services/arrays/specific/ > >> mouse_exo... > > >> Find the 'Current NetAffx Annotation Files' section and download the > >> csv.zip file. Just to have a quick peak at a few columns of this > >> file, if I run a unix tool called awk on the CSV file you get: > > >> -------- > >> awk '{FS=","; print $7,$1,$2,$3,$4,$5,$16}' HuEx-1_0-st- > >> v2.na25.hg18.probeset.csv | grep -v "^ " | more > >> -------- > > >> "transcript_cluster_id" "probeset_id" "seqname" "strand" "start" > >> "stop" "level" > >> ... > >> "2315373" "2315374" "chr1" "+" "742655" "742719" "core" > >> "2315373" "2315375" "chr1" "+" "742869" "743231" "core" > >> "2315373" "2315376" "chr1" "+" "743293" "743434" "core" > >> "2315373" "2315377" "chr1" "+" "744094" "744979" "core" > >> "2315380" "2315381" "chr1" "+" "752897" "752974" "extended" > >> "2315380" "2315382" "chr1" "+" "754246" "754274" "extended" > >> ... > > >> so there you have everything you need -- identifiers and chromosome > >> locations. You can just use 'read.csv' to read it into R and pull > >> off > >> the coordinates you need. > > >> Cheers, > >> Mark > > >>> Sabrina > >>> On Dec 8, 4:54 pm, Mark Robinson <[EMAIL PROTECTED]> wrote: > >>>> Hi Sabrina. > > >>>> On 08/12/2008, at 3:25 PM, sabrina wrote: > > >>>>> Hi, Mark: > >>>>> Thanks for the help! I think now I have better understanding of > >>>>> LIMMA > >>>>> now after I read the related chapter of LIMMA user's guide :) > > >>>>> On Dec 4, 4:10 pm, "Mark Robinson" <[EMAIL PROTECTED]> wrote: > >>>>>> Sabrina. > > >>>>>> When asking questions like this, please post your full set of > >>>>>> commands. > >>>>>> Its hard to decode what variables have been set without this. :) > > >>>>>> I'm guessing that affyAnnotation is what came from the 'read.csv' > >>>>>> call and > >>>>>> x2 is the output of topTable? Have you set the rownames of > >>>>>> exFirma? > > >>>>> x2 is the output of topTable > >>>>> I do not know how to set rownames of exFirma, when I check exFirma > >>>>> [1:3,1:6], it gave me the following: > > >>>>> unitName groupName unit group cell Array1 > >>>>> 1 6838637 4304927 1 1 1 1.0372756 > >>>>> 2 6838637 4330595 1 2 2 -1.1679380 > >>>>> 3 6838637 4356771 1 3 3 1.2217411 > > >>>>> Can you show me how to set the rownames? where can I find the row > >>>>> names? Is it correct that groupName is the exon id ? > > >>>> A good place to start: > > >>>> ?rownames > > >>>>> I used the following code to import affymetrix annotation file: > > >>>>> affyAnnotation<- read.csv(file="MoEx-1_0-st- > >>>>> v1.na27.mm9.transcript.csv",head=TRUE,sep=",",skip=20); > > >>>>> and > > >>>>>> Is this just a typo then? It should be > >>>>>> affyAnnotation$transcript_cluster_id instead of > >>>>>> affyAnnotation$transcript_cluster_Id. > > >>>>>> What does exFirma[x2$ID,1] actually give you? If the > >>>>>> rownames(exFirma) > >>>>>> are not set, this could mislead you. > > >>>>> I used the suggestion you gave to me to redo the lmFit, and here > >>>>> is > >>>>> what I used : > > >>>>> mm <- model.matrix(~cls) > >>>>> fit <- lmFit(exFirma[,6:ncol(exFirma)], mm) > >>>>> fit<-eBayes(fit) > >>>>> x<-topTable(fit,coef=2,adj="fdr") > > >>>>> Then I used exFirma[x$ID,1], it gave me: > >>>>> [1] "6996667" "6965348" "6982181" "6917790" "6899766" "7005797" > >>>>> "6880577" "6962054" "6897349" > >>>>> [10] "7018774" > > >>>>> when I printed x, it gave me: > > >>>>> ID logFC t P.Value adj.P.Val B > >>>>> 11253 11253 -4.856783 -8.524568 2.223806e-06 0.4982258 -4.323317 > >>>>> 183160 183160 4.058556 7.676406 6.413325e-06 0.7184271 -4.332955 > >>>>> 32235 32235 -3.692429 -6.555475 2.963704e-05 1.0000000 -4.350176 > >>>>> 132610 132610 2.315536 6.508698 3.170265e-05 1.0000000 -4.351036 > >>>>> 165279 165279 3.269404 6.311540 4.225007e-05 1.0000000 -4.354809 > >>>>> 171777 171777 3.845620 6.174738 5.172731e-05 1.0000000 -4.357574 > >>>>> 45380 45380 1.971388 6.088251 5.886544e-05 1.0000000 -4.359388 > >>>>> 132873 132873 -4.569042 -6.021868 6.505114e-05 1.0000000 -4.360815 > >>>>> 86037 86037 3.206732 5.890372 7.943255e-05 1.0000000 -4.363738 > >>>>> 27761 27761 -2.651559 -5.825434 8.774484e-05 1.0000000 -4.365229 > > >>>>> when I typed exFirma[11253,], it gave me: > > >>>>> exFirma[11253,] > >>>>> unitName groupName unit group cell array1, ... > >>>>> 11253 6996667 5092691 535 6 11253 2.718783 ... > > >>>>> so it appeared to me that x$ID matches the index of exFirma entry. > > >>>>> Then I used; > >>>>> index<-match(exFirma[x$ID,1], affyAnnotation > >>>>> $transcript_cluster_id) > >>>>> to find the index of these entries (significant different between > >>>>> two > >>>>> groups) in Affy annotation, then I checked the entries in Affy > >>>>> Annotation file: > >>>>> affyAnnotation[index, 1:5] > > >>>>> transcript_cluster_id probeset_id seqname strand start > >>>>> 71605 6996667 6996667 chr9 - > >>>>> 70631116 > >>>>> 62004 6965348 6965348 chr7 + > >>>>> 151009070 > >>>>> 67351 6982181 6982181 chr8 - > >>>>> 47660950 > >>>>> ... > > >>>> This all looks good to me. > > >>>>> But then I don't know how to check whether > > ... > > read more » --~--~---------~--~----~------------~-------~--~----~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~----------~----~----~----~------~----~------~--~---