Waverley @ Palo Alto wrote: > Hi, > > I have a list of IPI gene IDs. I want to find out whether there is a > package which can map the gene ontology to these IPIs, and plot the > pie chart to demonstrate the molecular function distributions. > > The input is like the following gene IPI IDs: > IPI:IPI00008860.1|SWISS-PROT:Q9BXJ4-1|TREMBL:Q542Y2|ENSEMBL:ENSP00000231338;EN > IPI:IPI00019922.5|SWISS-PROT:Q8N0Y2-1|TREMBL:Q53F81|ENSEMBL:ENSP00000338860;ENSP00000375594|REFSEQ:NP_060807|H-INV:HIT000028861|VEGA:OTTHUMP00000078377 > Tax_Id=9606 Gene_Symbol=ZN > IPI:IPI00647423.2|SWISS-PROT:Q8N819-1|REFSEQ:NP_001073870|VEGA:OTTHUMP00000076687 > Tax_Id=9606 Gene_Symbol=FLJ40125 Isoform 1 of > IPI:IPI00219000.2|SWISS-PROT:P27658|TREMBL:Q53XI6|ENSEMBL:ENSP00000261037|REFS > IPI:IPI00291878.4|SWISS-PROT:P35247|ENSEMBL:ENSP00000361366|REFSEQ:NP_003010|H-INV:HIT000039466|VEGA:OTTHUMP00000019944 > IPI:IPI00013945.1|SWISS-PROT:P07911-1|TREMBL:Q8NHW8|ENSEMBL:ENSP00000306279|RE > IPI:IPI00000634.1|SWISS-PROT:Q16204|TREMBL:Q6GSG7|ENSEMBL:ENSP00000263102|REFS > > I want to plot the pie chart of these gene distribution in the GO > molecular function as a pie chart. An example is shown in the > following link http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y > > > Can some one help?
Not sure that it is this easy. The IPI are protein identifiers. GO categories classify genes. Neither the mapping from protein to gene or gene to GO category is 1:1. GO categories form a hierarchy. So there are significant decisions to be made in representing IPI identifiers in a pie chart of GO terms. Bioconductor maintains 'org' and 'GO' database packages that provide the necessary link between IPI protein ids and GO gene ontology categories, via ENTREZ gene ids. Code might look like ## once only, to install packages source('http://bioconductor.org/biocLite.R') biocLite('org.Hs.eg.db', 'GO.db') ## from IPI to ENTREZ id, not 1:1 library(org.Hs.eg.db) ipi2eg = revmap(eapply(org.Hs.eg.db, names)) ## NOT 1:1 map ## Assume ipiIds is, e.g., c('IPI00008860', 'IPI00019922') egIds = revmap(ipi2eg[ipiIds]) ## get GO terms, also not 1:1 goIds = eapply(org.Hs.egGO[names(egIds)], names) You're still left with the problem of resolving multiple mappings and the hierarchical relationship between GO terms. Asking on the Bioconductor mailing list http://bioconductor.org/docs/mailList.html is likely to lead to helpful answers. Martin > Thanks much in advance. > > Merry Christmas!! > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.