It doesn't seem like it - on my installation, org.Hs.eg.db is still... monkeying around.
On Thu, Apr 25, 2019 at 9:17 AM Vincent Carey <st...@channing.harvard.edu> wrote: > Has this situation been rectified? > > On Tue, Apr 23, 2019 at 11:40 AM Van Twisk, Daniel < > daniel.vantw...@roswellpark.org> wrote: > >> We've made some changes to our annotation generation scripts this release >> and it seems these may have introduced some errors. Thank you for >> identifying this issue and I will try to have some fixes out asap. >> >> ________________________________ >> From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of James >> W. MacDonald <jmac...@uw.edu> >> Sent: Tuesday, April 23, 2019 11:03:02 AM >> To: Aaron Lun >> Cc: Bioc-devel >> Subject: Re: [Bioc-devel] Weird monkey identifiers in org.Hs.eg.db >> >> Looks like the ensembl table of the human.db0 package got polluted with >> *Pan >> troglodytes* genes: >> >> > con <- dbConnect(SQLite(), >> "/R-devel/lib64/R/library/human.db0/extdata/chipsrc_human.sqlite") >> > dbGetQuery(con, "select count(*) from ensembl where ensid like >> 'ENSPTR%';") >> count(*) >> 1 16207 >> > dbGetQuery(con, "select count(*) from ensembl where ensid like >> 'ENSG%';") >> count(*) >> 1 28973 >> >> On Mon, Apr 22, 2019 at 11:54 PM Aaron Lun < >> infinite.monkeys.with.keyboa...@gmail.com> wrote: >> >> > Playing around with org.Hs.eg.db 3.8.0. What on earth is ENSPTRG0000...? >> > >> > > library(org.Hs.eg.db) >> > > mapIds(org.Hs.eg.db, key="GCG", keytype="SYMBOL", column="ENSEMBL") >> > 'select()' returned 1:many mapping between keys and columns >> > GCG >> > "ENSPTRG00000000777" >> > >> > Well, at least it still recovers the right identifier... eventually. >> > >> > > select(org.Hs.eg.db, key="GCG", keytype="SYMBOL", columns="ENSEMBL") >> > 'select()' returned 1:many mapping between keys and columns >> > SYMBOL ENSEMBL >> > 1 GCG ENSPTRG00000000777 >> > 2 GCG ENSG00000115263 >> > >> > The SYMBOL->Entrez ID relational table seems to be okay: >> > >> > > Y <- toTable(org.Hs.egSYMBOL) >> > > Y[which(Y[,2]=="GCG"),] >> > gene_id symbol >> > 2152 2641 GCG >> > >> > So the cause is the Ensembl->Entrez mappings: >> > >> > > Z <- toTable(org.Hs.egENSEMBL2EG) >> > > Z[Z[,1]==2641,] >> > gene_id ensembl_id >> > 3028 2641 ENSPTRG00000000777 >> > 3029 2641 ENSG00000115263 >> > >> > Googling suggests that ENSPTRG00000000777 is an identifier for some >> > other gene in one of the other monkeys. Hardly "Hs" stuff. >> > >> > Session info (not technically R 3.6, but I didn't think that would have >> > been the cause): >> > >> > > R Under development (unstable) (2019-04-11 r76379) >> > > Platform: x86_64-pc-linux-gnu (64-bit) >> > > Running under: Ubuntu 18.04.2 LTS >> > > >> > > Matrix products: default >> > > BLAS: /home/luna/Software/R/trunk/lib/libRblas.so >> > > LAPACK: /home/luna/Software/R/trunk/lib/libRlapack.so >> > > >> > > locale: >> > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> > > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> > > [9] LC_ADDRESS=C LC_TELEPHONE=C >> > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> > > >> > > attached base packages: >> > > [1] parallel stats4 stats graphics grDevices utils >> datasets >> > > [8] methods base >> > > >> > > other attached packages: >> > > [1] org.Hs.eg.db_3.8.0 AnnotationDbi_1.45.1 IRanges_2.17.5 >> > > [4] S4Vectors_0.21.23 Biobase_2.43.1 BiocGenerics_0.29.2 >> > > >> > > loaded via a namespace (and not attached): >> > > [1] Rcpp_1.0.1 digest_0.6.18 DBI_1.0.0 RSQLite_2.1.1 >> > > [5] blob_1.1.1 bit64_0.9-7 bit_1.1-14 compiler_3.7.0 >> > > [9] pkgconfig_2.0.2 memoise_1.1.0 >> > >> > _______________________________________________ >> > Bioc-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > >> >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >> >> This email message may contain legally privileged and/or confidential >> information. If you are not the intended recipient(s), or the employee or >> agent responsible for the delivery of this message to the intended >> recipient(s), you are hereby notified that any disclosure, copying, >> distribution, or use of this email message is prohibited. If you have >> received this message in error, please notify the sender immediately by >> e-mail and delete this email message from your computer. Thank you. >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > The information in this e-mail is intended only for th...{{dropped:15}} _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel