Playing around with org.Hs.eg.db 3.8.0. What on earth is ENSPTRG0000...?

> library(org.Hs.eg.db)
> mapIds(org.Hs.eg.db, key="GCG", keytype="SYMBOL", column="ENSEMBL")
'select()' returned 1:many mapping between keys and columns
                 GCG
"ENSPTRG00000000777"

Well, at least it still recovers the right identifier... eventually.

> select(org.Hs.eg.db, key="GCG", keytype="SYMBOL", columns="ENSEMBL")
'select()' returned 1:many mapping between keys and columns
  SYMBOL            ENSEMBL
1    GCG ENSPTRG00000000777
2    GCG    ENSG00000115263

The SYMBOL->Entrez ID relational table seems to be okay:

> Y <- toTable(org.Hs.egSYMBOL)
> Y[which(Y[,2]=="GCG"),]
     gene_id symbol
2152    2641    GCG

So the cause is the Ensembl->Entrez mappings:

> Z <- toTable(org.Hs.egENSEMBL2EG)
> Z[Z[,1]==2641,]
     gene_id         ensembl_id
3028    2641 ENSPTRG00000000777
3029    2641    ENSG00000115263

Googling suggests that ENSPTRG00000000777 is an identifier for some other gene in one of the other monkeys. Hardly "Hs" stuff.

Session info (not technically R 3.6, but I didn't think that would have been the cause):

R Under development (unstable) (2019-04-11 r76379)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/trunk/lib/libRblas.so
LAPACK: /home/luna/Software/R/trunk/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets [8] methods base
other attached packages:
[1] org.Hs.eg.db_3.8.0 AnnotationDbi_1.45.1 IRanges_2.17.5 [4] S4Vectors_0.21.23 Biobase_2.43.1 BiocGenerics_0.29.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 digest_0.6.18 DBI_1.0.0 RSQLite_2.1.1 [5] blob_1.1.1 bit64_0.9-7 bit_1.1-14 compiler_3.7.0 [9] pkgconfig_2.0.2 memoise_1.1.0

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to