[Bioc-devel] Gene annotation: TxDb vs ENSEMBL/NCBI inconsistency

Ludwig Geistlinger Wed, 03 Jun 2015 04:05:32 -0700

Dear Bioc annotation team,

Querying TxDb.Hsapiens.UCSC.hg38.knownGene for gene coordinates, e.g. for


BRCA1; ENSG00000012048; entrez:672

via

> genes(TxDb.Hsapiens.UCSC.hg38.knownGene, vals=list(gene_id="672"))

gives me:

GRanges object with 1 range and 1 metadata column:
      seqnames               ranges strand |     gene_id
         <Rle>            <IRanges>  <Rle> | <character>
  672    chr17 [43044295, 43170403]      - |         672
  -------
  seqinfo: 455 sequences (1 circular) from hg38 genome


However, querying Ensembl and NCBI Gene
http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000012048
http://www.ncbi.nlm.nih.gov/gene/672

the gene is located at (note the difference in the end position)

Chromosome 17: 43,044,295-43,125,483 reverse strand


How is the inconsistency explained and how to extract an ENSEMBL/NCBI
conform annotation from the TxDb object?
(I am aware of biomaRt, but I want to explicitely use the Bioc annotation
functionality).

Thanks!
Ludwig


-- 
Dipl.-Bioinf. Ludwig Geistlinger

Lehr- und Forschungseinheit für Bioinformatik
Institut für Informatik
Ludwig-Maximilians-Universität München
Amalienstrasse 17, 2. Stock, Büro A201
80333 München

Tel.: 089-2180-4067
eMail: ludwig.geistlin...@bio.ifi.lmu.de

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] Gene annotation: TxDb vs ENSEMBL/NCBI inconsistency

Reply via email to