Good Morning:
The examples you mention I believe exist both in hg18 and hg19.
This shell procedure obtains results from each database:
for G in ENSG00000026103 ENSG00000030110 ENSG00000104725 ENSG00000104774
do
echo -n "hg19 ${G}: "
hgsql -N -e "select X.*, G.* from ensGene as G, knownToEnsembl as KE,
kgXref as X where G.name=KE.value and KE.name=X.kgID and
G.name2=\"${G}\" limit 1;" hg19
echo -n "hg18 ${G}: "
hgsql -N -e "select X.*, G.* from ensGene as G, knownToEnsembl as KE,
kgXref as X where G.name=KE.value and KE.name=X.kgID and
G.name2=\"${G}\" limit 1;" hg18
done
However, please keep in mind. The Ensembl gene track has more
annotations than in the UCSC gene track.
Not all Ensembl gene annotations have a corresponding UCSC gene.
Not all UCSC genes have a corresponding Ensembl gene.
The counts are:
hg18 Ensembl genes v54 May 2009: 63,280, UCSC genes Aug 2009: 66,803,
knownToEnsembl: 60,456
hg19 Ensembl genes v63 Jun 2011: 173,742, UCSC genes Oct 2009: 77,614,
knownToEnsembl: 75,160
The knownToEnsembl counts are the number of UCSC genes that correspond to
an Ensembl transcript ID. A single UCSC gene can correspond to a number of
different Ensembl transcript IDs. The counts of the unique number of Ensembl
transcript
IDs in the knownToEnsembl tables are: hg18: 30,209, hg19: 46,319
and the number of Ensembl transcripts in the table ensPep are: hg18: 47,509,
hg19: 90,720
Therefore, the coverage of Ensembl genes via knownToEnsembl is:
hg18: 30209/47509 == %63, hg19: 46319/90720 == %51
You will not always find UCSC genes for Ensembl transcripts.
--Hiram
Rispoli Rossella wrote:
> Hi,
> We have a local installation of UCSC that was updated yesterday, and I
> have problem querying the mysql DB and I would like to know if you can
> help me.
>
> I want use the mysql tables to retrieve informations starting from a
> list of ensembl gene ID. To do this I use the tables: ensGene,
> knownToEnsembl, kgXref with the following query:
>
> >>select QUERY.name,QUERY.name2,QUERY.geneSymbol,QUERY.refseq from
> (select X.*, G.* from ensGene as G, knownToEnsembl as KE, kgXref as
> X where G.name=KE.value
> and KE.name=X.kgID and G.name2='ensID') as QUERY;
>
> For some of this ensemblId if I query the hg19 DB I don't get any
> results, instead of if I query hg18 DB they can be found.
>
> But when I search the same ensemblID through the web interface I see
> that they are present in the ensembl gene track.
> although in the visualization in hg19 the results are titled
> EnsemblGene, in the hg18 EnsGene (I don't know if this may be relevant).
>
> Is there anything missing in my query?
>
> here are some of the ensemblID for which I see this problem:
> ENSG00000026103
> ENSG00000030110
> ENSG00000104725
> ENSG00000104774
>
> Thanks in advances, best regard
>
> Rossella
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome