Good Morning:

The examples you mention I believe exist both in hg18 and hg19.
This shell procedure obtains results from each database:

for G in ENSG00000026103 ENSG00000030110 ENSG00000104725 ENSG00000104774
do
echo -n "hg19 ${G}: "
hgsql -N -e "select X.*, G.* from ensGene as G, knownToEnsembl as KE,
         kgXref as X where  G.name=KE.value and KE.name=X.kgID and
         G.name2=\"${G}\" limit 1;" hg19
echo -n "hg18 ${G}: "
hgsql -N -e "select X.*, G.* from ensGene as G, knownToEnsembl as KE,
         kgXref as X where  G.name=KE.value and KE.name=X.kgID and
         G.name2=\"${G}\" limit 1;" hg18
done

However, please keep in mind.  The Ensembl gene track has more
annotations than in the UCSC gene track.
Not all Ensembl gene annotations have a corresponding UCSC gene.
Not all UCSC genes have a corresponding Ensembl gene.

The counts are:

hg18  Ensembl genes v54 May 2009: 63,280, UCSC genes Aug 2009: 66,803, 
knownToEnsembl: 60,456
hg19  Ensembl genes v63 Jun 2011: 173,742, UCSC genes Oct 2009: 77,614, 
knownToEnsembl: 75,160

The knownToEnsembl counts are the number of UCSC genes that correspond to
an Ensembl transcript ID.  A single UCSC gene can correspond to a number of
different Ensembl transcript IDs.  The counts of the unique number of Ensembl 
transcript
IDs in the knownToEnsembl tables are:  hg18: 30,209, hg19: 46,319
and the number of Ensembl transcripts in the table ensPep are: hg18: 47,509, 
hg19: 90,720
Therefore, the coverage of Ensembl genes via knownToEnsembl is:
hg18: 30209/47509 == %63, hg19: 46319/90720 == %51

You will not always find UCSC genes for Ensembl transcripts.

--Hiram

Rispoli Rossella wrote:
> Hi,
> We have a local installation of UCSC that was updated yesterday, and I  
> have problem querying the mysql DB and I would like to know if you can  
> help me.
> 
> I want use the mysql tables to retrieve informations starting from a  
> list of ensembl gene ID. To do this I use the tables: ensGene,  
> knownToEnsembl, kgXref with the following query:
> 
>  >>select QUERY.name,QUERY.name2,QUERY.geneSymbol,QUERY.refseq from   
> (select X.*,  G.* from ensGene as G,  knownToEnsembl as KE,  kgXref as  
> X where  G.name=KE.value
>      and KE.name=X.kgID and G.name2='ensID')  as QUERY;
> 
> For some of this ensemblId if I query the hg19 DB I don't get any  
> results, instead of if I query hg18 DB they can be found.
> 
> But when I search the same ensemblID through the web interface I see  
> that they are present in the ensembl gene track.
> although in the visualization in hg19 the results are titled   
> EnsemblGene, in the hg18 EnsGene (I don't know if this may be relevant).
> 
> Is there anything missing in my query?
> 
> here are some of the ensemblID for which I see this problem:
> ENSG00000026103
> ENSG00000030110
> ENSG00000104725
> ENSG00000104774
> 
> Thanks in advances, best regard
> 
> Rossella
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to