Dear All, I've used the Table browser in order to download all ensembl genes from the mar 2006 assembly in GTF format. This results in a list with 1,040,440 entities, which I suppose could be correct. But on each line transcript ID and gene ID are set to the same value (the transcript ID):
head ensGene.gtf chr1 hg18_ensGene CDS 67052401 67052451 0.000000 - 0 gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene exon 67051162 67052451 0.000000 - . gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene CDS 67060632 67060788 0.000000 - 1 gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene exon 67060632 67060788 0.000000 - . gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene CDS 67065091 67065317 0.000000 - 0 gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene exon 67065091 67065317 0.000000 - . gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene CDS 67066083 67066181 0.000000 - 0 gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene exon 67066083 67066181 0.000000 - . gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene CDS 67071856 67071977 0.000000 - 2 gene_id "ENST00000371026"; transcript_id "ENST00000371026"; chr1 hg18_ensGene exon 67071856 67071977 0.000000 - . gene_id "ENST00000371026"; transcript_id "ENST00000371026"; Is this a bug, or am I making some mistake here? If I am, how can I retrieve the correct file? Thank you, Boel _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
