Hi Li,

One of our engineers suggests the following, using the kent source tree 
(http://genome.ucsc.edu/FAQ/FAQlicense.html#license3):

> They should try this kent source tree command operation:
>
> $ genePredToGtf hg19 knownGene stdout | sort -k1,1 -k4,4 | gzip -c > 
> hg19.knownGene.gtf.gz
>
> With a file in their home directory called .hg.conf with three lines:
>
> db.host=genome-mysql.cse.ucsc.edu
> db.user=genomep
> db.password=password
>
> It does give a much better GTF output than the table browser.
>

Please let us know if you have any additional questions: [email protected]

-
Greg Roe
UCSC Genome Bioinformatics Group




On 7/12/11 1:51 PM, Jia, Li (NIH/NCI) [C] wrote:
> Hi Greg,
>
> Thanks for the response. The FAQ on GTF format doesn't answer my question.
> As you suggested, if I select "All fields from selected table", the output
> format is only in txt, not GTF. I really need GTF format with both
> Transcript_ID and Gene_name there. I combine the two outputs from USCS
> refseq table and refFlat table, it includes all information I need, it
> looks like this:
>
> chr1 protein_coding CDS 67162933  67163102  0.000000 - 0  gene_id
> "NM_207014"; transcript_id "NM_207014"; gene_name "WDR78";
> chr1 protein_coding start_codon 67163100  67163102  0.000000 - .  gene_id
> "NM_207014"; transcript_id "NM_207014"; gene_name "WDR78";
> chr1 protein_coding exon 67162933 67163158  0.000000 - .  gene_id
> "NM_207014"; transcript_id "NM_207014"; gene_name "WDR78";
> chr1 protein_coding stop_codon 58719225   58719227  0.000000 - .  gene_id
> "NM_145243"; transcript_id "NM_145243"; gene_name "OMA1";
> chr1 protein_coding CDS 58719228  58719434  0.000000 - 0  gene_id
> "NM_145243"; transcript_id "NM_145243"; gene_name "OMA1";
>
>
> Unfortunately it doesn't work when I tried to use it on the analysis.
>
> Do you have any other suggestion?
>
> Thanks,
> Li
>
> On 7/11/11 6:35 PM, "Greg Roe"<[email protected]>  wrote:
>
>> Hi Li,
>>
>> Please see this section of our help describing the GTF file format:
>> http://genome.ucsc.edu/FAQ/FAQformat.html#format4.
>>
>> If you want generate the data exactly like the table schema, for the
>> output format in the Table Browser, select "All fields from selected
>> table".
>>
>> Please let us know if you have any additional questions:
>> [email protected]
>>
>> -
>> Greg Roe
>> UCSC Genome Bioinformatics Group
>>
>>
>>
>> On 7/11/11 1:13 PM, Jia, Li (NIH/NCI) [C] wrote:
>>> Hi,
>>>
>>> I am using table browser working on generating annotation GTF format.
>>> After selecting assembly of interest select:
>>>
>>> group: Genes and Gene Prediction Tracks
>>> track: refSeq Gene
>>> table: refFlat
>>> output format: "GTF"--Gene transfer format
>>>
>>> then give the name and output the GTF file.
>>>
>>> My question is that my output refFlat.GTF is not exactly same as the
>>> described table schema. In table schema, output format is as follows:
>>>
>>> geneName     LOC100288778
>>> Name             NR_028269
>>> chrom            chr1
>>> strand             -
>>> txStart             4224
>>> txEnd           7502
>>> cdsStart             7502
>>> cdsEnd             7502
>>> exonCount     7
>>> exonStarts             4224,4832,5658,6469,6719,70...
>>> exonEnds             4692,4901,5810,6631,6918,72...
>>>
>>> but my output file is:
>>> chr1    hg18_refFlat    exon    14601    14754    0.000000    -    .
>>> gene_id "WASH7P"; transcript_id "WASH7P";
>>> chr1    hg18_refFlat    exon    19184    19233    0.000000    -    .
>>> gene_id "WASH7P"; transcript_id "WASH7P";
>>> chr1    hg18_refFlat    exon    24474    25037    0.000000    -    .
>>> gene_id "FAM138A"; transcript_id "FAM138A";
>>> chr1    hg18_refFlat    exon    25140    25344    0.000000    -    .
>>> gene_id "FAM138A"; transcript_id "FAM138A";
>>>
>>> it has GeneName (gene_id), but there is no trancript_id (in the output,
>>> it is same as gene_id). In the example schema, Name should be
>>> transcript_id?
>>>
>>> How do I generate the table exactly like the table schema?
>>>
>>> Thanks,
>>> Li
>>> _______________________________________________
>>> Genome maillist  -  [email protected]
>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to