My goal is to have both wgEncodeGencodeManualV4 and 
wgEncodeGencodeAutoV4 in GenePred format.

I tried to download the wgEncodeGencodeManualV4 table from the test 
browser. For some reason when downloading it gets stuck after 
downloading and the file is cutoff (chromosomes 17-22 are completely 
missing, chromosome 16 is there partially; after exactly 409600b = 
400kb). This happens reproducibly across multiple 
computers/networks/operating systems. I also want the 
wgEncodeGencodeAutoV4, which appears to download ok.

I have had this sort of problem before (where downloads from the table 
browser would get stuck). I am not sure what causes them. Is there a url 
from which the data can always be downloaded in flat files?

I can also download the gtf version of these files from:

http://hgdownload-test.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGencode/wgEncodeGencode{Auto,Manual}V4.gtf.gz

But when I try to convert the AutoV4 file I get several errors:

gunzip -c wgEncodeGencodeAutoV4.gtf.gz | gtfToGenePred -allErrors 
-genePredExt /dev/stdin /dev/stdout

... [snip]
no exons defined for 93876
no exons defined for 93875
no exons defined for 93874
no exons defined for 115098
no exons defined for 27940
no exons defined for 29602
no exons defined for 29603
no exons defined for 10879
622 errors

and these genes are missing from the final output (although they are 
present in the wgEncodeGencodeAutoV4 I download from the test browser).

I was wondering what the command used to convert the gtf files above to 
GenePred actually was. Also, can the table browser be repaired?

Right now I am using wgEncodeGencodeAutoV4 from the table browser and 
wgEncodeGencodeManualV4 converted with gtfToGenePred, but it would be 
nice to have a more consistent way to set it all up.

Thanks,
Pouya

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to