My goal is to have both wgEncodeGencodeManualV4 and wgEncodeGencodeAutoV4 in GenePred format.
I tried to download the wgEncodeGencodeManualV4 table from the test browser. For some reason when downloading it gets stuck after downloading and the file is cutoff (chromosomes 17-22 are completely missing, chromosome 16 is there partially; after exactly 409600b = 400kb). This happens reproducibly across multiple computers/networks/operating systems. I also want the wgEncodeGencodeAutoV4, which appears to download ok. I have had this sort of problem before (where downloads from the table browser would get stuck). I am not sure what causes them. Is there a url from which the data can always be downloaded in flat files? I can also download the gtf version of these files from: http://hgdownload-test.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGencode/wgEncodeGencode{Auto,Manual}V4.gtf.gz But when I try to convert the AutoV4 file I get several errors: gunzip -c wgEncodeGencodeAutoV4.gtf.gz | gtfToGenePred -allErrors -genePredExt /dev/stdin /dev/stdout ... [snip] no exons defined for 93876 no exons defined for 93875 no exons defined for 93874 no exons defined for 115098 no exons defined for 27940 no exons defined for 29602 no exons defined for 29603 no exons defined for 10879 622 errors and these genes are missing from the final output (although they are present in the wgEncodeGencodeAutoV4 I download from the test browser). I was wondering what the command used to convert the gtf files above to GenePred actually was. Also, can the table browser be repaired? Right now I am using wgEncodeGencodeAutoV4 from the table browser and wgEncodeGencodeManualV4 converted with gtfToGenePred, but it would be nice to have a more consistent way to set it all up. Thanks, Pouya _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
