Hi,

I am looking at the fastq.gz files for the mouse ENCODE data at the
UCSC DCC website, and it looks like
all datasets coming from Caltech are zipped with some format other
than gzip. Can you tell me which one?

For example, for any of the files *not* from Caltech, I can do gunzip:

avilella@magneto:~/00x$ wget -qO-
ftp://hgdownload.cse.ucsc.edu/goldenPath/mm9/encodeDCC/wgEncodeLicrHistone/wgEncodeLicrHistoneEsb4InputME0C57bl6StdRawDataRep2.fastq.gz
| gunzip -c | head -n 4@SOLEXA2_0001:2:1:0:9#0/1
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SOLEXA2_0001:2:1:0:9#0/1
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

But for the ones from Caltech, I just get encoded gibberish back:
wget -qO- 
ftp://hgdownload.cse.ucsc.edu/goldenPath/mm9/encodeDCC/wgEncodeCaltechHist/wgEncodeCaltechHistC2c12InputFCntrl50bE2p60hPcr1xRawDataRep1.fastq.gz
| gunzip -c | head -n 4

or

wget -qO- 
ftp://hgdownload.cse.ucsc.edu/goldenPath/mm9/encodeDCC/wgEncodeCaltechTfbs/wgEncodeCaltechTfbsC2c12InputFCntrl36bPcr1xRawDataRep1.fastq.gz
| gunzip -c | head -n 4

Thanks in advance,

Cheers,

Albert.
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to