Please note the README file there that explains the contents of the files: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/README.txt
You want to pick up the chromFa.tar.gz file. See also: http://en.wikipedia.org/wiki/FASTA_format http://en.wikipedia.org/wiki/Tar_%28file_format%29 http://en.wikipedia.org/wiki/Gzip --Hiram ----- Original Message ----- From: "Eyal Lev" <[email protected]> To: [email protected] Sent: Tuesday, March 15, 2011 5:04:28 PM Subject: [Genome] I need some human DNA (data) Hi, this may sound rather stupid, but since I'm not too familiar with how this "works", I'll just go ahead and ask. I need some DNA data. I need/want to run some statistical tests on the frequencies of certain "words" (DNA "letter" combinations), and for that I need a simple (text) file with the A,G,C,T's of the different chromosomes (two files for each, for either direction). Is such a thing even available (anywhere), is the human dna only partially mapped (I'm actually just as interested in the junk DNA part, as in the protein coding parts). I went to http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/ and downloaded the file "est.fa.gz <http://newmail.walla.co.il/est.fa.gz>", it seems to have a some nice letters, but I can't understand what the ">AA000972 1" parts mean (is it a "place holder" for 972 "letters"?) hope you can help me out. thanks _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
