Arthur, When the data is coming from casavA 1.8 (actually I believe from 1.5 and above) I think it's already in the proper format. An excellent overview is here: http://en.wikipedia.org/wiki/FASTQ_format Basically the headers of the fastq reads are my indication at the moment. Since 1.8 it changed to @SOMETHING<space>READINFO. Most current seqlabs deliver that format. Good luck! Alex PS: correct me when wrong about the phred scoring. Probably PeterC knows this best since he wrote the python groomer.
Van: Arthur Zheng [mailto:[email protected]] Verzonden: dinsdag 14 februari 2012 6:01 Aan: Bossers, Alex CC: [email protected] Onderwerp: Re: [galaxy-user] Large local file of NGS for FASTAQ Groomer Dear Alex, Thank you for the reminder. I noticed that I am using illumina CASAVA 1.8. How can I make sure whether it is already in Sanger format or not? Arthur On Mon, Feb 13, 2012 at 4:53 AM, Bossers, Alex <[email protected]<mailto:[email protected]>> wrote: Are you sure the fastq's are in older format? Otherwise you won't need to groom the files anymore (as far as I understood) since the newer format is comparable Sanger quality score already.... Saves huge resources! Alex
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/

