Greetings, The latest versions of the ABI basecaller does indeed give quality scores. Nicola Vitacolonna wrote a perl module that access the metadata encoded in the ab1 files.
use Bio::Trace::ABIF; my $abif = Bio::Trace::ABIF−>new(); $abif−>open_abif('/Path/to/my/file.ab1'); my $sequence = $abif−>sequence(); my @quality_values = $abif−>quality_values(); print $abif−>sample_name(), "\n"; print $sequence, "\n"; print '+\n'; print join(" ",@quality_values), "\n"; Will generate a fastq-sanger format. regards, Tom Thomas (Tom) Keller, PhD kellert at ohsu.edu<http://ohsu.edu> 503.494.2442 6339b R Jones Hall (BSc/CROET) www.ohsu.edu/xd/research/research-cores/dna-analysis/<http://www.ohsu.edu/xd/research/research-cores/dna-analysis/> On Jul 22, 2010, at 7:42 AM, Chevreux, Bastien wrote: AFAIK ab1 files do not have phred quality scores included. At least they did not a couple of years ago. You need to mangle them through a basecaller (TraceTuner, phred, others) to get these scores. B. -- DSM Nutritional Products AG R&D Human Nutrition & Health Bioinformatics - Bldg. 203.4 / 188 P.O. Box 2676 CH-4002 Basel / Switzerland Tel. +41 61 815 8264 -----Original Message----- From: emboss-boun...@lists.open-bio.org<mailto:emboss-boun...@lists.open-bio.org> [mailto:emboss-boun...@lists.open- bio.org] On Behalf Of Peter Sent: Donnerstag, 22. Juli 2010 15:14 To: Peter Rice Cc: emboss@lists.open-bio.org<mailto:emboss@lists.open-bio.org> Subject: Re: [EMBOSS] ABI to FASTQ with seqret On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice <p...@ebi.ac.uk<mailto:p...@ebi.ac.uk>> wrote: On 22/07/10 12:22, Peter C. wrote: I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 (PHRED quality 1, quality 0) which is rather strange. The sequence appears to agree with the provided file pGEM_(ABI)_A01.seq Have I just been unlucky with the AB1 files that I have looked at? Thus far all the quality scores seem meaningless. There are two sets of quality scores in that file. Both are the alternating characters 1 and 0. Adding 33 gives the scores you see. Looks as though EMBOSS is just reporting what it finds. The file offset is the value returned by function ajSeqABIGetConfidOffset. It simply reads one byte from there for each base of sequence length. Looks like that particular random example from the internet was just odd. I went back through my old emails, and see you had been testing with http://www.appliedbiosystems.com/support/software_community/ab1_files.zip (I had trouble downloading this with curl - Firefox worked). Looking at these ABI files with seqret as FASTQ does seem to give meaningful quality scores. Curious. It should look for a PCON tag in the file and pick up the second of two, or the first if there is only one. Can anyone on the list enlighten us further on what is intended for the quality socrss in these example files? The gGEM example I have no idea - I just found it with Google. I can send you a couple of our locally produced AB1 files off list if you wouldn't mind having a look at them. It may be that however these are being generated there simply are no useful scores inside. Peter _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org<mailto:EMBOSS@lists.open-bio.org> http://lists.open-bio.org/mailman/listinfo/emboss DISCLAIMER : This e-mail is for the intended recipient only If you have received it by mistake please let us know by reply and then delete it from your system; access, disclosure, copying, distribution or reliance on any of it by anyone else is prohibited. If you as intended recipient have received this e-mail incorrectly, please notify the sender (via e-mail) immediately. _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org<mailto:EMBOSS@lists.open-bio.org> http://lists.open-bio.org/mailman/listinfo/emboss _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss