On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice <p...@ebi.ac.uk> wrote: > > On 22/07/10 12:22, Peter C. wrote: > >>> I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI >>> 33 >>> (PHRED quality 1, quality 0) which is rather strange. The sequence appears >>> to agree with the provided file pGEM_(ABI)_A01.seq >>> >>> Have I just been unlucky with the AB1 files that I have looked at? Thus >>> far all the quality scores seem meaningless. > > There are two sets of quality scores in that file. Both are the > alternating characters 1 and 0. Adding 33 gives the scores you see. > > Looks as though EMBOSS is just reporting what it finds. > > The file offset is the value returned by function > ajSeqABIGetConfidOffset. It simply reads one byte from there for each > base of sequence length.
Looks like that particular random example from the internet was just odd. >> I went back through my old emails, and see you had been testing with >> http://www.appliedbiosystems.com/support/software_community/ab1_files.zip >> (I had trouble downloading this with curl - Firefox worked). Looking at these >> ABI files with seqret as FASTQ does seem to give meaningful quality scores. >> Curious. > > It should look for a PCON tag in the file and pick up the second of two, > or the first if there is only one. > > Can anyone on the list enlighten us further on what is intended for the > quality socrss in these example files? The gGEM example I have no idea - I just found it with Google. I can send you a couple of our locally produced AB1 files off list if you wouldn't mind having a look at them. It may be that however these are being generated there simply are no useful scores inside. Peter _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss