I guess it is because that needle is optimized for global alignment but not semi-global alignment.
On Fri, Dec 7, 2018 at 2:25 AM Jan T Kim <[email protected]> wrote: > > Hi All, > > I've coded up a pairwise semiglobal aligner and then checked it by > comparing its output to that of needle. My expectation was to find > problems in my code, but now, to my surprise, I found a case where > the pairwise alignment computed by my code has a higher score than > that returned by needle. The sequences are: > > >x1a > CGTGATACATTACTTTTTA > >x1b > GTGGACTTGACGCGTCATGGAAAGTACAAGATACTTCGACCTGGCAGTGCAAG > > Storing these in separate files x1a.fasta and x1b.fasta and running > > needle -auto x1a.fasta x1b.fasta x1.txt > > gives this alignment with score 21.5: > > >x1a > ------------CGTGAT--ACATTACTTTTTA-------------------- > >x1b > GTGGACTTGACGCGTCATGGAAAGTACAAGATACTTCGACCTGGCAGTGCAAG > > However, my aligner finds this with score 22.0: > > >x1a x1a, aligned semiglobally, score 22.000000 > ------------CGTGA-------TACA--TTACTTTTTA----------------- > >x1b x1b, aligned semiglobally, score 22.000000 > GTGGACTTGACGCGTCATGGAAAGTACAAGATACTT----CGACCTGGCAGTGCAAG > > The Needleman-Wunsch algorithm computes an optimal alignment, i.e. > one with maximal score, given the input sequences. Therefore, if my > score computation is correct, the alignment reported by needle is > not optimal, which would mean that there's a bug in needle. > > Of course it's also possible that I'm doing something wrong / stupid, > so I append position by position breakdowns of my score calculations. > > I also attach the full needle output and my fasta files, so you can > run the above command to check whether you get the same result. I've > found this with Ubuntu xenial / emboss6.6.0+dfsg-3build1, and confirmed > that I get the same with Ubuntu bionic / emboss 6.6.0+dfsg-6 (both amd64). > The needle output file also confirms that needle computes the score of > its alignment as 21.5, consistently with my scoring. > > I notice that it's a somewhat peculiar feature of my alignment that > the internal gap of length 4 in x1b ends where the terminal (unpenalised) > gap in x1a starts. I'm not aware of any criterion / rule violated by that, > though. > > Best regards & thanks in advance for any scrutiny and comments, Jan > > > ----- 8< --- breakdown of needle alignment scoring ------- > > +0.0 # -~G > +0.0 # -~T > +0.0 # -~G > +0.0 # -~G > +0.0 # -~A > +0.0 # -~C > +0.0 # -~T > +0.0 # -~T > +0.0 # -~G > +0.0 # -~A > +0.0 # -~C > +0.0 # -~G > +5.0 # C-C > +5.0 # G-G > +5.0 # T-T > -4.0 # G.C > +5.0 # A-A > +5.0 # T-T > -10.0 # - G > -0.5 # - G > +5.0 # A-A > -4.0 # C.A > +5.0 # A-A > -4.0 # T.G > +5.0 # T-T > +5.0 # A-A > +5.0 # C-C > -4.0 # T.A > -4.0 # T.A > -4.0 # T.G > -4.0 # T.A > +5.0 # T-T > +5.0 # A-A > +0.0 # -~C > +0.0 # -~T > +0.0 # -~T > +0.0 # -~C > +0.0 # -~G > +0.0 # -~A > +0.0 # -~C > +0.0 # -~C > +0.0 # -~T > +0.0 # -~G > +0.0 # -~G > +0.0 # -~C > +0.0 # -~A > +0.0 # -~G > +0.0 # -~T > +0.0 # -~G > +0.0 # -~C > +0.0 # -~A > +0.0 # -~A > +0.0 # -~G > +21.5 > > ----- 8< --- breakdown of scoring of "my" alignment ------- > > +0.0 # -~G > +0.0 # -~T > +0.0 # -~G > +0.0 # -~G > +0.0 # -~A > +0.0 # -~C > +0.0 # -~T > +0.0 # -~T > +0.0 # -~G > +0.0 # -~A > +0.0 # -~C > +0.0 # -~G > +5.0 # C-C > +5.0 # G-G > +5.0 # T-T > -4.0 # G.C > +5.0 # A-A > -10.0 # - T > -0.5 # - G > -0.5 # - G > -0.5 # - A > -0.5 # - A > -0.5 # - A > -0.5 # - G > +5.0 # T-T > +5.0 # A-A > +5.0 # C-C > +5.0 # A-A > -10.0 # - A > -0.5 # - G > -4.0 # T.A > +5.0 # T-T > +5.0 # A-A > +5.0 # C-C > +5.0 # T-T > +5.0 # T-T > -10.0 # T - > -0.5 # T - > -0.5 # T - > -0.5 # A - > +0.0 # -~C > +0.0 # -~G > +0.0 # -~A > +0.0 # -~C > +0.0 # -~C > +0.0 # -~T > +0.0 # -~G > +0.0 # -~G > +0.0 # -~C > +0.0 # -~A > +0.0 # -~G > +0.0 # -~T > +0.0 # -~G > +0.0 # -~C > +0.0 # -~A > +0.0 # -~A > +0.0 # -~G > +22.0 > > _______________________________________________ > EMBOSS mailing list > [email protected] > http://mailman.open-bio.org/mailman/listinfo/emboss _______________________________________________ EMBOSS mailing list [email protected] http://mailman.open-bio.org/mailman/listinfo/emboss
