On 13/10/2011 08:45, Jon Ison wrote:
Hi chaps (Aengus !)

If I understood Aengus' msg. what's needed is something that simply combines 
overlapping hits (for
a given pattern) into one or more non-overlapping "region of hits", and reports 
those regions e.g.

    Start     End  Strand Pattern_name Mismatch Sequence
       54      65       + pattern1            5 GCCAAATAAGGG
      104     115       + pattern1            5 CCTAAATAAGGG
      179     188       + pattern1            2 CCTTGCTTGG
      190     200       + pattern1            6 CCGATTAGAGC

Mismatch in this case is reporting the sum of mismatches from before.  A column 
for number of
(sub)matches would also be needed.  Is that right Aengus?

I'm not sure that adding the mismatches is sound. I'd assume just a best hit from the overlapping matches.

The above might give a useful result depending in the input pattern.  It would 
I think be easy
enough to implement.

This is a report output, so post-processing could be done by trimming the results before output using an associated qualifier.

Still not sure how useful it would be, we need more feedback from other users on this one please!

Peter Rice
EMBOSS Team

_______________________________________________
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss

Reply via email to