On Wed, 11 Feb 2026 21:41:25 -0500
Graydon Saunders via BaseX-Talk <[email protected]>
wrote:
>
> If I have two (fairly long) sequences of text, ('The', 'words',
> 'are', 'sequence', 'members') and I want all the index numbers of
> matching pairs despite the sequences only mostly matching (so a word,
> or several words, can be missing from sequence A or sequence B), is
> there an established algorithm for doing this?
There are several - Myers, MacIlroy (of Unix fame) and others, have
published papers on the longest matching subsequence problem that is at
the heart e.g. of the Unix diff program.
liam
--
Liam Quin: Delightful Computing - Training and Consultancy in
XSLT / XML Markup / Typography / CSS / Accessibility / and more...
Outreach for the GNU Image Manipulation Program
Vintage art digital files - fromoldbooks.org