On Thu, Mar 14, 2002 at 10:38:29AM -0500, Ted Zlatanov wrote: > I remember a lengthy LCS discussion, and the solutions for the most > part used the regex engine. I also looked on CPAN, but couldn't find > a canonical LCS module. > > Has anyone developed an implementation of the well-known bitstring > algorithm? Basically you convert your data strings to bitstrings, AND > the two, and look for the longest match. Then you rotate the shorter > data string by one, and repeat the test. Repeat for the number of > bits in the shorter data string. I want to do it, but just wanted to > check if it had come up before on FWP. > > There are even better algorithms to do this. I found them with > Google; for instance the Goeman & Clausen 1999 "A New Practical Linear > Space Algorithm for the LCS Problem" paper looks interesting but is > beyond what I remember of my college math. It describes a > O(ns + min(mp, (m log(m)) + p(n-p))) time, linear space algorithm and a > O(ns + min(mp, p(n-p))) time, O(ns) space algorithm, and claims the > latter to be the fastest known such algorithm (m = size of string A, > n = size of string B, n>= m, s = alphabet size, p = length of LCS). > I'm pretty sure implementations of the Goeman & Clausen algorithms > haven't been done in Perl yet.
I'd like to point out that the LCS problem from the literature is usually a harder problem than the LCS problem discussed here. Here we usually talk about consecutive common substrings, while the literature drops the consective restriction. Abigail