Longest Common Substring

Daniel R. Allen Thu, 14 Feb 2002 08:13:13 -0800

Hi,

Somebody asked on the Toronto Perlmongers list about finding the longest
common (consecutive) substring of two thousand-character texts using perl.  
I've given a quick search and can find nothing definitive. [1]


She doesn't want the longest common subsequence, where elements can be
chopped out of either text to make them similar, which I could find code
for, and which is essentially 'diff'.

As I understand, perl uses the Boyer-Moore algorithm in the regex engine,
so should it be possible to backtrack to the longest common string using a
RE?  Even if you don't know the text of the match beforehand?

Cheers,

Daniel

[1] There was a conversation on this very list, in Nov. 2000; but I
couldn't get the code to work...
www.bumppo.net/lists/fun-with-perl/2000/11/msg00009.html 


$_='[EMAIL PROTECTED] 519-575-3733  /Prescient Code Solutions/  coder.com
';s/-/ /g;s/([.@])/ $1/g;@y=(42*1476312054+7*3,14120504e4,-42*330261-33,
42*5436+3,42*2886+10,42*434987+5);s/(.)/ord(uc($1))/ge;for(@x=split/32/;
@y; map{print chr} split /(..)/, shift(@x) + shift(@y)) {perlmonk.da.ru}

Longest Common Substring

Reply via email to