Thanks to all of the people who came up with answers.  Ultimately the one
I'm happiest with was the one Japhy emailed me, which shows that it can
indeed be done in a regex.  And it even works in perl 5.005.  (see below).

-Daniel


#!/usr/bin/perl
sub lcs {

    my $this = shift;
    my $that = shift;

    my $str = join "\0", $this, $that;
    my $len = 1;
    my $lcs;
    while ($str =~ m{ ([^\0]{$len,}) (?= [^\0]* \0 [^\0]*? \1 ) }xg) {
        $lcs = $1;
        $len = 1 + length($1);
    }

    if ($len == 1) { print("No common substring\n"); }
    else {
        print("Longest common substring of length $len: \"");
        print("$lcs");
        print("\"\n");
    }
}

# Testing
lcs("This is one piece of text", "Another piece of text here");
lcs("Another piece of text here", "This is one piece of text");
lcs("bbbbcccbbbd", "dbbbbeeebbbf");
lcs("abbbbcccbbbd", "bbbbeeebbbf");
lcs("aaaabcd", "dfhaaaa");
lcs("dfhaaaa", "aaaabcd");
lcs("abcd", "efgh");



On Thu, 14 Feb 2002, Jeff 'japhy' Pinyan wrote:

> On Feb 14, Daniel R. Allen said:
> 
> >Somebody asked on the Toronto Perlmongers list about finding the longest
> >common (consecutive) substring of two thousand-character texts using perl.  
> >I've given a quick search and can find nothing definitive. [1]
> >
> >She doesn't want the longest common subsequence, where elements can be
> >chopped out of either text to make them similar, which I could find code
> >for, and which is essentially 'diff'.
> 
> So basically, from "this is the way we brush our teeth" and "is his the
> way to do it?" you would want "is the way ", right?  If so, I can come up
> with a regex solution for you.
> 
> -- 
> Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
> RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
> ** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
> <stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
> [  I'm looking for programming work.  If you like my work, let me know.  ]
> 
> 

Reply via email to