Helliwell, Kim am Dienstag, 12. Dezember 2006 21:56:
> Is there a function (perhaps in a library module) that would take two
> strings and return the common substring (if any) contained in the
> arguments? I've been looking for such a beast on CPAN, but no luck so
> far.
>
>
>
> If not, I guess I have to write it myself...
While I was writing it myself ;-) because I did not find anything via google
or CPAN - missing term LCSS... John posted Algorithm:LCCS.
I thought I post it anyway instead of copying it to /dev/null.
The script contains a testcase with "long" strings, it takes 1.2 secs on my
old machine (the test case is certainly not a worst case scenario).
It is just a dirty hack, using a naive aproach, and not proved to work
correctly.
Here it is, comments are welcome:
#!/usr/bin/perl
use strict;
use warnings;
sub lcss {
my ($s1, $s2)[EMAIL PROTECTED];
my $max1=length($s1)-1;
my $max2=length($s2)-1;
# make $s1 the shorter string
#
($s1, $s2, $max1, $max2)=($s2, $s1, $max2, $max1)
if $max1 > $max2;
my %found;
my $longest=0;
foreach my $i (0..$max1) {
foreach my $j ($i..$max1) {
my $searchlen=$j-$i+1;
next if $searchlen < $longest; # because longest css searched
my $search=substr($s1, $i, $searchlen); # pattern to search
$found{$1}++ for ($s2=~/($search)/g); # although count not used below
# not optimal because no test if match succeeded above
#
$longest=$searchlen if defined $1;
}
}
# (should) select only one random longest string if several present:
#
print '(one) LCSS found: ',
(sort {length($b) <=> length($a)} keys %found)[0], "\n";
}
### Test case:
my $pat=join '', 'hello' x 100;
my $bar=join '', 'hello' x 99;
my $foo=join '', 'a' x 1000000;
lcss ($pat, $bar.$foo.$bar.$pat.$bar.$foo.$bar);
lcss ('donut', 'I just want to eat one donut please!');
lcss ('I just want to eat one donut please!', 'donut');
__END__
Dani
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>