On Mon, Jan 21, 2002 at 05:45:55PM +0100, [EMAIL PROTECTED] wrote:
> On Mon, Jan 21, 2002 at 03:48:58PM +0000, Robin Houston wrote:
> > On Mon, Jan 21, 2002 at 04:32:59PM +0100, [EMAIL PROTECTED] wrote:
> > > You are right, I had forgotten a case, [...]
> > > This results in (a sloooow program):
> > 
> > There's something wrong here. Your regex matches "abcb", which isn't
> > shrinkable.
> 
> 
> Duh! I was writing  (...)+  where I should have written (...)\1*.
> 
> Here's a corrected version (making for quite a faster regex):


On reflection, it appears my second case is a special case of the third.
This makes for a smaller regex:


     #!/usr/bin/perl 

     use strict;
     use warnings qw /all/;
     
     my @strings;
     my $p;
     my $max = 255;  # Use a higher number for Unicode.
     foreach my $c (1 .. $max) {
         # Strings of the form:  XbYXaZ,  a lt b.
         push @strings => sprintf '(.*)\x%02x.*\%d[\x00-\x%02x].*' =>
                                   $c, ++ $p, $c - 1;
         # Strings of the form: (XaY)+XbZXaY, a lt b.
         push @strings => sprintf '((.*)[\x00-\x%02x].*)\%d*\%d\x%02x.*\%d' =>
                                   $c - 1, $p + 1, $p + 2, $c, $p + 1;
         $p += 2;
     }

     my $regex = join "|\n " => map {"(?:$_)"} @strings;
        $regex = "^(?:$regex)\$";

     print "/$regex/sx\n";

     __END__



Abigail

Reply via email to