On Jan 25, Bill -OSX- Jones said:
>Greetings, On Friday, January 25, 2002, at 10:56 AM,
>[EMAIL PROTECTED] wrote:
>
>> /[fF][oO][oO]/ better than /foo/i
>>
>> That'd be Mr. RE Jeffrey Freidl in the Mastering RE/Owl book.
>> The parser
>> has to do less backtracking or something.
>
>
>I have to disagree; while I have read the RE/Owl (I hear there is
>another in the works?) I feel that
>
>/[fF][oO][oO]/ lacks clarity over /foo/i
As a matter of speed, I find that /i is better than [] in some cases, such
as this example from bleadperl:
use Benchmark 'cmpthese';
my $str = "anajkCKLJLFnCVKCJkjccajRKchcbanRccjahrjaCJHECcca";
cmpthese(-5, {
'/i' => sub { $str =~ /cj+a/i },
'[]' => sub { $str =~ /[Cc][Jj]+[Aa]/ },
});
__END__
Benchmark: running /i, [] for at least 5 CPU seconds...
/i: 6 wallclock secs ( 5.31 usr + 0.00 sys = 5.31 CPU) @
44571.56/s (n=236675)
[]: 5 wallclock secs ( 5.27 usr + 0.00 sys = 5.27 CPU) @
22501.71/s (n=118584)
Rate [] /i
[] 22502/s -- -50%
/i 44572/s 98% --
In Perl 5.005_02, the [] was faster, than /i, but not by much. I think
regex optimizations have come a bit of ways since Perl 5.005.
In light of that, I wonder why /cj+a/i yields a scan for EXACTF "c" (that
is, a "c", case-folded), instead of EXACTF "cj" (and then looks for zero
or more j's).
On the same string as before, I ran the following two regexes (the names
of the tests themselves). Here are the results!
Benchmark: running cj+a, cjj*a for at least 5 CPU seconds...
cj+a: 6 wallclock secs ( 5.20 usr + 0.03 sys = 5.23 CPU) @
44552.01/s (n=233007)
cjj*a: 11 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @
68402.31/s (n=355008)
Rate cj+a cjj*a
cj+a 44552/s -- -35%
cjj*a 68402/s 54% --
Here's a case where expanding X+ to XX* yields good results.
--
Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/
RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
<stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.