I recently had a similar problem. A regex that worked fine in sample code
was a dog in the web-server code. It only happened with really long strings.
I tracked down the problem to this from the 'perlre' manpage.
WARNING: Once Perl sees that you need one of "$&", "$`", or "$'"
anywhere in the program, it
has to provide them for every pattern match. This may substantially
slow your program. Perl
uses the same mechanism to produce $1, $2, etc, so you also pay a
price for each pattern that
contains capturing parentheses. (To avoid this cost while retaining
the grouping behaviour,
use the extended regular expression "(?: ... )" instead.) But if you
never use "$&", "$`" or
"$'", then patterns without capturing parentheses will not be
penalized. So avoid "$&",
"$'", and "$`" if you can, but if you can't (and some algorithms
really appreciate them),
once you've used them once, use them at will, because you've already
paid the price. As of
5.005, "$&" is not so costly as the other two.
Basically one of the modules in the web-app I was 'use'ing needed $', but my
test code didn't 'use' that module. The result was pretty dramatic in this
case, something that took approx 1 second in the test code was timing out
after 2 minutes in the web-server.
What I did in the end was something like this:
In the code somewhere add this so it's run when a request hits.
open(F, '>/tmp/modulelist');
print F join("\n", values %INC), "\n";
close(F);
This creates a file which lists all the loaded modules. Then after sticking
a request through the browser, do something like:
grep \$\' `cat /tmp/modulelist`
grep \$\& `cat /tmp/modulelist`
grep \$\` `cat /tmp/modulelist`
to try and track down the offending module. You'll get quite a few false
hits (comments, etc), but you might find an offending module. The main ones
I found were:
Parse::RecDescent
Net::DNS
and a couple of others I can't remember now. I fixed Net::DNS myself and
sent a patch to the maintainer, but haven't heard anything. If you find this
happens to be your problem as well, ask me for the patched version.
Parse::RecDescent makes heavy use of the above vars, no chance of fixing
that in a hurry.
Rob
----- Original Message -----
From: "Paul Mineiro" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, January 24, 2002 11:01 AM
Subject: Re: slow regex [BENCHMARK]
> Paul Mineiro wrote:
>
> i've cleaned up the example to tighten the case:
>
> the mod perl code snippet is:
>
> ---
>
> my @cg;
>
> open DIL, '>', "/tmp/seqdata";
> print DIL $seq;
> close DIL;
>
> warn "length seq = @{[length ($seq)]}";
>
> my $t = timeit (1, sub {
> while ($seq =~ /CG/g)
> {
> push @cg, pos ($seq);
> }
> });
>
> print STDERR timestr ($t), "\n";
>
> ---
>
> which yields
> length seq = 200001 at
> /home/aerives/genegrokker-interface/mod_perl/genomic_img.pm line 634,
> <GEN1> line 102
> 16 wallclock secs (15.56 usr + 0.01 sys = 15.57 CPU) @ 0.06/s (n=1)
>
> and the perl script (command line) version is:
>
> ---
>
> #!/usr/bin/perl
>
> use Benchmark;
> use strict;
>
> open DIL, '<', "/tmp/seqdata";
> my $seq = <DIL>;
> close DIL;
>
> warn "length seq is @{[length $seq]}";
>
> my @cg;
>
> my $t = timeit (1, sub {
> while ($seq =~ /CG/g)
> {
> push @cg, pos ($seq);
> }
> });
>
> print STDERR timestr ($t), "\n";
>
> ---
> which yields:
>
> length seq is 200001 at ./t.pl line 10.
> 0 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
>
> the data is pretty big, so i didn't attach it, but feel free to contact
> me directly for it.
>
> -- p
>
> >hi. i'm running mod_perl 1.26 + apache 1.3.14 + perl 5.6.1
> >
> >i have a loop in a mod_perl handler like so:
> >----
> > my $stime = time ();
> >
> > while ($seq =~ /CG/og)
> > {
> > push @cg, pos ($seq);
> > }
> >
> > my $etime = time ();
> >
> > warn "time was: ", scalar localtime ($stime), " ",
> > scalar localtime ($etime), " ", $etime - $stime;
> >----
> >
> >under mod_perl this takes 23 seconds. running the perl "by hand" (via
> >extracting this piece into a seperate perl script) on the same data takes
> >less than 1 second.
> >
> >has anyone seen this kind of extreme slowdown before?
> >
> >-- p
> >
> >info:
> >
> >apache build options:
> >
> >CFLAGS="-g -g -O3 -funroll-loops" \
> >LDFLAGS="-L/home/aerives/lib -L/home/aerives/lib/mysql" \
> >LIBS="-L/home/aerives/genegrokker-interface/lib
> >-L/home/aerives/genegrokker-interface/ext/lib -L/home/aerives/lib
> >-L/home/aerives/lib/mysql" \
> >./configure \
> >"--prefix=/home/aerives/genegrokker-interface/ext" \
> >"--enable-rule=EAPI" \
> >"--enable-module=most" \
> >"--enable-shared=max" \
> >"--with-layout=GNU" \
> >"--disable-rule=EXPAT" \
> >"$@"
> >
> >mod_perl build options:
> >
> >configure_options="PERL_USELARGEFILES=0 USE_APXS=1
> >WITH_APXS=$PLAYPEN_ROOT/ext/sbin/apxs EVERYTHING=1
> >INC=$PLAYPEN_ROOT/ext/include -DEAPI"
> >
> >perl -V:
> >Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
> > Platform:
> > osname=linux, osvers=2.4.13, archname=i386-linux
> > uname='linux duende 2.4.13 #1 wed oct 31 19:18:07 est 2001 i686
unknown '
> >
config_args='-Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux
>
>-Dprefix=/usr -Dprivlib=/usr/share/perl/5.6.1 -Darchlib=/usr/lib/perl/5.6.1
>
>-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl
5
> >-Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.6.1
> >-Dsitearch=/usr/local/lib/perl/5.6.1 -Dman1dir=/usr/share/man/man1
> >-Dman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3perl
> >-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Duseshrplib
> >-Dlibperl=libperl.so.5.6.1 -Dd_dosuid -des'
> > hint=recommended, useposix=true, d_sigaction=define
> > usethreads=undef use5005threads=undef useithreads=undef
> >usemultiplicity=undef
> > useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
> > use64bitint=undef use64bitall=undef uselongdouble=undef
> > Compiler:
> > cc='cc', ccflags ='-DDEBIAN -fno-strict-aliasing -I/usr/local/include
> >-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
> > optimize='-O2',
> > cppflags='-DDEBIAN -fno-strict-aliasing -I/usr/local/include'
> > ccversion='', gccversion='2.95.4 (Debian prerelease)',
gccosandvers=''
> > intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
> > d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
> > ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
> >lseeksize=8
> > alignbytes=4, usemymalloc=n, prototype=define
> > Linker and Libraries:
> > ld='cc', ldflags =' -L/usr/local/lib'
> > libpth=/usr/local/lib /lib /usr/lib
> > libs=-lgdbm -ldb -ldl -lm -lc -lcrypt
> > perllibs=-ldl -lm -lc -lcrypt
> > libc=/lib/libc-2.2.4.so, so=so, useshrplib=true,
libperl=libperl.so.5.6.1
> > Dynamic Linking:
> > dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
> > cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
> >
> >
> >Characteristics of this binary (from libperl):
> > Compile-time options: USE_LARGE_FILES
> > Built under linux
> > Compiled at Jan 11 2002 04:09:18
> > %ENV:
> >
>
>PERL5LIB="/home/aerives/genegrokker-interface/lib/perl5:/home/aerives/geneg
rokker-interface/ext/lib/perl5:/home/aerives/lib/perl5"
> > @INC:
> > /home/aerives/genegrokker-interface/lib/perl5
> > /home/aerives/genegrokker-interface/ext/lib/perl5
> > /home/aerives/lib/perl5
> > /usr/local/lib/perl/5.6.1
> > /usr/local/share/perl/5.6.1
> > /usr/lib/perl5
> > /usr/share/perl5
> > /usr/lib/perl/5.6.1
> > /usr/share/perl/5.6.1
> > /usr/local/lib/site_perl
> >
>
>
>
>