Re: why a.pl is faster than b.pl

2005-12-29 Thread Bob Showalter

Jeff Pang wrote:

Hi,bob,

You said:

3. It will probably be faster to use a single regex of the format:

/pata|patb|patc|patd/


In fact maybe you are  wrong on this.


Darn. First time this year :-)


Based on my test case,the RE written as below:

/pata/ || /patb/ || /patc/ || /patd/

is much faster than yours.


OK. Perhaps its due to backtracking. Go with what works!

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: why a.pl is faster than b.pl

2005-12-29 Thread Chris Devers
On Thu, 29 Dec 2005, Bob Showalter wrote:

 Jeff Pang wrote:
  Hi,bob,
  
  You said:
  
  3. It will probably be faster to use a single regex of the format:
  
  /pata|patb|patc|patd/
  
  
  In fact maybe you are  wrong on this.
 
 Darn. First time this year :-)
 
  Based on my test case,the RE written as below:
  
  /pata/ || /patb/ || /patc/ || /patd/
  
  is much faster than yours.
 
 OK. Perhaps its due to backtracking. Go with what works!
 
Several Perl books, including _Mastering Regular Expressions_ and, if I 
remember correctly, _Learning Perl_, use variants of this example. In 
essence, yes, if you want to match one of several constant strings like 
this, the match will happen faster with a series of static regexes than 
it would wwith one compound regex with alternation. 


-- 
Chris Devers
DO NOT LEAVE IT IS NOT REAL

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: why a.pl is faster than b.pl

2005-12-28 Thread Chris Devers
On Wed, 28 Dec 2005, Jeff Pang wrote:

 Why the a.pl is faster than b.pl? I think ever the resulte should be 
 opposite.Thanks.

The easiest way to answer such questions is to benchmark and profile 
where the time in each script is being spent.

These two scripts are so different in composition that it isn't 
immediately obvious to me how they're similar or dis-similar.

You have two approaches you can try for answering such questions:

* have two nearly identical scripts, and measure how the small
  different part impacts performance.

* break each script into components and measure how long each
  component takes to complete its task.

These approaches can be intermixed as needed, but it's up to you to do 
the fundamental measuring of your code for yourself. 

Distill the question down to something clearer -- why is statement (or 
subroutine) A faster than statement B while having the same result -- 
and you may find more concrete advice from the list members.


-- 
Chris Devers
DO NOT LEAVE IT IS NOT REAL

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: why a.pl is faster than b.pl

2005-12-28 Thread Bob Showalter

Jeff Pang wrote:

hi,lists,

I have two perl scripts as following:

a.pl:

#!/usr/bin/perl
use strict;

my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*;

foreach my $log (@logs) {
open (HD,$log) or die $!;
while(HD){

if ( 
($_ =~ /×¢²á/o) || 
($_ =~ /Õ÷ÎÄ/o) || 
($_ =~ /Ê¥µ®¿ìÀÖ/) || 
($_ =~ /ӦƸ/o) || 
($_ =~ /�ø�¨/o) || 
($_ =~ /·¢»õ/o) || 
($_ =~ /±±¾©/o) || 
($_ =~ /×Ê��/o) || 
($_ =~ /�Å�¢/o) || 
($_ =~ /�ãɽ/o) || 
($_ =~ /°Ù�ò/o) || 
($_ =~ /Ãâ·Ñ/o) )  {

print $_;
   }
 }
close HD;
}


b.pl

#!/usr/bin/perl
use strict;

  my $ref = sub { $_[0] =~ /×¢²á/o || $_[0] =~ /Õ÷ÎÄ/o || $_[0] =~ 
/Ê¥µ®¿ìÀÖ/o ||
  $_[0] =~ /ӦƸ/o || $_[0] =~ /�ø�¨/o || $_[0] =~ 
/·¢»õ/o ||
  $_[0] =~ /±±¾©/o || $_[0] =~ /×Ê��/o || $_[0] =~ 
/�Å�¢/o ||
  $_[0] =~ /�ãɽ/o || $_[0] =~ /°Ù�ò/o || $_[0] =~ 
/Ãâ·Ñ/o };


my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*;

foreach my $log (@logs) {
open (HD,$log) or die $!;
while(HD){
print if $ref-($_);
  }
close HD;
}


I run the 'time' command to get the running speed:

time perl a.pl  /dev/null 


real0m0.190s
user0m0.181s
sys 0m0.008s


time perl b.pl  /dev/null 


real0m0.286s
user0m0.278s
sys 0m0.007s


Why the a.pl is faster than b.pl? I think ever the resulte should be 
opposite.Thanks.



Well, the time differences aren't dramatic. But off hand, I would say 
that a.pl is faster because no subroutine call is involved.


A couple of other observations:

1. /o is useless on these regexes, since they don't interpolate any 
variables.


2. $_ is the default target for the m// operator, so

   $_ =~ /regex/

can be replaced with simply

   /regex/

3. It will probably be faster to use a single regex of the format:

   /pata|patb|patc|patd/

If the alternation can stay inside the regex code rather than happening 
   out at the Perl opcode level, it might be faster.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response




Re: why a.pl is faster than b.pl

2005-12-28 Thread Jeff Pang
Hi,bob,

You said:

3. It will probably be faster to use a single regex of the format:

/pata|patb|patc|patd/


In fact maybe you are  wrong on this.Based on my test case,the RE written as 
below:

/pata/ || /patb/ || /patc/ || /patd/

is much faster than yours.


-Original Message-
From: Bob Showalter [EMAIL PROTECTED]
Sent: Dec 29, 2005 2:54 AM
To: Jeff Pang [EMAIL PROTECTED]
Cc: beginners@perl.org
Subject: Re: why a.pl is faster than b.pl

Jeff Pang wrote:
 hi,lists,
 
 I have two perl scripts as following:
 
 a.pl:
 
 #!/usr/bin/perl
 use strict;
 
 my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*;
 
 foreach my $log (@logs) {
 open (HD,$log) or die $!;
 while(HD){
 
 if ( 
 ($_ =~ /?¢²á/o) || 
 ($_ =~ /?÷??/o) || 
 ($_ =~ /?¥µ®¿ì??/) || 
 ($_ =~ /?¦?¸/o) || 
 ($_ =~ /?ø?¨/o) || 
 ($_ =~ /·¢»õ/o) || 
 ($_ =~ /±±¾©/o) || 
 ($_ =~ //o) || 
 ($_ =~ /???¢/o) || 
 ($_ =~ /?ã?½/o) || 
 ($_ =~ /°??ò/o) || 
 ($_ =~ /?â·?/o) )  {
 print $_;
}
  }
 close HD;
 }
 
 
 b.pl
 
 #!/usr/bin/perl
 use strict;
 
   my $ref = sub { $_[0] =~ /?¢²á/o || $_[0] =~ /?÷??/o || $_[0] =~ 
 /?¥µ®¿ì??/o ||
   $_[0] =~ /?¦?¸/o || $_[0] =~ /?ø?¨/o || $_[0] =~ 
 /·¢»õ/o ||
   $_[0] =~ /±±¾©/o || $_[0] =~ //o || $_[0] =~ 
 /???¢/o ||
   $_[0] =~ /?ã?½/o || $_[0] =~ /°??ò/o || $_[0] =~ 
 /?â·?/o };
 
 
 my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*;
 
 foreach my $log (@logs) {
 open (HD,$log) or die $!;
 while(HD){
 print if $ref-($_);
   }
 close HD;
 }
 
 
 I run the 'time' command to get the running speed:
 
 time perl a.pl  /dev/null 
 
 real0m0.190s
 user0m0.181s
 sys 0m0.008s
 
 
 time perl b.pl  /dev/null 
 
 real0m0.286s
 user0m0.278s
 sys 0m0.007s
 
 
 Why the a.pl is faster than b.pl? I think ever the resulte should be 
 opposite.Thanks.
 

Well, the time differences aren't dramatic. But off hand, I would say 
that a.pl is faster because no subroutine call is involved.

A couple of other observations:

1. /o is useless on these regexes, since they don't interpolate any 
variables.

2. $_ is the default target for the m// operator, so

$_ =~ /regex/

can be replaced with simply

/regex/

3. It will probably be faster to use a single regex of the format:

/pata|patb|patc|patd/

If the alternation can stay inside the regex code rather than happening 
out at the Perl opcode level, it might be faster.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/ http://learn.perl.org/first-response