Re: why a.pl is faster than b.pl
Jeff Pang wrote: Hi,bob, You said: 3. It will probably be faster to use a single regex of the format: /pata|patb|patc|patd/ In fact maybe you are wrong on this. Darn. First time this year :-) Based on my test case,the RE written as below: /pata/ || /patb/ || /patc/ || /patd/ is much faster than yours. OK. Perhaps its due to backtracking. Go with what works! -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: why a.pl is faster than b.pl
On Thu, 29 Dec 2005, Bob Showalter wrote: Jeff Pang wrote: Hi,bob, You said: 3. It will probably be faster to use a single regex of the format: /pata|patb|patc|patd/ In fact maybe you are wrong on this. Darn. First time this year :-) Based on my test case,the RE written as below: /pata/ || /patb/ || /patc/ || /patd/ is much faster than yours. OK. Perhaps its due to backtracking. Go with what works! Several Perl books, including _Mastering Regular Expressions_ and, if I remember correctly, _Learning Perl_, use variants of this example. In essence, yes, if you want to match one of several constant strings like this, the match will happen faster with a series of static regexes than it would wwith one compound regex with alternation. -- Chris Devers DO NOT LEAVE IT IS NOT REAL -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: why a.pl is faster than b.pl
On Wed, 28 Dec 2005, Jeff Pang wrote: Why the a.pl is faster than b.pl? I think ever the resulte should be opposite.Thanks. The easiest way to answer such questions is to benchmark and profile where the time in each script is being spent. These two scripts are so different in composition that it isn't immediately obvious to me how they're similar or dis-similar. You have two approaches you can try for answering such questions: * have two nearly identical scripts, and measure how the small different part impacts performance. * break each script into components and measure how long each component takes to complete its task. These approaches can be intermixed as needed, but it's up to you to do the fundamental measuring of your code for yourself. Distill the question down to something clearer -- why is statement (or subroutine) A faster than statement B while having the same result -- and you may find more concrete advice from the list members. -- Chris Devers DO NOT LEAVE IT IS NOT REAL -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: why a.pl is faster than b.pl
Jeff Pang wrote: hi,lists, I have two perl scripts as following: a.pl: #!/usr/bin/perl use strict; my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*; foreach my $log (@logs) { open (HD,$log) or die $!; while(HD){ if ( ($_ =~ /×¢²á/o) || ($_ =~ /Õ÷ÎÄ/o) || ($_ =~ /Ê¥µ®¿ìÀÖ/) || ($_ =~ /ӦƸ/o) || ($_ =~ /�ø�¨/o) || ($_ =~ /·¢»õ/o) || ($_ =~ /±±¾©/o) || ($_ =~ /×Ê��/o) || ($_ =~ /�Å�¢/o) || ($_ =~ /�ãɽ/o) || ($_ =~ /°Ù�ò/o) || ($_ =~ /Ãâ·Ñ/o) ) { print $_; } } close HD; } b.pl #!/usr/bin/perl use strict; my $ref = sub { $_[0] =~ /×¢²á/o || $_[0] =~ /Õ÷ÎÄ/o || $_[0] =~ /Ê¥µ®¿ìÀÖ/o || $_[0] =~ /ӦƸ/o || $_[0] =~ /�ø�¨/o || $_[0] =~ /·¢»õ/o || $_[0] =~ /±±¾©/o || $_[0] =~ /×Ê��/o || $_[0] =~ /�Å�¢/o || $_[0] =~ /�ãɽ/o || $_[0] =~ /°Ù�ò/o || $_[0] =~ /Ãâ·Ñ/o }; my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*; foreach my $log (@logs) { open (HD,$log) or die $!; while(HD){ print if $ref-($_); } close HD; } I run the 'time' command to get the running speed: time perl a.pl /dev/null real0m0.190s user0m0.181s sys 0m0.008s time perl b.pl /dev/null real0m0.286s user0m0.278s sys 0m0.007s Why the a.pl is faster than b.pl? I think ever the resulte should be opposite.Thanks. Well, the time differences aren't dramatic. But off hand, I would say that a.pl is faster because no subroutine call is involved. A couple of other observations: 1. /o is useless on these regexes, since they don't interpolate any variables. 2. $_ is the default target for the m// operator, so $_ =~ /regex/ can be replaced with simply /regex/ 3. It will probably be faster to use a single regex of the format: /pata|patb|patc|patd/ If the alternation can stay inside the regex code rather than happening out at the Perl opcode level, it might be faster. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: why a.pl is faster than b.pl
Hi,bob, You said: 3. It will probably be faster to use a single regex of the format: /pata|patb|patc|patd/ In fact maybe you are wrong on this.Based on my test case,the RE written as below: /pata/ || /patb/ || /patc/ || /patd/ is much faster than yours. -Original Message- From: Bob Showalter [EMAIL PROTECTED] Sent: Dec 29, 2005 2:54 AM To: Jeff Pang [EMAIL PROTECTED] Cc: beginners@perl.org Subject: Re: why a.pl is faster than b.pl Jeff Pang wrote: hi,lists, I have two perl scripts as following: a.pl: #!/usr/bin/perl use strict; my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*; foreach my $log (@logs) { open (HD,$log) or die $!; while(HD){ if ( ($_ =~ /?¢²á/o) || ($_ =~ /?÷??/o) || ($_ =~ /?¥µ®¿ì??/) || ($_ =~ /?¦?¸/o) || ($_ =~ /?ø?¨/o) || ($_ =~ /·¢»õ/o) || ($_ =~ /±±¾©/o) || ($_ =~ //o) || ($_ =~ /???¢/o) || ($_ =~ /?ã?½/o) || ($_ =~ /°??ò/o) || ($_ =~ /?â·?/o) ) { print $_; } } close HD; } b.pl #!/usr/bin/perl use strict; my $ref = sub { $_[0] =~ /?¢²á/o || $_[0] =~ /?÷??/o || $_[0] =~ /?¥µ®¿ì??/o || $_[0] =~ /?¦?¸/o || $_[0] =~ /?ø?¨/o || $_[0] =~ /·¢»õ/o || $_[0] =~ /±±¾©/o || $_[0] =~ //o || $_[0] =~ /???¢/o || $_[0] =~ /?ã?½/o || $_[0] =~ /°??ò/o || $_[0] =~ /?â·?/o }; my @logs = glob ~/logs/rcptstat/da2005_12_28/da.127.0.0.1.*; foreach my $log (@logs) { open (HD,$log) or die $!; while(HD){ print if $ref-($_); } close HD; } I run the 'time' command to get the running speed: time perl a.pl /dev/null real0m0.190s user0m0.181s sys 0m0.008s time perl b.pl /dev/null real0m0.286s user0m0.278s sys 0m0.007s Why the a.pl is faster than b.pl? I think ever the resulte should be opposite.Thanks. Well, the time differences aren't dramatic. But off hand, I would say that a.pl is faster because no subroutine call is involved. A couple of other observations: 1. /o is useless on these regexes, since they don't interpolate any variables. 2. $_ is the default target for the m// operator, so $_ =~ /regex/ can be replaced with simply /regex/ 3. It will probably be faster to use a single regex of the format: /pata|patb|patc|patd/ If the alternation can stay inside the regex code rather than happening out at the Perl opcode level, it might be faster. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response