Re: [OT] anchoring regexp
On Mon, 10 Apr 2000, Jason Simms wrote: I have a question first, then some insight as to why you may be having the problem... First, is this Knoxville, TN? I lived there for 17 years of my life, and only recently (1.5 years ago) moved up to New York City. I left due to lack of businesses in Knoxville using Linux / UNIX and Perl, along with other more advanced Web technologies. I can say the market for hardcore UNIX / perl (what I do) is much stronger up here than down htere, but I am always interested in the possibility of moving back down there with my skills. As to your problem, I doubt people can be of much assistance (though we'll see) without seeing the regex and sample data. Perhaps if you resent that?? In any case, good luck. And perhaps, stay in touch, or put me on a mailing list, or something. Thanks! Well, there isn't really anything in Knoxville, but Oak Ridge is pretty busy. You have a good point. Sorry about the lack of code. Here is my analyzer_benchmark script that will show you the code. J. J. Horner Linux, Apache, Perl, Unix, Stronghold [EMAIL PROTECTED] http://www.knoxlug.org System has been up: 9 days. #!/usr/bin/perl -w # # This script is used to benchmark various algorithms for checking the log file. # It uses the Benchmark module and some sloppy coding. use Benchmark; my @internals = ("192.168","ornl.gov","134.167","199.201","128.219","198.124","198.207","160.91","198.136","198.148","fueleconomy.gov","172.17","172.20"); sub first { my $i; my @fields = ("134.167","","","","","GET","/","HTTP/1.0","404",""); my $source = $fields[0]; $fields[5] =~ s/\"//; $fields[7] =~ s/\"//; my $method = $fields[5]; my $uri = $fields[6]; my $protocol = $fields[7]; my $status = $fields[$#fields-1]; for ($i = 0; $i = $#internals ; $i++) { if ($internals[$i] =~ /$source/) { } } } sub anchored { my $i; my @fields = ("134.167","","","","","GET","/","HTTP/1.0","404",""); my $source = $fields[0]; $fields[5] =~ s/\"//; $fields[7] =~ s/\"//; my $method = $fields[5]; my $uri = $fields[6]; my $protocol = $fields[7]; my $request = join(" ",$method,$uri); my $status = $fields[$#fields-1]; for ($i = 0; $i = $#internals ; $i++) { if ($internals[$i] =~ /$source/) { } } } timethese(5000, { first = 'first()',anchored = 'anchored()', });
Re: [OT] anchoring regexp
"J. Horner" wrote: On Mon, 10 Apr 2000, Jason Simms wrote: As to your problem, I doubt people can be of much assistance (though we'll see) without seeing the regex and sample data. Perhaps if you resent that?? In any case, good luck. And perhaps, stay in touch, or put me on a mailing list, or something. Thanks! You have a good point. Sorry about the lack of code. Here is my analyzer_benchmark script that will show you the code. $ diff first anchored 1c1 sub first { --- sub anchored { 9a10 my $request = join(" ",$method,$uri); The only difference between your two benchmark subroutines are their names, and that the anchored one also composes your $request variable. Of course anchored will take a little longer -- it has one extra statement. -- Devin Ben-Hur | President / CTO | mailto:[EMAIL PROTECTED] The eMarket Group | eMerchandise.com | http://www.eMerchandise.com 503/944-5044 x228 | "Where do you want to go today?" "Confutatis maledictis, flammis acribus addictis" (The damned and accursed are convicted to the flames of hell)
Re: [OT] anchoring regexp
On Mon, 10 Apr 2000, Devin Ben-Hur wrote: $ diff first anchored 1c1 sub first { --- sub anchored { 9a10 my $request = join(" ",$method,$uri); The only difference between your two benchmark subroutines are their names, and that the anchored one also composes your $request variable. Of course anchored will take a little longer -- it has one extra statement. Sorry, it is a Monday. I attached the right file. J. J. Horner Linux, Apache, Perl, Unix, Stronghold [EMAIL PROTECTED] http://www.knoxlug.org System has been up: 9 days. #!/usr/bin/perl -w # # This script is used to benchmark various algorithms for checking the log file. # It uses the Benchmark module and some sloppy coding. use Benchmark; my @internals = ("192.168","ornl.gov","134.167","199.201","128.219","198.124","198.207","160.91","198.136","198.148","fueleconomy.gov","172.17","172.20"); sub first { my $i; my @fields = ("134.167.1.1","","","","","GET","/","HTTP/1.0","404",""); my $source = $fields[0]; $fields[5] =~ s/\"//; $fields[7] =~ s/\"//; my $method = $fields[5]; my $uri = $fields[6]; my $protocol = $fields[7]; my $status = $fields[$#fields-1]; for ($i = 0; $i = $#internals ; $i++) { if ($source =~ /$internals[$i]/) { } } } sub anchored { my $i; my @fields = ("134.167.1.1","","","","","GET","/","HTTP/1.0","404",""); my $source = $fields[0]; $fields[5] =~ s/\"//; $fields[7] =~ s/\"//; my $method = $fields[5]; my $uri = $fields[6]; my $protocol = $fields[7]; my $status = $fields[$#fields-1]; for ($i = 0; $i = $#internals ; $i++) { if ($source =~ /^$internals[$i]/) { } } } timethese(5000, { first = 'first()',anchored = 'anchored()', });
Re: [OT] anchoring regexp
"J. Horner" wrote: Sorry, it is a Monday. I attached the right file. -- Devin Ben-Hur | President / CTO | mailto:[EMAIL PROTECTED] The eMarket Group | eMerchandise.com | http://www.eMerchandise.com 503/944-5044 x228 | "Where do you want to go today?" "Confutatis maledictis, flammis acribus addictis" (The damned and accursed are convicted to the flames of hell)
Re: [OT] anchoring regexp
"J. Horner" wrote: Sorry, it is a Monday. I attached the right file. You're problem is that you have toomuch other junk in addition to the statements you're trying to compare. Also, the strings you're matching against are all so short that you wont see much difference between an anchored and unanchored regex. If you want to see the timing advantage demonstrated, try something like this instead: use Benchmark; my $iter = 1; my $listsize = 100; my @internals = (); sub make_rand_str (;$) { my $maxstr = shift || 200; my $str = ''; for (1 .. int(rand($maxstr)+1)) { $str .= chr( ord(' ') + int(rand(127-ord(' '))) ); } return $str; } for (1 .. $listsize) { push @internals, make_rand_str(); } sub unanchored { grep { /134\.167/ } @internals; } sub anchored { grep { /^134\.167/ } @internals; } timethese($iter, { unanchored = 'unanchored()', anchored = 'anchored()', }); -- Devin Ben-Hur | President / CTO | mailto:[EMAIL PROTECTED] The eMarket Group | eMerchandise.com | http://www.eMerchandise.com 503/944-5044 x228 | "Where do you want to go today?" "Confutatis maledictis, flammis acribus addictis" (The damned and accursed are convicted to the flames of hell)