Re: [OT] anchoring regexp

2000-04-10 Thread J. Horner

On Mon, 10 Apr 2000, Jason Simms wrote:

 I have a question first, then some insight as to why you may be having the 
 problem...  First, is this Knoxville, TN?  I lived there for 17 years of my 
 life, and only recently (1.5 years ago) moved up to New York City.  I left 
 due to lack of businesses in Knoxville using Linux / UNIX and Perl, along 
 with other more advanced Web technologies.  I can say the market for 
 hardcore UNIX / perl (what I do) is much stronger up here than down htere, 
 but I am always interested in the possibility of moving back down there with 
 my skills.
 
 As to your problem, I doubt people can be of much assistance (though we'll 
 see) without seeing the regex and sample data.  Perhaps if you resent that??
 
 In any case, good luck.  And perhaps, stay in touch, or put me on a mailing 
 list, or something.  Thanks!

Well, there isn't really anything in Knoxville, but Oak Ridge is pretty
busy.

You have a good point.  Sorry about the lack of code.

Here is my analyzer_benchmark script that will show you the code.

J. J. Horner
Linux, Apache, Perl, Unix, Stronghold
[EMAIL PROTECTED] http://www.knoxlug.org
System has been up: 9 days.



#!/usr/bin/perl -w
#
# This script is used to benchmark various algorithms for checking the log file. 
# It uses the Benchmark module and some sloppy coding.

use Benchmark;

my @internals = 
("192.168","ornl.gov","134.167","199.201","128.219","198.124","198.207","160.91","198.136","198.148","fueleconomy.gov","172.17","172.20");


sub first {
my $i;
my @fields = ("134.167","","","","","GET","/","HTTP/1.0","404","");
my $source = $fields[0];
$fields[5] =~ s/\"//;
$fields[7] =~ s/\"//;
my $method = $fields[5];
my $uri = $fields[6];
my $protocol = $fields[7];
my $status = $fields[$#fields-1];
for ($i = 0; $i = $#internals ; $i++) {
if ($internals[$i] =~ /$source/) {
}
}
}

sub anchored {
my $i;
my @fields = ("134.167","","","","","GET","/","HTTP/1.0","404","");
my $source = $fields[0];
$fields[5] =~ s/\"//;
$fields[7] =~ s/\"//;
my $method = $fields[5];
my $uri = $fields[6];
my $protocol = $fields[7];
my $request = join(" ",$method,$uri);
my $status = $fields[$#fields-1];
for ($i = 0; $i = $#internals ; $i++) {
if ($internals[$i] =~ /$source/) {
}
}
}

timethese(5000, { first = 'first()',anchored = 'anchored()',  });



Re: [OT] anchoring regexp

2000-04-10 Thread Devin Ben-Hur

"J. Horner" wrote:
 On Mon, 10 Apr 2000, Jason Simms wrote:
  As to your problem, I doubt people can be of much assistance (though we'll
  see) without seeing the regex and sample data.  Perhaps if you resent that??
 
  In any case, good luck.  And perhaps, stay in touch, or put me on a mailing
  list, or something.  Thanks!

 You have a good point.  Sorry about the lack of code.
 
 Here is my analyzer_benchmark script that will show you the code.

$ diff first anchored
1c1
 sub first {
---
 sub anchored {
9a10
 my $request = join(" ",$method,$uri);

The only difference between your two benchmark subroutines are their
names, and that the anchored one also composes your $request variable. 
Of course anchored will take a little longer -- it has one extra
statement.

-- 
Devin Ben-Hur | President / CTO  | mailto:[EMAIL PROTECTED]
The eMarket Group | eMerchandise.com | http://www.eMerchandise.com
503/944-5044 x228 | 
"Where do you want to go today?"
   "Confutatis maledictis, flammis acribus addictis"
   (The damned and accursed are convicted to the flames of hell)



Re: [OT] anchoring regexp

2000-04-10 Thread J. Horner

On Mon, 10 Apr 2000, Devin Ben-Hur wrote:

 $ diff first anchored
 1c1
  sub first {
 ---
  sub anchored {
 9a10
  my $request = join(" ",$method,$uri);
 
 The only difference between your two benchmark subroutines are their
 names, and that the anchored one also composes your $request variable. 
 Of course anchored will take a little longer -- it has one extra
 statement.
 
 

Sorry, it is a Monday.  I attached the right file.

J. J. Horner
Linux, Apache, Perl, Unix, Stronghold
[EMAIL PROTECTED] http://www.knoxlug.org
System has been up: 9 days.


#!/usr/bin/perl -w
#
# This script is used to benchmark various algorithms for checking the log file. 
# It uses the Benchmark module and some sloppy coding.

use Benchmark;

my @internals = 
("192.168","ornl.gov","134.167","199.201","128.219","198.124","198.207","160.91","198.136","198.148","fueleconomy.gov","172.17","172.20");


sub first {
my $i;
my @fields = ("134.167.1.1","","","","","GET","/","HTTP/1.0","404","");
my $source = $fields[0];
$fields[5] =~ s/\"//;
$fields[7] =~ s/\"//;
my $method = $fields[5];
my $uri = $fields[6];
my $protocol = $fields[7];
my $status = $fields[$#fields-1];
for ($i = 0; $i = $#internals ; $i++) {
if ($source =~ /$internals[$i]/) {
}
}
}

sub anchored {
my $i;
my @fields = ("134.167.1.1","","","","","GET","/","HTTP/1.0","404","");
my $source = $fields[0];
$fields[5] =~ s/\"//;
$fields[7] =~ s/\"//;
my $method = $fields[5];
my $uri = $fields[6];
my $protocol = $fields[7];
my $status = $fields[$#fields-1];
for ($i = 0; $i = $#internals ; $i++) {
if ($source =~ /^$internals[$i]/) {
}
}
}

timethese(5000, { first = 'first()',anchored = 'anchored()',  });



Re: [OT] anchoring regexp

2000-04-10 Thread Devin Ben-Hur

"J. Horner" wrote:
 Sorry, it is a Monday.  I attached the right file.




-- 
Devin Ben-Hur | President / CTO  | mailto:[EMAIL PROTECTED]
The eMarket Group | eMerchandise.com | http://www.eMerchandise.com
503/944-5044 x228 | 
"Where do you want to go today?"
   "Confutatis maledictis, flammis acribus addictis"
   (The damned and accursed are convicted to the flames of hell)



Re: [OT] anchoring regexp

2000-04-10 Thread Devin Ben-Hur

"J. Horner" wrote:
 Sorry, it is a Monday.  I attached the right file.

You're problem is that you have toomuch other junk in addition to the
statements you're trying to compare.  Also, the strings you're matching
against are all so short that you wont see much difference between an
anchored and unanchored regex.

If you want to see the timing advantage demonstrated, try something like
this instead:

use Benchmark;

my $iter = 1;
my $listsize = 100;
my @internals = ();

sub make_rand_str (;$) {
my $maxstr = shift || 200;
my $str = '';
for (1 .. int(rand($maxstr)+1)) {
$str .= chr( ord(' ') + int(rand(127-ord(' '))) );
}
return $str;
}

for (1 .. $listsize) { push @internals, make_rand_str(); }

sub unanchored { grep { /134\.167/ } @internals; }
sub anchored   { grep { /^134\.167/ } @internals; }

timethese($iter, { unanchored = 'unanchored()', anchored =
'anchored()',  });

-- 
Devin Ben-Hur | President / CTO  | mailto:[EMAIL PROTECTED]
The eMarket Group | eMerchandise.com | http://www.eMerchandise.com
503/944-5044 x228 | 
"Where do you want to go today?"
   "Confutatis maledictis, flammis acribus addictis"
   (The damned and accursed are convicted to the flames of hell)