Mike M wrote:
> 
> Hi,

Hello,

> I'm new to Perl and have what I hope is a simple question:  I have a Perl
> script that parses a log file from our proxy server and reformats it to a
> more easily readable space-delimited text file.  I also have another file
> that has a categorized list of internet domains, also space-delimited.  A
> snippet of both text files is below:
> 
> Proxy Log
> ----snip----
> 10/23/2003 4:18:32 192.168.0.100 http://www.squid-cache.org OK
> 10/23/2003 4:18:33 192.168.1.150 http://msn.com OK
> 10/23/2003 4:18:33 192.168.1.150 http://www.playboy.com DENIED
> ----snip----
> 
> Categorized Domains List
> ----snip----
> msn.com news
> playboy.com porn
> squid-cache.com software
> ----snip----
> 
> What I would like to do is write a script that compares the URL in the proxy
> log with the categorized domains list file and creates a new file that looks
> something like this:
> 
> New File
> ----snip----
> 10/23/2003 4:18:32 192.168.0.100 http://www.squid-cache.org software OK
> 10/23/2003 4:18:33 192.168.1.150 http://msn.com news OK
> 10/23/2003 4:18:33 192.168.1.150 http://www.playboy.com porn DENIED
> ----snip----
> 
> Is this possible with Perl??  I've been trying to do this by importing the
> log files into SQL and then running queries, but it's so much slower than
> Perl (the proxy logs are roughly 1 million lines).  Any ideas?

You could do something like this:

#!/usr/bin/perl -w
use strict;

my $file = 'domains.txt';
my $log  = 'access.log';
my $out  = 'access.out';

my %domains = do {
    open my $fh, $file or die "Cannot open $file: $!";
    local $/;
    map split, <$fh>;
    };

my $search = qr/@{[ join '|', map "\Q$_", keys %domains ]}/i;

open OUT, ">> $out" or die "Cannot open $out: $!";
open FILE, $log or die "Cannot open $log: $!";

while ( <FILE> ) {
    s/\b($search)(?=\s+(?:OK|DENIED)$)/ $1 ? "$1 $domains{$1}" : $1 /e;
    print OUT;
    }

__END__


John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to