Phil Miller am Freitag, 27. Oktober 2006 15:36:
> I am working on my very first program and have run into a bit of a
> roadblock. I am trying to print a report of users who show up in an IIS
> Log file. The good news is that the format of the userid is
> WINDOWSDOMAIN\USERID. The bad news is that it is not always at the same
> place in the IIS Log file due to some variable length fields that come
> before it. Its location can vary left or right by about 10 bytes.
>
>
>
> I read the IIS Log file in one line at a time. I have gotten far enough
> that I can identify the lines with WINDOWSDOMAIN on it, but am stuck
> there. The code $userid = substr($logfile_in, 33, 12); gets me close
> but depending on the length of the date, the time or the IP address, it
> is usually off by a few bytes. A sample of the input is below to
> explain what I am talking about.
>
>
>
> 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
> GET /itd/styles/main.css
>
> 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
> GET /itd/styles/contents.aspx
>
> 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
> GET /itd/styles/footer.aspx
>
>
>
> Essentially what I need to do is find the WINDOWSDOMAIN on a line, and
> write to a file the matched string plus \USERID data (up to the next
> space). Does anyone have any suggestions? I'm thinking there must be
> some very easy way to do it since Perl is made for this sort of thing.
> I remember reading about some Perl built-in capability that would take a
> scalar variable and parse it into an array based on a delimiter, but I
> can't remember what it is. That would probably do it for me. But if
> you know of a better way, I'm all ears.
Here's demonstration code how you can do it with a regex or with split.
The code assumes that the GET line and the line above are on one line in the
log.
The two demonstration subs return 1 on match and 0 otherwise, so the counter
can be updated by the subs' return value.
The $miss_counter is calculated only once, from the hits and the number of
lines read.
The data after __DATA__ may be wrapped by your mail client (4 lines).
I'm not sure if "WINDOWSDOMAIN" is meant as a hardcoded constant.
#!/usr/bin/perl
use strict;
use warnings;
# see perldoc perlre
#
sub do_regex {
$_=shift;
if (m; \w+ \\ (\w+) .* \s/itd/ ;ix) { # NOT OPTIMAL!
print "userid (regex): $1\n";
return 1;
}
return 0;
}
# see perldoc -f split
#
sub do_split {
$_=shift;
my @parts=split;
if ($parts[7]=~m;/itd/;i) {
if ( my ($domain, $userid)=split m;\\;, $parts[3] ) {
print "userid (split): $userid\n";
return 1;
}
}
return 0;
}
my $hit_counter=0;
while (<DATA>) {
$hit_counter+=do_regex($_);
do_split($_);
}
my $miss_counter=$.- $hit_counter;
print "hits: $hit_counter / missed: $miss_counter / read: $. lines\n"
__DATA__
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 GET
/itd/styles/main.css
blubb blubb foo bar dummy asdf 44 44 55 66
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 GET
/itd/styles/contents.aspx
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 GET
/itd/styles/footer.aspx
==============
Some random annotations to your code (there are others as well),
UNTESTED:
> Below is the code I am using.
# with the following statements your life will be easier!
#
use strict; use warnings;
> open USERIDOUT, ">userid.out.txt";
# perldoc -f open
# perldoc perlvar
#
open my $outf, '>', 'userid.out.txt' or die $!;
> open IISLOG, "<ex061023.log";
open my $log, '<', 'ex061023.log' or die $!;
> $ctr = 0;
> $hit_counter = 0;
> $miss_counter = 0;
> $logfile_in;
> $userid;
Put "my" in front of all these declarations/definitions.
> while (<IISLOG>)
while (<$log>)
> {
> $logfile_in = $_;
> if ( ($logfile_in =~ m/WINDOWSDOMAIN/i && $logfile_in =~
> m/itd/i)
I think you can omit on () pair here.
> )
> {
> print "\n** Found success\n";
> $hit_counter += 1;
# same as
#
$hit_counter++;
> $userid = substr($logfile_in, 33, 12);
> # This is not correct but is somewhat close
> print "\n", $userid;
> }
> else
> {
> print "Did not find success\n";
> $miss_counter += 1;
> }
> }
> print "\n Hit Counter = ", $hit_counter;
> print "\n Miss Counter = ", $miss_counter;
> print "\n Total Records Counter = ", $hit_counter + $miss_counter;
>
> close USERIDOUT;
close $outf or die $!;
> close IISLOG;
close $log or die $!;
Dani
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>