Phil Miller am Freitag, 27. Oktober 2006 15:36: > I am working on my very first program and have run into a bit of a > roadblock. I am trying to print a report of users who show up in an IIS > Log file. The good news is that the format of the userid is > WINDOWSDOMAIN\USERID. The bad news is that it is not always at the same > place in the IIS Log file due to some variable length fields that come > before it. Its location can vary left or right by about 10 bytes. > > > > I read the IIS Log file in one line at a time. I have gotten far enough > that I can identify the lines with WINDOWSDOMAIN on it, but am stuck > there. The code $userid = substr($logfile_in, 33, 12); gets me close > but depending on the length of the date, the time or the IP address, it > is usually off by a few bytes. A sample of the input is below to > explain what I am talking about. > > > > 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 > GET /itd/styles/main.css > > 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 > GET /itd/styles/contents.aspx > > 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 > GET /itd/styles/footer.aspx > > > > Essentially what I need to do is find the WINDOWSDOMAIN on a line, and > write to a file the matched string plus \USERID data (up to the next > space). Does anyone have any suggestions? I'm thinking there must be > some very easy way to do it since Perl is made for this sort of thing. > I remember reading about some Perl built-in capability that would take a > scalar variable and parse it into an array based on a delimiter, but I > can't remember what it is. That would probably do it for me. But if > you know of a better way, I'm all ears.
Here's demonstration code how you can do it with a regex or with split. The code assumes that the GET line and the line above are on one line in the log. The two demonstration subs return 1 on match and 0 otherwise, so the counter can be updated by the subs' return value. The $miss_counter is calculated only once, from the hits and the number of lines read. The data after __DATA__ may be wrapped by your mail client (4 lines). I'm not sure if "WINDOWSDOMAIN" is meant as a hardcoded constant. #!/usr/bin/perl use strict; use warnings; # see perldoc perlre # sub do_regex { $_=shift; if (m; \w+ \\ (\w+) .* \s/itd/ ;ix) { # NOT OPTIMAL! print "userid (regex): $1\n"; return 1; } return 0; } # see perldoc -f split # sub do_split { $_=shift; my @parts=split; if ($parts[7]=~m;/itd/;i) { if ( my ($domain, $userid)=split m;\\;, $parts[3] ) { print "userid (split): $userid\n"; return 1; } } return 0; } my $hit_counter=0; while (<DATA>) { $hit_counter+=do_regex($_); do_split($_); } my $miss_counter=$.- $hit_counter; print "hits: $hit_counter / missed: $miss_counter / read: $. lines\n" __DATA__ 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 GET /itd/styles/main.css blubb blubb foo bar dummy asdf 44 44 55 66 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 GET /itd/styles/contents.aspx 2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80 GET /itd/styles/footer.aspx ============== Some random annotations to your code (there are others as well), UNTESTED: > Below is the code I am using. # with the following statements your life will be easier! # use strict; use warnings; > open USERIDOUT, ">userid.out.txt"; # perldoc -f open # perldoc perlvar # open my $outf, '>', 'userid.out.txt' or die $!; > open IISLOG, "<ex061023.log"; open my $log, '<', 'ex061023.log' or die $!; > $ctr = 0; > $hit_counter = 0; > $miss_counter = 0; > $logfile_in; > $userid; Put "my" in front of all these declarations/definitions. > while (<IISLOG>) while (<$log>) > { > $logfile_in = $_; > if ( ($logfile_in =~ m/WINDOWSDOMAIN/i && $logfile_in =~ > m/itd/i) I think you can omit on () pair here. > ) > { > print "\n** Found success\n"; > $hit_counter += 1; # same as # $hit_counter++; > $userid = substr($logfile_in, 33, 12); > # This is not correct but is somewhat close > print "\n", $userid; > } > else > { > print "Did not find success\n"; > $miss_counter += 1; > } > } > print "\n Hit Counter = ", $hit_counter; > print "\n Miss Counter = ", $miss_counter; > print "\n Total Records Counter = ", $hit_counter + $miss_counter; > > close USERIDOUT; close $outf or die $!; > close IISLOG; close $log or die $!; Dani -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>