SuccessInMind wrote: > Hi all, > > I have a little sorting dilemma. > > I have combined two server access logs and I am wanting to sort the records > by the Date/time field. I can perform simple sorts but this one is out of my > expertise. The problem I am running into is that not only each record > include different field separators but the date is also alphanumeric which > compound the problem, but I am sure that there is a way to solve this issue. > > Here is a sample of three records in the file - this is a rather large file > so an efficient streamline routine would be needed. > > 65.93.183.185 - - [25/Mar/2002:12:41:31 -0500] "GET / HTTP/1.1" 200 33140 > "-" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128 > Netscape6/6.2.1" > > 65.93.183.185 - - [20/Mar/2002:12:41:31 -0500] "GET /includes/style.css > HTTP/1.1" 200 3147 "-" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; > rv:0.9.4) Gecko/20011128 Netscape6/6.2.1" > > 65.93.183.185 - - [26/Mar/2002:12:41:31 -0500] "GET > /template/images/clear.gif HTTP/1.1" 200 43 "http://www.yahoo.com/" > "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128 > Netscape6/6.2.1" > > Does anyone have a snippet of code that would accomplish this?
If you have memory to do it in Perl (if not, you may need to create a file that associates the line number to the epoch or convert the time to epoch and write a new file and sort on that field with a disk sort program): use strict; my @lines = ( '65.93.183.185 - - [25/Mar/2002:12:41:31 -0500] "GET / HTTP/1.1" 200 33140 "-" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1"', '65.93.183.185 - - [20/Mar/2002:12:41:31 -0500] "GET /includes/style.css HTTP/1.1" 200 3147 "-" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1"', '65.93.183.185 - - [26/Mar/2002:12:41:31 -0500] "GET /template/images/clear.gif HTTP/1.1" 200 43 "http://www.yahoo.com/" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1"', ); print $_, "\n" foreach @lines; print "\n"; my @sorted = sort { log_to_epoch ($a) <=> log_to_epoch ($b) } @lines; print $_, "\n" foreach @sorted; exit; #- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sub log_to_epoch { # $epoch = log_to_epoch ($log_line) my @l = split /\s+/, $_[0]; my $date = $l[3]; my $tz = $l[4]; my %mons = ('jan' => 0, 'feb' => 1, 'mar' => 2, 'apr' => 3, 'may' => 4, 'jun' => 5, 'jul' => 6, 'aug' => 7, 'sep' => 8, 'oct' => 9, 'nov' => 10, 'dec' => 11); use Time::Local; $date =~ s/^\[//; $tz =~ s/\]$//; my @t = split ':', $date; my @d = split /\//, $t[0]; $d[1] = $mons{lc $d[1]}; my $off = 0; $tz =~ /(?:([+-])(\d{2,2})(\d{2,2}))|([A-Za-z]{1,3})/i; if (defined $2 and defined $3) { $off = $2 * 3600 + $3 * 60; $off = 0 - $off if defined $1 and $1 eq '+'; } else { print "TZ error ($tz); date=$date; using GMT\n"; } my $epoch = &timegm ($t[3], $t[2], $t[1], $d[0], $d[1], $d[2] - 1900); $epoch += $off; } __END__ -- ,-/- __ _ _ $Bill Luebkert ICQ=162126130 (_/ / ) // // DBE Collectibles Mailto:[EMAIL PROTECTED] / ) /--< o // // http://dbecoll.tripod.com/ (Free site for Perl) -/-' /___/_<_</_</_ Castle of Medieval Myth & Magic http://www.todbe.com/ _______________________________________________ Perl-Unix-Users mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs