SuccessInMind wrote:
> Hi all,
> 
> I have a little sorting dilemma.
> 
> I have combined two server access logs and I am wanting to sort the records
> by the Date/time field. I can perform simple sorts but this one is out of my
> expertise. The problem I am running into is that not only each record
> include different field separators but the date is also alphanumeric which
> compound the problem, but I am sure that there is a way to solve this issue.
> 
> Here is a sample of three records in the file - this is a rather large file
> so an efficient streamline routine would be needed.
> 
> 65.93.183.185 - - [25/Mar/2002:12:41:31 -0500] "GET / HTTP/1.1" 200 33140
> "-" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128
> Netscape6/6.2.1"
> 
> 65.93.183.185 - - [20/Mar/2002:12:41:31 -0500] "GET /includes/style.css
> HTTP/1.1" 200 3147 "-" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US;
> rv:0.9.4) Gecko/20011128 Netscape6/6.2.1"
> 
> 65.93.183.185 - - [26/Mar/2002:12:41:31 -0500] "GET
> /template/images/clear.gif HTTP/1.1" 200 43 "http://www.yahoo.com/";
> "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128
> Netscape6/6.2.1"
> 
> Does anyone have a snippet of code that would accomplish this?

If you have memory to do it in Perl (if not, you may need to create a file that
associates the line number to the epoch or convert the time to epoch and write
a new file and sort on that field with a disk sort program):

use strict;

my @lines = (
'65.93.183.185 - - [25/Mar/2002:12:41:31 -0500] "GET / HTTP/1.1" 200 33140 "-" 
"Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128 
Netscape6/6.2.1"',
'65.93.183.185 - - [20/Mar/2002:12:41:31 -0500] "GET /includes/style.css HTTP/1.1" 200 
3147 "-" "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:0.9.4) Gecko/20011128 
Netscape6/6.2.1"',
'65.93.183.185 - - [26/Mar/2002:12:41:31 -0500] "GET /template/images/clear.gif 
HTTP/1.1" 200 43 "http://www.yahoo.com/"; "Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; 
rv:0.9.4) Gecko/20011128 Netscape6/6.2.1"',
);

print $_, "\n" foreach @lines;
print "\n";
my @sorted = sort { log_to_epoch ($a) <=> log_to_epoch ($b) } @lines;
print $_, "\n" foreach @sorted;
exit;

#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sub log_to_epoch {              # $epoch = log_to_epoch ($log_line)
        my @l = split /\s+/, $_[0];
        my $date = $l[3];
        my $tz = $l[4];
        my %mons = ('jan' => 0, 'feb' => 1, 'mar' => 2, 'apr' => 3, 'may' => 4,
          'jun' => 5, 'jul' => 6, 'aug' => 7, 'sep' => 8, 'oct' => 9,
          'nov' => 10, 'dec' => 11);
        use Time::Local;

$date =~ s/^\[//; $tz =~ s/\]$//;
my @t = split ':', $date;
my @d = split /\//, $t[0];
$d[1] = $mons{lc $d[1]};

my $off = 0;
$tz =~ /(?:([+-])(\d{2,2})(\d{2,2}))|([A-Za-z]{1,3})/i;
if (defined $2 and defined $3) {
        $off = $2 * 3600 + $3 * 60;
        $off = 0 - $off if defined $1 and $1 eq '+';
} else {
        print "TZ error ($tz); date=$date; using GMT\n";
}
my $epoch = &timegm ($t[3], $t[2], $t[1], $d[0], $d[1], $d[2] - 1900);
$epoch += $off;

}

__END__



-- 
   ,-/-  __      _  _         $Bill Luebkert   ICQ=162126130
  (_/   /  )    // //       DBE Collectibles   Mailto:[EMAIL PROTECTED]
   / ) /--<  o // //      http://dbecoll.tripod.com/ (Free site for Perl)
-/-' /___/_<_</_</_     Castle of Medieval Myth & Magic http://www.todbe.com/

_______________________________________________
Perl-Unix-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to