I wrote a small script that uses message ID's as unique values and extracts recipient address info. The goal is to count 1019 events per message ID. It also gets the sum of recipients per message ID. The script works fine but when it runs against a very large file (2GB+) I receive an out of memory error.
Is there a more efficient way of handling the hash portion that is less memory intense and preferably faster? --Paul # Tracking log parser use strict; my $recips; my %event_id; my $counter; my $total_recips; my $count; # Get log file die "You must enter a tracking log. \n" if $#ARGV <0; my $logfile = shift; open (LOGFILE, $logfile) || die "Unable to open $logfile because\n $!\n"; foreach (<LOGFILE>) { next if /^#/; #skip any comment lines that contain the pound sign. my @fields = split (/\t/, $_); #split the line by tabs $recips = $fields[13]; # Number or recipients column my $message_id = $fields[9]; # message ID if ($fields[8] == "1019") { $event_id{$message_id}++ unless exists $event_id{$message_id}; $counter++; $total_recips = ($total_recips + $recips); } close LOGFILE; } print "\n\nTotal instances of 1019 events in \"$logfile\" is $counter.\n\n"; print "\nTotal single instances of 1019 event per message ID is "; #print keys %event_id; foreach my $key (keys (%event_id)) { $count ++; } print $count; print "\n\nTotal # of recipients per message ID is "; print $total_recips;