I've wrote a program (before I knew much.. not that I know more now mind you
*smirk*) that reads in 10 data files that contain student computer use
information and creates a report for each student. The original program only
took seconds to execute at first, but as the file sizes have grown to >25MB
it is bringing the system it runs on to its knees because it "slurps" all of
the data to memory. (Because one student might use several computers during
the semester).
I'm looking for ideas to solve this issue... I was thinking of maybe using
temp files to cut down on the overhead needed. Something along the lines of
a foreach $student {write a temp file} type of thing that I can then have
the program return to later to actually sort.
Here's a snippet of some of the code that I know is at issue. I took out
some stuff in the while loop that isn't related . Any comments would be
appreciated.
foreach my $file (@files){
read_dbf($file);
}
sub read_dbf { #read the .dbf file
my $file = shift;
my $sth = $dbh->prepare("SELECT * FROM $data_dir/$file")
or die $dbh->errstr();
$sth->execute()
or die $sth->errstr();
while (my @data = $sth->fetchrow_array()) {
#here's the part that puts all of it into memory
my %fields = (
comp_name => $data[0],
user_name => $data[1],
exe_file => $data[2],
start_date => $data[6],
start_time => $data[7],
duration => $data[8]
);
push @LoH, { %fields };
$users{$data[1]} = ' ';
$days{$data[6]} = ' ';
}#END while
}#END read_dbf()