Dear all,
the script below count word occurences in input file. It uses simple
hash structure to store unique words and its frequencies.
--------------------
use strict;
my %words;
while (<>) {
chop;
foreach my $wd (split) {
$words{$wd}++;
}
}
foreach my $w (keys %words) {
print "$w|$words{$w}\n";
}
--------------------
In order to process large amounts of data (10.000.000 lines) and to
avoid memory problems I use DB_File module to store hash %words into
local file and than read data from it.
--------------------
use strict;
use DB_File;
tie my %words, 'DB_File', 'words.db';
while (<>) {
chop;
foreach my $wd (split) {
$words{$wd}++;
}
}
foreach my $w (keys %words) {
print "$w|$words{$w}\n";
}
untie(%words);
--------------------
Is that brainy solution in the sense of good programming practice...?
Thanks in advance for any opinion,
Andrej
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>