Dear all,

the script below count word occurences in input file. It uses simple hash structure to store unique words and its frequencies.
--------------------
use strict;
my %words;
while (<>) {
   chop;
   foreach my $wd (split) {
$words{$wd}++; } }
foreach my $w (keys %words) {
   print "$w|$words{$w}\n";
}
--------------------

In order to process large amounts of data (10.000.000 lines) and to avoid memory problems I use DB_File module to store hash %words into local file and than read data from it.

--------------------
use strict;
use DB_File;
tie my %words, 'DB_File', 'words.db';
while (<>) {
   chop;
   foreach my $wd (split) {
$words{$wd}++; } }
foreach my $w (keys %words) {
   print "$w|$words{$w}\n";
}
untie(%words);
--------------------


Is that brainy solution in the sense of good programming practice...?

Thanks in advance for any opinion,
Andrej

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to