Your opinion on large file processing

Andrej Kastrin Sat, 23 Sep 2006 02:52:16 -0700

Dear all,

the script below count word occurences in input file. It uses simplehash structure to store unique words and its frequencies.

--------------------
use strict;
my %words;
while (<>) {
   chop;
   foreach my $wd (split) {

$words{$wd}++;}}

foreach my $w (keys %words) {
   print "$w|$words{$w}\n";
}
--------------------

In order to process large amounts of data (10.000.000 lines) and toavoid memory problems I use DB_File module to store hash %words intolocal file and than read data from it.


--------------------
use strict;
use DB_File;
tie my %words, 'DB_File', 'words.db';
while (<>) {
   chop;
   foreach my $wd (split) {

$words{$wd}++;}}

foreach my $w (keys %words) {
   print "$w|$words{$w}\n";
}
untie(%words);
--------------------


Is that brainy solution in the sense of good programming practice...?

Thanks in advance for any opinion,
Andrej

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Your opinion on large file processing

Reply via email to