Dear Perl users,
I try to parse 20.000.000 records file but... To solve my recent Perl
problem I collect my previous posts on this list.
I have bar separated file (FILE_A):
name1|10
name2|20
name3|5
name4|30
etc.
I processed it with the following code:
my %scores;
while ( <FILE_A> ) {
chomp;
my ($name, $score) = split /\|/;
$scores{$name} = $score;
}
Then I have another file (FILE_B) which looks like:
____________
ID - 001
NA - name1
NA - name2
ID - 002
NA - name2
NA - name4
etc.
____________
The code below reads each record from FILE_B (ID and NA fields) and sums
corresponding NA values from FILE_A:
my ( $ID, %ids );
while ( <FILE_B> ) {
if ( /^ID\s*-\s*(.+)/ ) {
$ID = $1;
}
elseif ( /^NA\s*-\s*(.+)/ ) {
$ids{ $ID } += $scores{ $1 };
}
}
for my $id ( keys %ids ) {
print "$id | $ids{$id}\n";
}
So we obtain:
001|30 #ID is 001 and 10+20=30
002|50 #ID is 002 and 20+30=50
The script works perfect, but when I try to process larger files (eg.
with 20 milions records!), it hangs. How should I modify this script
that I could process each record from FILE_B separately.
Thanks for any help.
Cheers, Andrej
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>