Dear Perl users,

I try to parse 20.000.000 records file but... To solve my recent Perl problem I collect my previous posts on this list.

I have bar separated file (FILE_A):
name1|10
name2|20
name3|5
name4|30
etc.

I processed it with the following code:

my %scores;
while ( <FILE_A> ) {
   chomp;
   my ($name, $score) = split /\|/;
   $scores{$name} = $score;
}

Then I have another file (FILE_B) which looks like:
____________
ID - 001
NA - name1
NA - name2

ID - 002
NA - name2
NA - name4

etc.
____________

The code below reads each record from FILE_B (ID and NA fields) and sums corresponding NA values from FILE_A:

my ( $ID, %ids );
while ( <FILE_B> ) {
   if ( /^ID\s*-\s*(.+)/ ) {
   $ID = $1;
  }
  elseif ( /^NA\s*-\s*(.+)/ ) {
     $ids{ $ID } += $scores{ $1 };
  }
}

for my $id ( keys %ids ) {
   print "$id | $ids{$id}\n";
}

So we obtain:
001|30  #ID is 001 and 10+20=30
002|50  #ID is 002 and 20+30=50

The script works perfect, but when I try to process larger files (eg. with 20 milions records!), it hangs. How should I modify this script that I could process each record from FILE_B separately.

Thanks for any help.

Cheers, Andrej

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to