It's hard to suggest improvement seeing only a fragment of entire code
for, but 
I would probably expect your code to be slow as you have 5 nested for each
loops:

while ( my @data_files = grep(/\.csv$/,readdir(DH)) )

   ...
   foreach my $file ( @data_files )
   ...
      while ( my $data = <$FH> )
      ...
         foreach my $up_acs ( keys %{$up_map} )
         ...
            foreach my $ensemble_id (@{$up_map->{$up_acs}{'Ensembl_TRS'}} )
            ...

I am not entirely sure how big could be your keys %{$up_map} and
@{$up_map->{$up_acs}{'Ensembl_TRS'}} but it's obvious by nesting so deeply
you increase complexity and dramatically the total number of iterations.

Here I found a good explanation:
http://pages.cs.wisc.edu/~vernon/cs367/notes/3.COMPLEXITY.html

See nested loops.

Maybe try to simplify your code to reduce the number of nested loops if
possible.

 


On 11/06/2012 15:31, "venkates" <venka...@nt.ntnu.no> wrote:

>Hi all,
>
>I am trying to filter files from a directory (code provided below)  by
>comparing the contents of each file with a hash ref (a parsed id map
>file provided as an argument). The code is working however, is extremely
>slow.  The .csv files (81 files) that I am reading are not very large
>(largest file is 183,258 bytes).  I would appreciate if you could
>suggest improvements to the code.
>
>sub filter {
>     my ( $pazar_dir_path, $up_map, $output ) = @_;
>     croak "Not enough arguments! " if ( @_ < 3 );
>
>     my $accepted = 0;
>     my $rejected = 0;
>
>     opendir DH, $pazar_dir_path or croak ("Error in opening directory
>'$pazar_dir_path': $!");
>     open my $OUT, '>', $output or croak ("Cannot open file for writing
>'$output': $!");
>     while ( my @data_files = grep(/\.csv$/,readdir(DH)) ) {
>         my @records;
>         foreach my $file ( @data_files ) {
>             open my $FH, '<', "$pazar_dir_path/$file" or croak ("Cannot
>open file '$file': $!");
>             while ( my $data = <$FH> ) {
>                 chomp $data;
>                 my $record_output;
>                 @records = split /\t/, $data;
>                 foreach my $up_acs ( keys %{$up_map} ) {
>                     foreach my $ensemble_id (
>@{$up_map->{$up_acs}{'Ensembl_TRS'}} ){
>                         if ( $records[1] eq $ensemble_id ) {
>                             $record_output = join( "\t", @records );
>                             print $OUT "$record_output\n";
>                             $accepted++;
>                         }
>                         else {
>                             $rejected++;
>                             next;
>                         }
>                     }
>                 }
>             }
>             close $FH;
>         }
>     }
>     close $OUT;
>     closedir (DH);
>     print "accepted records: $accepted\n, rejected records: $rejected\n";
>     return $output;
>}
>
>__DATA__
>
>TF0000210    ENSMUST00000001326    SP1_MOUSE    GS0000422
>ENSMUSG00000037974    7    148974877    149005136    Mus musculus
>MUC5AC    14570593    ELECTROPHORETIC MOBILITY SHIFT ASSAY
>(EMSA)::SUPERSHIFT
>TF0000211    ENSMUST00000066003    SP3_MOUSE    GS0000422
>ENSMUSG00000037974    7    148974877    149005136    Mus musculus
>MUC5AC    14570593    ELECTROPHORETIC MOBILITY SHIFT ASSAY
>(EMSA)::SUPERSHIFT
>
>
>Thanks a lot,
>
>Aravind
>
>-- 
>To unsubscribe, e-mail: beginners-unsubscr...@perl.org
>For additional commands, e-mail: beginners-h...@perl.org
>http://learn.perl.org/
>
>
>

-----------------------------------------------------------------------------------------------------------------------------------------
LOVEFiLM UK Limited is a company registered in England and Wales. 
Registered Number: 06528297. 
Registered Office: No.9, 6 Portal Way, London W3 6RU, United Kingdom.

This e-mail is confidential to the ordinary user of the e-mail address to which 
it was addressed. If you have received it in error, 
please delete it from your system and notify the sender immediately.

This email message has been delivered safely and archived online by Mimecast.
For more information please visit http://www.mimecast.co.uk 
-----------------------------------------------------------------------------------------------------------------------------------------

Reply via email to