Re: map/array performance

John W. Krahn Sun, 23 Oct 2005 11:11:54 -0700

Frank Bax wrote:
> Rather than create/store/sort many billion entities, my script creates
> these entities dynamically and maintains a hash of the "top 100".  As
> each entity is created, I search my hash for the entity with "lowest"
> value, based on a number of elements in the hash; then "low" element
> gets replaced with "new" element.  Code looks like:
>     my $low = 0;
>     for( my $new=1; $new<=$iSuit; ++$new ) {
>         my $snew =
> sprintf("%4d%4d",$aSuit{$new}{'rescap'},$aSuit{$new}{'resval'});
>         my $slow =
> sprintf("%4d%4d",$aSuit{$low}{'rescap'},$aSuit{$low}{'resval'});


Using sprintf() to concatenate numbers is (AFAIK) going to be slower than
concatenation:

        my $snew = $aSuit{ $new }{ rescap } . $aSuit{ $new }{ resval };
        my $slow = $aSuit{ $low }{ rescap } . $aSuit{ $low }{ resval };


>         if( $snew lt $slow ) { $low = $new; }

You are comparing numbers so:

        $low = $new if $snew < $slow;


>     }
> I needed to change this code so that 'rescap' and 'resval' are runtime
> options and there could be any number of them, so I created an array
> @f_seq and rewrote my simple loop as:
>     my $low = 0;
>     for( my $new=1; $new<=$iSuit; ++$new ) {
>         $a=$new; $b=$low;
>         if( cmpSuits() > 0 ) { $low = $new; }
>     }
>   sub cmpSuits {
>     my $aval=''; map { $aval=$aval.sprintf("%4d",$aSuit{$a}{$_}); } @f_seq;
>     my $bval=''; map { $bval=$bval.sprintf("%4d",$aSuit{$b}{$_}); } @f_seq;

You shouldn't use map in void context, you should use a foreach loop instead,
but you don't even need a loop there:

    my $aval = join '', @{ $aSuit{ $a } }{ @f_seq };
    my $bval = join '', @{ $aSuit{ $b } }{ @f_seq };


>     $bval cmp $aval;    # a<b=1  a=b=0  a>b=-1 ... sorts descending

Or just:

    join( '', @{ $aSuit{ $a } }{ @f_seq } ) cmp join( '', @{ $aSuit{ $b } }{
@f_seq } );


>   }
> 
> I use $a and $b because at the end of my script, aSuit hash is sorted
> for output - also using function "cmpSuits".  The problem is that my
> script is now horribly slower than the original!!  Is this because I
> used "map"?  If so, what should I have used instead?
> 
> Running the script on a small test sample from our database, the
> original code runs in 85-90 seconds.  The modified code using "map",
> takes 160-165 seconds.  Processing of my "real" database took 69 hours
> on the original code, but at 80 hours, my modified script is not even
> half way!  I must find something a bit faster then "map", but more
> flexible than my original code.

You might be able to use a Schwartzian Transform or a Guttman-Rosler Transform
to speed up the sort.



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: map/array performance

Reply via email to