Frank Bax wrote: > Rather than create/store/sort many billion entities, my script creates > these entities dynamically and maintains a hash of the "top 100". As > each entity is created, I search my hash for the entity with "lowest" > value, based on a number of elements in the hash; then "low" element > gets replaced with "new" element. Code looks like: > my $low = 0; > for( my $new=1; $new<=$iSuit; ++$new ) { > my $snew = > sprintf("%4d%4d",$aSuit{$new}{'rescap'},$aSuit{$new}{'resval'}); > my $slow = > sprintf("%4d%4d",$aSuit{$low}{'rescap'},$aSuit{$low}{'resval'});
Using sprintf() to concatenate numbers is (AFAIK) going to be slower than concatenation: my $snew = $aSuit{ $new }{ rescap } . $aSuit{ $new }{ resval }; my $slow = $aSuit{ $low }{ rescap } . $aSuit{ $low }{ resval }; > if( $snew lt $slow ) { $low = $new; } You are comparing numbers so: $low = $new if $snew < $slow; > } > I needed to change this code so that 'rescap' and 'resval' are runtime > options and there could be any number of them, so I created an array > @f_seq and rewrote my simple loop as: > my $low = 0; > for( my $new=1; $new<=$iSuit; ++$new ) { > $a=$new; $b=$low; > if( cmpSuits() > 0 ) { $low = $new; } > } > sub cmpSuits { > my $aval=''; map { $aval=$aval.sprintf("%4d",$aSuit{$a}{$_}); } @f_seq; > my $bval=''; map { $bval=$bval.sprintf("%4d",$aSuit{$b}{$_}); } @f_seq; You shouldn't use map in void context, you should use a foreach loop instead, but you don't even need a loop there: my $aval = join '', @{ $aSuit{ $a } }{ @f_seq }; my $bval = join '', @{ $aSuit{ $b } }{ @f_seq }; > $bval cmp $aval; # a<b=1 a=b=0 a>b=-1 ... sorts descending Or just: join( '', @{ $aSuit{ $a } }{ @f_seq } ) cmp join( '', @{ $aSuit{ $b } }{ @f_seq } ); > } > > I use $a and $b because at the end of my script, aSuit hash is sorted > for output - also using function "cmpSuits". The problem is that my > script is now horribly slower than the original!! Is this because I > used "map"? If so, what should I have used instead? > > Running the script on a small test sample from our database, the > original code runs in 85-90 seconds. The modified code using "map", > takes 160-165 seconds. Processing of my "real" database took 69 hours > on the original code, but at 80 hours, my modified script is not even > half way! I must find something a bit faster then "map", but more > flexible than my original code. You might be able to use a Schwartzian Transform or a Guttman-Rosler Transform to speed up the sort. John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>