Robin Sheat wrote:
> I have a section of a program that is this:
>
>         my $users = get_all_users();
>         foreach my $user (@$users) {
>                 my $details = get_user_details($user);
>                 my $sum=0;
>                 my $count=0;
>                 for (my $i=1; $i<@$details; $i+=2) {
>                         $sum += @$details[$i];
>                         $count++;
>                 }
>                 $means{$user} = $sum/$count;
>         }
>
> The $users var is a reference to an array that is about 450,000 entries. The
> $details var is an array reference typically of size around 500 or so (note
> however, the inner loop goes up in twos). Overall the innermost part is
> executed 100 million times. How can I optimise it? If I was using C, I'd use
> pointer arithmetic on the array to save the multiplication on the array
> indexing. A naive reading of this implies that Perl would be doing a multiply
> for every array access, is this the case, or is it optimised away? If not, is
> there a way to use pointer arithmetic (or something that behaves similarly,
> but probably safer)? It is probably possible to restructure the array so that
> it can go up in ones, and do a foreach, but in most cases it'll be common to
> want both values, which is why I didn't do that.
>
> Anyone know any good resources for this kind of thing?

Hello Robin

The List::Util module is written in C, and as such is a lot faster than
accumulating a sum inside a Perl loop. Try this:

use List::Util qw/sum/;

my $users = get_all_users();

foreach my $user (@$users) {
  my $details = get_user_details($user);
  my $f = 1;
  $means{$user} = sum(grep $f ^= 1, @$details) / int(@$details / 2);
}

and if your details list is guaranteed to have an even number of elements then
you can remove the call to int() (but not the parentheses!).

Another thought: if the contents of the even-indexed array elements are unique
within each record so that they can form a hash key, your loop could become:

foreach my $user (@$users) {
  my $details = get_user_details($user);
  my %details = @$details;
  $means{$user} = sum(values %details) / values %details;
}

and if you could rewrite get_user_details() so that it creates a hash in the
first place then you'd be much better off. In fact, following this line still
further, it would make sense to alter the unpack() call in get_user_details() so
that it returns only the significant data in the first place. Your loop then
reduces to:

foreach my $user (@$users) {
  my $details = get_user_details($user);
  $means{$user} = sum(@$details) / @$details;
}

which is about as fast as you're going to get!

HTH,

Rob


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to