Robin Sheat wrote:
> I have a section of a program that is this:
>
> my $users = get_all_users();
> foreach my $user (@$users) {
> my $details = get_user_details($user);
> my $sum=0;
> my $count=0;
> for (my $i=1; $i<@$details; $i+=2) {
> $sum += @$details[$i];
> $count++;
> }
> $means{$user} = $sum/$count;
> }
>
> The $users var is a reference to an array that is about 450,000 entries. The
> $details var is an array reference typically of size around 500 or so (note
> however, the inner loop goes up in twos). Overall the innermost part is
> executed 100 million times. How can I optimise it? If I was using C, I'd use
> pointer arithmetic on the array to save the multiplication on the array
> indexing. A naive reading of this implies that Perl would be doing a multiply
> for every array access, is this the case, or is it optimised away? If not, is
> there a way to use pointer arithmetic (or something that behaves similarly,
> but probably safer)? It is probably possible to restructure the array so that
> it can go up in ones, and do a foreach, but in most cases it'll be common to
> want both values, which is why I didn't do that.
>
> Anyone know any good resources for this kind of thing?
Hello Robin
The List::Util module is written in C, and as such is a lot faster than
accumulating a sum inside a Perl loop. Try this:
use List::Util qw/sum/;
my $users = get_all_users();
foreach my $user (@$users) {
my $details = get_user_details($user);
my $f = 1;
$means{$user} = sum(grep $f ^= 1, @$details) / int(@$details / 2);
}
and if your details list is guaranteed to have an even number of elements then
you can remove the call to int() (but not the parentheses!).
Another thought: if the contents of the even-indexed array elements are unique
within each record so that they can form a hash key, your loop could become:
foreach my $user (@$users) {
my $details = get_user_details($user);
my %details = @$details;
$means{$user} = sum(values %details) / values %details;
}
and if you could rewrite get_user_details() so that it creates a hash in the
first place then you'd be much better off. In fact, following this line still
further, it would make sense to alter the unpack() call in get_user_details() so
that it returns only the significant data in the first place. Your loop then
reduces to:
foreach my $user (@$users) {
my $details = get_user_details($user);
$means{$user} = sum(@$details) / @$details;
}
which is about as fast as you're going to get!
HTH,
Rob
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>