Wow, Puneet really stirred us all up (again).  Puneet, as David said, your PDL code is slow because you are using a complicated _expression_, which forced PDL to create and destroy intermediate PDLs (every binary operation has to have a complete temporary PDL allocated and then freed to store its result!).  I attach a variant of your test, with the operation carried out as much in-place as possible to eliminate extra allocations.  PDL runs almost exactly a factor of 10 faster on my computer than does raw Perl in this case. 

Note that the original ingestion of the Perl array to PDL is quite slow:  it generally takes slightly longer to create the PDL than to generate the random numbers and create the Perl array in the first place!  That is because PDL has to make several passes through the Perl array to determine its size, and then has to individually probe and convert each numeric value in the Perl array.

 
On Jul 9, 2010, at 1:09 AM, David Mertens wrote:

FYI, for really thorough timing results, check out Devel::NYTProf: http://search.cpan.org/~timb/Devel-NYTProf-4.03/lib/Devel/NYTProf.pm

You have a lot of things going on to mix up the results - you have both a memory allocation and a calculation. As I understand it, Perl will likely outperform PDL in the memory allocation portion of this exercise, but PDL should have Perl's lunch for the calculation portion.

Perl will outperform PDL in the memory allocation because in all likelihood, it doesn't perform any allocation with the push. It likely already allocated more than three elements for (all of) its arrays, so pushing the new value on the array does not cost anything, except for a higher up-front memory cost. I suspect this is where PDL is losing to Perl - Perl is performing the allocation ahead of where you start the timer.

In terms of the calculation itself, PDL should far outperform Perl. The reason is that the actual contents of the calculation loop are very slim, so the cost of all of the Perl stack manipulation should significantly increase its cost. The reason Perl for loops usually make sense are because the code inside the for loops often involve IO operations or other such things, in which case the Perl stack manipulations comprise only a small portion of the total compute time.

Try a situation when Perl and PDL allocate their memory as part of the timing and see what that gives.

David

--
Sent via my carrier pigeon.
_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

use PDL;
use PDL::NiceSlice;
use Time::HiRes q/gettimeofday/;

sub get_time{
    my ($sec, $usec) = gettimeofday();
    my $out =  $sec + $usec*1e-6;
    return $out;
}

for my $c(10000, 100000, 1000000, 2000000, 3000000)  {
    
    my @dat = ();
    my $t0 = get_time();
    for (1 .. $c) {
        push @dat, [int(rand(10)), int(rand(10)), int(rand(10))];
    }
    my $t1 = get_time();
    printf("\n\n\ndat took %.3g secs to prepare\n", $t1 - $t0);

    my $t0 = get_time();
    $a = pdl @dat;
    my $t1 = get_time();
    printf("pdl took %.3g secs to prepare\n", $t1 - $t0);
    
    print "testing...\n";
    by_pdl($a, $c);
    by_arr(\...@dat, $c);
}
#


sub by_arr {
    my $t0 = get_time();
    
    my ($dat, $count) = @_;
    
    for (@$dat) {
        push @$_, (($_->[0] ** 2) + ($_->[1] ** 2)) ** 0.5;
    }
    
    my $t1 = get_time();
    
    printf("array: $count: %.4g secs\n",  ($t1 - $t0) );
}

sub by_pdl {
    my $t0 = get_time();
    
    my ($a, $count) = @_;
    
    my $b = PDL->new_from_specification(4, $a->((0))->dims);
    $b->(0:2) .= $a;
    $b->(0:2) *= $b->(0:2);
    $b->((3)) .= $b->(0:2)->sumover;
    $b->((3))->inplace->sqrt;
    $b->(0:2) .= $a;
    $bb = $b;
    my $t1 = get_time();
    
    printf("pdl:   $count: %.4g secs\n",$t1-$t0);
}


_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to