Re: [Perldl] Request for my response to a PDL question on Stack Overflow

Chris Marshall Fri, 20 Apr 2012 13:45:55 -0700

You could also point out that the vec() approach can
also be used on the ${$pdl->get_dataref} so it is possible
to have your cake and eat it too...


2012/4/20 Chris Marshall <[email protected]>:
> David-
>
> I reran the final tests but included the vec approach
> as well.  vec is roughly comparable with perl for
> performance while PDL with indadd outperforms
> the perl list approach somewhere between $N=100
> and $N=1000.  PDL can be more than 3-6X faster
> for large iteration counts.  Considering that PDL:
>
>  - has the same memory footprint for processing vec
>
>  - has much more powerful and flexible data manipulation
>
>  - has simpler syntax for accessing elements (NiceSlice!)
>
>  - does all of this is without custom XS, PP, or Inline::C
>
> It is arguably the best alternative for this general
> class of problems.  The output files from my
> timing runs and the test code are attached.
>
> --Chris
>
> On Fri, Apr 20, 2012 at 2:52 PM, Chris Marshall <[email protected]> 
> wrote:
>> I reran the timings for both cygwin perl and
>> asperl on the same computer (without the
>> virus scan running) and got more consistent
>> answers (without the low-$N anomalies) than
>> from yesterday runs.
>>
>> The performance cross-over between perl and
>> PDL is somewhere between $N=100 and
>> $N=1000 with PDL being from 3-6X faster
>> than perl for $N>10000.  I also note that
>> cygwin is 2X slower than the native perl
>> so there is good reason to make win32
>> PDL fully compatable.  :-)
>>
>> $N         Cygwin   ASperl
>> ---------------------------
>> 1          0.02     0.02
>> 10         0.05     0.08
>> 100        0.4      0.6
>> 1000       1.9      3.2
>> 10000      3.4      6.6
>> 100000     3.6      6.5
>> 1000000    3.6      6.1
>>
>> --Chris
>>
>> On Fri, Apr 20, 2012 at 8:29 AM, David Mertens <[email protected]> 
>> wrote:
>>> No, I'm just not very experienced with Benchmarks. :-)
>>>
>>> I'll look into it later today.
>>>
>>> On Apr 19, 2012 4:29 PM, "Chris Marshall" <[email protected]> wrote:
>>>>
>>>> I notice you are still running a very small
>>>> amount of timings.  How are your timings
>>>> if you change the -1 to -10 in cmpthese?
>>>> Is there a specific reason to leave that off
>>>> of the update?
>>>>
>>>> --Chris
>>>>
>>>>
>>>> On Thu, Apr 19, 2012 at 5:11 PM, David Mertens <[email protected]>
>>>> wrote:
>>>> > Wow, that makes a sizeable difference, especially for larger values of
>>>> > $updates_per_round, where it doesn't need to constantly re-allocate
>>>> > temporary memory. I've updated the post to reflect that.
>>>> >
>>>> > Thanks, Chris!
>>>> > David
>>>> >
>>>> >
>>>> > On Thu, Apr 19, 2012 at 3:19 PM, Chris Marshall <[email protected]>
>>>> > wrote:
>>>> >>
>>>> >> I don't have a stackoverflow.com account but if I use
>>>> >> your original test code, increase the cpu seconds for
>>>> >> the benchmark and replace the dataflow increment
>>>> >> code by the indadd() routine, I get speeds from 50X
>>>> >> for $N=1, down to 1.9X for $N=1000, and then up to
>>>> >> 3-4X as the sizes increase to 10000000 where I
>>>> >> stopped testing.
>>>> >>
>>>> >> The key here is to avoid creating and destroying piddles
>>>> >> since the computational work involved here is *very*
>>>> >> light.  In fact, an Inline::PP routine to handle the core
>>>> >> of an algorithm of interest (if more than indadd) + the
>>>> >> existing PDL should perform very well for this use
>>>> >> case---according to the benchmarks on my system:
>>>> >>
>>>> >> use PDL;
>>>> >> use Benchmark qw/cmpthese/;
>>>> >>
>>>> >> my $updates_per_round = shift || 1;
>>>> >>
>>>> >> my $N = 1_000_000;
>>>> >> my @perl = (0 .. $N - 1);
>>>> >> my $pdl = zeroes $N;
>>>> >>
>>>> >> cmpthese(-10,{
>>>> >>    perl => sub{
>>>> >>        $perl[int(rand($N))]++ for (1..$updates_per_round);
>>>> >>    },
>>>> >>    pdl => sub{
>>>> >>        my $to_update = long(random($updates_per_round) * $N);
>>>> >>        indadd(1,$to_update,$pdl);
>>>> >>        ## $pdl->index($to_update)++;
>>>> >>    }
>>>> >> });
>>>> >>
>>>> >>
>>>> >> Cheers,
>>>> >> Chris
>>>> >>
>>>> >> On Wed, Apr 18, 2012 at 10:11 AM, David Mertens
>>>> >> <[email protected]> wrote:
>>>> >> > Hey folks -
>>>> >> >
>>>> >> > There's a PDL question on stack overflow
>>>> >> > (http://stackoverflow.com/questions/9730678/c-like-arrays-in-perl) to
>>>> >> > which
>>>> >> > I submitted an answer. I believe my answer is better than the
>>>> >> > currently
>>>> >> > marked best answer, but it looks like the OP either disagrees or
>>>> >> > simply
>>>> >> > has
>>>> >> > not returned to read my answer. I would appreciate if those of you
>>>> >> > who
>>>> >> > have
>>>> >> > stack overflow accounts could read the responses and up-vote what you
>>>> >> > think
>>>> >> > is the best answer.
>>>> >> >
>>>> >> > Thanks!
>>>> >> > David
>>>> >> >
>>>> >> > --
>>>> >> >  "Debugging is twice as hard as writing the code in the first place.
>>>> >> >   Therefore, if you write the code as cleverly as possible, you are,
>>>> >> >   by definition, not smart enough to debug it." -- Brian Kernighan
>>>> >> >
>>>> >> >
>>>> >> > _______________________________________________
>>>> >> > Perldl mailing list
>>>> >> > [email protected]
>>>> >> > http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>>>> >> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> >  "Debugging is twice as hard as writing the code in the first place.
>>>> >   Therefore, if you write the code as cleverly as possible, you are,
>>>> >   by definition, not smart enough to debug it." -- Brian Kernighan
>>>> >

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Re: [Perldl] Request for my response to a PDL question on Stack Overflow

Reply via email to