Wow, didn't drink my coffee this morning.  I see that I've only got
half the story...
sorry about the noise.

On Thu, Sep 18, 2014 at 10:12 AM, Diab Jerius <[email protected]> wrote:
> For just this reason if I don't need to initialize the piddle I tend
> to use new_from_specification, even if it is marked internal only.
>
> use PDL;
> use Benchmark;
>
> my $pdl;
> my $nelem = 1_000_000;
>
> timethese(
>     100000,
>     {
>         zeroes => sub { $pdl = zeroes( byte, $nelem ) },
>         new_from_specification => sub {
>             $pdl = PDL->new_from_specification( byte, $nelem );
>         },
>     } );
>
> results in:
>
> Benchmark: timing 100000 iterations of new_from_specification, zeroes...
> new_from_specification:  1 wallclock secs ( 0.83 usr +  0.00 sys =
> 0.83 CPU) @ 120481.93/s (n=100000)
>                 zeroes: 74 wallclock secs (74.34 usr +  0.01 sys =
> 74.35 CPU) @ 1344.99/s (n=100000)
>
> Maybe we can remove the "internal only" flag on new_from_specification
> ( or wrap it with something more intuitively named)?
>
>
>
>
> On Thu, Sep 18, 2014 at 8:52 AM, Chris Marshall <[email protected]> 
> wrote:
>> Hi Roey-
>>
>> You haven't missed anything.  zeros() is a pretty high level
>> implementation and definitely is not optimized for anything
>> like byte operation.  I would guess that a lot of the difference
>> is just from a byte loop in PDL::zeros() versus a memory
>> copy.
>>
>> Have you tried just pre-allocating a zero piddle and then
>> copying it to create each new buffer?
>>
>>   $zero_1M = zeros(byte, 1_000_000);
>>   $buffer = $zero_1M->copy;
>>
>> Also, please feel free to put a ticket on our sf.net Feature
>> Request tracker requesting improved performance with
>> a sample code to demonstrate the timing difference.
>>
>> --Chris
>>
>>
>> On Thu, Sep 18, 2014 at 7:22 AM, Roey Almog (Infoneto Ltd)
>> <[email protected]> wrote:
>>> Hi,
>>>
>>> I am a bit new to PDL so if the answer is obvious I have missed it some
>>> how...
>>>
>>> I have around 1 million vectors of about 1 million elements long each
>>> element is one byte in size
>>>
>>> The information is streamed in, one vector at a time with input data - the
>>> data itself is not read from  a file its real life information collected so
>>> every time I run the program I get something else.
>>>
>>> I found that initializing the vectors with zeros takes long time, seems that
>>> zeros iterates on all the cells settings them to zero
>>>
>>> so I created any empty pdl with zeros and each time I need to create a PDL I
>>> use this methos
>>>
>>> # this is done once
>>> my $pdl_template = zeros(byte, $size);
>>> my $buffer_template = $pdl_template->get_dataref;
>>>
>>>
>>> # now when getting the information
>>> my $pdl = PDL->new_from_specification(byte, $size);
>>> my $ptr_buffer = $pdl ->get_dataref;
>>> $$ptr_buffer = ${$buffer_template};
>>> $pdl->upd_data;
>>>
>>> This is about 7 times faster than zeros, however it would be better if zeros
>>> (and also ones) had some optimization to clear an empty vector even faster
>>> using c's memset function when applicable (bytes. shorts integer etc.)
>>>
>>> As this seems very useful I wonder if I missed something ?
>>>
>>> Thanks
>>> Roey
>>>
>>> _______________________________________________
>>> Perldl mailing list
>>> [email protected]
>>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>>>
>>
>> _______________________________________________
>> Perldl mailing list
>> [email protected]
>> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to