The PDL matmult routine was originally implemented
as a loop over inner products.  Last year it was redone
as a cache-friendly tiled implementation.

--Chris

On 5/22/11, Daniel Carrera <[email protected]> wrote:
> Hi Dima,
>
> On 05/22/2011 11:45 PM, Dima Kogan wrote:
>> The new functionality in PDL is able to distribute operations created
>> by PDL threading into separate processor threads. This takes effect
>> if, for example, you use PDL to multiply a 5000x5000x5 piddle by a
>> 5000x5000 piddle. PDL threading treats this as 5 separate
>> multiplications of 5000x5000 matrices, and the new code will
>> parallelize this. However, if you're simply multiplying two 5000x5000
>> matrices together, there is no PDL threading involved, so the new patch
>> will do nothing.
>
>
> Ah, thanks. That makes everything a lot more clear now.
>
>
>> It COULD do something if we define matrix multiplication as a bunch of
>> matrix-vector multiplications threaded together. Then the
>> parallelization will 'just work', but we don't define matrix
>> multiplication this way. (Sorta off-topic: should we change the
>> multiplication definition to this?)
>
>
> This may not apply to PDL, but last year I tried something like this
> using OpenMP (i.e. threads) and Fortran, and the "parallel" code was
> actually slower.
>
> In Fortran, when I just did "matmul(A,B)" the compiler wrote a loop that
> accessed memory very efficiently, and by forcing matrix-vector products
> I ruined that optimization and made the code slower. But I have no idea
> if this has any relevance to PDL.
>
> --
> I'm not overweight, I'm undertall.
>
> _______________________________________________
> Perldl mailing list
> [email protected]
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to