On 3/12/2011 4:51 PM, Ivan Shmakov wrote:
>>>>>> Chris Marshall<[email protected]>  writes:
>>>>>> On 3/12/2011 3:44 PM, Ivan Shmakov wrote:
>
>   >>  However, the only way I see for the topmost operator to be “seen” to
>   >>  PDL before its arguments are actually computed is the delayed (AKA
>   >>  lazy) evaluation.  Which is quite common in certain languages (and
>   >>  some, like Haskell, are built all around such a notion), but which
>   >>  I've rarely (if at all) seen being done in Perl.  And I'm not sure
>   >>  that extending PDL “to do it the lazy way” could be at all easy.
>   >>  (And we'd have to watch for all the PDL instances involved in such a
>   >>  computation to not become tampered along the way, as such tampering
>   >>  should wait for the computation to complete first.)
>
>   >  Lazy evaluation is one approach of interest.  The current dataflow
>   >  support for slicing is similar to that.
>
>       ACK.  I'd probably take a glance at that.
>
>   >  Another option would be adding something like a forall construct to
>   >  directly support these types of operations.

http://en.wikipedia.org/wiki/Fortran_95_language_features#The_FORALL_Statement_and_Construct

As you can see, this notation makes simple, independent
parallel operations easy to write and, I hope, to implement.

>       It could do the thing.  A simple model could be like:
>
>      my $a
>          = distribute {
>              ## .
>              $_[0]->plus ($_[1], 0)->mult ($_[2], 0);
>          } ($x, $y, $z);
>
>       (Or with a code reference.)
>
>       It'd be necessary for distribute () to degenerate all the inner
>       distribute ()'s into mere function applications; thus, some
>       global flag is to be maintained.
>
>       Unfortunately, such an approach (without lazy evaluation)
>       “breaks” the use of functions, or, to be more precise, it
>       somewhat “penalizes” the use of composition.  E. g., the
>       computation of $a = f (g ($b)), should f () and g () be both
>       distribute ()'d, will still imply that g ($b) will be computed
>       first, and the result is to be clumped together by the
>       distribute () in g () and split back again by the distribute ()
>       in f (), which doesn't look particularly efficient.

forall { $a($i,$j) = f( g( $b($i,$j) ) } ($i=1:$n, $j=1:$m);


>       Perhaps I'm exaggerate the whole issue, though.
>
>   >>  Still, individual methods could probably be made
>   >>  “multicore-enabled” without much effort.  I'm, however, unsure
>   >>  about the overall effect it may have upon the performance, etc.
>
>   >  These type of issues will also be involved in effective GPU use with
>   >  PDL.
>
>       IIUC, the contemporary GPU's are considerably less flexible than
>       the CPU's.  It'd require some research to get the idea on how to
>       plan a computation for a GPU so that it could be performed more
>       or less efficiently.

A forall-style distributed computation maps naturally
onto the CUDA programming model for Nvidia GPUs.
The key for GPU computation is to minimize IO
between the GPU and CPU memory spaces.

--Chris

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to