Hi,
in addition to what Philippe said, let me give you a short code snippet
used as a prototype for a couple of reductions in ViennaCL. It certainly
takes a little to get your head around it, but once you figured it out
it's like a swiss army knife ;-)
The overall workflow is always the same: You decompose the full data
into large chunks upon which all the workgroups operate (e.g. individual
rows for a matrix-vector product). Within each work group you further
decompose the work for each thread. Then you need to sum (alternatives:
min/max/xor/etc.) all those values:
// place thread results in __local array:
shared_array[get_local_id(0)] = value_computed_by_respective_thread();
// reduction loop:
for (uint stride=get_local_size(0)/2; stride > 0; stride /= 2)
{
barrier(CLK_LOCAL_MEM_FENCE);
if (get_local_id(0) < stride)
shared_array[get_local_id(0)] += shared_array[get_local_id(0) +
stride];
}
// process result in shared_array[0], e.g. write to global memory:
if (get_local_id(0) == 0)
do_something_with_result(shared_array[0]);
The barrier in the body of the for-loop is required to avoid data races.
The last if-statement is merely for processing the results and may also
involve all threads rather than just the first thread in the workgroup.
Hope this helps :-)
Best regards,
Karli
On 07/27/2015 07:07 PM, Charles Determan wrote:
> Philippe,
>
> I definitely understand and support such a solution for ViennaCL. I
> don't mean to say it should be included, I was just curious what the
> current approach was. However, I am interested in additional OpenCL
> development outside of the framework. Do you have any recommendations
> in learning more about coding OpenCL reductions? As I mentioned above,
> I have only found very basic reduction approaches so far, nothing for
> slightly more complex scenarios like the column and row sum examples.
>
> If this is transitioning too far from the mailing list focus I would
> certainly appreciate a reply off list.
>
> Thank you,
> Charles
>
> On Mon, Jul 27, 2015 at 11:46 AM, Philippe Tillet <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi,
>
> Such row-rise / column-wise reductions could be generate-able by the
> OpenCL backend, but this won't work on the Host of CUDA backend.
> Plus, this is not really maintained at the moment. I would recommend
> Karl's solution, even though it won't be optimal when the vector
> does not fit in the L2 cache of the OpenCL device (Maxwell for
> example has 2MB of L2 cache), as the current algorithm for GEMV
> accesses the entire vector get_num_groups(0) times.
>
> Philippe
>
> 2015-07-27 9:40 GMT-07:00 Karl Rupp <[email protected]
> <mailto:[email protected]>>:
>
>
> > Excellent, thank you. I thought that would be the way to go
> initially
> > but I hesitated because of concerns about having additional
> temporary
> > objects taking up memory when matrices begin to get larger but it
> > certainly is simpler this way.
>
> Just pushed:
>
> https://github.com/viennacl/viennacl-dev/commit/4063c941235d46804cd448db7ddecf0c3238548f
>
> Yeah, it's a bit of a trade-off: Sure, one could optimize the
> summation
> kernel, but this also implies more code to maintain. On the
> other hand,
> I'm not aware (which, of course, does not deny a possible
> existence) of
> a scenario where such summation routines are the performance
> bottleneck.
>
> > Glad to hear that 1.7.0 is nearly completed. Does that mean we
> should
> > expect a formal release soon?
>
> Yep. Expect the release on Wednesday.
>
> Best regards,
> Karli
>
>
>
> > On Mon, Jul 27, 2015 at 9:57 AM, Karl Rupp <[email protected]
> <mailto:[email protected]>
> > <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >
> > Hi Charles,
> >
> > > I am working on writing some additional opencl kernels
> > (potentially to
> > > incorporate in to viennacl) which involve column-wise
> reductions. A
> > > simple case would simply be the sum of each column of a
> matrix.
> > > However, I am having an extremely difficult time getting my
> kernel
> > > correct (reductions are tricky to me). That said, after
> searching for
> > > some resources I came across an old post on sourceforge
> referring to
> > > column-wise kernels
> > >
> (http://sourceforge.net/p/viennacl/mailman/message/27542552/) with
> > > viennacl. This leads me to my primary question.
> > >
> > > Are there such kernels already in ViennaCL that I have
> overlooked?
> >
> > Yes ;-) Have a look here at how row-wise sums reduce to a
> standard
> > matrix-vector product:
>
> >https://sourceforge.net/p/viennacl/discussion/1143678/thread/38e942a0/
> >
> > That is, in order to compute a row-sum and a column-sum you can
> use
> > row_sum = prod(A, ones);
> > col_sum = prod(trans(A), ones);
> >
> > In an hour or two I will push convenience functions for
> summation fixing
> > the only remaining issue for the 1.7.0 release:
> >https://github.com/viennacl/viennacl-dev/issues/127
> >
> >
> > > If not, are there any examples or resources you would
> recommend to help
> > > learn this topic? I have tried searching further but the
> only thing I
> > > can really find is a reduction of an entire matrix (which is
> relatively
> > > simple) as opposed to by column or row.
> >
> > At this point I can only recommend to think about how such
> operations
> > can be recast in terms of (standard) linear algebra. For
> example, row-
> > and column-wise updates to a matrix are special cases of the
> more
> > general
> > A += outer_prod(u, v);
> > operation (rank-1 updates). I'll improve the documentation in
> that
> > direction.
> >
> > Best regards,
> > Karli
> >
> >
> >
> ------------------------------------------------------------------------------
> > _______________________________________________
> > ViennaCL-devel mailing list
> > [email protected]
> <mailto:[email protected]>
> > <mailto:[email protected]
> <mailto:[email protected]>>
> > https://lists.sourceforge.net/lists/listinfo/viennacl-devel
> >
> >
>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> ViennaCL-devel mailing list
> [email protected]
> <mailto:[email protected]>
> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>
>
>
------------------------------------------------------------------------------
_______________________________________________
ViennaCL-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-devel