The clpp project is also out there. The cantankerous issue with opencl is getting the workgroups mapped propperly to each compute unit so for each phase they can stream independently.
I CC'd Sean Baxter who is the current guru. I will be traveling to San Diego later this week; so I won't be able do help out till next Monday. -Chad On Jan 16, 2012 2:19 PM, "Dieter Morgenroth" <[email protected]> wrote: > Hi, > I also worked on a radix sort implementation, I had a rough working > implementation but I found that the numpy.argsort was much faster on my > machine. So I delayed that task for now. But if someone comes up with a > fast generic solution I would also be interested. > I used the sorting for a sph simulation. > http://youtu.be/1hHELRSCIm8 > I have only a notebook with an ATI graphics card. At least on that the > numpy sort was about 5 times faster even on several million entries. > Dieter > > Am 15.01.2012 22:22, schrieb Ian Johnson: > > Hi Andreas, > > That code is the latest, I haven't touched it in a long time since my > work has taken me away from opencl for the time being. As for the > licensing, I put in an MIT license so its free as far as I'm concerned. > Some of the radix code comes straight from the nvidia sdk example, we had > to modify it a good bit to sort keys and values but I'm not sure what their > licenses are. > > This is also definitely not the best implementation of radix, as there > is a much faster (and open) CUDA implementation. I would have hoped it > would be ported to OpenCL by now, and there is this project: > http://code.google.com/p/ocl-radix-sort/ which is GPL. > > good luck! I'd like to hear about any improvements that come along! > Ian > > On Sun, Jan 15, 2012 at 9:35 AM, Andreas Kloeckner < > [email protected]> wrote: > >> Hi Ian, >> >> On Sun, 17 Apr 2011 22:29:41 -0400, Ian Johnson <[email protected]> >> wrote: >> > I finally bit the bullet and got radix working in PyOpenCL :) >> > It's also improved over the SDK example because it does keys and values, >> > mostly thanks to my advisor. >> > Additionally this sort will handle any size array as long as it is a >> power >> > of 2. The shipped example does not allow for arrays smaller than 32768, >> but >> > I've hooked up their naive scan to allow all smaller arrays. >> > >> > >> https://github.com/enjalot/adventures_in_opencl/tree/master/experiments/radix/nv >> > all you really need are radix.py, RadixSort.cl and Scan_b.cl >> > >> > some simple tests are at the bottom of radix.py >> > >> > I hammered this out because I need it for a project, it's not all that >> clean >> > and I didn't add support for sorting on keys only (altho it wouldn't >> take >> > much to add that, and I intend to at a later time when I need the >> > functionality). Hopefully this helps someone else out there. I'll also >> be >> > porting it using my own OpenCL C++ wrappers to include in my fluid >> > simulation library at some point. >> > >> > I also began looking at AMD's radix from their SPH tutorial, but they >> use >> > local atomics which are not supported on my 9600M >> >> Out of personal need, I'm thinking of bringing some kind of sort >> functionality into PyOpenCL. I saw that you made a number of >> enhancements to your sort code since you sent the announcement. Is your >> most recent sort code still in the repo above? What is the license for >> that code? More generally, what course of action would you recommend? >> >> Thanks in advance for your help, >> Andreas >> > > > > -- > Ian Johnson > http://enja.org > > > > _______________________________________________ > PyOpenCL mailing > [email protected]http://lists.tiker.net/listinfo/pyopencl > > > > _______________________________________________ > PyOpenCL mailing list > [email protected] > http://lists.tiker.net/listinfo/pyopencl > >
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
