Well, after a bit more reading, I now think that bitwise shifting is available. I was confused because all the floating point operations are explicitly listed whereas for the integer operations only the "atomic" operations are listed. But, "atomic" just appears to be a memory-locking operation. On page 50 of the CUDA programming guide they use a bitwise right shift to demonstrate fast division by 2, so clearly it's supported (and takes 4 clock cycles).
The document also mentions that 32-bit multiply is currently very slow but will be available in a 4 clock cycle version in future hardware. --jason On Sun, Nov 23, 2008 at 9:54 PM, Bill Hart <[EMAIL PROTECTED]> wrote: > > I think it is impossible to implement shift right with those > primitives. Each of those primitives propagates information only to > the left or not at all. > > However, given a 32 bit quantity in the lower 32 bits of a 64 bit > word, shift right by n can be simulated by shift left by 32 - n. > > Bill. > > 2008/11/23 Jason Martin <[EMAIL PROTECTED]>: >> >> So, as I look over the CUDA specification I don't see support for some >> important integer operations like: shift, rot, mul, and div. I >> suppose that left shift could be implemented by repeated adds, but I >> can't see an easy way to implement right shift (if I'm missing >> something, or if rot is a simple thing to implement with just add, >> sub, and, xor, or, then please correct me). Likewise, there's no >> access to carry bits. Of course, we can deal without carry bits for >> this much parallelism... >> >> But, it looks like a GeForce GTX 260 is only $220. So, I think I'll >> go ahead and order one and start playing around with it. >> >> Jason Worth Martin >> Asst. Professor of Mathematics >> http://www.math.jmu.edu/~martin >> >> >> >> On Sun, Nov 23, 2008 at 4:22 PM, Bill Hart <[EMAIL PROTECTED]> wrote: >>> >>> Sorry I mean M4RI, not GF2X. >>> >>> 2008/11/23 Bill Hart <[EMAIL PROTECTED]>: >>>> I looked up the NVIDIA Cuda docs here: >>>> http://developer.download.nvidia.com/compute/cuda/2_0/docs/NVIDIA_CUDA_Programming_Guide_2.0.pdf >>>> >>>> It looks like section C2.3 describes an atomic Xor function. That >>>> should be just what is needed for GF2X. >>>> >>>> I can see some definite potential is doing exact arithmetic too. One >>>> would implement a floating point FFT. It doesn't matter that if one >>>> wanted a proved result one would have to work with a hopelessly slow >>>> bound. With that many cores it would be irrelevant. You'd still be a >>>> factor of 30-100 times faster than a single core machine! >>>> >>>> Bill. >>>> >>>> 2008/11/23 mabshoff <[EMAIL PROTECTED]>: >>>>> >>>>> >>>>> >>>>> On Nov 23, 12:38 pm, "Bill Hart" <[EMAIL PROTECTED]> wrote: >>>>>> Perhaps if you, me, John C, mabshoff and the people he is working with >>>>>> all signed off on it. >>>>> >>>>> The people I am working with here is basically Clement Pernet. There >>>>> are also other people form the LinBox universe working on GPU code, >>>>> i.e. Pascal Giorgi. >>>>> >>>>> Another interesting angle here could be m4ri since the XORing engine >>>>> on the GPU should be insanely fast, but last time I talked to malb he >>>>> wasn't very enthusiastic about it. >>>>> >>>>>> I could also mention the "seed funding" EPSRC have given me through my >>>>>> grant, for hardware and my salary, specifically for developing "fast >>>>>> core arithmetic for parallel processors and platforms". >>>>> >>>>> Cool. >>>>> >>>>>> We could actually make the application look quite impressive I think. >>>>> >>>>> One would hope so. >>>>> >>>>>> Bill. >>>>> >>>>> Cheers, >>>>> >>>>> Michael >>>>> >> >>>>> >>>> >>> >>> > >>> >> >> > >> > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---