On Jul 20, 1:12 pm, "William Stein" <[EMAIL PROTECTED]> wrote: > What is CUDA? Why should the typical read of sage-devel or user > of Sage care? Any chance you could write a paragraph or two and > about this? It might get a lot more Sage developers excited about > what you're doing (which is I'm sure extremely exciting).
I'm going to try to respond to the entire thread in this message. First, a disclaimer: I've read a lot about CUDA, but I've never actually written any CUDA programs. (Hopefully this will change very soon... my new computer with a CUDA-capable graphics card should be ready later this week.) CUDA is NVidia's programming environment to expose the computational power of their graphics card for general-purpose computation. Current graphics cards are immensely powerful; for instance, the current top- of-the-line NVidia card has (basically) 30 cores, each of which can do 8 single-precision floating-point operations (including a multiply-and- add) per cycle at about 1.3 GHz, and is available for $450. (To get this speed, you need to be doing "the same" computation on lots of different data; it's somewhat similar to programming for SSE/Altivec/ etc., although you end up with code that looks quite different.) So for the sorts of things the graphics card can do well, it's actually much faster than a CPU. CUDA includes 2 programming languages (a C dialect and a low-level assembly language), and a library to load programs into the graphics card, send data back and forth, call the programs, etc. (There's also a mode where you write your program in a combination of regular C and CUDA's dialect; the CUDA tools compile the CUDA part themselves, pass the regular parts to your regular C compiler, and automatically construct glue code to tie the two together.) Actually, the above is a simplification: CUDA includes 2 separate libraries to load programs/exchange data/call the programs, and you apparently cannot mix and match. CUDA includes fast BLAS and FFT implementations that run on the GPU; to use these, you must use the "high-level" API, but pycuda is based on the "low-level" API. Although CUDA is best known for fast single-precision floating point, it does have a full complement of integer operations, so it should also be useful for arbitrary-precision arithmetic, modular arithmetic (preferably with a modulus <2^24), computations over GF(2), etc. Until very recently, CUDA could only handle single-precision floating point. The most recent products (the GTX 260, the GTX 280, the Tesla C1060, and the Tesla S1070) support double-precision floating point, but each core only has one double-precision FPU (so double-precision operations happen at 1/8 the rate of single-precision). mabshoff doesn't like the idea of recreating pycuda using Cython, but I think it's reasonable. pycuda is actually pretty small (650 lines of Python, 1325 lines of C++; the 1325 lines of C++ would probably be replaced by a much smaller number of lines of Cython). Doing the rewrite would also give a chance to switch from the low-level to the high-level API, which would make it much easier (possible?) to use the CUDA BLAS and FFT. Note that the CUDA single-precision FPU is not quite IEEE-compliant... denormal numbers (very small numbers) are not handled correctly, division is slightly inaccurate, and there are a few other issues. I was actually planning to start incorporating CUDA into Sage myself sometime in the next few months, probably starting by rewriting pycuda in Cython. Carl --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---