On Jul 20, 1:12 pm, "William Stein" <[EMAIL PROTECTED]> wrote:
> What is CUDA?  Why should the typical read of sage-devel or user
> of Sage care?  Any chance you could write a paragraph or two and
> about this?  It might get a lot more Sage developers excited about
> what you're doing (which is I'm sure extremely exciting).

I'm going to try to respond to the entire thread in this message.

First, a disclaimer: I've read a lot about CUDA, but I've never
actually written any CUDA programs.  (Hopefully this will change very
soon... my new computer with a CUDA-capable graphics card should be
ready later this week.)

CUDA is NVidia's programming environment to expose the computational
power of their graphics card for general-purpose computation.  Current
graphics cards are immensely powerful; for instance, the current top-
of-the-line NVidia card has (basically) 30 cores, each of which can do
8 single-precision floating-point operations (including a multiply-and-
add) per cycle at about 1.3 GHz, and is available for $450.  (To get
this speed, you need to be doing "the same" computation on lots of
different data; it's somewhat similar to programming for SSE/Altivec/
etc., although you end up with code that looks quite different.)

So for the sorts of things the graphics card can do well, it's
actually much faster than a CPU.

CUDA includes 2 programming languages (a C dialect and a low-level
assembly language), and a library to load programs into the graphics
card, send data back and forth, call the programs, etc.  (There's also
a mode where you write your program in a combination of regular C and
CUDA's dialect; the CUDA tools compile the CUDA part themselves, pass
the regular parts to your regular C compiler, and automatically
construct glue code to tie the two together.)

Actually, the above is a simplification: CUDA includes 2 separate
libraries to load programs/exchange data/call the programs, and you
apparently cannot mix and match.  CUDA includes fast BLAS and FFT
implementations that run on the GPU; to use these, you must use the
"high-level" API, but pycuda is based on the "low-level" API.

Although CUDA is best known for fast single-precision floating point,
it does have a full complement of integer operations, so it should
also be useful for arbitrary-precision arithmetic, modular arithmetic
(preferably with a modulus <2^24), computations over GF(2), etc.

Until very recently, CUDA could only handle single-precision floating
point.  The most recent products (the GTX 260, the GTX 280, the Tesla
C1060, and the Tesla S1070) support double-precision floating point,
but each core only has one double-precision FPU (so double-precision
operations happen at 1/8 the rate of single-precision).

mabshoff doesn't like the idea of recreating pycuda using Cython, but
I think it's reasonable.  pycuda is actually pretty small (650 lines
of Python, 1325 lines of C++; the 1325 lines of C++ would probably be
replaced by a much smaller number of lines of Cython).  Doing the
rewrite would also give a chance to switch from the low-level to the
high-level API, which would make it much easier (possible?) to use the

Note that the CUDA single-precision FPU is not quite IEEE-compliant...
denormal numbers (very small numbers) are not handled correctly,
division is slightly inaccurate, and there are a few other issues.

I was actually planning to start incorporating CUDA into Sage myself
sometime in the next few months, probably starting by rewriting pycuda
in Cython.


To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org

Reply via email to