On Jul 20, 1:12 pm, "William Stein" <[EMAIL PROTECTED]> wrote:
> What is CUDA?  Why should the typical read of sage-devel or user
> of Sage care?  Any chance you could write a paragraph or two and
> about this?  It might get a lot more Sage developers excited about
> what you're doing (which is I'm sure extremely exciting).

I'm going to try to respond to the entire thread in this message.

First, a disclaimer: I've read a lot about CUDA, but I've never
actually written any CUDA programs.  (Hopefully this will change very
soon... my new computer with a CUDA-capable graphics card should be
ready later this week.)

CUDA is NVidia's programming environment to expose the computational
power of their graphics card for general-purpose computation.  Current
graphics cards are immensely powerful; for instance, the current top-
of-the-line NVidia card has (basically) 30 cores, each of which can do
8 single-precision floating-point operations (including a multiply-and-
add) per cycle at about 1.3 GHz, and is available for $450.  (To get
this speed, you need to be doing "the same" computation on lots of
different data; it's somewhat similar to programming for SSE/Altivec/
etc., although you end up with code that looks quite different.)

So for the sorts of things the graphics card can do well, it's
actually much faster than a CPU.

CUDA includes 2 programming languages (a C dialect and a low-level
assembly language), and a library to load programs into the graphics
card, send data back and forth, call the programs, etc.  (There's also
a mode where you write your program in a combination of regular C and
CUDA's dialect; the CUDA tools compile the CUDA part themselves, pass
the regular parts to your regular C compiler, and automatically
construct glue code to tie the two together.)

Actually, the above is a simplification: CUDA includes 2 separate
libraries to load programs/exchange data/call the programs, and you
apparently cannot mix and match.  CUDA includes fast BLAS and FFT
implementations that run on the GPU; to use these, you must use the
"high-level" API, but pycuda is based on the "low-level" API.

Although CUDA is best known for fast single-precision floating point,
it does have a full complement of integer operations, so it should
also be useful for arbitrary-precision arithmetic, modular arithmetic
(preferably with a modulus <2^24), computations over GF(2), etc.

Until very recently, CUDA could only handle single-precision floating
point.  The most recent products (the GTX 260, the GTX 280, the Tesla
C1060, and the Tesla S1070) support double-precision floating point,
but each core only has one double-precision FPU (so double-precision
operations happen at 1/8 the rate of single-precision).

mabshoff doesn't like the idea of recreating pycuda using Cython, but
I think it's reasonable.  pycuda is actually pretty small (650 lines
of Python, 1325 lines of C++; the 1325 lines of C++ would probably be
replaced by a much smaller number of lines of Cython).  Doing the
rewrite would also give a chance to switch from the low-level to the
high-level API, which would make it much easier (possible?) to use the
CUDA BLAS and FFT.

Note that the CUDA single-precision FPU is not quite IEEE-compliant...
denormal numbers (very small numbers) are not handled correctly,
division is slightly inaccurate, and there are a few other issues.

I was actually planning to start incorporating CUDA into Sage myself
sometime in the next few months, probably starting by rewriting pycuda
in Cython.

Carl

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

Reply via email to