I'm using a GTX 480 (with 700W PSU) for a physics client, we're seeing 1000* and 154* speed-ups on the two main simulators using full double-precision math against single-threaded CPU equivalent systems.
That said - we're now looking at Teslas for the next step up in power, as mentioned here they're a might more expensive. Using profiling I've noticed that my memory bottlenecks probably remove a lot of the performance gain in the math for the algorithm we're really interested in (the one with the 154* speed-up). The Tesla will be an interesting experiment but until I fix my memory bottleneck, I don't think it'll help a huge deal. Ask yourself - are you likely to be computer-bound or memory-transfer-bound? If memory is your bottleneck then it might make sense to get the GTX 480 and experiment, then upgrade to a (better) Tesla when/if the time is right. If you're only ever going to care about double-precision math and money is no object then I'd suggest going for the Tesla first. HTH, Ian. On 30 October 2010 01:28, Neal Becker <ndbeck...@gmail.com> wrote: > Julien Cornebise wrote: > >> Bought a few months ago a random graphic card based on GTX 480, works >> very well, available in any PC store in the street -- easier than >> buying an older generation one (e.g. GTX 280, much mentioned in >> tutorials but harder to find and slower) >> >> See http://en.wikipedia.org/wiki/GeForce_400_Series for a comparison >> of the models in the series. Basic number to look at is the first one >> in the triple in column "Config core": it is the number of shaders, >> aka "cuda cores". >> >> Caution: don't forget that you may need a stronger power supply than >> standardly shipped in your computer: minimum 500 or 550 Watts (depend >> on the GPU model). Bringing back the GPU only to find the lack of >> power supply was like one of those Christmases when you open a >> wonder-toy that the batteries are not included and that all the stores >> are closed for the week-end ;) >> >> Hope this helps, enjoy the shopping. >> >> Julien >> > > http://en.wikipedia.org/wiki/GeForce_400_Series says: > > "In the more expensive "Tesla" configurations, the chip features optional > ECC protection on the memory, and can perform one double-precision floating- > point operation per cycle per core; the consumer GeForce cards are > artificially driver restricted to one DP operation per four cycles. " > > So, if I want double-precision float performance, what are my choices? > > > > _______________________________________________ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda > -- Ian Ozsvald (A.I. researcher, screencaster) i...@ianozsvald.com http://IanOzsvald.com http://MorConsulting.com/ http://blog.AICookbook.com/ http://TheScreencastingHandbook.com http://FivePoundApp.com/ http://twitter.com/IanOzsvald _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda