I'm using a GTX 480 (with 700W PSU) for a physics client, we're seeing
1000* and 154* speed-ups on the two main simulators using full
double-precision math against single-threaded CPU equivalent systems.

That said - we're now looking at Teslas for the next step up in power,
as mentioned here they're a might more expensive.

Using profiling I've noticed that my memory bottlenecks probably
remove a lot of the performance gain in the math for the algorithm
we're really interested in (the one with the 154* speed-up). The Tesla
will be an interesting experiment but until I fix my memory
bottleneck, I don't think it'll help a huge deal.

Ask yourself - are you likely to be computer-bound or
memory-transfer-bound? If memory is your bottleneck then it might make
sense to get the GTX 480 and experiment, then upgrade to a (better)
Tesla when/if the time is right.

If you're only ever going to care about double-precision math and
money is no object then I'd suggest going for the Tesla first.

HTH,
Ian.

On 30 October 2010 01:28, Neal Becker <ndbeck...@gmail.com> wrote:
> Julien Cornebise wrote:
>
>> Bought a few months ago a random graphic card based on GTX 480, works
>> very well, available in any PC store in the street -- easier than
>> buying an older generation one (e.g. GTX 280, much mentioned in
>> tutorials but harder to find and slower)
>>
>> See http://en.wikipedia.org/wiki/GeForce_400_Series for a comparison
>> of the models in the series. Basic number to look at is the first one
>> in the triple in column "Config core": it is the number of shaders,
>> aka "cuda cores".
>>
>> Caution: don't forget that you may need a stronger power supply than
>> standardly shipped in your computer: minimum 500 or 550 Watts (depend
>> on the GPU model). Bringing back the GPU only to find the lack of
>> power supply was like one of those Christmases when you open a
>> wonder-toy that the batteries are not included and that all the stores
>> are closed for the week-end ;)
>>
>> Hope this helps, enjoy the shopping.
>>
>> Julien
>>
>
>  http://en.wikipedia.org/wiki/GeForce_400_Series says:
>
> "In the more expensive "Tesla" configurations, the chip features optional
> ECC protection on the memory, and can perform one double-precision floating-
> point operation per cycle per core; the consumer GeForce cards are
> artificially driver restricted to one DP operation per four cycles. "
>
> So, if I want double-precision float performance, what are my choices?
>
>
>
> _______________________________________________
> PyCUDA mailing list
> PyCUDA@tiker.net
> http://lists.tiker.net/listinfo/pycuda
>



-- 
Ian Ozsvald (A.I. researcher, screencaster)
i...@ianozsvald.com

http://IanOzsvald.com
http://MorConsulting.com/
http://blog.AICookbook.com/
http://TheScreencastingHandbook.com
http://FivePoundApp.com/
http://twitter.com/IanOzsvald

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to