[PyCUDA] PyCUDA poor FP32 performance on Fermi ?

Roberto Colistete Jr. Mon, 02 Jul 2012 12:37:35 -0700

Hi,

It is my first post here in this PyCUDA group. I am using PyCUDA xCUDA x Mathematica 8 CUDA to compare performance in some problems inPhysics.

Until CC 1.3, the performance ratio of PyCUDA between DP/SP(FP64/FP32) was as expected (near 1/8 or 1/12), comparable when runningCUDA or Mathematica 8 CUDA.

But using the same source code on any GPU device with CC 2.0/2.1(Fermi), the performance in FP32 (SP) is poor with :

- DP/SP ratio of approx. 1/3 to 1/2;

- better GPU device (Tesla C2050, CC2.0) being slower (0.77s x 0.33s) inFP32 than older GPU (Tesla C1060, CC1.3)), while in FP64 it is faster(0.89s x 4.48s).

The same behaviour happens with other CC2.x GPU devices (GTX 480,GT 540M, etc) and any Linux (Ubuntu, Fedora, etc).

Do you have some explanation about this issue ? And recomendationto solve it ?


        Regards,

        Roberto

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

[PyCUDA] PyCUDA poor FP32 performance on Fermi ?

Reply via email to