Hi Georg, On Fri, 21 May 2010 15:10:27 +0200, "Teichtmeister, Georg" <georg.teichtmeis...@joanneum.at> wrote: > I installed PyCuda 0.93/Cuda 2.3 on my computer (Win7-32bit, Xeon > W3520 , GTX285) and after some troubles it seemed to execute the > examples without problems. Then I started test_gpuarray_random.py and > it shows a gpu speed up around 1e-7 (so the cpu is 10 million times > faster ). It seems there is something wrong.
> Size |Time GPU |Size/Time GPU|Time CPU |Size/Time CPU > |GPU vs CPU speedup > --------+----------------+-------------+-----------------+-----------------+------------------ > 1024 |0.00725786230469|141088.375201|2.88000004366e-09|355555550165.0 > |3.96811061268e-07 > 2048 |0.00724965625 |282496.152835|2.88000004366e-09|711111100331.0 > |3.97260220946e-07 > 4096 > |0.00723105859375|566445.416932|2.91200005449e-09|1.40659338027e+12|4.02707296136e-07 > 8192 |0.00733181494141|1117322.25451|2.97600007616e-09|2.7526881016e+12 > |4.0590223566e-07 > 16384 > |0.00751718310547|2179539.83163|2.91200005449e-09|5.62637352108e+12|3.87379157011e-07 > 32768 > |0.00753562646484|4348410.86576|2.91200005449e-09|1.12527470422e+13|3.86431050966e-07 > 65536 > |0.00768003027344|8533299.69631|2.88000004366e-09|2.27555552106e+13|3.74998527496e-07 > 131072 > |0.00763785009766|17160850.0199|2.94400006533e-09|4.45217381425e+13|3.85448788296e-07 > 262144 |0.0091615 > |28613654.9692|2.91200005449e-09|9.00219763374e+13|3.17851886099e-07 > 524288 > |0.00921462597656|56897371.7798|2.91200005449e-09|1.80043952675e+14|3.16019343802e-07 > 1048576 |0.0095365390625 > |109953515.959|2.97600007616e-09|3.52344077005e+14|3.12062904231e-07 > 2097152 |0.0102445153809 |204709732.187|2.94400006533e-08|7.1234781028e+13 > |2.87373287645e-06 > 4194304 |0.0117750354004 |356203090.469|2.91200005449e-08|1.4403516214e+14 > |2.47302870478e-06 > 8388608 |0.0144500805664 |580523268.466|2.84800003283e-08|2.9454381683e+14 > |1.97092328983e-06 > 16777216|0.0199746191406 > 16777216||839926703.077|2.94400006533e-08|5.69878248224e+14|1.4738704375 > 16777216|8e-06 This is what I get on my 260: Size |Time GPU |Size/Time GPU|Time CPU |Size/Time CPU|GPU vs CPU speedup --------+-----------------+-------------+-----------------+-------------+------------------ 1024 |0.00142396020508 |719121.220065|2.40813121796e-05|42522599.7805|0.0169115064408 2048 |0.00067283581543 |3043833.21017|4.50990066528e-05|45411199.758 |0.067028249119 4096 |0.000669760253906|6115621.18849|8.75118713379e-05|46805078.4126|0.130661487939 8192 |0.000686293457031|11936584.7307|0.000170064575195|48169937.7462|0.247801539492 16384 |0.000707383789063|23161401.5663|0.000337699188232|48516551.3301|0.477391754595 32768 |0.000822115112305|39858165.2491|0.000671733520508|48781248.8131|0.817079640617 65536 |0.000892228027344|73452075.0207|0.00134026708984 |48897716.3557|1.50215757494 131072 |0.00101073547363 |129679825.651|0.00267541870117 |48991210.2142|2.64700188226 262144 |0.00114172753906 |229602940.309|0.00536326074219 |48877728.0467|4.69749617023 524288 |0.00136735131836 |383433279.334|0.0118434707031 |44268104.6073|8.66161501006 1048576 |0.00160290478516 |654172356.156|0.0242522578125 |43236221.8853|15.1301924089 2097152 |0.00214327362061 |978480759.45 |0.0483249169922 |43396908.4797|22.5472457308 4194304 |0.00346559783936 |1210268529.25|0.153984335938 |27238510.8165|44.4322576004 8388608 |0.00638256225586 |1314301006.98|0.338394179687 |24789457.0993|53.01854743 16777216|0.0112450183105 |1491968757.78|0.591380039063 |28369601.4269|52.5904024992 It looks like Windows is mis-measuring the CPU time. Can you re-test with the example from the wiki? test_gpuarray_random.py has been gone From the repository for a while now--try getting git or 0.94rc and then using the Wiki examples (use the wiki download script under examples/) to see if that makes this any better. If not, we would need to stop using GPU events to measure CPU time... Andreas
pgpL2mcDTIQVl.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda