Am 2015-08-14 05:40, schrieb Henry Gomersall:
I haven't been able to test with an Nvidia machine today and I won't be able to until Monday, but from memory the numbers were similar and were what prompted my last email.
I performed some micro-optimizations based on the profiles you submitted, and according to my measurements most of the code paths that showed up in those profiles should now essentially be gone. The mysterious thing is that these code paths were never such a big deal on my machine in first place, so that overall the net gain from these optimizations on my machine is pretty small, perhaps 10%. But then, the machine is already spending about 60% of its time waiting for OpenCL anyway, so that even the ideal case gain would be pretty small. I would love to hear what you find with this updated code, currently in git master.
Once again, thanks very much for reporting this, and for taking the time to submit detailed profiles.
Andreas _______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
