Gregor Thalhammer <[email protected]> writes: > on my macbook pro (running os x 10.11) I have been plagued with nasty > segfaults when using the nvidia 750M GPU and the high level methods of > pyopencl.array together with complex64 arrays. Ultimately it seems to > be due to buggy nvidia OpenCL drivers on os x, but I found a > workaround: > > The typedef in pyopencl/cl/pyopencl-complex.h for cfloat_t (after macro > expansion) > as a > "union {struct {float x,y;}; struct {float real, imag;}}“ > is too sophisticated. Same for a simpler „struct {float x,y;}“ But segfaults > go away if instead I use "typedef float2 cfloat_t;“ and then replace a.real > by a.x (same for .imag -> .y). All tests pass (could not test for complex > double). This is not beautiful, but it works for me. > > The docs state that the the struct has been introduced to avoid silent > bugs, e.g. complex + real not giving the expected result, so I > understand that my workaround is not acceptable for a PR.
I would be happy to accept a pull request to this effect that, based on some flag, replaces the struct-based definition of the complex number with a vector-based one, to work around broken OpenCL implementations, of which there appear to be many. The one requirement would be that the struct-based implementation remains the default, and that all code continues to work with it. Andreas _______________________________________________ PyOpenCL mailing list [email protected] https://lists.tiker.net/listinfo/pyopencl
