Hi all, dear Andreas,

on my macbook pro (running os x 10.11) I have been plagued with nasty segfaults 
when using the nvidia 750M GPU and the high level methods of pyopencl.array 
together with complex64 arrays. Ultimately it seems to be due to buggy nvidia 
OpenCL drivers on os x, but I found a workaround: 

The typedef in pyopencl/cl/pyopencl-complex.h  for cfloat_t (after macro 
expansion) 
as a 
"union {struct {float x,y;}; struct {float real, imag;}}“ 
is too sophisticated. Same for a simpler „struct {float x,y;}“ But segfaults go 
away if instead I use  "typedef float2 cfloat_t;“ and then replace a.real by 
a.x (same for .imag -> .y). All tests pass (could not test for complex double). 
This is not beautiful, but it works for me. 

The docs state that the the struct has been introduced to avoid silent bugs, 
e.g. complex + real not giving the expected result, so I understand that my 
workaround is not acceptable for a PR.

Does anybody know about other ways to avoid the segfaults? 

Gregor


_______________________________________________
PyOpenCL mailing list
[email protected]
https://lists.tiker.net/listinfo/pyopencl

Reply via email to