Hello, I finally have the time to contribute something to compyte, so I had a look at its sources. As far as I understand, at the moment it has: - sources for GPU platform-dependent memory operations (malloc()/free()/...) - sources for array class, which uses abstract API of these operations - some high-level Python code like scan.py with generalized kernels
So I have a few questions about this layout: 1. It does not have its own setup script; is it supposed to be a part of PyCuda/PyOpenCL and get compiled with them or is it just a temporary solution? In the former case, the second question: 2. Why was it decided to keep low-level memory operations in compyte? They require platform-specific makefiles (and the one currently committed to repo is quite specific and belongs to Frederic, as I understand from the paths inside). The only reason I can see is to keep memory operations API inside the single module, but in this case we will have to copy specialized building code from setup scripts of PyCuda/PyOpenCL, which, I think, is more serious violation of DRY. Memory API is small and unlikely to change much; we can create separate modules in PyCuda/PyOpenCL and pass pointers to memory functions to compyte using capsules. 3. Moreover, we can export some simple memory API in each of PyCuda/PyOpenCL (something like opaque Buffer object and memory functions that use it, like it's done in PyOpenCL) for people who want some fine tuning and do not want to use our general ndarray-like object. In fact, compyte developers are such people too. There can be some problems, of course, if you are inclined to write ndarray module in C (is it really necessary?), but they are, of course, solvable. Hope this makes sense. In any case, at the moment I am mostly interested in the answer to the first question, because it will remove some uncertainty in my current understanding. Best regards, Bogdan _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda