Update for those who are still interested: djinni is a nice tool for
generating Java/C++ bindings. Before today djinni's Java support was only
aimed at Android, but now djinni works with (at least) Debian, Ubuntu, and
CentOS.
djinni will help you run C++ code in-process with the caveat that
In order to get a major speedup from applying *single-pass* map/filter/reduce
operations on an array in GPU memory, wouldn't you need to stream the
columnar data directly into GPU memory somehow? You might find in your
experiments that GPU memory allocation is a bottleneck. See e.g. John
Canny's
Paul: I've worked on running C++ code on Spark at scale before (via JNA, ~200
cores) and am working on something more contribution-oriented now (via JNI).
A few comments:
* If you need something *today*, try JNA. It can be slow (e.g. a short
native function in a tight loop) but works if you