Random question for the PySpark and Python experts/enthusiasts on here: How big of a deal would it be for PySpark and PySpark users if you could run numpy on PyPy?
PySpark already supports running on PyPy <https://github.com/apache/spark/pull/2144>, but libraries like MLlib that use numpy are not supported. There is an ongoing initiative to support numpy on PyPy <http://morepypy.blogspot.com/2015/02/numpypy-status-january-2015.html>, and they are taking donations <http://pypy.org/numpydonate.html> to support the effort. I’m wondering if any companies using PySpark in production would be interested in pushing this initiative along, or if it’s not that big of a deal. Nick