Random question for the PySpark and Python experts/enthusiasts on here:

How big of a deal would it be for PySpark and PySpark users if you could
run numpy on PyPy?

PySpark already supports running on PyPy
<https://github.com/apache/spark/pull/2144>, but libraries like MLlib that
use numpy are not supported.

There is an ongoing initiative to support numpy on PyPy
<http://morepypy.blogspot.com/2015/02/numpypy-status-january-2015.html>,
and they are taking donations <http://pypy.org/numpydonate.html> to support
the effort.

I’m wondering if any companies using PySpark in production would be
interested in pushing this initiative along, or if it’s not that big of a
deal.

Nick
​

Reply via email to