Hello Folks:
Since spark exposes python bindings and allows you to express your logic in
Python, Is there a way to leverage some of the sophisticated libraries like
NumPy, SciPy, Scikit in spark job and run at scale?
What's been your experience, any insights you can share in terms of what's
These libraries could be used in PySpark easily. For example, MLlib
uses Numpy heavily, it can accept np.array or sparse matrix in SciPy
as vectors.
On Mon, Nov 24, 2014 at 10:56 AM, Rohit Pujari rpuj...@hortonworks.com wrote:
Hello Folks:
Since spark exposes python bindings and allows you to