Hi,
I've filed a JIRA (https://issues.apache.org/jira/browse/SPARK-6553) and
suggested a fix (https://github.com/apache/spark/pull/5206).
On 2015-03-25 19:49, Davies Liu wrote:
It’s good to support functools.partial, could you file a JIRA for it?
On Wednesday, March 25, 2015 at 5:42 AM,
I mentioned this earlier in the thread, but I'll put it out again. Dense
BLAS are not very important for most machine learning workloads: at
least for non-image workloads in industry (and for image processing you
would probably want a deep learning/SGD solution with convolution
kernels). e.g.
On binary file formats - I looked at HDF5+Spark a couple of years ago and
found it barely JVM-friendly and very Hadoop-unfriendly (e.g. the APIs
needed filenames as input, you couldn't pass it anything like an
InputStream). I don't know if it has gotten any better.
Parquet plays much more nicely
I'm not at all surprised ;-) I fully expect the GPU performance to get
better automatically as the hardware improves.
Netlib natives still need to be shipped separately. I'd also oppose any
move to make Open BLAS the default - is not always better and I think
natives really need DevOps buy-in.
Btw, OpenBLAS requires GPL runtime binaries which are typically considered
system libraries (and these fall under something similar to the Java
classpath exception rule)... so it's basically impossible to distribute
OpenBLAS the way you're suggesting, sorry. Indeed, there is work ongoing in
Spark