Hi all,

I have run a very quick comparison between SystemML's LibMatrixMult and Breeze matrix multiplication using native BLAS (OpenBLAS through netlib-java). As per my very small comparison I get the result that there is a performance difference for dense-dense Matrices of size 1000 x 1000 (our default blocksize) with Breeze being about 5-6 times faster here. The code I used can be found here: https://github.com/fschueler/incubator-systemml/blob/model_types/src/test/scala/org/apache/sysml/api/linalg/layout/local/SystemMLLocalBackendTest.scala

Running this code with 50 iterations each gives me for example average times of:
Breeze:         49.74 ms
SystemML:   363.44 ms

I don't want to say this is true for every operation, but those results let us form the hypothesis that native BLAS operations can lead to a significant speedup for certain operations which is worth testing with more advanced benchmarks.

Btw: I am definitely not saying we should use Breeze here. I am more looking at native BLAS and LAPACK implementations in general (as provided by OpenBLAS, MKL, etc.).

Let me know what you think!
Felix

Reply via email to