There are a couple of differences between the ml-matrix implementation and the one used in AMPCamp
- I think the AMPCamp one uses JBLAS which tends to ship native BLAS libraries along with it. In ml-matrix we switched to using Breeze + Netlib BLAS which is faster but needs some setup [1] to pick up native libraries. If native libraries are not found it falls back to a JVM implementation, so that might explain the slow down. - The other difference if you are comparing the whole image pipeline is that I think the AMPCamp version used NormalEquations which is around 2-3x faster (just in terms of number of flops) compared to TSQR. [1] https://github.com/fommil/netlib-java#machine-optimised-system-libraries Thanks Shivaram On Tue, Mar 10, 2015 at 9:57 AM, Jaonary Rabarisoa <jaon...@gmail.com> wrote: > I'm trying to play with the implementation of least square solver (Ax = b) > in mlmatrix.TSQR where A is a 50000*1024 matrix and b a 50000*10 matrix. > It works but I notice > that it's 8 times slower than the implementation given in the latest > ampcamp : > http://ampcamp.berkeley.edu/5/exercises/image-classification-with-pipelines.html > . As far as I know these two implementations come from the same basis. > What is the difference between these two codes ? > > > > > > On Tue, Mar 3, 2015 at 8:02 PM, Shivaram Venkataraman < > shiva...@eecs.berkeley.edu> wrote: > >> There are couple of solvers that I've written that is part of the AMPLab >> ml-matrix repo [1,2]. These aren't part of MLLib yet though and if you are >> interested in porting them I'd be happy to review it >> >> Thanks >> Shivaram >> >> >> [1] >> https://github.com/amplab/ml-matrix/blob/master/src/main/scala/edu/berkeley/cs/amplab/mlmatrix/TSQR.scala >> [2] >> https://github.com/amplab/ml-matrix/blob/master/src/main/scala/edu/berkeley/cs/amplab/mlmatrix/NormalEquations.scala >> >> On Tue, Mar 3, 2015 at 9:01 AM, Jaonary Rabarisoa <jaon...@gmail.com> >> wrote: >> >>> Dear all, >>> >>> Is there a least square solver based on DistributedMatrix that we can >>> use out of the box in the current (or the master) version of spark ? >>> It seems that the only least square solver available in spark is private >>> to recommender package. >>> >>> >>> Cheers, >>> >>> Jao >>> >> >> >