It runs faster but there is some drawbacks. It seems to consume more memory. I get java.lang.OutOfMemoryError: Java heap space error if I don't have a sufficient partitions for a fixed amount of memory. With the older (ampcamp) implementation for the same data size I didn't get it.
On Thu, Mar 12, 2015 at 11:36 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > > On Thu, Mar 12, 2015 at 3:05 PM, Jaonary Rabarisoa <jaon...@gmail.com> > wrote: > >> In fact, by activating netlib with native libraries it goes faster. >> >> Glad you got it work ! Better performance was one of the reasons we made > the switch. > >> Thanks >> >> On Tue, Mar 10, 2015 at 7:03 PM, Shivaram Venkataraman < >> shiva...@eecs.berkeley.edu> wrote: >> >>> There are a couple of differences between the ml-matrix implementation >>> and the one used in AMPCamp >>> >>> - I think the AMPCamp one uses JBLAS which tends to ship native BLAS >>> libraries along with it. In ml-matrix we switched to using Breeze + Netlib >>> BLAS which is faster but needs some setup [1] to pick up native libraries. >>> If native libraries are not found it falls back to a JVM implementation, so >>> that might explain the slow down. >>> >>> - The other difference if you are comparing the whole image pipeline is >>> that I think the AMPCamp version used NormalEquations which is around 2-3x >>> faster (just in terms of number of flops) compared to TSQR. >>> >>> [1] >>> https://github.com/fommil/netlib-java#machine-optimised-system-libraries >>> >>> Thanks >>> Shivaram >>> >>> On Tue, Mar 10, 2015 at 9:57 AM, Jaonary Rabarisoa <jaon...@gmail.com> >>> wrote: >>> >>>> I'm trying to play with the implementation of least square solver (Ax = >>>> b) in mlmatrix.TSQR where A is a 50000*1024 matrix and b a 50000*10 >>>> matrix. It works but I notice >>>> that it's 8 times slower than the implementation given in the latest >>>> ampcamp : >>>> http://ampcamp.berkeley.edu/5/exercises/image-classification-with-pipelines.html >>>> . As far as I know these two implementations come from the same basis. >>>> What is the difference between these two codes ? >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Mar 3, 2015 at 8:02 PM, Shivaram Venkataraman < >>>> shiva...@eecs.berkeley.edu> wrote: >>>> >>>>> There are couple of solvers that I've written that is part of the >>>>> AMPLab ml-matrix repo [1,2]. These aren't part of MLLib yet though and if >>>>> you are interested in porting them I'd be happy to review it >>>>> >>>>> Thanks >>>>> Shivaram >>>>> >>>>> >>>>> [1] >>>>> https://github.com/amplab/ml-matrix/blob/master/src/main/scala/edu/berkeley/cs/amplab/mlmatrix/TSQR.scala >>>>> [2] >>>>> https://github.com/amplab/ml-matrix/blob/master/src/main/scala/edu/berkeley/cs/amplab/mlmatrix/NormalEquations.scala >>>>> >>>>> On Tue, Mar 3, 2015 at 9:01 AM, Jaonary Rabarisoa <jaon...@gmail.com> >>>>> wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> Is there a least square solver based on DistributedMatrix that we can >>>>>> use out of the box in the current (or the master) version of spark ? >>>>>> It seems that the only least square solver available in spark is >>>>>> private to recommender package. >>>>>> >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Jao >>>>>> >>>>> >>>>> >>>> >>> >> >