Sean, please add a benchmark too in integration so we can track the progression.
Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Thu, Apr 18, 2013 at 4:12 PM, Sebastian Schelter <s...@apache.org> wrote: > Let us know the results! :) > > I think in the case of ALS, we can even use Solve.solveSymmetric() > > Best, > Sebastian > > On 18.04.2013 23:07, Sean Owen wrote: > > Good lead -- from > > > https://github.com/mikiobraun/jblas/blob/master/src/main/java/org/jblas/Solve.java > > it looks like it's an SVD. Definitely took a search to figure out what > > 'gelsd' does in LAPACK! I'll see if I can test-drive this too to see > > if it bumps performance. That would be great, JNI is a much smaller > > requirement than a GPU! > > > > On Thu, Apr 18, 2013 at 10:01 PM, Sebastian Schelter <s...@apache.org> > wrote: > >> Hi Sean, > >> > >> I simply used the Solve.solve() method, I guess it uses a QR > >> decomposition internally. I can provide a copy of the code if you want > >> to have a look. > >> > >> Best, > >> Sebastian > >> > >> On 18.04.2013 22:56, Sean Owen wrote: > >>> I'm always interested in optimizing the bit where you solve Ax=B which > >>> I so recently went on about. That's a dense-matrix problem. Is there a > >>> QR decomposition available? > >>> > >>> I tried getting this part to run on a GPU, and it worked, but wasn't > >>> faster. Still somehow it was slower to push the smalish dense matrix > >>> onto the card so many times per second. Same issue is identified here > >>> so I'm interested to hear if this is a win by using the direct buffer > >>> approach. > >>> > >>> On Thu, Apr 18, 2013 at 9:51 PM, Dmitriy Lyubimov <dlie...@gmail.com> > wrote: > >>>> i've looked at jblas some time year or two ago. > >>>> > >>>> It's a fast bridge to LAPack and LAPack by far is hard to beat. But, I > >>>> think i convinced myself it lacks support for sparse stuff. Which > will work > >>>> nice though still for many blockified algorithms such as ALS-WR with > try to > >>>> avoid doing blas level 3 operations on sparse data. > >>>> > >>>> > >>>> On Thu, Apr 18, 2013 at 1:45 PM, Robin Anil <robin.a...@gmail.com> > wrote: > >>>> > >>>>> BTW did this include the changes I made in the trunk recently? I > would also > >>>>> like to profile that code and see if we can squeeze out our Vectors > and > >>>>> Matrices more. Could you point me to how I can run the 1M example. > >>>>> > >>>>> Robin > >>>>> > >>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. > >>>>> > >>>>> > >>>>> On Thu, Apr 18, 2013 at 3:43 PM, Robin Anil <robin.a...@gmail.com> > wrote: > >>>>> > >>>>>> I was just emailing something similar on Mahout(See my email). I > saw the > >>>>>> TU Berlin name and I thought you would do something about it :) > This is > >>>>>> excellent. One of the next gen work on Vectors is maybe > investigating > >>>>> this. > >>>>>> > >>>>>> > >>>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. > >>>>>> > >>>>>> > >>>>>> On Thu, Apr 18, 2013 at 3:37 PM, Sebastian Schelter <s...@apache.org > >>>>>> wrote: > >>>>>> > >>>>>>> Hi there, > >>>>>>> > >>>>>>> with regard to Robin mentioning JBlas [1] recently when we talked > about > >>>>>>> the performance of our vector operations, I ported the solving > code for > >>>>>>> ALS to JBlas today and got some awesome results. > >>>>>>> > >>>>>>> For the movielens 1M dataset and a factorization of rank 100, the > >>>>>>> runtimes per iteration dropped from 50 seconds to less than 7 > seconds. I > >>>>>>> will run some tests with the distributed version and larger > datasets in > >>>>>>> the next days, but from what I've seen we should really take a > closer > >>>>>>> look at JBlas, at least for operations on dense matrices. > >>>>>>> > >>>>>>> Best, > >>>>>>> Sebastian > >>>>>>> > >>>>>>> [1] http://mikiobraun.github.io/jblas/ > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >> > >