Re: Performance of ALS

Robin Anil Thu, 18 Apr 2013 14:20:22 -0700

Sean, please add a benchmark too in integration so we can track the
progression.


Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Thu, Apr 18, 2013 at 4:12 PM, Sebastian Schelter <s...@apache.org> wrote:

> Let us know the results! :)
>
> I think in the case of ALS, we can even use Solve.solveSymmetric()
>
> Best,
> Sebastian
>
> On 18.04.2013 23:07, Sean Owen wrote:
> > Good lead -- from
> >
> https://github.com/mikiobraun/jblas/blob/master/src/main/java/org/jblas/Solve.java
> > it looks like it's an SVD. Definitely took a search to figure out what
> > 'gelsd' does in LAPACK! I'll see if I can test-drive this too to see
> > if it bumps performance. That would be great, JNI is a much smaller
> > requirement than a GPU!
> >
> > On Thu, Apr 18, 2013 at 10:01 PM, Sebastian Schelter <s...@apache.org>
> wrote:
> >> Hi Sean,
> >>
> >> I simply used the Solve.solve() method, I guess it uses a QR
> >> decomposition internally. I can provide a copy of the code if you want
> >> to have a look.
> >>
> >> Best,
> >> Sebastian
> >>
> >> On 18.04.2013 22:56, Sean Owen wrote:
> >>> I'm always interested in optimizing the bit where you solve Ax=B which
> >>> I so recently went on about. That's a dense-matrix problem. Is there a
> >>> QR decomposition available?
> >>>
> >>> I tried getting this part to run on a GPU, and it worked, but wasn't
> >>> faster. Still somehow it was slower to push the smalish dense matrix
> >>> onto the card so many times per second. Same issue is identified here
> >>> so I'm interested to hear if this is a win by using the direct buffer
> >>> approach.
> >>>
> >>> On Thu, Apr 18, 2013 at 9:51 PM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
> >>>> i've looked at jblas some time year or two ago.
> >>>>
> >>>> It's a fast bridge to LAPack and LAPack by far is hard to beat. But, I
> >>>> think i convinced myself it lacks support for sparse stuff. Which
> will work
> >>>> nice though still for many blockified algorithms such as ALS-WR with
> try to
> >>>> avoid doing blas level 3 operations on sparse data.
> >>>>
> >>>>
> >>>> On Thu, Apr 18, 2013 at 1:45 PM, Robin Anil <robin.a...@gmail.com>
> wrote:
> >>>>
> >>>>> BTW did this include the changes I made in the trunk recently? I
> would also
> >>>>> like to profile that code and see if we can squeeze out our Vectors
> and
> >>>>> Matrices more. Could you point me to how I can run the 1M example.
> >>>>>
> >>>>> Robin
> >>>>>
> >>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
> >>>>>
> >>>>>
> >>>>> On Thu, Apr 18, 2013 at 3:43 PM, Robin Anil <robin.a...@gmail.com>
> wrote:
> >>>>>
> >>>>>> I was just emailing something similar on Mahout(See my email). I
> saw the
> >>>>>> TU Berlin name and I thought you would do something about it :)
> This is
> >>>>>> excellent. One of the next gen work on Vectors is maybe
> investigating
> >>>>> this.
> >>>>>>
> >>>>>>
> >>>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Apr 18, 2013 at 3:37 PM, Sebastian Schelter <s...@apache.org
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi there,
> >>>>>>>
> >>>>>>> with regard to Robin mentioning JBlas [1] recently when we talked
> about
> >>>>>>> the performance of our vector operations, I ported the solving
> code for
> >>>>>>> ALS to JBlas today and got some awesome results.
> >>>>>>>
> >>>>>>> For the movielens 1M dataset and a factorization of rank 100, the
> >>>>>>> runtimes per iteration dropped from 50 seconds to less than 7
> seconds. I
> >>>>>>> will run some tests with the distributed version and larger
> datasets in
> >>>>>>> the next days, but from what I've seen we should really take a
> closer
> >>>>>>> look at JBlas, at least for operations on dense matrices.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Sebastian
> >>>>>>>
> >>>>>>> [1] http://mikiobraun.github.io/jblas/
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>
>
>

Re: Performance of ALS

Reply via email to