Re: Top-N recommendations from SVD

2013-03-07 Thread Josh Devins
So the good news is that the patch runs ;) The bad news is that it's slower, going from 1600-1800ms to ~2500ms to calculate a single users' topK recommendations. For kicks, I ran a couple other experiments, progressively removing code to isolate the problem area. Results are detailed here:

Re: Top-N recommendations from SVD

2013-03-07 Thread Sebastian Schelter
I've created a small benchmark to play around with the way the dot product is computed. It tries to mimic Josh's usecase: multiplying 2.5M dense item vectors of size 20 by a dense user vector of size 20. https://gist.github.com/sscdotopen/5108988 I compared using the .dot() method from the

Re: Top-N recommendations from SVD

2013-03-07 Thread Sebastian Schelter
Hi Josh, I made another attempt today. It directly computes the dot products, introduces a mutable version of RecommendedItem and uses Lucene's PriorityQueue to keep the top k. I hope this gives you some improvements. Here's the patch (must be applied against trunk):

Re: Top-N recommendations from SVD

2013-03-07 Thread Josh Devins
I'm running a job right now that uses your static `dot` method from your previous post, ontop of v0.7 (nothing from trunk). This has cut the time down by about 1/3 but it's still around 500ms per user. I'll give your latest patch a go hopefully tomorrow and report back. We're working on another

Re: Top-N recommendations from SVD

2013-03-06 Thread Ted Dunning
Well, it would definitely not be the for time I counted incorrectly. Anytime I do arithmetic the result should be considered suspect. I do think my numbers are correct, but then again, I always do. But the OP did say 20 dimensions which gives me back 5x. Inclusion of learning time is a

Re: Top-N recommendations from SVD

2013-03-06 Thread Josh Devins
So the 80 hour estimate is _only_ for the U*M', top-n calculation and not the factorization. Factorization is on the order of 2-hours. For the interested, here's the pertinent code from the ALS `RecommenderJob`:

Re: Top-N recommendations from SVD

2013-03-06 Thread Sebastian Schelter
Hi Josh, The factorization should be quite a bit faster with the current trunk, as we reworked the QR decomposition used for solving the least squares problems of ALS. I think we can also remove a lot of object instantiations in ParallelALSFactorizationJob. /s On 06.03.2013 11:25, Josh Devins

Re: Top-N recommendations from SVD

2013-03-06 Thread Sean Owen
Yeah that's right, he said 20 features, oops. And yes he says he's talking about the recs only too, so that's not right either. That seems way too long relative to factorization. And the factorization seems quite fast; how many machines, and how many iterations? I thought the shape of the

Re: Top-N recommendations from SVD

2013-03-06 Thread Josh Devins
The factorization at 2-hours is kind of a non-issue (certainly fast enough). It was run with (if I recall correctly) 30 reducers across a 35 node cluster, with 10 iterations. I was a bit shocked at how long the recommendation step took and will throw some timing debug in to see where the problem

Re: Top-N recommendations from SVD

2013-03-06 Thread Sebastian Schelter
Btw, all important jobs in ALS are map-only, so its the number of map slotes that counts. On 06.03.2013 12:11, Sean Owen wrote: OK, that's reasonable on 35 machines. (You can turn up to 70 reducers, probably, as most machines can handle 2 reducers at once). I think the recommendation step

Re: Top-N recommendations from SVD

2013-03-06 Thread Sean Owen
OK and he mentioned that 10 mappers were running, when it ought to be able to use several per machine. The # of mappers is a function of the input size really, so probably needs to turn down the max file split size to induce more mappers? On Wed, Mar 6, 2013 at 11:16 AM, Sebastian Schelter

Re: Top-N recommendations from SVD

2013-03-06 Thread Ted Dunning
I would recommend against a mutable object on maintenance grounds. Better is to keep the threshold that a new score must meet and only construct the object on need. That cuts the allocation down to negligible levels. On Wed, Mar 6, 2013 at 6:11 AM, Sean Owen sro...@gmail.com wrote: OK, that's

Re: Top-N recommendations from SVD

2013-03-06 Thread Ted Dunning
That sounds way too long. How is the U matrix stored? What type? On Wed, Mar 6, 2013 at 6:44 AM, Josh Devins h...@joshdevins.com wrote: First bit of feedback. The `M.forEachPair` loop is about 1600-1800 millis per user (recall the size is ~2.6M users x ~2.8M items). There doesn't appear to

Re: Top-N recommendations from SVD

2013-03-06 Thread Sean Owen
That too, even better. Isn't that already done? Could be in one place but not another. IIRC there were also cases where it was a lot easier to pass around an object internally and mutability solved the performance issue, without much risk since it was only internal. You can (nay, must) always copy

Re: Top-N recommendations from SVD

2013-03-06 Thread Sebastian Schelter
I already committed a fix in that direction. I modified our FixedSizePriorityQueue to allow inspection of its head for direct comparison. This obviates the need to instantiate a Comparable and offer it to the queue. /s On 06.03.2013 17:01, Ted Dunning wrote: I would recommend against a mutable

Re: Top-N recommendations from SVD

2013-03-06 Thread Josh Devins
Got sidetracked today but I'll run Sebastian's version in trunk tomorrow and report back. On 6 March 2013 17:07, Sebastian Schelter s...@apache.org wrote: I already committed a fix in that direction. I modified our FixedSizePriorityQueue to allow inspection of its head for direct comparison.

Top-N recommendations from SVD

2013-03-05 Thread Josh Devins
Hi all, I have a conceptually simple problem. A user-item matrix, A, whose dimensions are ~2.6M rows x ~2.8M cols (~65M non-zeros). Running ALS with 20 features reduces this in the usual way to A = UM'. Trying to generate top-n (where n=100) recommendations for all users in U is quite a long

Re: Top-N recommendations from SVD

2013-03-05 Thread Sean Owen
Without any tricks, yes you have to do this much work to really know which are the largest values in UM' for every row. There's not an obvious twist that speeds it up. (Do you really want to compute all user recommendations? how many of the 2.6M are likely to be active soon, or, ever?) First,

Re: Top-N recommendations from SVD

2013-03-05 Thread Josh Devins
Thanks Sean, at least I know I'm mostly on the right track ;) So in our case (a large, social, consumer website), this is already a small subset of all users (and items for that matter) and is really only the active users. In fact, in future iterations, the number of users will likely grow by

Re: Top-N recommendations from SVD

2013-03-05 Thread Sean Owen
Ah OK, so this is quite a big problem. Still, it is quite useful to be able to make recommendations in real-time, or near-real-time. It saves the relatively quite large cost of precomputing, and lets you respond immediately to new data. If the site has a lot of occasional or new users, that can

Re: Top-N recommendations from SVD

2013-03-05 Thread Ted Dunning
Hmm... each users recommendations seems to be about 2.8 x 20M Flops = 60M Flops. You should get about a Gflop per core in Java so this should about 60 ms. You can make this faster with more cores or by using ATLAS. Are you expecting 3 million unique people every 80 hours? If no, then it is