I ran from what's in trunk as of this morning. I didn't dig in further to see where that extra time was coming from but can do so when I get some time soon.
On 7 March 2013 15:56, Sebastian Schelter <s...@apache.org> wrote: > Hi Josh, > > Did you run the patch from the jira issue or did you run the trunk? I > made some follow up changes after uploading the patch. I can't imagine > why those small changes would lead to an increase of 50% in the runtime. > > /s > > > > On 07.03.2013 15:02, Josh Devins wrote: > > So the good news is that the patch runs ;) The bad news is that it's > > slower, going from 1600-1800ms to ~2500ms to calculate a single users' > topK > > recommendations. For kicks, I ran a couple other experiments, > progressively > > removing code to isolate the problem area. Results are detailed here: > > https://gist.github.com/joshdevins/5106930 > > > > Conclusions thus far: > > * the patch is not helpful (for performance) and should be reverted or > > fixed again (sorry Sebastian) > > * the dot product operation in `Vector` is not efficient enough for > large > > vectors/matrices, when used as it is in the ALS `RecommenderJob`, inside > a > > loop over `M` > > > > I've tried a few other experiments with Colt (for example) but there was > no > > noticeable gain. Parallelizing inside the map task (manually or with > > Parallel Colt) is possible but obviously is not ideal in an environment > > like Hadoop -- this would save memory since you only need a few map tasks > > loading the matrices, but isn't playing very nicely within a shared > cluster > > :) > > > > Next step at this point is to look at either reducing the number of items > > to recommend over, LSH or a third secret plan that "the PhD's" are > thinking > > about. Paper forthcoming, no doubt :D > > > > @Sebastian, happy to run any patches on our cluster/dataset before making > > more commits. > > > > > > > > On 6 March 2013 20:58, Josh Devins <h...@joshdevins.com> wrote: > > > >> Got sidetracked today but I'll run Sebastian's version in trunk tomorrow > >> and report back. > >> > >> > >> On 6 March 2013 17:07, Sebastian Schelter <s...@apache.org> wrote: > >> > >>> I already committed a fix in that direction. I modified our > >>> FixedSizePriorityQueue to allow inspection of its head for direct > >>> comparison. This obviates the need to instantiate a Comparable and > offer > >>> it to the queue. > >>> > >>> /s > >>> > >>> > >>> On 06.03.2013 17:01, Ted Dunning wrote: > >>>> I would recommend against a mutable object on maintenance grounds. > >>>> > >>>> Better is to keep the threshold that a new score must meet and only > >>>> construct the object on need. That cuts the allocation down to > >>> negligible > >>>> levels. > >>>> > >>>> On Wed, Mar 6, 2013 at 6:11 AM, Sean Owen <sro...@gmail.com> wrote: > >>>> > >>>>> OK, that's reasonable on 35 machines. (You can turn up to 70 > reducers, > >>>>> probably, as most machines can handle 2 reducers at once). > >>>>> I think the recommendation step loads one whole matrix into memory. > >>> You're > >>>>> not running out of memory but if you're turning up the heap size to > >>>>> accommodate, you might be hitting swapping, yes. I think (?) the > >>>>> conventional wisdom is to turn off swap for Hadoop. > >>>>> > >>>>> Sebastian yes that is probably a good optimization; I've had good > >>> results > >>>>> reusing a mutable object in this context. > >>>>> > >>>>> > >>>>> On Wed, Mar 6, 2013 at 10:54 AM, Josh Devins <h...@joshdevins.com> > >>> wrote: > >>>>> > >>>>>> The factorization at 2-hours is kind of a non-issue (certainly fast > >>>>>> enough). It was run with (if I recall correctly) 30 reducers across > a > >>> 35 > >>>>>> node cluster, with 10 iterations. > >>>>>> > >>>>>> I was a bit shocked at how long the recommendation step took and > will > >>>>> throw > >>>>>> some timing debug in to see where the problem lies exactly. There > >>> were no > >>>>>> other jobs running on the cluster during these attempts, but it's > >>>>> certainly > >>>>>> possible that something is swapping or the like. I'll be looking > more > >>>>>> closely today before I start to consider other options for > calculating > >>>>> the > >>>>>> recommendations. > >>>>>> > >>>>>> > >>>>> > >>>> > >>> > >>> > >> > > > >