Re: Top-N recommendations from SVD

Sebastian Schelter Wed, 06 Mar 2013 08:08:10 -0800

I already committed a fix in that direction. I modified our
FixedSizePriorityQueue to allow inspection of its head for direct
comparison. This obviates the need to instantiate a Comparable and offer
it to the queue.


/s


On 06.03.2013 17:01, Ted Dunning wrote:
> I would recommend against a mutable object on maintenance grounds.
> 
> Better is to keep the threshold that a new score must meet and only
> construct the object on need.  That cuts the allocation down to negligible
> levels.
> 
> On Wed, Mar 6, 2013 at 6:11 AM, Sean Owen <sro...@gmail.com> wrote:
> 
>> OK, that's reasonable on 35 machines. (You can turn up to 70 reducers,
>> probably, as most machines can handle 2 reducers at once).
>> I think the recommendation step loads one whole matrix into memory. You're
>> not running out of memory but if you're turning up the heap size to
>> accommodate, you might be hitting swapping, yes. I think (?) the
>> conventional wisdom is to turn off swap for Hadoop.
>>
>> Sebastian yes that is probably a good optimization; I've had good results
>> reusing a mutable object in this context.
>>
>>
>> On Wed, Mar 6, 2013 at 10:54 AM, Josh Devins <h...@joshdevins.com> wrote:
>>
>>> The factorization at 2-hours is kind of a non-issue (certainly fast
>>> enough). It was run with (if I recall correctly) 30 reducers across a 35
>>> node cluster, with 10 iterations.
>>>
>>> I was a bit shocked at how long the recommendation step took and will
>> throw
>>> some timing debug in to see where the problem lies exactly. There were no
>>> other jobs running on the cluster during these attempts, but it's
>> certainly
>>> possible that something is swapping or the like. I'll be looking more
>>> closely today before I start to consider other options for calculating
>> the
>>> recommendations.
>>>
>>>
>>
>

Re: Top-N recommendations from SVD

Reply via email to