On Fri, Mar 27, 2009 at 4:39 PM, Sid123 <itis...@gmail.com> wrote:

> But I was thinking of grouping the values and generating a key using a
> random number generator in the collector of the mapper. The values will now
> be uniformly distributed over a few keys. Say the number of keys will be
> 0.1% of the # of values or atleast 1, which ever is higher. So if there
> 20000 values 2000 odd values should be under a single key.. and 10 reducers
> should spawn to do the sum in parallel...  Now I can atleast run 10 sum in
> parallel rather than just 1 reducer doing the whole work... How does that
> theory seem?
>

What you want to do is write a combiner, which is essentially a reducer that
runs on the map output of a single node before before being sent to the main
reducer. Then the real reducer would get one value per node.

Reply via email to