On Wed, Sep 16, 2009 at 12:54 PM, Anh Nguyen <[email protected]>wrote:

> Hi all,
>
> I am having some trouble with distributing workload evenly to reducers.
>
> I have 25 reducers and I intentionally created 25 different Map output keys
> so that each output set will go to one Reducer.
>
> But in practice, some Reducers get 2 sets and some does not get anything.
>
> I wonder if there is a way to fix this. Perhaps a custom Map output class?
>
> Any help is greatly appreciated.
>
>
The default HashPartitioner does this: (key.hashCode() & Integer.MAX_VALUE)
% numReduceTasks

So there's no guarantee your 25 different map-output keys would in fact end
up in different partitions.
Btw if you want some custom partitioning behavior, just implement the
Partitioner interface in your custom Partitioner class and supply that to
Hadoop (via JobConf.setPartitionerClass).


-- 
Harish Mallipeddi
http://blog.poundbang.in

Reply via email to