Thanks, I will try what you suggested.
Best, On Wed, Sep 16, 2009 at 2:59 AM, Harish Mallipeddi < [email protected]> wrote: > On Wed, Sep 16, 2009 at 12:54 PM, Anh Nguyen <[email protected] > >wrote: > > > Hi all, > > > > I am having some trouble with distributing workload evenly to reducers. > > > > I have 25 reducers and I intentionally created 25 different Map output > keys > > so that each output set will go to one Reducer. > > > > But in practice, some Reducers get 2 sets and some does not get anything. > > > > I wonder if there is a way to fix this. Perhaps a custom Map output > class? > > > > Any help is greatly appreciated. > > > > > The default HashPartitioner does this: (key.hashCode() & Integer.MAX_VALUE) > % numReduceTasks > > So there's no guarantee your 25 different map-output keys would in fact end > up in different partitions. > Btw if you want some custom partitioning behavior, just implement the > Partitioner interface in your custom Partitioner class and supply that to > Hadoop (via JobConf.setPartitionerClass). > > > -- > Harish Mallipeddi > http://blog.poundbang.in > -- ---------------------------- Anh Nguyen http://www.im-nguyen.com
