I'm (unfortunately) aware of this and this isn't the issue. My key object
contains only long, int and String values.

The job map output is consistent, but the reduce input groups and values
for the key vary from one job to the next on the same input. It's like it
isn't properly comparing and partitioning the keys.

I have properly implemented a hashCode(), equals() and the
WritableComparable methods.

Also not surprisingly when I use 1 reduce task, the output is correct.


On Tue, Jan 10, 2012 at 5:58 PM, W.P. McNeill <bill...@gmail.com> wrote:

> The Hadoop framework reuses Writable objects for key and value arguments,
> so if your code stores a pointer to that object instead of copying it you
> can find yourself with mysterious duplicate objects.  This has tripped me
> up a number of times. Details on what exactly I encountered and how I fixed
> it are here
>
> http://cornercases.wordpress.com/2011/03/14/serializing-complex-mapreduce-keys/
> and
> here
>
> http://cornercases.wordpress.com/2011/08/18/hadoop-object-reuse-pitfall-all-my-reducer-values-are-the-same/
>  .
>

Reply via email to