I have no Java implementation of my job, sorry.

Since it's all in the map side, IdentityMapper/IdentityReducer is fine, as long as both the splits and the number of reduce tasks are the same.

The data is a representation for loglines, and not exactly small, e.g. the
stuff has already been reduced once.

By "not exactly small, do you mean each line is long or that there are many records?

The interesting thing is that it happens inside the last Map task, not in the
reducer tasks.
As you can see above the mapper cmd is rather on the simple side.

util.QuickSort is only used on the map side, so this shouldn't have anything to do with the reduce. Is it always and only the *last* map task that fails? If I sent you a patch that would print a trace with the partitions, would you mind running it? Do you have any other settings that differ from the defaults? -C

Reply via email to