reduce task failing after 24 hours waiting

Billy Pearson Wed, 25 Mar 2009 19:24:15 -0700

I am seeing on one of my long running jobs about 50-60 hours that after 24hours all

active reduce task fail with the error messages

java.io.IOException: Task process exit with nonzero status of 255.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)


Is there something in the config that I can change to stop this?

Every time with in 1 min of 24 hours they all fail at the same time.
waist a lot of resource downloading the map outputs and merging them again.

Billy

reduce task failing after 24 hours waiting

Reply via email to