On Dec 3, 2008, at 5:49 AM, Zhou, Yunqing wrote:
I'm running a job on a data with size 5TB. But currently it reports there is a checksum error block in the file. Then it cause a map task failure then the whole job failed. But the lack of a 64MB block will almost not affect the final result. So can I ignore some map task failure and continue with reduce step? I'm using hadoop-0.18.2 with a replication factor of 1.
You can specify that your job can tolerate some percentage of failures: http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int) http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int) Arun