On Dec 3, 2008, at 5:49 AM, Zhou, Yunqing wrote:

I'm running a job on a data with size 5TB. But currently it reports
there is a checksum error block in the file. Then it cause a map task
failure then the whole job failed.
But the lack of a 64MB block will almost not affect the final result.
So can I ignore some map task failure and continue with reduce step?

I'm using hadoop-0.18.2 with a replication factor of 1.


You can specify that your job can tolerate some percentage of failures:
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int)
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int)

Arun

Reply via email to