recovery after sybchronous Mapper failures on some records
----------------------------------------------------------
Key: HADOOP-1706
URL: https://issues.apache.org/jira/browse/HADOOP-1706
Project: Hadoop
Issue Type: New Feature
Components: contrib/streaming
Reporter: arkady borkovsky
Ii is sometimes hard or impossible to make sure that the Mapper reacts
correctly to all the errors in the input data -- especially when reusing legacy
or 3rd party code.
It would be nice if Streaming infrastructure had the following feature:
* check the exit code of the mapper command;
* if the command has crashed
* log the record that was processed during the failure to the error log
* restart the command
* feed it the remainder of the input
This way most of the data gets processed.
This feature should be disabled by default -- the user should explicitly
specify how many faults are allowed per task.
Once the number is exceeded, the whole job should fail without retries.
BTW: this functionality was described in the original MapReduce paper.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.