[ 
http://issues.apache.org/jira/browse/HADOOP-153?page=comments#action_12375455 ] 

eric baldeschwieler commented on HADOOP-153:
--------------------------------------------

sounds good.  The acceptable % should probably be configurable.  I'd be 
inclined to use something more like 1%.  You could turn the feature off by 
configuring 0%, which should arguably be the default.


> skip records that throw exceptions
> ----------------------------------
>
>          Key: HADOOP-153
>          URL: http://issues.apache.org/jira/browse/HADOOP-153
>      Project: Hadoop
>         Type: New Feature

>   Components: mapred
>     Versions: 0.2
>     Reporter: Doug Cutting
>     Assignee: Doug Cutting
>      Fix For: 0.2

>
> MapReduce should skip records that throw exceptions.
> If the exception is thrown under RecordReader.next() then RecordReader 
> implementations should automatically skip to the start of a subsequent record.
> Exceptions in map and reduce implementations can simply be logged, unless 
> they happen under RecordWriter.write().  Cancelling partial output could be 
> hard.  So such output errors will still result in task failure.
> This behaviour should be optional, but enabled by default.  A count of errors 
> per task and job should be maintained and displayed in the web ui.  Perhaps 
> if some percentage of records (>50%?) result in exceptions then the task 
> should fail.  This would stop jobs early that are misconfigured or have buggy 
> code.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to