[
https://issues.apache.org/jira/browse/PIG-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lorand Bendig updated PIG-2620:
-------------------------------
Attachment: rewrite_example.txt
error_flow.png
Based on the discussion on the wiki page I have further elaborated the
implementation details. Two files are attached:
* a possible rewrite of the ONERROR syntax with explain plans
* a diagram about the error/ignored result propagation back from the EvalFuncs.
Some notes:
The ONERROR syntax has been substitued with existing Pig operators, so that the
optimizers/visitors can fully understand/process the logical plan.
The diagram shows a publish-subscribe communication between
EvalFuncs/POUserFuncs and POForeach which enables to report back information at
the same time when a null return happens due to an invalid record. These
information like thrown exception, detail msg...etc are needed to create the
tuple for the error relation.
Guava's [EventBus|
http://docs.guava-libraries.googlecode.com/git-history/v11.0/javadoc/com/google/common/eventbus/EventBus.html]
could be a good candidate for this purpose.
What do you think?
> Customizable Error Handling in Pig
> ----------------------------------
>
> Key: PIG-2620
> URL: https://issues.apache.org/jira/browse/PIG-2620
> Project: Pig
> Issue Type: New Feature
> Reporter: Dmitriy V. Ryaboy
> Assignee: Lorand Bendig
> Attachments: error_flow.png, rewrite_example.txt
>
>
> The current behavior of Pig when handling exceptions thrown by UDFs is to
> fail and stop processing. We want to extend this behavior to let user have
> finer grain control on error handling.
> Depending on the use-case there are several options users would like to have:
> Stop the execution and report an error
> Ignore tuples that cause exceptions and log warnings
> Ignore tuples that cause exceptions and redirect them to an error relation
> (to enable statistics, debugging, ...)
> Write their own error handler
--
This message was sent by Atlassian JIRA
(v6.2#6252)