https://github.com/apache/beam/pull/6586 is still open for review, but I
also wanted to gather feedback about a potential refactor as part of that
change.

We could refactor MapElements, FlatMapElements, and Filter to all inherit
from a common abstract base class SingleMessageTransform. The new code for
exception handling is nearly identical between the three classes and could
be consolidated without altering the current public interfaces. Are there
concerns with adding such a base class ?

On Thu, Oct 11, 2018 at 4:44 PM Jeff Klukas <[email protected]> wrote:

> The PR (https://github.com/apache/beam/pull/6586) is updated now with a
> coding solution for Failure. We use AvroCoder for the Exception and inherit
> whatever the input coder was for values.
>
> The unfortunate bit is that users might provide an Exception subclass that
> doesn't provide a no-argument constructor and thus isn't
> AvroCoder-compatible. I'm currently handling this through early failure
> with context about how to choose a different exception type.
>
>
> On Fri, Oct 5, 2018 at 3:59 PM Jeff Klukas <[email protected]> wrote:
>
>> It would be ideal to have some higher-level way of wrapping a PTransform
>> to handle errors inside, but that indeed seems like a substantially
>> trickier thing to implement.
>>
>>
>>
>>
>>
>> On Fri, Oct 5, 2018 at 3:38 PM Reuven Lax <[email protected]> wrote:
>>
>>> Cool! I've left a few comments.
>>>
>>> This also makes me think whether we can implement this on ParDo as well,
>>> though that might be a bit trickier since it involves hooking into
>>> DoFnInvoker.
>>>
>>> Reuven
>>>
>>> On Fri, Oct 5, 2018 at 10:33 AM Jeff Klukas <[email protected]> wrote:
>>>
>>>> I've posted a full PR for the Java exception handling API that's ready
>>>> for review: https://github.com/apache/beam/pull/6586
>>>>
>>>> It implements new WithErrors nested classes on MapElements,
>>>> FlatMapElements, Filter, AsJsons, and ParseJsons.
>>>>
>>>> On Wed, Oct 3, 2018 at 7:55 PM Jeff Klukas <[email protected]> wrote:
>>>>
>>>>> Jira issues for adding exception handling in Java and Python SDKs:
>>>>>
>>>>> https://issues.apache.org/jira/browse/BEAM-5638
>>>>> https://issues.apache.org/jira/browse/BEAM-5639
>>>>>
>>>>> I'll plan to have a complete PR for the Java SDK put together in the
>>>>> next few days.
>>>>>
>>>>> On Wed, Oct 3, 2018 at 1:29 PM Jeff Klukas <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I don't personally have experience with the Python SDK, so am not
>>>>>> immediately in a position to comment on how feasible it would be to
>>>>>> introduce a similar change there. I'll plan to write up two separate 
>>>>>> issues
>>>>>> for adding exception handling in the Java and Python SDKs.
>>>>>>
>>>>>> On Wed, Oct 3, 2018 at 12:17 PM Thomas Weise <[email protected]> wrote:
>>>>>>
>>>>>>> +1 for the proposal as well as the suggestion to offer it in other
>>>>>>> SDKs, where applicable
>>>>>>>
>>>>>>> On Wed, Oct 3, 2018 at 8:58 AM Chamikara Jayalath <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Sounds like a very good addition. I'd say this can be a single PR
>>>>>>>> since changes are related. Please open a JIRA for tracking.
>>>>>>>>
>>>>>>>> Have you though about introducing a similar change to Python SDK ?
>>>>>>>> (doesn't have to be the same PR).
>>>>>>>>
>>>>>>>> - Cham
>>>>>>>>
>>>>>>>> On Wed, Oct 3, 2018 at 8:31 AM Jeff Klukas <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> If this looks good for MapElements, I agree that it makes sense to
>>>>>>>>> extend to FlatMapElements and Filter and to keep the API consistent 
>>>>>>>>> between
>>>>>>>>> them.
>>>>>>>>>
>>>>>>>>> Do you have suggestions on how to submit changes with that wider
>>>>>>>>> scope? Would one PR altering MapElements, FlatMapElements, Filter,
>>>>>>>>> ParseJsons, and AsJsons be too large to reasonably review? Should I 
>>>>>>>>> open an
>>>>>>>>> overall JIRA ticket to track and break this into smaller  PRs?
>>>>>>>>>
>>>>>>>>> On Wed, Oct 3, 2018 at 10:31 AM Reuven Lax <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sounds cool. Why not support this on other transforms as well?
>>>>>>>>>> (FlatMapElements, Filter, etc.)
>>>>>>>>>>
>>>>>>>>>> Reuven
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 2, 2018 at 4:49 PM Jeff Klukas <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I've seen a few Beam users mention the need to handle errors in
>>>>>>>>>>> their transforms by using a try/catch and routing to different 
>>>>>>>>>>> outputs
>>>>>>>>>>> based on whether an exception was thrown. This was particularly 
>>>>>>>>>>> nicely
>>>>>>>>>>> written up in a post by Vallery Lancey:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://medium.com/@vallerylancey/error-handling-elements-in-apache-beam-pipelines-fffdea91af2a
>>>>>>>>>>>
>>>>>>>>>>> I'd love to see this pattern better supported directly in the
>>>>>>>>>>> Beam API, because it currently requires the user to implement a 
>>>>>>>>>>> full DoFn
>>>>>>>>>>> even for the simplest cases.
>>>>>>>>>>>
>>>>>>>>>>> I propose we support for a MapElements-like transform that
>>>>>>>>>>> allows the user to specify a set of exceptions to catch and route 
>>>>>>>>>>> to a
>>>>>>>>>>> failure output. Something like:
>>>>>>>>>>>
>>>>>>>>>>> MapElements
>>>>>>>>>>> .via(myFunctionThatThrows)
>>>>>>>>>>> .withSuccessTag(successTag)
>>>>>>>>>>> .withFailureTag(failureTag, JsonParsingException.class)
>>>>>>>>>>>
>>>>>>>>>>> which would output a PCollectionTuple with both the successful
>>>>>>>>>>> outcomes of the map operation and also a collection of the inputs 
>>>>>>>>>>> that
>>>>>>>>>>> threw JsonParsingException.
>>>>>>>>>>>
>>>>>>>>>>> To make this more concrete, I put together a proof of concept
>>>>>>>>>>> PR: https://github.com/apache/beam/pull/6518  I'd appreciate
>>>>>>>>>>> feedback about whether this seems like a worthwhile addition and a 
>>>>>>>>>>> feasible
>>>>>>>>>>> approach.
>>>>>>>>>>>
>>>>>>>>>>

Reply via email to