Hi Derek - It sounds like this is a Dataflow specific questions so I'd
recommend you also reach out through the Dataflow's Stack Overflow
<https://stackoverflow.com/questions/tagged/google-cloud-dataflow> channel.
I'm also cc'ing Thomas Groh who might be able to help.



On 20 October 2017 at 11:35, Derek Hao Hu <[email protected]> wrote:

> ​Kindly ping as I'm really curious about this. :p
>
> Derek​
>
> On Thu, Oct 19, 2017 at 2:15 PM, Derek Hao Hu <[email protected]>
> wrote:
>
>> Hi,
>>
>> ​We are trying to use Dataflow in Prod and right now one of our main
>> concerns is this "infinite retry" behavior which might stall the whole
>> pipeline.
>>
>> Right now for all the DoFns we've implemented ourselves we've added some
>> error handling or exception swallowing mechanism to make sure some bundles
>> can just fail and we log the exceptions. But we are a bit concerned about
>> the other Beam native transforms which we can not easily wrap, e.g.
>> PubSubIO transforms and DatastoreV1 transforms.
>>
>> A few days ago I asked a specific question in this group about how one
>> can catch exception in DatastoreV1 transforms and the recommended approach
>> is to 1) either duplicate the code in the current DatastoreV1
>> implementation and swallow the exception instead of throwing or 2) Follow
>> the implementation of BigQueryIO to add the ability to support custom retry
>> policy. Both are feasible options but I'm a bit concerned in that doesn't
>> that mean eventually all Beam native transforms need to implement something
>> like 2) if we want to use them in Prod?
>>
>> So in short, I want to know right now what is the recommended approach or
>> workaround to say, hey, just let this bundle fail and we can process the
>> rest of the elements instead of just stall the pipeline?
>>
>> Thanks!
>> --
>> Derek Hao Hu
>>
>> Software Engineer | Snapchat
>> Snap Inc.
>>
>
>
>
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>

Reply via email to