Hi Dan,

Ok I see, that makes sense. I thought it might make things easier if there
was a way to define a strategy for handling certain exceptions (e.g. where
the strategy could be to skip processing the record). But I understand the
worry about making data loss easy. I could probably refactor my pipeline
and move the exception-throwing code from the tablespec function to a DoFn,
which only outputs an element if the table name is found successfully, and
then make the tablespec function something very simple. But I will hack it
like you described for now!

Thanks,
Josh

On Tue, May 16, 2017 at 2:06 PM, Dan Halperin <[email protected]> wrote:

> Hey Josh,
>
> There isn't really generic functionality for this as we don't want to make
> "data loss" easy. There are some ongoing designs for specific transforms
> (e.g., BEAM-190 for BigQueryIO). One easy thing to do in this case might be
> to wrap the code in a try/catch and if you catch an exception then return
> some table name like "leftovers".
>
> Dan
>
> On Tue, May 16, 2017 at 8:02 AM, Josh <[email protected]> wrote:
>
>> Hi all,
>>
>> I am wondering if there is there a way to make Beam skip certain failures
>> - for example I am using BigQueryIO to write to a table, where the table
>> name is chosen dynamically:
>>
>>
>> ```
>>
>> .apply(BigQueryIO.<TableRow>write()
>>
>> .to(new ExtractTableName()))
>>
>> ```
>>
>>
>> I want to make it so that, if for some reason my ExtractTableName
>> instance (which is a SerializableFunction<ValueInSingleWindow<TableRow>,
>> TableDestination>) throws an exception, then the exception is logged and
>> the write is skipped.
>>
>>
>> Is it possible to achieve this behaviour without modifying the Beam
>> codebase/BigQueryIO retry logic?
>>
>> At the moment if my function throws an exception, the write is retried
>> indefinitely.
>>
>>
>> Thanks,
>>
>> Josh
>>
>
>

Reply via email to