Thank you David.

In the case we have in mind it should only happen literally on the very
rare Exception i.e. in some cases if somehow an uncaught exception occurs,
we want to send the record to a DLQ and handle the retry manually Vs
checkpointing and restarting.

Regards,

Tom.


On Sun, Jul 26, 2020 at 1:14 PM David Anderson <da...@alpinegizmo.com>
wrote:

> Every job is required to have a sink, but there's no requirement that all
> output be done via sinks. It's not uncommon, and doesn't have to cause
> problems, to have other operators that do I/O.
>
> What can be problematic, however, is doing blocking I/O. While your user
> function is blocked, the function will exert back pressure, and checkpoint
> barriers will be unable to make any progress. This sometimes leads to
> checkpoint timeouts and job failures. So it's recommended to make any I/O
> you do asynchronous, using an AsyncFunction [1] or something similar.
>
> Note that the asynchronous i/o function stores the records for in-flight
> asynchronous requests in checkpoints, and restores/re-triggers the requests
> when recovering from a failure. This might lead to duplicate results if you
> are using it to do non-idempotent database writes. If you need
> transactions, use a sink that offers them.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html
> <https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/stream/operators/asyncio.html>
>
> Best,
> David
>
> On Sun, Jul 26, 2020 at 11:08 AM Tom Fennelly <tfenne...@cloudbees.com>
> wrote:
>
>> Hi.
>>
>> What are the negative side effects of (for example) a filter function
>> occasionally making a call out to a DB ? Is this a big no-no and should all
>> outputs be done through sinks and side outputs, no exceptions ?
>>
>> Regards,
>>
>> Tom.
>>
>

Reply via email to