[previous didn't cc list, sorry for dupes]

The classic connection pool pattern, where expensive connections are
created relatively few times and used by lots of transient short-lived
tasks, each of which borrows a connection from the pool and returns it when
done, would still be usable here, but as Péter points out, you can't rely
on a single global pool instance to protect the backend from overload...
Each JVM in the job (and possibly each classloader) will have its own pool.

If you really need a global limit, the easiest thing I can think of, if the
backend supports it, is to configure a specific user/role for the Flink
job, configure the backend system to only allow a limited number of
concurrent connections by that user/role, and (importantly!) make sure the
client side reacts well to the no-connections-available condition, e.g. by
shunting data into a sink where it can get picked up and reprocessed, or
using an async operator... I'm not sure there's a great way to handle this:
you're adding backpressure to the entire job, and/or introducing additional
out-of-order processing, which some use cases are very unhappy with.

If you can't rate-limit the backend directly, if it's something REST-like,
you could try wrapping it in a rate-limiting reverse proxy service, or use
a distributed token-bucket kind of architecture and have runtimes try to
grab a token before opening a connection... It wouldn't help with the
backpressure/out-of-order problems though.

Stepping back a _little_: if these connections are ultimately being used as
KV-like lookups for enrichment purposes, and your use case can tolerate
stale values under heavy load, you can add transient, optimistic
"near-line" caching using something like Caffeine to read-through on a
miss. That won't do anything about maintaining global limits, but it can
significantly reduce the pressure, depending on the hit rate, which depends
on cache sizing, which depends on locality... 😅

HTH!

-0xe1a


On Thu, Mar 21, 2024 at 1:21 PM Péter Váry <peter.vary.apa...@gmail.com>
wrote:

> Hi Jacob,
>
> Flink jobs, tasks typically run on multiple nodes/servers. This means that
> it is not possible to have a connection shared on job level.
>
> You can read about the architecture in more detail in the docs. [1]
>
> I hope this helps,
> Péter
>
> [1] -
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/concepts/flink-architecture/
>
> On Thu, Mar 21, 2024, 13:10 Jacob Rollings <jacobrolling...@gmail.com>
> wrote:
>
>> Hello,
>>
>> Is there a way in Flink to instantiate or open connections (to cache/db)
>> at global level, so that it can be reused across many process functions
>> rather than doing it in each operator's open()?Along with opening, also
>> wanted to know if there is a way to close them at job level stop, such that
>> they are closed at the very end after each operator close() method is
>> complete. Basically the idea is to maintain a single instance at global
>> level and close its session as a last step after each opertor close is
>> complete.
>>
>>
>> Regards,
>> JB
>>
>

Reply via email to