This sounds about right to me.  When I was working on getting flink running
this was one of the biggest pain points, the upload/download process is
painfully slow.  We had implemented a jar cache in flink to improve this
(and stopped using fat jars)
https://github.com/twitter-forks/flink/commit/01f26ab7124b722ec767c8e01d10395afbbb0dc5
but
I don't think it ever made it upstream.

On Tue, May 7, 2024 at 10:32 AM Balogh, György <bog...@ultinous.com> wrote:

> Hi,
> We are switching from embedded flink runner to filink server. The embedded
> flink runner works fine, jobs start instantly. However when we use the
> flink server we are experiencing about 30 seconds of idle time (no
> significant cpu, io load) before we see the job in the flink. We are
> targeting at most 2 seconds overhead to run ad-hoc query pipelines.
> We did some investigation and saw that the fat jar is about 150MB that was
> sent from the beam to the flink cluster. Still this does not justify the 30
> sec. We are trying to deploy dependencies to the flink workers and decrease
> the jar size. It's not clear at the moment what causes this delay.
>
> Flink version: 1.16
> Beam version: 2.55.1
>
> Thank you,
> Gyorgy
>
> --
>
> György Balogh
> CTO
> E gyorgy.bal...@ultinous.com <zsolt.sala...@ultinous.com>
> M +36 30 270 8342 <+36%2030%20270%208342>
> A HU, 1117 Budapest, Budafoki út 209.
> W www.ultinous.com
>

Reply via email to