Hi Prakhar,

have you enabled HA for your cluster? If yes, then Flink will try to store
the job graph to the configured high-availability.storageDir in order to be
able to recover it. If this operation takes long, then it is either the
filesystem which is slow or storing the pointer in ZooKeeper. If it is the
filesystem, then I would suggest to check whether you have some read/write
quotas which might slow the operation down.

If you haven't enabled HA or persisting the jobGraph is not what takes
long, then the next most likely candidate is the recovery from a previous
checkpoint. Here again, Flink needs to read from the remote storage (in
your case GCS). Depending on the size of the checkpoint and the read
bandwidth, this can be faster or slower. The best way to figure out what
takes long is to share the logs with us so that we can confirm what takes
long.

To sum it up, the job submission is most likely slow because of the
interplay of Flink with the external system (most likely your configured
filesystem). If the filesystem is somewhat throttled, then Flink cannot do
much about it.

What you could try to do is to check whether your jar contains dependencies
which are not needed (e.g. Flink dependencies which are usually provided by
the system). That way you could decrease the size of the jar a bit.

Cheers,
Till

On Wed, Sep 2, 2020 at 9:48 AM Prakhar Mathur <prakha...@gojek.com> wrote:

> Hi,
>
> We are currently running Flink 1.9.0. We see a delay of around 20 seconds
> in order to start a job on a session Flink cluster. We start the job using
> Flink's monitoring REST API where our jar is already uploaded on Job
> Manager. Our jar file size is around 200 MB. We are using memory state
> backend having GCS as remote storage.
>
> On running the cluster in debug mode, we observed that generating the plan
> itself takes around 6 seconds and copying job graph from local to the
> remote folder takes around 10 seconds.
>
> We were wondering whether this delay is expected or if it can be reduced
> via tweaking any configuration?
>
> Thank you. Regards
> Prakhar Mathur
>

Reply via email to