Hi Prakhar, have you enabled HA for your cluster? If yes, then Flink will try to store the job graph to the configured high-availability.storageDir in order to be able to recover it. If this operation takes long, then it is either the filesystem which is slow or storing the pointer in ZooKeeper. If it is the filesystem, then I would suggest to check whether you have some read/write quotas which might slow the operation down.
If you haven't enabled HA or persisting the jobGraph is not what takes long, then the next most likely candidate is the recovery from a previous checkpoint. Here again, Flink needs to read from the remote storage (in your case GCS). Depending on the size of the checkpoint and the read bandwidth, this can be faster or slower. The best way to figure out what takes long is to share the logs with us so that we can confirm what takes long. To sum it up, the job submission is most likely slow because of the interplay of Flink with the external system (most likely your configured filesystem). If the filesystem is somewhat throttled, then Flink cannot do much about it. What you could try to do is to check whether your jar contains dependencies which are not needed (e.g. Flink dependencies which are usually provided by the system). That way you could decrease the size of the jar a bit. Cheers, Till On Wed, Sep 2, 2020 at 9:48 AM Prakhar Mathur <prakha...@gojek.com> wrote: > Hi, > > We are currently running Flink 1.9.0. We see a delay of around 20 seconds > in order to start a job on a session Flink cluster. We start the job using > Flink's monitoring REST API where our jar is already uploaded on Job > Manager. Our jar file size is around 200 MB. We are using memory state > backend having GCS as remote storage. > > On running the cluster in debug mode, we observed that generating the plan > itself takes around 6 seconds and copying job graph from local to the > remote folder takes around 10 seconds. > > We were wondering whether this delay is expected or if it can be reduced > via tweaking any configuration? > > Thank you. Regards > Prakhar Mathur >