Thanks for proposing this design document Shuiqiang. It is a very interesting idea how to solve the problem of running multiple Flink jobs as part of a single application. I like the idea since it does not require many runtime changes apart from a session concept on the Dispatcher and it would work in all environments.
One thing which is not fully clear to me is the failover behavior in case of driver job faults. Would it be possible to recover the driver job or would a driver job fault directly transition the application into a globally terminal state? Apart from that, I left some minor comments in the design document. Cheers, Till On Tue, Apr 23, 2019 at 10:04 AM Shuiqiang Chen <acqua....@gmail.com> wrote: > Hi All, > > We would like to start a discussion thread about a new feature called Flink > Driver. A brief summary is following. > > As mentioned in the discussion of Interactive Programming, user > applications might consist of multiple jobs and take long to finish. > Currently, when Flink runs applications with multiple jobs, the application > will run in a local process which is responsible for submitting the jobs. > That local process will not exit until the whole application has finished. > Users have to keep eyes on the local process in case it is killed due to > connection lost, session timeout, local operating system problem, etc. > > To solve the problem, we would like to introduce the Flink Driver. Users > can submit applications using driver mode. A Flink driver job will be > submitted to take care of the job submissions in the user application. > > For more details about flink driver, please refer to the doc: > > https://docs.google.com/document/d/1dJnDOgRae9FzoBzXcsGoutGB1YjTi1iqG6d1Ht61EnY > > Any comments and suggestions will be highly appreciated. > > Best Regards, > Shuiqiang >