Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-10 Thread tison
Thanks for your update Klou! Best, tison. Kostas Kloudas 于2020年3月11日周三 上午2:05写道: > Hi all, > > The FLIP was updated under the section "First Version Deliverables". > > Cheers, > Kostas > > On Tue, Mar 10, 2020 at 4:10 PM Kostas Kloudas wrote: > > > > Hi all, > > > > Yes I will do that. From t

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-10 Thread Kostas Kloudas
Hi all, The FLIP was updated under the section "First Version Deliverables". Cheers, Kostas On Tue, Mar 10, 2020 at 4:10 PM Kostas Kloudas wrote: > > Hi all, > > Yes I will do that. From the discussion, I will add that: > 1) for the cli, we are planning to add a "run-application" command > 2) f

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-10 Thread Kostas Kloudas
Hi all, Yes I will do that. From the discussion, I will add that: 1) for the cli, we are planning to add a "run-application" command 2) for deployment in Yarn we are planning to use LocalResources to let Yarn do the jar transfer 3) for Standalone/containers, we assume that dependencies/jars are bu

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-10 Thread Yang Wang
Thanks for your response. @Kostas Kloudas Could we update the cli changes and how to fetch the user jars to FLIP document? I think other dev or users may have the similar questions. Best, Yang Aljoscha Krettek 于2020年3月10日周二 下午9:03写道: > On 10.03.20 03:31, Yang Wang wrote: > > For the "run-job

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-10 Thread Aljoscha Krettek
On 10.03.20 03:31, Yang Wang wrote: For the "run-job", do you mean to submit a Flink job to an existing session or just like the current per-job to start a dedicated Flink cluster? Then will "flink run" be deprecated? I was talking about the per-job mode that starts a dedicated Flink cluster.

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-09 Thread Yang Wang
Hi Aljoscha, Kostas, I would be in favour of something like "bin/flink run-application", > maybe we should even have "run-job" in the future to differentiate. I have no preference for the "-R/--remote-deploy" option of "flink run" or new introduced "flink run-application". If we always bind the

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-09 Thread Aljoscha Krettek
> For the -R flag, this was in the PoC that I published just as a quick > implementation, so that I can move fast to the entrypoint part. > Personally, I would not even be against having a separate command in > the CLI for this, sth like run-on-cluster or something along those > lines. > What do y

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-09 Thread Kostas Kloudas
Hi all, And thanks for the discussion topics. For the cluster lifecycle, it is the Entrypoint that will tear down the cluster when the application finishes. Probably we should emphasise it a bit more in the FLIP. For the -R flag, this was in the PoC that I published just as a quick implementatio

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-09 Thread Becket Qin
Thanks Yang, That would be very helpful! Jiangjie (Becket) Qin On Mon, Mar 9, 2020 at 3:31 PM Yang Wang wrote: > Hi Becket, > > Thanks for your suggestion. We will update the FLIP to add/enrich the > following parts. > * User cli option change, use "-R/--remote" to apply the cluster deploy > m

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-09 Thread Yang Wang
Hi Becket, Thanks for your suggestion. We will update the FLIP to add/enrich the following parts. * User cli option change, use "-R/--remote" to apply the cluster deploy mode * Configuration change, how to specify remote user jars and dependencies * The whole story about how "application mode" wor

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-08 Thread Becket Qin
Thanks for the reply, tison and Yang, Regarding the public interface, is "-R/--remote" option the only change? Will the users also need to provide a remote location to upload and store the jars, and a list of jars as dependencies to be uploaded? It would be important that the public interface sec

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-08 Thread Yang Wang
Hi Becket, Thanks for jumping out and sharing your concerns. I second tison's answer and just make some additions. > job submission interface This FLIP will introduce an interface for running user `main()` on cluster, named as “ProgramDeployer”. However, it is not a public interface. It will be

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-08 Thread tison
Hi Becket, Thanks for your attention on FLIP-85! I answered your question inline. 1. What exactly the job submission interface will look like after this FLIP? The FLIP template has a Public Interface section but was removed from this FLIP. As Yang mentioned in this thread above: >From user pers

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-08 Thread Becket Qin
Hi Peter and Kostas, Thanks for creating this FLIP. Moving the JobGraph compilation to the cluster makes a lot of sense to me. FLIP-40 had the exactly same idea, but is currently dormant and can probably be superseded by this FLIP. After reading the FLIP, I still have a few questions. 1. What exa

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-05 Thread Kostas Kloudas
Also from my side +1 to start voting. Cheers, Kostas On Thu, Mar 5, 2020 at 7:45 AM tison wrote: > > +1 to star voting. > > Best, > tison. > > > Yang Wang 于2020年3月5日周四 下午2:29写道: >> >> Hi Peter, >> Really thanks for your response. >> >> Hi all @Kostas Kloudas @Zili Chen @Peter Huang @Rong Rong

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-04 Thread tison
+1 to star voting. Best, tison. Yang Wang 于2020年3月5日周四 下午2:29写道: > Hi Peter, > Really thanks for your response. > > Hi all @Kostas Kloudas @Zili Chen > @Peter Huang @Rong > Rong > It seems that we have reached an agreement. The “application mode” > is regarded as the enhanced “per-job”. It

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-04 Thread Yang Wang
Hi Peter, Really thanks for your response. Hi all @Kostas Kloudas @Zili Chen @Peter Huang @Rong Rong It seems that we have reached an agreement. The “application mode” is regarded as the enhanced “per-job”. It is orthogonal with “cluster deploy”. Currently, we bind the “per-job” to `run-user-m

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-04 Thread Peter Huang
Hi Yang and Kostas, Thanks for the clarification. It makes more sense to me if the long term goal is to replace per job mode to application mode in the future (at the time that multiple execute can be supported). Before that, It will be better to keep the concept of application mode internally.

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-03 Thread Yang Wang
Hi Peter, Having the application mode does not mean we will drop the cluster-deploy option. I just want to share some thoughts about “Application Mode”. 1. The application mode could cover the per-job sematic. Its lifecyle is bound to the user `main()`. And all the jobs in the user main will be

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-03 Thread Kostas Kloudas
Hi Peter, I understand your point. This is why I was also a bit torn about the name and my proposal was a bit aligned with yours (something along the lines of "cluster deploy" mode). But many of the other participants in the discussion suggested the "Application Mode". I think that the reasoning

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-02 Thread Peter Huang
Hi Kostas, Thanks for updating the wiki. We have aligned with the implementations in the doc. But I feel it is still a little bit confusing of the naming from a user's perspective. It is well known that Flink support per job cluster and session cluster. The concept is in the layer of how a job is

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-02 Thread Kostas Kloudas
Hi Yang, The difference between per-job and application mode is that, as you described, in the per-job mode the main is executed on the client while in the application mode, the main is executed on the cluster. I do not think we have to offer "application mode" with running the main on the client

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-02 Thread Yang Wang
Hi Kostas, Thanks a lot for your conclusion and updating the FLIP-85 WIKI. Currently, i have no more questions about motivation, approach, fault tolerance and the first phase implementation. I think the new title "Flink Application Mode" makes a lot senses to me. Especially for the containerized

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-03-02 Thread Kostas Kloudas
Hi all, I update https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode based on the discussion we had here: https://docs.google.com/document/d/1ji72s3FD9DYUyGuKnJoO4ApzV-nSsZa0-bceGXW7Ocw/edit# Please let me know what you think and please keep the discussion in the ML

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-15 Thread Yang Wang
Hi all, Thanks a lot for the feedback from @Kostas Kloudas. Your all concerns are on point. The FLIP-85 is mainly focused on supporting cluster mode for per-job. Since it is more urgent and have much more use cases both in Yarn and Kubernetes deployment. For session cluster, we could have more dis

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-15 Thread Peter Huang
Hi Kostas, Thanks for this feedback. I can't agree more about the opinion. The cluster mode should be added first in per job cluster. 1) For job cluster implementation 1. Job graph recovery from configuration or store as static job graph as session cluster. I think the static one will be better f

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-15 Thread Kostas Kloudas
Hi all, I am writing here as the discussion on the Google Doc seems to be a bit difficult to follow. I think that in order to be able to make progress, it would be helpful to focus on per-job mode for now. The reason is that: 1) making the (unique) JobSubmitHandler responsible for creating the j

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-08 Thread tison
not always, Yang Wang is also not yet a committer but he can join the channel. I cannot find the id by clicking “Add new member in channel” so come to you and ask for try out the link. Possibly I will find other ways but the original purpose is that the slack channel is a public area we discuss abo

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-08 Thread Peter Huang
Hi Tison, I am not the committer of Flink yet. I think I can't join it also. Best Regards Peter Huang On Wed, Jan 8, 2020 at 9:39 AM tison wrote: > Hi Peter, > > Could you try out this link? https://the-asf.slack.com/messages/CNA3ADZPH > > Best, > tison. > > > Peter Huang 于2020年1月9日周四 上午1:22

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-08 Thread tison
Hi Peter, Could you try out this link? https://the-asf.slack.com/messages/CNA3ADZPH Best, tison. Peter Huang 于2020年1月9日周四 上午1:22写道: > Hi Tison, > > I can't join the group with shared link. Would you please add me into the > group? My slack account is huangzhenqiu0825. > Thank you in advance.

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-08 Thread Peter Huang
Hi Tison, I can't join the group with shared link. Would you please add me into the group? My slack account is huangzhenqiu0825. Thank you in advance. Best Regards Peter Huang On Wed, Jan 8, 2020 at 12:02 AM tison wrote: > Hi Peter, > > As described above, this effort should get attention fro

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-08 Thread tison
Hi Peter, As described above, this effort should get attention from people developing FLIP-73 a.k.a. Executor abstractions. I recommend you to join the public slack channel[1] for Flink Client API Enhancement and you can try to share you detailed thoughts there. It possibly gets more concrete atte

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-06 Thread Peter Huang
Dear All, Happy new year! According to existing feedback from the community, we revised the doc with the consideration of session cluster support, and concrete interface changes needed and execution plan. Please take one more round of review at your most convenient time. https://docs.google.com/d

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-02 Thread Peter Huang
Hi Dian, Thanks for giving us valuable feedbacks. 1) It's better to have a whole design for this feature For the suggestion of enabling the cluster mode also session cluster, I think Flink already supported it. WebSubmissionExtension already allows users to start a job with the specified jar by us

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-02 Thread Peter Huang
Hi Yang, I understand your point. As for Kubernates per job cluster, users only have the image path for starting the job. The user code is inaccessible. I think it is a common question for containerized deployment (For example yanr with docker image) after FLIP-73. Let's get some feedback from < a

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-30 Thread Yang Wang
Hi Peter, Certainly, we could add a 'if-else' in `AbstractJobClusterExecutor` to handle different deploy mode. However, i think we need to avoid executing any user program code in cluster deploy-mode including in the `ExecutionEnvironment`. Let's wait for some feedback from FLIP-73's author @Aljosc

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-29 Thread Dian Fu
Hi all, Sorry to jump into this discussion. Thanks everyone for the discussion. I'm very interested in this topic although I'm not an expert in this part. So I'm glad to share my thoughts as following: 1) It's better to have a whole design for this feature As we know, there are two deployment m

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-26 Thread Peter Huang
Hi Yang, I can't agree more. The effort definitely needs to align with the final goal of FLIP-73. I am thinking about whether we can achieve the goal with two phases. 1) Phase I As the CLiFrontend will not be depreciated soon. We can still use the deployMode flag there, pass the program info thro

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-25 Thread Yang Wang
Hi Peter, I think we need to reconsider tison's suggestion seriously. After FLIP-73, the deployJobCluster has beenmoved into `JobClusterExecutor#execute`. It should not be perceived for `CliFrontend`. That means the user program will *ALWAYS* be executed on client side. This is the by design behav

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-24 Thread Peter Huang
Hi Jingjing, The improvement proposed is a deployment option for CLI. For SQL based Flink application, It is more convenient to use the existing model in SqlClient in which the job graph is generated within SqlClient. After adding the delayed job graph generation, I think there is no change is nee

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-20 Thread Yang Wang
Hi tison, I think we have the same direction that running user program on jobmanager side. Maybe i could called it cluster deploy-mode for per-job. The separation of deployment and submission is exactly what we need to do. After separation, we could choose where to run the user program. For examp

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-19 Thread tison
Hi Yang, Your ideas go in the same direction with mine. There is one question I'd like to sync with you. As mentioned above, we now always run user program on client side and do the deployment in env.execute. It is less than awesome if you want to absolutely run the program on master side. A pos

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-19 Thread tison
Hi Peter, I'm afraid that FLIP-73 also changes how per-job works. Please check the work first. You can search AbstractJobClusterExecutor and its call graph. For how it influences your proposal FLIP-85, I already mentioned above that >user program is designed to ALWAYS run on the client side. Spe

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-18 Thread Peter Huang
Hi Yang, Thanks for your input, I can see the master side job graph generation is a common requirement for per job mode. I think FLIP-73 is mainly for session mode. I think the proposal is a valid improvement for existing CLI and per job mode. Best Regards Peter Huang On Wed, Dec 18, 2019 at 3:

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-18 Thread Peter Huang
Hi Tison, Sorry for the late reply. I am busy with some internal urgent work last week. I tried to read the FLIP-73, from my limited understanding. The scope of this FLIP is to unify the deployment process of each type of cluster management system. As the implementation details is unclear, I have

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-18 Thread jingjing bai
hi peter: we had extension SqlClent to support sql job submit in web base on flink 1.9. we support submit to yarn on per job mode too. in this case, the job graph generated on client side . I think this discuss Mainly to improve api programme. but in my case , there is no jar to upload

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-18 Thread Yang Wang
I just want to revive this discussion. Recently, i am thinking about how to natively run flink per-job cluster on Kubernetes. The per-job mode on Kubernetes is very different from on Yarn. And we will have the same deployment requirements to the client and entry point. 1. Flink client not always

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-12 Thread tison
A quick idea is that we separate the deployment from user program that it has always been done outside the program. On user program executed there is always a ClusterClient that communicates with an existing cluster, remote or local. It will be another thread so just for your information. Best, ti

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-12 Thread tison
Hi Peter, Another concern I realized recently is that with current Executors abstraction(FLIP-73) I'm afraid that user program is designed to ALWAYS run on the client side. Specifically, we deploy the job in executor when env.execute called. This abstraction possibly prevents Flink runs user progr

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-09 Thread Peter Huang
Hi Tison, Yes, you are right. I think I made the wrong argument in the doc. Basically, the packaging jar problem is only for platform users. In our internal deploy service, we further optimized the deployment latency by letting users to packaging flink-runtime together with the uber jar, so that w

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-09 Thread Peter Huang
Hi Yang, Thanks for the suggestion. Actually I forgot to share the original google doc with you. Feel free to comment directly on it, so that I may revise it based on people's feedback before syncing to confluence. https://docs.google.com/document/d/1aAwVjdZByA-0CHbgv16Me-vjaaDMCfhX7TzVVTuifYM/edi

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-09 Thread tison
> 3. What do you mean about the package? Do users need to compile their jars inlcuding flink-clients, flink-optimizer, flink-table codes? The answer should be no because they exist in system classpath. Best, tison. Yang Wang 于2019年12月10日周二 下午12:18写道: > Hi Peter, > > Thanks a lot for starting

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-09 Thread Yang Wang
Hi Peter, Thanks a lot for starting this discussion. I think this is a very useful feature. Not only for Yarn, i am focused on flink on Kubernetes integration and come across the same problem. I do not want the job graph generated on client side. Instead, the user jars are built in a user-defined

[DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-09 Thread Peter Huang
Dear All, Recently, the Flink community starts to improve the yarn cluster descriptor to make job jar and config files configurable from CLI. It improves the flexibility of Flink deployment Yarn Per Job Mode. For platform users who manage tens of hundreds of streaming pipelines for the whole org