[jira] [Comment Edited] (FLINK-17469) Support override of DEFAULT_JOB_NAME with system property for StreamExecutionEnvironment

John Lonergan (Jira) Tue, 12 May 2020 15:40:16 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105785#comment-17105785
 ]


John Lonergan edited comment on FLINK-17469 at 5/12/20, 10:39 PM:
------------------------------------------------------------------

What I'm really after is to be able to provide the job name from the CLI rather 
than from inside the code.
We wanted to take the job name out of the "application code" that the developer 
writes and move this logic/naming to our standard control wrapper scripts so 
that we can enforce some name derivation rules and impose those names from the 
outside.

So, my original suggestion to modify the framework by the introduction of 
getDefaultJobName was really just a means to providing a progammatic hook that 
I could bend to this need.
However, if you think about it I'm not really trying to change the default job 
name to some other constant, I'm actually trying to inject the actual distinct 
job name on a job by job basis.

So an explicit means to set the job name from the CLI is really what we want.

Changing fliunk-conf.yaml is really only applicable if one is really trying to 
override the default name for when no name is provided - but as I mentioned 
above that is not really what I am after. (UPDATE: I can see that if one was 
using Kube with one job per "cluster" then putting the job name in flink conf 
would work fine - however for a shared cluster running multiple jobs then I 
need the job name setting at the CLI at the point of submission).

As long as we provide a command line option "--jobname' that I can set when I 
submit a job then probably that is good enouigh for my use case.

If in additon we remove DEFAULT_JOB_NAME from the code and move it into 
flink-conf.yaml then that also seems sensible for other use cases - and I hate 
constants like that.

========

Additional context.

I would also like to be able to cancel/stop etc a job using the job name rather 
than the appid.

Flink doesn't +necessarily+ need to add complexity by concerning itself with 
job names being unique; as a minimum for example just document the "stop 
--jobname MyJob" as stopping all jobs called "MyJob".

:) If you really want to make me happy then DO concern yourself with making job 
names unique. Uniqueness could be an optional feature;  perhaps selected in 
flink-conf affecting the cluster, OR as an additonal optional "--unique" flag 
on the cli when submitting a job.

BTW I have a uniqueness thing working at the moment, but it relies on our 
custom wrapper script taking out a mutex on Zk so that there is a critical 
section around the operation of asking the cluster what jobs are currently 
running and the operation of running the new job if it is not already running. 
So if two users tried to start the same job at the same time then only one 
would succeed. This works, but for sure a built in feature would be preferable. 
 However, we're still left being unsure which version of MyJob is currently 
running.

---

Finally (probably not) it would be useful if the CLI allowed me to enquire 
about some runtime properties submitted with the job. In general, I'd like to 
be able to provide a custom property (or properties) to the submit and then 
pull those properties back some time later also using the CLI. 

Specifically in my case "-set-property version=a.b.c"  Then at some later time 
I could do "-get-property version" to check whether the cluster is running the 
version of the job that I think it ought to be.  I would then use this facility 
to simplify my operations by making the submit operation idempotent. ie "Don't 
tear down the job and submit version X if version X is already running."

This "property" facility could be a general purpose thing faciliating other 
innovations - who knows.

Again I want this from the CLI and not from within the app code.


was (Author: johnlon):
What I'm really after is to be able to provide the job name from the CLI rather 
than from inside the code.
We wanted to take the job name out of the "application code" that the developer 
writes and move this logic/naming to our standard control wrapper scripts so 
that we can enforce some name derivation rules and impose those names from the 
outside.

So, my original suggestion to modify the framework by the introduction of 
getDefaultJobName was really just a means to providing a progammatic hook that 
I could bend to this need.
However, if you think about it I'm not really trying to change the default job 
name to some other constant, I'm actually trying to inject the actual distinct 
job name on a job by job basis.

So an explicit means to set the job name from the CLI is really what we want.

Changing fliunk-conf.yaml is really only applicable if one is really trying to 
override the default name for when no name is provided - but as I mentioned 
above that is not really what I am after. (UPDATE: I can see that if one was 
using Kube with one job per "cluster" then putting the job name in flink conf 
would work fine - however for a shared cluster running multiple jobs then I 
need the job name setting at the CLI at the point of submission).

As long as we provide a command line option "--jobname' that I can set when I 
submit a job then probably that is good enouigh for my use case.

If in additon we remove DEFAULT_JOB_NAME from the code and move it into 
flink-conf.yaml then that also seems sensible for other use cases - and I hate 
constants like that.

========

Additional context.

I would also like to be able to cancel/stop etc a job using the job name rather 
than the appid.

Flink doesn't +necessarily+ need to add complexity by concerning itself with 
job names being unique; as a minimum for example just document the "stop 
--jobname MyJob" as stopping all jobs called "MyJob".

:) If you really want to make me happy then DO concern yourself with making job 
names unique. Uniqueness could be an optional feature;  perhaps selected in 
flink-conf affecting the cluster, OR as an additonal optional "--unique" flag 
on the cli when submitting a job.

BTW I have a uniqueness thing working at the moment, but it relies on our 
custom wrapper script taking out a mutex on Zk so that there is a critical 
section around the operation of asking the cluster what jobs are currently 
running and the operation of running the new job if it is not already running. 
So if two users tried to start the same job at the same time then only one 
would succeed. This works, but for sure a built in feature would be preferable.

---

Finally (probably not) it would be useful if the CLI allowed me to enquire 
about some runtime properties submitted with the job. In general, I'd like to 
be able to provide a custom property (or properties) to the submit and then 
pull those properties back some time later also using the CLI. 

Specifically in my case "-set-property version=a.b.c"  Then at some later time 
I could do "-get-property version" to check whether the cluster is running the 
version of the job that I think it ought to be.  I would then use this facility 
to simplify my operations by making the submit operation idempotent. ie "Don't 
tear down the job and submit version X if version X is already running."

This "property" facility could be a general purpose thing faciliating other 
innovations - who knows.

Again I want this from the CLI and not from within the app code.

> Support override of DEFAULT_JOB_NAME with system property for 
> StreamExecutionEnvironment
> ----------------------------------------------------------------------------------------
>
>                 Key: FLINK-17469
>                 URL: https://issues.apache.org/jira/browse/FLINK-17469
>             Project: Flink
>          Issue Type: New Feature
>          Components: API / DataSet, API / DataStream
>    Affects Versions: 1.10.0
>            Reporter: John Lonergan
>            Priority: Trivial
>
> We are running multiple jobs on a shared standalone HA Cluster.
> We want to be able to provide the job name via the submitting shell script 
> using a system property; for example "job.name".
> We could of course write Java application code in each job to achieve this by 
> passing the system property value ourselves to the execute(name)  method, 
> however we want to do this from the env.
> ---
> However, there exists already default job name in 
> StreamExecutionEnvironment.DEFAULT_JOB_NAME.
> Our proposed changed to add a method to StreamExecutionEnvironment...
> {code:java}
> String getDefaultJobName() {
>       return System.getProperty("default.job.name", 
> StreamExecutionEnvironment.DEFAULT_JOB_NAME);
> }
> {code}
> .. and call that method rather than directly accessing 
> StreamExecutionEnvironment.DEFAULT_JOB_NAME 
> This change is backwards compatible.
> We need this method to evalulate on a job by job basis so for example the 
> following small amendment to the existing DEFAULT_JOB_NAME  value will NOT 
> work because this will not allow us to vary the value job by job.
> {code:java}
> class StreamExecutionEnvironment {
>  static final String DEFAULT_JOB_NAME = 
> System.getProperty("default.job.name", "Flink Streaming Job"))
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17469) Support override of DEFAULT_JOB_NAME with system property for StreamExecutionEnvironment

Reply via email to