I think a short FLIP would be awesome. I guess this feature hasn't been implemented yet because it has not been implemented yet ;-) I agree that this feature will improve configuration ergonomics big time :-)
Cheers, Till On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi <u...@apache.org> wrote: > Hey all, > > I think that approach 2 is more idiomatic for container deployments where > it can be cumbersome to manually map flink-conf.yaml contents to env vars > [1]. The precedence order outlined by Till would also cover Steven's > hierarchical overwrite requirement. > > I'm really excited about this feature as it will make Flink deployments a > lot more ergonomic. The implementation seems to be not too complicated > (which makes we wonder why we didn't tackle this earlier or whether I'm > missing something). > > I'd also be happy to shepherd this contribution if there is consensus on > the need for it and the approach. Does it make sense to formalize this > decision a bit with a short FLIP? > > – Ufuk > > [1] In Ververica Platform, we support approach 1, because the Flink > configuration is part of the specification for a single Deployment and it's > minimally more convenient to have something like > > flinkConfiguration: > foo: ${BAR} > > for us. I don't think this approach would feel natural when manually > deploying Flink. There would be a clear migration path for our customers, > so I'm not concerned about this too much. > > On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote: > > Hi everyone, > > > > Thanks for starting this discussion Ingo. I think being able to use env > > variables to change Flink's configuration will be a very useful feature. > > > > Concerning the two approaches I would be in favour of the second approach > > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to > > prepare a special flink-conf.yaml where he inserts env variables for > every > > config value he wants to configure. Since this is not required with the > > second approach, I think it is more general and easier to use. Also, the > > user does not have to remember a second set of names (env names) which he > > has to/can set. > > > > For how to substitute the values, I think it should happen when we load > the > > Flink configuration. First we read the file and then overwrite values > > specified via an env variable or dynamic properties in some defined > order. > > For env.java.opts and other options which are used for starting the JVM > we > > might need special handling in the bash scripts. > > > > Cheers, > > Till > > > > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <i...@ververica.com> wrote: > > > > > Hi Yang, > > > > > > 1. As you said I think this doesn't affect Ververica Platform, really, > so > > > I'm more than happy to hear and follow the thoughts of people more > > > experienced with Flink than me. > > > 2. I wasn't aware of env.java.opts, but that's definitely a candidate > where > > > a user may want to "escape" it so it doesn't get substituted > immediately, I > > > agree. > > > > > > > > > Regards > > > Ingo > > > > > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <danrtsey...@gmail.com> > wrote: > > > > > > > Hi Ingo, > > > > > > > > Thanks for your response. > > > > > > > > 1. Not distinguishing JM/TM is reasonable, but what about the client > > > side. > > > > For Yarn/K8s deployment, > > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just > confused > > > > about where should the environment > > > > variables be replaced? IIUC, it is not an issue for Ververica > Platform > > > > since it is always done in the JM/TM side. > > > > > > > > 2. I believe we should support not do the substitution for specific > key. > > > A > > > > typical use case is "env.java.opts". If the > > > > value contains environment variables, they are expected to be > replaced > > > > exactly when the java command is executed, > > > > not after the java process is started. Maybe escaping with single > quote > > > is > > > > enough. > > > > > > > > 3. The substitution only takes effects on the value makes sense to > me. > > > > > > > > > > > > Best, > > > > Yang > > > > > > > > Steven Wu <stevenz...@gmail.com> 于2021年1月19日周二 上午12:36写道: > > > > > > > > > Variable substitution (proposed here) is definitely useful. > > > > > > > > > > For us, hierarchical override is more useful. E.g., we may have > the > > > > > default value of "state.checkpoints.dir=path1" defined in > > > > flink-conf.yaml. > > > > > But maybe we want to override it to "state.checkpoints.dir=path2" > via > > > > > environment variable in some scenarios. Otherwise, we have to > define a > > > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the > Flink > > > > > config, which is annoying. > > > > > > > > > > As Ingo pointed, it is also annoying to handle Java property key > naming > > > > > convention (dots separated), as dots aren't allowed in shell env > var > > > > naming > > > > > (All caps, separated with underscore). Shell will complain. We > have to > > > > > bundle all env var overrides (k-v pairs) in a single property value > > > (JSON > > > > > and base64 encode) to avoid it. > > > > > > > > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <i...@ververica.com> > wrote: > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > thanks for your questions! I'm glad to see this feature is being > > > > received > > > > > > positively. > > > > > > > > > > > > ad 1) We don't distinguish JM/TM, and I can't think of a good > reason > > > > why > > > > > a > > > > > > user would want to do so. I'm not very experienced with Flink, > > > however, > > > > > so > > > > > > please excuse me if I'm overlooking some obvious reason here. :-) > > > > > > ad 2) Admittedly I don't have a good overview on all the > > > configuration > > > > > > options that exist, but from those that I do know I can't imagine > > > > someone > > > > > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica > > > > Platform > > > > > as > > > > > > of now we ignore this problem. If, however, this needs to be > > > > addressed, a > > > > > > possible solution could be to allow escaping syntax such as > > > > "\${MY_VAR}". > > > > > > > > > > > > Another point to consider here is when exactly the substitution > takes > > > > > > place: on the "raw" file, or on the parsed key / value > separately, > > > and > > > > if > > > > > > so, should it support both key and value? My current thinking is > that > > > > > > substituting only the value of the parsed entry should be > sufficient. > > > > > > > > > > > > > > > > > > Regards > > > > > > Ingo > > > > > > > > > > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <danrtsey...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > Thanks for kicking off the discussion. > > > > > > > > > > > > > > I think supporting environment variables rendering in the Flink > > > > > > > configuration yaml file is a good idea. Especially for > > > > > > > the Kubernetes environment since we are using the secret > resource > > > to > > > > > > store > > > > > > > the authentication information. > > > > > > > > > > > > > > But I have some questions for how to do it? > > > > > > > 1. The environments in Flink configuration yaml will be > replaced in > > > > > > client, > > > > > > > JobManager, TaskManager or all of them? > > > > > > > 2. If users do not want some config options to be replaced, > how to > > > > > > > achieve that? > > > > > > > > > > > > > > Best, > > > > > > > Yang > > > > > > > > > > > > > > Khachatryan Roman <khachatryan.ro...@gmail.com> 于2021年1月18日周一 > > > > > 下午8:55写道: > > > > > > > > > > > > > > > Hi Ingo, > > > > > > > > > > > > > > > > Thanks a lot for this proposal! > > > > > > > > > > > > > > > > We had a related discussion recently in the context of > > > FLINK-19520 > > > > > > > > (randomizing tests configuration) [1]. > > > > > > > > I believe other scenarios will benefit as well. > > > > > > > > > > > > > > > > For the end users, I think substitution in configuration > files is > > > > > > > > preferable over parsing env vars in Flink code. > > > > > > > > And for cases without such a file, we could have a default > one on > > > > the > > > > > > > > classpath with all substitutions defined (and then merge > > > everything > > > > > > from > > > > > > > > the user-supplied file). > > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520 > > > > > > > > > > > > > > > > Regards, > > > > > > > > Roman > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk < > i...@ververica.com> > > > > > wrote: > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > in Ververica Platform we offer a feature to use environment > > > > > variables > > > > > > > in > > > > > > > > > the Flink configuration¹, e.g. > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > s3.access-key: ${S3_ACCESS_KEY} > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > We've been discussing internally whether contributing such > a > > > > > feature > > > > > > to > > > > > > > > > Flink directly would make sense and wanted to start a > > > discussion > > > > on > > > > > > > this > > > > > > > > > topic. > > > > > > > > > > > > > > > > > > An alternative way to do so from the above would be parsing > > > those > > > > > > > > directly > > > > > > > > > based on their name, so instead of having it defined in the > > > Flink > > > > > > > > > configuration as above, it would get automatically set if > > > > something > > > > > > > like > > > > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. > This is > > > > > > > somewhat > > > > > > > > > similar to what e.g. Spring does, and faces similar > challenges > > > > > > (dealing > > > > > > > > > with "."s etc.) > > > > > > > > > > > > > > > > > > Although I view both of these approaches as mostly > orthogonal, > > > > > > > supporting > > > > > > > > > both very likely wouldn't make sense, of course. So I was > > > > wondering > > > > > > > what > > > > > > > > > your opinion is in terms of whether the project would > benefit > > > > from > > > > > > > > > environment variable support for the Flink configuration, > and > > > > > whether > > > > > > > > there > > > > > > > > > are tendencies as to which approach to go with. > > > > > > > > > > > > > > > > > > ¹ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables > > > > > > > > > > > > > > > > > > Best regards > > > > > > > > > Ingo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >