Hi everyone,

Thanks for starting this discussion Ingo. I think being able to use env
variables to change Flink's configuration will be a very useful feature.

Concerning the two approaches I would be in favour of the second approach
($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user to
prepare a special flink-conf.yaml where he inserts env variables for every
config value he wants to configure. Since this is not required with the
second approach, I think it is more general and easier to use. Also, the
user does not have to remember a second set of names (env names) which he
has to/can set.

For how to substitute the values, I think it should happen when we load the
Flink configuration. First we read the file and then overwrite values
specified via an env variable or dynamic properties in some defined order.
For env.java.opts and other options which are used for starting the JVM we
might need special handling in the bash scripts.

Cheers,
Till

On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <i...@ververica.com> wrote:

> Hi Yang,
>
> 1. As you said I think this doesn't affect Ververica Platform, really, so
> I'm more than happy to hear and follow the thoughts of people more
> experienced with Flink than me.
> 2. I wasn't aware of env.java.opts, but that's definitely a candidate where
> a user may want to "escape" it so it doesn't get substituted immediately, I
> agree.
>
>
> Regards
> Ingo
>
> On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <danrtsey...@gmail.com> wrote:
>
> > Hi Ingo,
> >
> > Thanks for your response.
> >
> > 1. Not distinguishing JM/TM is reasonable, but what about the client
> side.
> > For Yarn/K8s deployment,
> > the local flink-conf.yaml will be shipped to JM/TM. So I am just confused
> > about where should the environment
> > variables be replaced? IIUC, it is not an issue for Ververica Platform
> > since it is always done in the JM/TM side.
> >
> > 2. I believe we should support not do the substitution for specific key.
> A
> > typical use case is "env.java.opts". If the
> > value contains environment variables, they are expected to be replaced
> > exactly when the java command is executed,
> > not after the java process is started. Maybe escaping with single quote
> is
> > enough.
> >
> > 3. The substitution only takes effects on the value makes sense to me.
> >
> >
> > Best,
> > Yang
> >
> > Steven Wu <stevenz...@gmail.com> 于2021年1月19日周二 上午12:36写道:
> >
> > > Variable substitution (proposed here) is definitely useful.
> > >
> > > For us, hierarchical override is more useful.  E.g., we may have the
> > > default value of "state.checkpoints.dir=path1" defined in
> > flink-conf.yaml.
> > > But maybe we want to override it to "state.checkpoints.dir=path2" via
> > > environment variable in some scenarios. Otherwise, we have to define a
> > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for the Flink
> > > config, which is annoying.
> > >
> > > As Ingo pointed, it is also annoying to handle Java property key naming
> > > convention (dots separated), as dots aren't allowed in shell env var
> > naming
> > > (All caps, separated with underscore). Shell will complain. We have to
> > > bundle all env var overrides (k-v pairs) in a single property value
> (JSON
> > > and base64 encode) to avoid it.
> > >
> > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <i...@ververica.com> wrote:
> > >
> > > > Hi Yang,
> > > >
> > > > thanks for your questions! I'm glad to see this feature is being
> > received
> > > > positively.
> > > >
> > > > ad 1) We don't distinguish JM/TM, and I can't think of a good reason
> > why
> > > a
> > > > user would want to do so. I'm not very experienced with Flink,
> however,
> > > so
> > > > please excuse me if I'm overlooking some obvious reason here. :-)
> > > > ad 2) Admittedly I don't have a good overview on all the
> configuration
> > > > options that exist, but from those that I do know I can't imagine
> > someone
> > > > wanting to pass a value like "${MY_VAR}" verbatim. In Ververica
> > Platform
> > > as
> > > > of now we ignore this problem. If, however, this needs to be
> > addressed, a
> > > > possible solution could be to allow escaping syntax such as
> > "\${MY_VAR}".
> > > >
> > > > Another point to consider here is when exactly the substitution takes
> > > > place: on the "raw" file, or on the parsed key / value separately,
> and
> > if
> > > > so, should it support both key and value? My current thinking is that
> > > > substituting only the value of the parsed entry should be sufficient.
> > > >
> > > >
> > > > Regards
> > > > Ingo
> > > >
> > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <danrtsey...@gmail.com>
> > wrote:
> > > >
> > > > > Thanks for kicking off the discussion.
> > > > >
> > > > > I think supporting environment variables rendering in the Flink
> > > > > configuration yaml file is a good idea. Especially for
> > > > > the Kubernetes environment since we are using the secret resource
> to
> > > > store
> > > > > the authentication information.
> > > > >
> > > > > But I have some questions for how to do it?
> > > > > 1. The environments in Flink configuration yaml will be replaced in
> > > > client,
> > > > > JobManager, TaskManager or all of them?
> > > > > 2. If users do not want some config options to be replaced, how to
> > > > > achieve that?
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Khachatryan Roman <khachatryan.ro...@gmail.com> 于2021年1月18日周一
> > > 下午8:55写道:
> > > > >
> > > > > > Hi Ingo,
> > > > > >
> > > > > > Thanks a lot for this proposal!
> > > > > >
> > > > > > We had a related discussion recently in the context of
> FLINK-19520
> > > > > > (randomizing tests configuration) [1].
> > > > > > I believe other scenarios will benefit as well.
> > > > > >
> > > > > > For the end users, I think substitution in configuration files is
> > > > > > preferable over parsing env vars in Flink code.
> > > > > > And for cases without such a file, we could have a default one on
> > the
> > > > > > classpath with all substitutions defined (and then merge
> everything
> > > > from
> > > > > > the user-supplied file).
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
> > > > > >
> > > > > > Regards,
> > > > > > Roman
> > > > > >
> > > > > >
> > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <i...@ververica.com>
> > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > in Ververica Platform we offer a feature to use environment
> > > variables
> > > > > in
> > > > > > > the Flink configuration¹, e.g.
> > > > > > >
> > > > > > > ```
> > > > > > > s3.access-key: ${S3_ACCESS_KEY}
> > > > > > > ```
> > > > > > >
> > > > > > > We've been discussing internally whether contributing such a
> > > feature
> > > > to
> > > > > > > Flink directly would make sense and wanted to start a
> discussion
> > on
> > > > > this
> > > > > > > topic.
> > > > > > >
> > > > > > > An alternative way to do so from the above would be parsing
> those
> > > > > > directly
> > > > > > > based on their name, so instead of having it defined in the
> Flink
> > > > > > > configuration as above, it would get automatically set if
> > something
> > > > > like
> > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the environment. This is
> > > > > somewhat
> > > > > > > similar to what e.g. Spring does, and faces similar challenges
> > > > (dealing
> > > > > > > with "."s etc.)
> > > > > > >
> > > > > > > Although I view both of these approaches as mostly orthogonal,
> > > > > supporting
> > > > > > > both very likely wouldn't make sense, of course. So I was
> > wondering
> > > > > what
> > > > > > > your opinion is in terms of whether the project would benefit
> > from
> > > > > > > environment variable support for the Flink configuration, and
> > > whether
> > > > > > there
> > > > > > > are tendencies as to which approach to go with.
> > > > > > >
> > > > > > > ¹
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
> > > > > > >
> > > > > > > Best regards
> > > > > > > Ingo
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to