Hi,

sorry, I almost forgot, so just to update this thread: I now started a new
thread for the actual FLIP:

http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-161-Configuration-through-environment-variables-td48094.html


Ingo

On Wed, Jan 20, 2021 at 11:10 AM Ingo Bürk <i...@ververica.com> wrote:

> Hi Steven,
>
> understood, thanks for expanding on your point. I will prepare a FLIP now
> going for the solution of S3_ACCESS_KEY style environment variables since
> overall this fits Flink use-cases better. I think your point that users
> have to know how to convert s3.access-key to S3_ACCESS_KEY is valid, but it
> would follow a logical pattern also used by a big project like Spring.
>
> I'll update this thread with the FLIP number once I have written the first
> draft. Thanks everyone!
>
>
> Regards
> Ingo
>
> On Tue, Jan 19, 2021 at 6:28 PM Steven Wu <stevenz...@gmail.com> wrote:
>
>> Ingo,
>>
>> regarding "state.checkpoints.dir: ${CHECKPOINTS_DIR:-path1}", it
>> definitely
>> can work. but now users need to know that we can use "CHECKPOINTS_DIR" env
>> var to override "state.checkpoints.dir". That is the inconvenience that I
>> am trying to avoid. "state.checkpoints.dir" is well documented in the
>> Flink
>> website. Now, we need to document "CHECKPOINTS_DIR" separately.
>>
>> Ideally, we want to just let the user define a env var like
>> "state.checkpoints.dir=some/overriden/path". But it suffers the problem
>> that dot chars are invalid characters for shell env var names. That is the
>> dilemma.
>>
>> That is why we bundled all the non-conformning env var overrides (where
>> variable names containing non-conforming chars) into a single base64
>> encoded string (like "NON_CONFORMING_OVERRIDES_BASE64=<base64 encoded json
>> string>"  in our deployment infrastructure and unpack them in the
>> container
>> startup. It is hacky. I am hoping that there is a more elegant solution.
>>
>> If the configuration key is conforming to shell standard (like
>> S3_ACCESS_KEY), then we don't have a problem. but it will be deviating
>> from
>> the Flink config naming convention (Java property style, dot separated).
>>
>> Thanks,
>> Steven
>>
>>
>> On Tue, Jan 19, 2021 at 6:56 AM Till Rohrmann <trohrm...@apache.org>
>> wrote:
>>
>> > I think a short FLIP would be awesome.
>> >
>> > I guess this feature hasn't been implemented yet because it has not been
>> > implemented yet ;-) I agree that this feature will improve configuration
>> > ergonomics big time :-)
>> >
>> > Cheers,
>> > Till
>> >
>> > On Tue, Jan 19, 2021 at 12:28 PM Ufuk Celebi <u...@apache.org> wrote:
>> >
>> > > Hey all,
>> > >
>> > > I think that approach 2 is more idiomatic for container deployments
>> where
>> > > it can be cumbersome to manually map flink-conf.yaml contents to env
>> vars
>> > > [1]. The precedence order outlined by Till would also cover Steven's
>> > > hierarchical overwrite requirement.
>> > >
>> > > I'm really excited about this feature as it will make Flink
>> deployments a
>> > > lot more ergonomic. The implementation seems to be not too complicated
>> > > (which makes we wonder why we didn't tackle this earlier or whether
>> I'm
>> > > missing something).
>> > >
>> > > I'd also be happy to shepherd this contribution if there is consensus
>> on
>> > > the need for it and the approach. Does it make sense to formalize this
>> > > decision a bit with a short FLIP?
>> > >
>> > > – Ufuk
>> > >
>> > > [1] In Ververica Platform, we support approach 1, because the Flink
>> > > configuration is part of the specification for a single Deployment and
>> > it's
>> > > minimally more convenient to have something like
>> > >
>> > > flinkConfiguration:
>> > >   foo: ${BAR}
>> > >
>> > > for us. I don't think this approach would feel natural when manually
>> > > deploying Flink. There would be a clear migration path for our
>> customers,
>> > > so I'm not concerned about this too much.
>> > >
>> > > On Tue, Jan 19, 2021, at 10:01 AM, Till Rohrmann wrote:
>> > > > Hi everyone,
>> > > >
>> > > > Thanks for starting this discussion Ingo. I think being able to use
>> env
>> > > > variables to change Flink's configuration will be a very useful
>> > feature.
>> > > >
>> > > > Concerning the two approaches I would be in favour of the second
>> > approach
>> > > > ($FLINK_CONFIG_S3_ACCESS_KEY) because it does not require the user
>> to
>> > > > prepare a special flink-conf.yaml where he inserts env variables for
>> > > every
>> > > > config value he wants to configure. Since this is not required with
>> the
>> > > > second approach, I think it is more general and easier to use. Also,
>> > the
>> > > > user does not have to remember a second set of names (env names)
>> which
>> > he
>> > > > has to/can set.
>> > > >
>> > > > For how to substitute the values, I think it should happen when we
>> load
>> > > the
>> > > > Flink configuration. First we read the file and then overwrite
>> values
>> > > > specified via an env variable or dynamic properties in some defined
>> > > order.
>> > > > For env.java.opts and other options which are used for starting the
>> JVM
>> > > we
>> > > > might need special handling in the bash scripts.
>> > > >
>> > > > Cheers,
>> > > > Till
>> > > >
>> > > > On Tue, Jan 19, 2021 at 9:46 AM Ingo Bürk <i...@ververica.com>
>> wrote:
>> > > >
>> > > > > Hi Yang,
>> > > > >
>> > > > > 1. As you said I think this doesn't affect Ververica Platform,
>> > really,
>> > > so
>> > > > > I'm more than happy to hear and follow the thoughts of people more
>> > > > > experienced with Flink than me.
>> > > > > 2. I wasn't aware of env.java.opts, but that's definitely a
>> candidate
>> > > where
>> > > > > a user may want to "escape" it so it doesn't get substituted
>> > > immediately, I
>> > > > > agree.
>> > > > >
>> > > > >
>> > > > > Regards
>> > > > > Ingo
>> > > > >
>> > > > > On Tue, Jan 19, 2021 at 4:47 AM Yang Wang <danrtsey...@gmail.com>
>> > > wrote:
>> > > > >
>> > > > > > Hi Ingo,
>> > > > > >
>> > > > > > Thanks for your response.
>> > > > > >
>> > > > > > 1. Not distinguishing JM/TM is reasonable, but what about the
>> > client
>> > > > > side.
>> > > > > > For Yarn/K8s deployment,
>> > > > > > the local flink-conf.yaml will be shipped to JM/TM. So I am just
>> > > confused
>> > > > > > about where should the environment
>> > > > > > variables be replaced? IIUC, it is not an issue for Ververica
>> > > Platform
>> > > > > > since it is always done in the JM/TM side.
>> > > > > >
>> > > > > > 2. I believe we should support not do the substitution for
>> specific
>> > > key.
>> > > > > A
>> > > > > > typical use case is "env.java.opts". If the
>> > > > > > value contains environment variables, they are expected to be
>> > > replaced
>> > > > > > exactly when the java command is executed,
>> > > > > > not after the java process is started. Maybe escaping with
>> single
>> > > quote
>> > > > > is
>> > > > > > enough.
>> > > > > >
>> > > > > > 3. The substitution only takes effects on the value makes sense
>> to
>> > > me.
>> > > > > >
>> > > > > >
>> > > > > > Best,
>> > > > > > Yang
>> > > > > >
>> > > > > > Steven Wu <stevenz...@gmail.com> 于2021年1月19日周二 上午12:36写道:
>> > > > > >
>> > > > > > > Variable substitution (proposed here) is definitely useful.
>> > > > > > >
>> > > > > > > For us, hierarchical override is more useful.  E.g., we may
>> have
>> > > the
>> > > > > > > default value of "state.checkpoints.dir=path1" defined in
>> > > > > > flink-conf.yaml.
>> > > > > > > But maybe we want to override it to
>> "state.checkpoints.dir=path2"
>> > > via
>> > > > > > > environment variable in some scenarios. Otherwise, we have to
>> > > define a
>> > > > > > > corresponding shell variable (like STATE_CHECKPOINTS_DIR) for
>> the
>> > > Flink
>> > > > > > > config, which is annoying.
>> > > > > > >
>> > > > > > > As Ingo pointed, it is also annoying to handle Java property
>> key
>> > > naming
>> > > > > > > convention (dots separated), as dots aren't allowed in shell
>> env
>> > > var
>> > > > > > naming
>> > > > > > > (All caps, separated with underscore). Shell will complain. We
>> > > have to
>> > > > > > > bundle all env var overrides (k-v pairs) in a single property
>> > value
>> > > > > (JSON
>> > > > > > > and base64 encode) to avoid it.
>> > > > > > >
>> > > > > > > On Mon, Jan 18, 2021 at 8:15 AM Ingo Bürk <i...@ververica.com
>> >
>> > > wrote:
>> > > > > > >
>> > > > > > > > Hi Yang,
>> > > > > > > >
>> > > > > > > > thanks for your questions! I'm glad to see this feature is
>> > being
>> > > > > > received
>> > > > > > > > positively.
>> > > > > > > >
>> > > > > > > > ad 1) We don't distinguish JM/TM, and I can't think of a
>> good
>> > > reason
>> > > > > > why
>> > > > > > > a
>> > > > > > > > user would want to do so. I'm not very experienced with
>> Flink,
>> > > > > however,
>> > > > > > > so
>> > > > > > > > please excuse me if I'm overlooking some obvious reason
>> here.
>> > :-)
>> > > > > > > > ad 2) Admittedly I don't have a good overview on all the
>> > > > > configuration
>> > > > > > > > options that exist, but from those that I do know I can't
>> > imagine
>> > > > > > someone
>> > > > > > > > wanting to pass a value like "${MY_VAR}" verbatim. In
>> Ververica
>> > > > > > Platform
>> > > > > > > as
>> > > > > > > > of now we ignore this problem. If, however, this needs to be
>> > > > > > addressed, a
>> > > > > > > > possible solution could be to allow escaping syntax such as
>> > > > > > "\${MY_VAR}".
>> > > > > > > >
>> > > > > > > > Another point to consider here is when exactly the
>> substitution
>> > > takes
>> > > > > > > > place: on the "raw" file, or on the parsed key / value
>> > > separately,
>> > > > > and
>> > > > > > if
>> > > > > > > > so, should it support both key and value? My current
>> thinking
>> > is
>> > > that
>> > > > > > > > substituting only the value of the parsed entry should be
>> > > sufficient.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Regards
>> > > > > > > > Ingo
>> > > > > > > >
>> > > > > > > > On Mon, Jan 18, 2021 at 3:48 PM Yang Wang <
>> > danrtsey...@gmail.com
>> > > >
>> > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Thanks for kicking off the discussion.
>> > > > > > > > >
>> > > > > > > > > I think supporting environment variables rendering in the
>> > Flink
>> > > > > > > > > configuration yaml file is a good idea. Especially for
>> > > > > > > > > the Kubernetes environment since we are using the secret
>> > > resource
>> > > > > to
>> > > > > > > > store
>> > > > > > > > > the authentication information.
>> > > > > > > > >
>> > > > > > > > > But I have some questions for how to do it?
>> > > > > > > > > 1. The environments in Flink configuration yaml will be
>> > > replaced in
>> > > > > > > > client,
>> > > > > > > > > JobManager, TaskManager or all of them?
>> > > > > > > > > 2. If users do not want some config options to be
>> replaced,
>> > > how to
>> > > > > > > > > achieve that?
>> > > > > > > > >
>> > > > > > > > > Best,
>> > > > > > > > > Yang
>> > > > > > > > >
>> > > > > > > > > Khachatryan Roman <khachatryan.ro...@gmail.com>
>> > 于2021年1月18日周一
>> > > > > > > 下午8:55写道:
>> > > > > > > > >
>> > > > > > > > > > Hi Ingo,
>> > > > > > > > > >
>> > > > > > > > > > Thanks a lot for this proposal!
>> > > > > > > > > >
>> > > > > > > > > > We had a related discussion recently in the context of
>> > > > > FLINK-19520
>> > > > > > > > > > (randomizing tests configuration) [1].
>> > > > > > > > > > I believe other scenarios will benefit as well.
>> > > > > > > > > >
>> > > > > > > > > > For the end users, I think substitution in configuration
>> > > files is
>> > > > > > > > > > preferable over parsing env vars in Flink code.
>> > > > > > > > > > And for cases without such a file, we could have a
>> default
>> > > one on
>> > > > > > the
>> > > > > > > > > > classpath with all substitutions defined (and then merge
>> > > > > everything
>> > > > > > > > from
>> > > > > > > > > > the user-supplied file).
>> > > > > > > > > >
>> > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-19520
>> > > > > > > > > >
>> > > > > > > > > > Regards,
>> > > > > > > > > > Roman
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Mon, Jan 18, 2021 at 11:11 AM Ingo Bürk <
>> > > i...@ververica.com>
>> > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi everyone,
>> > > > > > > > > > >
>> > > > > > > > > > > in Ververica Platform we offer a feature to use
>> > environment
>> > > > > > > variables
>> > > > > > > > > in
>> > > > > > > > > > > the Flink configuration¹, e.g.
>> > > > > > > > > > >
>> > > > > > > > > > > ```
>> > > > > > > > > > > s3.access-key: ${S3_ACCESS_KEY}
>> > > > > > > > > > > ```
>> > > > > > > > > > >
>> > > > > > > > > > > We've been discussing internally whether contributing
>> > such
>> > > a
>> > > > > > > feature
>> > > > > > > > to
>> > > > > > > > > > > Flink directly would make sense and wanted to start a
>> > > > > discussion
>> > > > > > on
>> > > > > > > > > this
>> > > > > > > > > > > topic.
>> > > > > > > > > > >
>> > > > > > > > > > > An alternative way to do so from the above would be
>> > parsing
>> > > > > those
>> > > > > > > > > > directly
>> > > > > > > > > > > based on their name, so instead of having it defined
>> in
>> > the
>> > > > > Flink
>> > > > > > > > > > > configuration as above, it would get automatically
>> set if
>> > > > > > something
>> > > > > > > > > like
>> > > > > > > > > > > $FLINK_CONFIG_S3_ACCESS_KEY was set in the
>> environment.
>> > > This is
>> > > > > > > > > somewhat
>> > > > > > > > > > > similar to what e.g. Spring does, and faces similar
>> > > challenges
>> > > > > > > > (dealing
>> > > > > > > > > > > with "."s etc.)
>> > > > > > > > > > >
>> > > > > > > > > > > Although I view both of these approaches as mostly
>> > > orthogonal,
>> > > > > > > > > supporting
>> > > > > > > > > > > both very likely wouldn't make sense, of course. So I
>> was
>> > > > > > wondering
>> > > > > > > > > what
>> > > > > > > > > > > your opinion is in terms of whether the project would
>> > > benefit
>> > > > > > from
>> > > > > > > > > > > environment variable support for the Flink
>> configuration,
>> > > and
>> > > > > > > whether
>> > > > > > > > > > there
>> > > > > > > > > > > are tendencies as to which approach to go with.
>> > > > > > > > > > >
>> > > > > > > > > > > ¹
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> >
>> https://docs.ververica.com/user_guide/application_operations/deployments/configure_flink.html#environment-variables
>> > > > > > > > > > >
>> > > > > > > > > > > Best regards
>> > > > > > > > > > > Ingo
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to