Gerk Elznik created FLINK-39791:
-----------------------------------
Summary: Helm chart silently ignores
defaultConfiguration.config.yaml; always mounts flink-conf.yaml from chart
defaults
Key: FLINK-39791
URL: https://issues.apache.org/jira/browse/FLINK-39791
Project: Flink
Issue Type: Bug
Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.15.0, kubernetes-operator-1.14.0
Reporter: Gerk Elznik
h2. Summary
The Helm chart at {{helm/flink-kubernetes-operator/}} exposes a
{{defaultConfiguration.config.yaml}} parameter (showcased in {{values.yaml}}
per FLINK-38409 / PR #1022), but a downstream Helm consumer who sets it cannot
actually get {{config.yaml}} mounted into the operator pod. The chart silently
continues to mount {{{}flink-conf.yaml{}}}, populated only with the chart's
hardcoded defaults. The user's {{config.yaml}} content lands in the
{{flink-operator-config}} ConfigMap under the {{config.yaml}} key but is never
read by the operator JVM.
h2. Root cause
Two design choices interact:
*1. {{helm/flink-kubernetes-operator/values.yaml}} hardcodes
{{defaultConfiguration.flink-conf.yaml}}* as a non-empty string (Flink-1.x
StatsD/SLF4J reporter config):
{code:yaml}
defaultConfiguration: ...
flink-conf.yaml: |+
# Flink Config Overrides
kubernetes.operator.metrics.reporter.slf4j.factory.class:
org.apache.flink.metrics.slf4j.Slf4jReporterFactory
kubernetes.operator.metrics.reporter.slf4j.interval: 5 MINUTE
kubernetes.operator.reconcile.interval: 15 s
kubernetes.operator.observer.progress-check.interval: 5 s
{code}
*2. {{helm/flink-kubernetes-operator/templates/controller/deployment.yaml}}
selects the mounted file with:*
{code:java}
{{- if hasKey .Values.defaultConfiguration "flink-conf.yaml" }}
- key: flink-conf.yaml
path: flink-conf.yaml
{{- else }}
- key: config.yaml
path: config.yaml
{{- end }}
{code}
Helm value merging is additive on maps – a downstream {{-f my-values.yaml-}} or
{{-set}} cannot remove a key that exists in the chart's own
{{{}values.yaml{}}}. Setting {{{}defaultConfiguration.flink-conf.yaml:
null{}}}, {{{}~{}}}, or {{""}} in a downstream
values file still leaves the key present in {{.Values.defaultConfiguration}}
for {{hasKey}} purposes. So {{hasKey}} is always true and the {{config.yaml}}
branch of the conditional is unreachable for any normal consumer of the
published chart.
h3. Compounding issue
{{templates/controller/configmap.yaml}} unconditionally emits *both*
{{config.yaml}} and {{flink-conf.yaml}} keys into the rendered ConfigMap. So
even if a user works around the mount selection, the ConfigMap contains two
divergent files and the
wrong one is authoritative based on what the Deployment volume's {{items}}
references.
h3. Misleading showcase
FLINK-38409 / PR #1022 added this comment to {{{}values.yaml{}}}:
{quote} # Uncomment to use the new config.yaml format, but also comment out the
flink-config.yaml key.{quote}
The instruction "comment out the flink-conf.yaml key" silently requires editing
the *chart's* own {{{}values.yaml{}}}, which downstream Helm consumers cannot
do without forking. The PR landed an undocumented workaround as a documented
feature.
h2. Reproduction
# Install or {{helm template}} the published 1.15.0 chart with downstream
values that set {{defaultConfiguration.config.yaml}} to some Flink-2.x
structured-YAML content. Do *not* set
{{{}defaultConfiguration.flink-conf.yaml{}}}.
# Inspect the rendered Deployment: the {{{}flink-operator-config-volume{}}}'s
{{items}} references {{{}flink-conf.yaml{}}}, not {{{}config.yaml{}}}.
# Inspect the rendered ConfigMap {{{}flink-operator-config{}}}: both
{{data.config.yaml}} and {{data.flink-conf.yaml}} are present. The user's
content is under {{{}config.yaml{}}}; the upstream defaults are under
{{{}flink-conf.yaml{}}}.
# The pod reads {{/opt/flink/conf/flink-conf.yaml}} and runs on upstream
defaults; the user's overrides are silently inert.
h2. Expected behavior
A user who provides {{defaultConfiguration.config.yaml}} and nothing for
{{defaultConfiguration.flink-conf.yaml}} should get:
* A Deployment that mounts {{{}config.yaml{}}}.
* A ConfigMap that does not include a stale, ignored {{flink-conf.yaml}} key
(or, if it must for back-compat, an explicitly empty one).
h2. Suggested fixes (any one of)
# *Invert priority* in {{templates/controller/deployment.yaml}} – prefer
{{config.yaml}} when the user has explicitly populated it, fall back to
{{flink-conf.yaml}} only otherwise. Lossless for legacy users; gives precedence
to the explicit
modern-format opt-in. Mirror the same priority in
{{templates/controller/configmap.yaml}} so only one key is emitted.
# *Drop the hardcoded default* for {{defaultConfiguration.flink-conf.yaml}} in
{{{}values.yaml{}}}. The same content is loaded from {{conf/flink-conf.yaml}}
via {{{}defaultConfiguration.append: true{}}}, so nothing is lost; {{hasKey}}
then actually
discriminates based on user intent.
# *Treat empty values as absent* – replace {{hasKey ...}} with a check on the
resolved value (e.g. {{{}(index .Values.defaultConfiguration
"flink-conf.yaml"){}}}). Less ergonomic than #1 but the smallest template
change; lets downstream users
effectively unset via {{{}flink-conf.yaml: ""{}}}.
h2. Workaround
Apply Kustomize patches downstream after {{{}helm template{}}}:
# {{op: remove}} the {{/data/flink-conf.yaml}} key from the rendered ConfigMap.
# Strategic-merge the Deployment volume's {{items}} to reference
{{config.yaml}} plus the log4j (or logback) files.
h2. Related tickets
* FLINK-35744 ("Support flink 1.19 config.yaml format", fixed in 1.12.0) –
added {{config.yaml}} support generally; did not address downstream override.
* FLINK-38409 ("Showcase in Helm chart values.yaml that config.yaml can be
used", merged via PR #1022) – added the misleading showcase comment.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)