Gerk Elznik created FLINK-39791:
-----------------------------------

             Summary: Helm chart silently ignores 
defaultConfiguration.config.yaml; always mounts flink-conf.yaml from chart 
defaults
                 Key: FLINK-39791
                 URL: https://issues.apache.org/jira/browse/FLINK-39791
             Project: Flink
          Issue Type: Bug
          Components: Kubernetes Operator
    Affects Versions: kubernetes-operator-1.15.0, kubernetes-operator-1.14.0
            Reporter: Gerk Elznik


h2. Summary

The Helm chart at {{helm/flink-kubernetes-operator/}} exposes a 
{{defaultConfiguration.config.yaml}} parameter (showcased in {{values.yaml}} 
per FLINK-38409 / PR #1022), but a downstream Helm consumer who sets it cannot 
actually get {{config.yaml}} mounted into the operator pod. The chart silently 
continues to mount {{{}flink-conf.yaml{}}}, populated only with the chart's 
hardcoded defaults. The user's {{config.yaml}} content lands in the 
{{flink-operator-config}} ConfigMap under the {{config.yaml}} key but is never 
read by the operator JVM.
h2. Root cause

Two design choices interact:

*1. {{helm/flink-kubernetes-operator/values.yaml}} hardcodes 
{{defaultConfiguration.flink-conf.yaml}}* as a non-empty string (Flink-1.x 
StatsD/SLF4J reporter config):
{code:yaml}
  defaultConfiguration:     ...
    flink-conf.yaml: |+
      # Flink Config Overrides
      kubernetes.operator.metrics.reporter.slf4j.factory.class: 
org.apache.flink.metrics.slf4j.Slf4jReporterFactory
      kubernetes.operator.metrics.reporter.slf4j.interval: 5 MINUTE
      kubernetes.operator.reconcile.interval: 15 s
      kubernetes.operator.observer.progress-check.interval: 5 s
  {code}
*2. {{helm/flink-kubernetes-operator/templates/controller/deployment.yaml}} 
selects the mounted file with:*
{code:java}
  {{- if hasKey .Values.defaultConfiguration "flink-conf.yaml" }}
    - key: flink-conf.yaml
      path: flink-conf.yaml
  {{- else }}
    - key: config.yaml
      path: config.yaml
  {{- end }}
  {code}
Helm value merging is additive on maps – a downstream {{-f my-values.yaml-}} or 
{{-set}} cannot remove a key that exists in the chart's own 
{{{}values.yaml{}}}. Setting {{{}defaultConfiguration.flink-conf.yaml: 
null{}}}, {{{}~{}}}, or {{""}} in a downstream
values file still leaves the key present in {{.Values.defaultConfiguration}} 
for {{hasKey}} purposes. So {{hasKey}} is always true and the {{config.yaml}} 
branch of the conditional is unreachable for any normal consumer of the 
published chart.
h3. Compounding issue

{{templates/controller/configmap.yaml}} unconditionally emits *both* 
{{config.yaml}} and {{flink-conf.yaml}} keys into the rendered ConfigMap. So 
even if a user works around the mount selection, the ConfigMap contains two 
divergent files and the
wrong one is authoritative based on what the Deployment volume's {{items}} 
references.
h3. Misleading showcase

FLINK-38409 / PR #1022 added this comment to {{{}values.yaml{}}}:
{quote} # Uncomment to use the new config.yaml format, but also comment out the 
flink-config.yaml key.{quote}
The instruction "comment out the flink-conf.yaml key" silently requires editing 
the *chart's* own {{{}values.yaml{}}}, which downstream Helm consumers cannot 
do without forking. The PR landed an undocumented workaround as a documented 
feature.
h2. Reproduction
 # Install or {{helm template}} the published 1.15.0 chart with downstream 
values that set {{defaultConfiguration.config.yaml}} to some Flink-2.x 
structured-YAML content. Do *not* set 
{{{}defaultConfiguration.flink-conf.yaml{}}}.
 # Inspect the rendered Deployment: the {{{}flink-operator-config-volume{}}}'s 
{{items}} references {{{}flink-conf.yaml{}}}, not {{{}config.yaml{}}}.
 # Inspect the rendered ConfigMap {{{}flink-operator-config{}}}: both 
{{data.config.yaml}} and {{data.flink-conf.yaml}} are present. The user's 
content is under {{{}config.yaml{}}}; the upstream defaults are under 
{{{}flink-conf.yaml{}}}.
 # The pod reads {{/opt/flink/conf/flink-conf.yaml}} and runs on upstream 
defaults; the user's overrides are silently inert.

h2. Expected behavior

A user who provides {{defaultConfiguration.config.yaml}} and nothing for 
{{defaultConfiguration.flink-conf.yaml}} should get:
 * A Deployment that mounts {{{}config.yaml{}}}.
 * A ConfigMap that does not include a stale, ignored {{flink-conf.yaml}} key 
(or, if it must for back-compat, an explicitly empty one).

h2. Suggested fixes (any one of)
 # *Invert priority* in {{templates/controller/deployment.yaml}} – prefer 
{{config.yaml}} when the user has explicitly populated it, fall back to 
{{flink-conf.yaml}} only otherwise. Lossless for legacy users; gives precedence 
to the explicit
modern-format opt-in. Mirror the same priority in 
{{templates/controller/configmap.yaml}} so only one key is emitted.
 # *Drop the hardcoded default* for {{defaultConfiguration.flink-conf.yaml}} in 
{{{}values.yaml{}}}. The same content is loaded from {{conf/flink-conf.yaml}} 
via {{{}defaultConfiguration.append: true{}}}, so nothing is lost; {{hasKey}} 
then actually
discriminates based on user intent.
 # *Treat empty values as absent* – replace {{hasKey ...}} with a check on the 
resolved value (e.g. {{{}(index .Values.defaultConfiguration 
"flink-conf.yaml"){}}}). Less ergonomic than #1 but the smallest template 
change; lets downstream users
effectively unset via {{{}flink-conf.yaml: ""{}}}.

h2. Workaround

Apply Kustomize patches downstream after {{{}helm template{}}}:
 # {{op: remove}} the {{/data/flink-conf.yaml}} key from the rendered ConfigMap.
 # Strategic-merge the Deployment volume's {{items}} to reference 
{{config.yaml}} plus the log4j (or logback) files.

h2. Related tickets
 * FLINK-35744 ("Support flink 1.19 config.yaml format", fixed in 1.12.0) – 
added {{config.yaml}} support generally; did not address downstream override.
 * FLINK-38409 ("Showcase in Helm chart values.yaml that config.yaml can be 
used", merged via PR #1022) – added the misleading showcase comment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to