gerkElznik commented on PR #1126:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/1126#issuecomment-4771813362

   > @gerkElznik the e2es seem to be failing
   
   ## Heads-up before going further: this regresses Flink 1.x application 
deployments
   
   While tracking down the failing `test_application_operations.sh` smoke test, 
I found that switching the operator to mount `config.yaml` breaks **Flink 1.x 
application-mode** deployments. I think it reshapes this PR, so I wanted to lay 
it out.
   
   ### Symptom
   A `v1_xx` application deployment's JobManager crashloops at startup:
   
   ```
   ERROR ClusterEntrypoint - Could not create application program.
   java.net.URISyntaxException: Illegal character in scheme name at index 0:
       ['local:///opt/flink/usrlib/myjob.jar']
     at KubernetesApplicationClusterEntrypoint.fetchArtifacts(…) 
[flink-dist-1.20.4.jar]
   ```
   
   ### Isolated with an A/B
   Same operator image (built from this branch), same kind cluster, same `v1_20`
   `StateMachineExample` — the only difference is the chart's mounted config 
file:
   
   |  | mounts `config.yaml` (this PR) | mounts `flink-conf.yaml` (main) |
   |---|---|---|
   | operator config mode | standard-YAML | legacy |
   | operator writes to the JM ConfigMap | `pipeline.jars: 
['local:///…/myjob.jar']` | `pipeline.jars: local:///…/myjob.jar` |
   | `v1_20` JobManager | URISyntaxException → CrashLoopBackOff | READY / 
RUNNING |
   
   ### Root cause
   `FlinkConfMountDecorator#getClusterSideConfData` (the operator's copy, from 
FLINK-37236):
   
   ```java
   // … otherwise it would simply inherit it from the base config (which would 
always be false currently).
   Configuration clusterSideConfig = new 
Configuration(useStandardYamlConfig()); // false for v1_x → flink-conf.yaml ✓
   clusterSideConfig.addAll(flinkConfig);                                       
   // copies the operator's global-config values
   return ConfigurationUtils.convertConfigToWritableLines(clusterSideConfig, 
false);
   ```
   
   useStandardYamlConfig() correctly keys off the deployment's FlinkVersion, 
but addAll(flinkConfig) copies values from the operator's global config — and 
the comment assumes that global config is "always false" (legacy). Once the 
operator reads config.yaml, its global config is standardYaml=true, so list 
options (pipeline.jars, pipeline.classpaths, …) carry standard-YAML form and 
get written into the v1_x flink-conf.yaml as ['local://…'], which the legacy 
parser rejects. So it's a pre-existing operator assumption that this PR is 
simply the first to expose — and it affects any list-valued option on any 
Flink-1.x deployment, not only pipeline.jars.
   
   How would you like to sequence this @gyfora?  Tagging @Dennis-Mircea as well 
since he's mentioned planning for the overall 2.x upgrade.
   
   1. Fix FlinkConfMountDecorator to serialize per the target deployment's YAML 
mode (downgrade list values when useStandardYamlConfig() is false) as a 
prerequisite, then land this chart change; or
   2. Hold the config.yaml mount (this change) until the operator no longer 
manages Flink-1.x deployments.
   
   Happy to help with either, including the operator-side code change.
   
   _Two smaller notes, same root (single config.yaml key):_
   _- Re your inline question on the removed lines: the 
watchNamespaces/operatorHealth injections aren't dropped, they were emitted 
once per config-file key (so twice), and the dedup keeps one copy (still just 
below the resolver; helm template shows them rendering once). The hasKey → with 
swap is the actual fix, and the flink-conf.yaml key is removed to emit a single 
file._
   _- e2e-tests/test_dynamic_flink_conf.sh patches the flink-conf.yaml 
ConfigMap key, which is no longer mounted; its twin test_dynamic_config.sh 
already covers the same path via config.yaml, so it'd need updating/removing — 
but that's moot until the above is settled._


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to