jason810496 opened a new pull request, #68694:
URL: https://github.com/apache/airflow/pull/68694

   ## Why
   
   While verifying KuberntesExecutor setup for Multi-Lang feature, I found that 
we need a new config mapping to represent `coordinator -> pod_template_file`. 
For example, KubernetesExecutor needs image with JVM runtime for Java-Task.
   
   ## How
   
   Here're several directions that I went through:
   
   1.  (**Current choice**) Generic `extra` object on the `[sdk] coordinators` 
entry. Regarding the `extra` field naming, I also considered `executor`, 
`deployment`, etc. But I prefer as `extra` to make it more extendable as user 
might introduce other optional fields in `extra` for non-deployment purpose 
(e.g. referencing `conn_id`).
   
   ```json
   {
     "jdk-17": {
       "classpath": "airflow.sdk.coordinators.java.JavaCoordinator",
       "kwargs": {"java_executable": "/usr/lib/jvm/java-17-openjdk/bin/java", 
"jvm_args": ["-Xmx1024m"]},
       "extra": {"pod_template_file": "/opt/airflow/pod_templates/java.yaml"}
     }
   }
   ```
   
   2.  An optional  `pod_template_file` field at first level -- Cons: coupling 
the optional / provider-level field at first level, needs coordinator code 
change for every new field.
   
   ```json
   {
     "jdk-17": {
       "classpath": "airflow.sdk.coordinators.java.JavaCoordinator",
       "kwargs": {"java_executable": "/usr/lib/jvm/java-17-openjdk/bin/java", 
"jvm_args": ["-Xmx1024m"]},
       "pod_template_file": "/opt/airflow/pod_templates/java.yaml"
     }
   }
   ```
   
   3. An optional `pod_template_file` in `kwargs` level -- Same cons as point 2.
   
   ```json
   {
     "jdk-17": {
       "classpath": "airflow.sdk.coordinators.java.JavaCoordinator",
       "kwargs": {
         "java_executable": "/usr/lib/jvm/java-17-openjdk/bin/java",
         "jvm_args": ["-Xmx1024m"],
         "pod_template_file": "/opt/airflow/pod_templates/java.yaml"
       },
     }
   }
   ```
   
   4. A fully decouple config setup -- 
`AIRFLOW__KUBERNETES__COORDINATOR_TO_POD_TEMPLATE_FILE` 
   ```bash
   AIRFLOW__SDK__COORDINATORS='{"jdk-17": {"classpath": 
"airflow.sdk.coordinators.java.JavaCoordinator", "kwargs": {"jars_root": 
["/files/java-bundles"]}}}'
   AIRFLOW__SDK__QUEUE_TO_COORDINATOR='{"java": "jdk-17"}'
   AIRFLOW__KUBERNETES__COORDINATOR_TO_POD_TEMPLATE_FILE='{"jdk-17": 
"/opt/airflow/pod_templates/java.yaml"}'
   ```
   The cons is quite obvious, user have to understand the `queue -> coordinator 
-> pod_template` relationship to setup the Multi-Lang with KubernetesExecutor 
properly. IMO, it's a too complicated, and it's fine to store the optional 
metadata in core but paring the optional info at provider level. 
   
   ## What
   
   - Add an `extra` field to `_CoordinatorSpec` (kept separate from `kwargs`) 
and `CoordinatorManager.extra_for_queue`.
     - The `CoordinatorManager.extra_for_queue` method **will not instantiate 
the coordinator instance**.
   - Add `KubernetesExecutor._coordinator_pod_template_file` (reads 
`pod_template_file` from `extra`, import-guarded for 3.3+) and apply it with 
**precedence over `executor_config`** in `execute_async`.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [x] Yes, with help of Claude Code (Opus 4.8) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to