jason810496 commented on code in PR #68603:
URL: https://github.com/apache/airflow/pull/68603#discussion_r3433493587


##########
.agents/skills/airflow-java-sdk/SKILL.md:
##########
@@ -0,0 +1,86 @@
+---
+name: airflow-java-sdk
+description: >
+  Guide for contributing to the Airflow Java SDK (AIP-108). Use this skill
+  whenever a contributor is working in the `java-sdk/` directory or on the Java
+  coordinator in `task-sdk/src/airflow/sdk/coordinators/java/` — whether they
+  want to add a feature, write tests, fix a bug, understand the architecture, 
or
+  prepare a PR. Trigger on phrases like "Java SDK", "JavaCoordinator",
+  "java-sdk", "annotation processor", "Builder.Task", "BundleBuilder", or
+  anything about running JVM tasks in Airflow.
+---
+
+<!-- SPDX-License-Identifier: Apache-2.0
+     https://www.apache.org/licenses/LICENSE-2.0 -->
+
+# Airflow Java SDK contributor guide
+
+The Java SDK lets Airflow tasks execute JVM code (Java, Kotlin, or any JVM 
language). You are helping
+a contributor work in one or both of these locations:
+
+- **`java-sdk/`** — the JVM-side library (Kotlin source, published to Maven)
+- **`task-sdk/src/airflow/sdk/coordinators/java/`** — the Python coordinator 
that launches the JVM subprocess
+

Review Comment:
   Perhaps worthwhile to mention that 
`java-sdk/sdk/src/main/kotlin/org/apache/airflow/sdk` will be user-facing 
interface and the 
`java-sdk/sdk/src/main/kotlin/org/apache/airflow/sdk/execution` be runtime 
implementation that won't expose to user?



##########
.agents/skills/airflow-new-sdk/SKILL.md:
##########
@@ -0,0 +1,137 @@
+---
+name: airflow-new-sdk
+description: >
+  Guide for implementing a brand-new language SDK for Airflow (AIP-108). Use
+  this skill when a contributor wants to add support for a new programming
+  language — designing the Python coordinator, implementing the wire protocol 
in
+  the target language, writing the bundle format, and structuring the PR. 
Trigger
+  on phrases like "new language SDK", "new SDK", "add support for [language]",
+  "implement coordinator for", "SubprocessCoordinator", "BaseCoordinator",
+  "new runtime", "AFBNDL01", "supervisor schema", or anything about bringing a
+  new language into the Airflow executor ecosystem.
+---
+
+<!-- SPDX-License-Identifier: Apache-2.0
+     https://www.apache.org/licenses/LICENSE-2.0 -->
+
+# Implementing a new language SDK for Airflow
+
+## Start here
+
+**Read `contributing-docs/30_new_language_sdk.rst` first.** It is the
+authoritative contributor guide for this topic — coordinator base class 
choices,
+wire protocol spec, bundle footer format, and testing requirements. Everything
+in this skill builds on top of it, not alongside it.
+
+---
+
+## Repository layout
+
+Every new SDK needs two things. The coordinator (Python) goes here:
+
+```
+task-sdk/src/airflow/sdk/coordinators/<language>/
+    __init__.py        # re-export + module docstring
+    coordinator.py     # subclass of SubprocessCoordinator or BaseCoordinator
+task-sdk/tests/coordinators/<language>/
+    test_coordinator.py
+task-sdk/tests/integration/coordinators/<language>/
+    test_integration.py   # requires Breeze
+```
+
+The language SDK itself lives in a top-level `<language>-sdk/` directory (like
+`java-sdk/` and `go-sdk/`). For native-executable languages using
+`ExecutableCoordinator`, no coordinator code is needed at all.
+
+---
+
+## Choosing the right base class — quick guide
+
+```
+Does the runtime compile to a self-contained native executable?
+  YES → Use ExecutableCoordinator (zero Python to write).
+        Append an AFBNDL01 footer with a packer tool (see go-sdk reference).
+  NO  →
+    Does it start via a single shell command (node, ruby, dotnet, …)?
+      YES → Subclass SubprocessCoordinator.
+            Implement _build_execute_task_command only (see 
30_new_language_sdk.rst).
+      NO  →
+        Subclass BaseCoordinator and implement execute_task from scratch.
+        (Rare: gRPC daemons, shared memory, persistent processes.)
+```
+
+The full rationale, method signature, and socket lifecycle for each path are in
+`30_new_language_sdk.rst`. Read that section before writing any code.
+
+---
+
+## Reference implementations to study
+
+| What to study | Where |
+|---|---|
+| SubprocessCoordinator base class | 
`task-sdk/src/airflow/sdk/coordinators/_subprocess.py` |
+| Java coordinator (SubprocessCoordinator subclass) | 
`task-sdk/src/airflow/sdk/coordinators/java/coordinator.py` |
+| ExecutableCoordinator (native bundles) | 
`task-sdk/src/airflow/sdk/coordinators/executable/coordinator.py` |
+| Wire protocol in Kotlin | 
`java-sdk/sdk/src/main/kotlin/org/apache/airflow/sdk/execution/` |
+| Wire protocol in Go | `go-sdk/pkg/execution/` |
+| AFBNDL01 footer (Go reference) | `go-sdk/internal/bundlefooter/` |

Review Comment:
   ```suggestion
   | AFBNDL01 footer (Go reference) | `go-sdk/internal/bundlefooter/`, 
`task-sdk/docs/executable-bundle-spec.rst` |
   ```



##########
java-sdk/README.md:
##########
@@ -181,31 +181,151 @@ newlines, which does not work well in a Gradle 
properties file.
 credentials instead: `ASF_NEXUS_USERNAME`, `ASF_NEXUS_PASSWORD`, `SIGNING_KEY`,
 and `SIGNING_PASSWORD`. This is especially useful on e.g. CI.
 
-## Technical Details
+## Contributing
+
+The user implements a Java application containing task methods annotated (or
+registered) with the SDK. The application is packaged as a bundle and placed
+where Airflow can find it.
+
+When the Airflow supervisor identifies that a task should run with Java, it
+launches the JVM application as a subprocess. The flow is:
+
+1. `JavaCoordinator.execute_task()` (Python) scans `jars_root`, builds the
+   classpath, and spawns `java -cp <jars> <MainClass> --comm=<host>:<port>
+   --logs=<host>:<port>`.
+2. `Server.kt` connects to both sockets immediately on startup.
+3. The supervisor sends a `StartupDetails` MessagePack message; the JVM reads
+   it, looks up the matching task by `dag_id` + `task_id`, and calls the
+   user's task method.
+4. During execution the JVM sends requests to the supervisor (GetVariable,
+   GetConnection, GetXCom, SetXCom, etc.) and the supervisor responds. All
+   frames are a 4-byte big-endian length prefix followed by a MessagePack
+   payload.
+5. On completion (or exception) the JVM sends a `TaskState` message and closes
+   the socket. The JVM process then exits.
+
+Log messages produced by the SDK (not by user code) are forwarded over the
+`--logs` socket so the supervisor can append them to Airflow's log store.
+
+The wire protocol is defined in
+`task-sdk/src/airflow/sdk/execution_time/schema/schema.json`.
+`execution/Comm.kt` implements the framing layer. Adding a new message type
+requires changes in **both** `schema.json` (Python side) and
+`execution/Comm.kt` + `execution/Client.kt` (JVM side).
+
+See [Architectural Design Records](./adr) in the `adr` directory to learn more.
+

Review Comment:
   It seems we didn't mention the Jar discovery method here. Like how the 
coordinator discover these're valid Jars that it should spawn etc. And how does 
it relate to the gradle extension we supported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to