On 14 May 2026, at 02:21, Jarek Potiuk <[email protected]> wrote:
Well. in this case you have just one class. one coordinator and even fo other (non-java)
coordiatonr the "classpath" is wrong thing to say so. You could achieve the
same by not stating the classpath - but simply stating which java interpreters to use.
[sdk]
jdk_bridge = {
"jdk-11": {
"kwargs": {"java_executable": "/usr/lib/jvm/java-11-openjdk/bin/java",
"jars_root": ["/files/old/lib"]}
},
"jdk-17": {
"kwargs": {"java_executable": "/usr/lib/jvm/java-17-openjdk/bin/java", "jars_root":
["/files/new/lib"], "jvm_args": ["-Xmx1024m"]}
}
}
I really don’t understand the desire to have the Java coordinator inside the
Task SDK distribution in the first go. The coordinator class must be public in
the worker (at least the import path), and putting it in the SDK does not
provide any more freedom to change it faster. It’s the contrary because Task
SDK releases require significantly more testing since the distribution contains
many things, while providers (in a similar position to Airflow Core as
coordinators to Task SDK) are released more frequently, and can have major
version bumps on their own if needed.
If we agree that you want to release bugfixes for the task SDK independently
and faster, then yes, separate distribution might be a good reason. But you
need to solve the SDK's version coupling issue to make it happen.
his introduces operational complexities - depending on what kind of version
coupling you choose between SDK and coordinators.
What is the versioning and compatibility scheme you see? That will significantly impact testing complexity and the release
schedule - because we will have to maintain a parallel release "train" for the "coordinator". For example
when a new SDK coordinator is released, it must work with existing SDKs—imagine we have SDK 1.2.*, 1.3.*, 1.4.*, 1.5.*. Will the
new version of task-sdk be compatible? Should we add back-compat tests for all those versions?) . And I am not even talking about
intentionally breaking the APIs, but unintentional bugs. Also if someone uses the new version of the "SDK" but doesn't
update the old version of the "Java Coordinator." Will that continue to work? How do we ensure that? Are we going to
test all SDK versions with all "coordinator" versions?
This is the operational complexity I am talking about. We already have this for providers
and it only works because we intentionally limited back-compatibility and we run all
those tests for older airflow versions and we have "year" stable and proven
BaseHook and BaseOperator API that has not changed for years after it stabilized.
And we could limit that operational complexity - for example by coupling minor
versions. For example we say SDK 1.2.* Only works with coordinator 1.2. *, SDK
1.3.* Only works with coordinator 1.3.* Assuming only bugfixes are done in
each. That also means that coordinator changes from main will need to be cherry
picked to v3_N_test and the faster releases of coordinators will have to be
released from v3_N_stable branch - that will limit back-compat tests, but
increases the development complexity - because you will have to cherry-pick
changes and have - potentially - independent releases of coordinator 1_N from
that branch where it will be tested with Airflow 3.N and SDK 3.N.
So we have those trade-offs:
1) Strict coupling (pinning) SDK version == Coordinator version: Slower bug fix
cycle, but no back-compat testing needed
2) Coupling SDK MAJOR.MINOR = Coordinator MAJOR.MINOR => Faster bugfix cycles,
increased development/release complexity, leading to cherry-picking to the
v3_branch and separate releases for the coordinator from that branch,
back-compatibility testing is limited to that v3_N_test branch
3) Free-fall: Any SDK works with Any Coordinator => faster bugfix cycles, simpler
releases and development (releases done from main) -> hugely complex matrix of
compatibility tests that might slow down testing even more
There is also a fourth option: what we do for providers which is "limited free form." We
deliberately limit "min_version" in providers and bump it regularly to reduce our
compatibility matrix size.
Those are basically the three choices we have. I personally think option 1 is
best at this stage. We release the task-sdk with Airflow every month. When
needed, and if we find a critical bug we can do an ad-hoc release.
Which one would you prefer - and do you also want to commit to maintaining the
associated development/testing complexity if it is not 1) ?
J.
On Wed, May 13, 2026 at 7:07 PM Tzu-ping Chung via dev <[email protected]
<mailto:[email protected]>> wrote:
You can do the same if it’s in the task sdk, but
1. You need to use the same import path, but then you need to separately teach
users to install a new package before moving it out. Not a very good user
experience.
2. Or you use a different import path. You need to keep the old path working in
the distribution for a long time *and* have users change their configs to fix
the deprecation warning (and eventual breaking change). Unnecessary mental
gymnastics on both sides.
I really don’t understand the desire to have the Java coordinator inside the
Task SDK distribution in the first go. The coordinator class must be public in
the worker (at least the import path), and putting it in the SDK does not
provide any more freedom to change it faster. It’s entirely the contrary since
Task SDK releases require a lot more testing since the distribution contains
many things, while providers (in a similar position to Airflow Core as
coordinators to Task SDK) are released more frequently, and can have major
version bumps on their own if needed.
On 14 May 2026, at 00:54, Jarek Potiuk <[email protected]
<mailto:[email protected]>> wrote:
You can create multiple instances of the same coordinator class. Pass
appropriate arguments to suite your need. This is in the AIP.
Yes. And you cand do exactly the same 1-1 if it's part of package and embedded in
"airflow-sdk" distribution? Or am I wrong? Why do you think it would not be
possible if it's part of task_sdk?
On Wed, May 13, 2026 at 6:43 PM Tzu-ping Chung via dev <[email protected]
<mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>> wrote:
You can create multiple instances of the same coordinator class. Pass
appropriate arguments to suite your need. This is in the AIP.
[sdk]
coordinators = {
"jdk-11": {
"classpath": "airflow.sdk.coordinators.java.JavaCoordinator",
"kwargs": {"java_executable": "/usr/lib/jvm/java-11-openjdk/bin/java",
"jars_root": ["/files/old/lib"]}
},
"jdk-17": {
"classpath": "airflow.sdk.coordinators.java.JavaCoordinator",
"kwargs": {"java_executable": "/usr/lib/jvm/java-17-openjdk/bin/java", "jars_root":
["/files/new/lib"], "jvm_args": ["-Xmx1024m"]}
}
}
The problem is, classpath points to a class, so whatever this string is needs
to be kept compatible in future releases.
On 13 May 2026, at 23:59, Jarek Potiuk <[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>> wrote:
How do you make it changeable any time? User needs to be able to specify what
coordinator to use in the config, and you can’t break that later.
We have only *jdk* coordinator now . So I will revert the question. How are you going to configure two
"jdk" `coordinators` when you have separate distributions running "java"? Are you
planning to install two "coordinator-jdk" packages? This isn't possible in Python unless you build
almost the same package with jdk-11, jdk-19 built in?
My understanding is that you will have configuration options to choose between "jdk-11" and
"jdk-19." This "jdk" package of yours will simply have a list of "jdks" linking to the
Java interpreters.
So, it doesn't matter if it's a single "coordinator-jdk" package or everything in "airflow.sdk._coordinator."jdk"
package or "airflow.sdk._bridge.jdk" package in the task-sdk. Regardless, you cannot install two "jdk" packages, whether
they are separate distributions or if the package is in "task-sdk" and you have to configure which of the "jdk" bridges you
want to use.
Yes. Sometime later, when we also have Go/TypeScript or other languages, we might decide
to centralize some APIs, create a "true" coordinator package and separate
distributions. And splitting into different packages will be absolutely no problem then.
Nothing will stop us from doing it.
It will save a lot of time for all the "distribution" issue - including
releases, packagig, CI and everything connected - and it does not absolutely block us
from further split later.
J.
On Wed, May 13, 2026 at 3:47 PM Tzu-ping Chung via dev <[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>> <mailto:[email protected]
<mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>>>> wrote:
On 13 May 2026, at 20:42, Jarek Potiuk <[email protected] <mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>> <mailto:[email protected] <mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>>> wrote:
Not really. I proposed an internal package that can be changed **any time**. Users aren't supposed to use
those items. We can clearly mark them with "_" and also describe them thoroughly in the public API
documentation. And no. Initilaly providers were **not** in arflow at all - you started from step 2. Step 1 is
that they were added at some point in time long before my time. Hooks and operators as "API" were
creaed quite early in the concept of Airflow - and the first implementations were added then. Then, after
common patterns emerged, those hooks and operators were grouped into providers (they were not initially) and
only moved out after quite some time. As I see it - you even admit yourself that things will look differently
for different languages, and maybe even we will not need bridges for some of them at all. So why should we
introduce new concept if we know currrently that it applies only to "JDK"? I fail to see why we
should proceed if we already know the patterns are unlikely to be reusable in their current form.
How do you make it changeable any time? User needs to be able to specify what
coordinator to use in the config, and you can’t break that later.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>
<mailto:[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>>
For additional commands, e-mail: [email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>
<mailto:[email protected] <mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>>