I’m replying to the 5th point separately since IMO it is quite obvious in the 
AIP. I also made some edits to the document so it is ABUNDANTLY clear.

Coordinators are Python. They are imported into Airflow. Not separate 
processes. tasks run in Python, and the coordinator knows how to talk to them. 
How the messages are exchanged (not the messages themselves) is purely between 
the coordinator (Python) and Java is only between them, and the same goes for 
other language coordinator-SDK. It is not public and thus not needed for 
specification.

Once the message is in the coordinator, it is a Pydantic object (this is why we 
need to standardise the message themselves, as discussed previously) that the 
dag processor or executor can use in memory.


> 5. Distribution of Jars and Java processes running -> We should specify how 
> we envision the distribution of code Jars working. As I understand 
> coordinators have two sides: "can_handle_dag" (Python implementation so we do 
> not run Java code in the scheduler?) and is it continuously running? Started 
> on demand? How? Process runs on Java: one for DagFileProcessor (so that it 
> can potentially parse the Dag Definition from the Java definition if the 
> whole Dag is defined in Java?). Can this Java process live elsewhere and 
> should the Dag Processor communicate with it? Or will it run as a subprocess 
> of DagProcessor? Will it be one process or many processes? Does it start per 
> DAG or run continuously? Similarly, consider the workers. Will those be the 
> same?What jars does the DagFile processor use? Or different? How will they 
> relate to the Dag Bundle? Are the .Are jars always present in DagBundle and 
> distributed? I think at least a rough outline of the deployment "process" 
> assumption is needed. Maybe it's already there and I badly missed it - but 
> those are the questions that immediately come to mind when I see the 
> proposal. 

Reply via email to