Hi All,

I would like to discuss a new AIP aimed at enhancing the DAG loading
mechanism to support reading DAGs from ephemeral storage solutions. This
proposal is intended to supersede AIP-5 Remote DAG Fetcher and provide a
more flexible and scalable approach and to prepare for AIP-63.

https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-71+Generalizing+DAG+Loader+and+Processor+for+Ephemeral+Storage

*Abstract*
This proposal aims to generalize the DAG loader and processor to use
pathlib.Path for file operations instead of assuming direct OS filesystem
access. It includes implementing a custom module loader that supports
loading from ObjectStoragePath locations and other Path-like abstractions,
with caching capabilities provided by fsspec. Furthermore, while this AIP
does not directly implement DAG versioning, it creates a foundational layer
that can be extended to support DAG versioning as outlined in AIP-63.

A work in progress PR can be found here:
https://github.com/apache/airflow/pull/39647

*Key points for discussion*

Previous proposals, like AIP-5, suggested using a Fetcher mechanism. Kind
of like an in-process git-sync. This proposal is about making that
redundant by fully supporting object storage locations by leveraging
ObjectStoragePath and fsspec caching mechanisms.

Earlier feedback on AIP-5 was that we thought that having an additional
Fetcher process was out of scope of the project. With the transient
integration of pathlib.Path and ObjectStoragePath I think this argument
does not hold anymore and the demand is there. In addition the added
flexibility allows AIP-63 to be implemented easier (what that looks like
remains to be seen).

Airflow scans DAGs often. This very likely requires a caching mechanism for
both the DAGs and their modules. Fsspec does implement caching, and it is
planned to leverage this.

Non DAG, Non module assets as part of the DAG folder are out of scope. So
say for example for some reason you include a GIF. This will not
automatically be available without changes to your code.

I kindly request your thoughts :-).

Bolke

-- 

--
Bolke de Bruin
bdbr...@gmail.com

Reply via email to