I'd like to propose adding a new provider, Apache OpenDAL [1], to the collection of Apache providers in Airflow.
OpenDAL (Open Data Access Layer) is a unified abstraction layer that simplifies interactions with various storage backends, including AWS S3, GCS, Azure, and key-value storage systems. Its guiding theme is: "One Layer, All Storage." The key advantage of this provider is that a single operator can support multiple operations read, write, copy, and list etc, across both storage and key-value systems. Currently, We currently have many dedicated filesystem transfer operators, which work well for their specific use cases. S3CreateObjectOperator LocalFilesystemToS3Operator GCSToS3Operator S3ToGCSOperator and many more like them. OpenDAL fits well in these scenarios. offering users a unified way to interact with diverse storage systems, With a single OpenDAL operator, we can handle all supported operations [2] using Airflow connection IDs and minimal configuration. You can find a list of all storage services OpenDAL supports [3]. IMHO, this aligns with other common-area solutions like common messaging provider, and would be a great addition to the common providers area. I've drafted a PR[4] to introduce the OpenDAL provider. The current version includes synchronous operations and limited (read/write/copy), and async support can be added in future updates. I have been discussing this provider with Jarek, Xuanwo( OpenDAL PMC). A big thanks to both of them and their help. @Xuanwo Happy for you, if you would like to add any more points. Happy to hear from you all any feedback or questions :) Note: OpenDAL only supports the python version starting from 3.10, Hope we may need tweek or it just works in CI without any issues. [1] https://opendal.apache.org/ [2] https://github.com/apache/opendal/blob/main/bindings/python/python/opendal/__init__.pyi#L30 [3] https://github.com/apache/opendal/blob/main/bindings/python/python/opendal/__base.pyi [4] https://github.com/apache/airflow/pull/50728 Pavan,