Hi everyone,

I've been experimenting with `ObjectStoragePath` and recently opened a
[PR](https://github.com/apache/airflow/pull/52002) aiming to simplify
its construction using Airflow connections — especially in cases where
environments (e.g., dev, staging, prod) differ primarily in object
storage provider (e.g., S3, GCS, file) and base path.

The goal was to construct a reusable root path from a connection like this:

```python
storage = ObjectStoragePath.from_conn(BaseHook.get_connection("storage"))
object = storage / "my_file.txt"
```

...without needing to hardcode schemes like `s3://` or `gs://` and
base paths (usually "buckets") into the DAG code. The idea was to
infer provider and base path from connection `extra` fields (e.g.,
`provider`, `base_path`), allowing the same DAG code to work across
environments by simply reconfiguring the connection.

The PR sparked a great discussion (linked above), and I realized this
might be a good opportunity to collect **broader community
experience** around the use of `ObjectStoragePath` and object storage
in general.

A few questions I'd like to raise:

* How are you configuring access to object storage across environments?
* Do you find it useful to extract `scheme` and `base_path` from
connections (or any other configuration)?
* Are there existing best practices or patterns for making
`ObjectStoragePath` construction generic and environment-agnostic?
* Would it make sense to define a common utility or convention (e.g.
via extras, `get_fs`, provider's `filesystems`, or a connection
helper)?

I’m primarily looking for the best pattern—if any exists—or hoping we
can come together to define and document one as a community.

Best regards,
Josef Šimánek (https://github.com/simi)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to