Hi team,

As a follow up to the discussion in
https://lists.apache.org/thread/vff0q9oogxrsp0w1nzco2lqhy9b02pwz, I want to
call for a lazy consensus
on what we document about metadata database access approaches in the
Airflow 3 upgrade guide for Airflow 2.x users.

We will be documenting:

Approach 1: Airflow Python Client (REST API)
- Use `apache-airflow-client` package to interact with Airflow metadata via
REST API
- Document as the primary/recommended solution

Approach 2: Database Hooks (PostgresHook/MySqlHook)
- Create a connection to metadata DB and use DB hooks to execute SQL
directly
- Document as a "known workaround" (NOT recommended) with explicit warnings
based on suggestions on the other mail thread:
  - Can break in future versions (especially 3.2+)
  - Users are responsible for adapting when schema changes occur
  - Database schema is NOT a public API and can change without notice
  - Breaks task isolation (one of Airflow 3's core features)
  - No support guarantees
  - Performance implications

Both approaches will be documented with pros and cons.

Approach 3 (Direct SQLAlchemy Access) will NOT be documented at all as
discussed.

The lazy consensus will end on Friday (14 Nov) at 4 PM UTC (~72 hours from
now).

If there are no objections or significant concerns by that time, I will
proceed with updating the documentation accordingly
in this PR: https://github.com/apache/airflow/pull/57479


Thanks & Regards,
Amogh Desai

Reply via email to