Dev-iL opened a new pull request, #68496:
URL: https://github.com/apache/airflow/pull/68496

   closes: #67801
   
   ## What
   
   When `[database] sql_alchemy_conn_async` is not set, Airflow derives the 
async metadata-database URL from `sql_alchemy_conn`. For PostgreSQL, the 
derived URL now uses **psycopg3** (`postgresql+psycopg_async://`) instead of 
asyncpg (`postgresql+asyncpg://`).
   
   Packaging follows: `apache-airflow-providers-postgres` now installs 
`psycopg[binary]` by default, and `asyncpg` moves to a new opt-in `asyncpg` 
extra. `psycopg2-binary` is unchanged — the sync engine still uses psycopg2.
   
   ## Why
   
   Airflow recommends running PgBouncer in front of PostgreSQL in production. 
asyncpg uses named server-side prepared statements, which break under 
transaction-mode PgBouncer unless prepared-statement caching is explicitly 
disabled. psycopg3 is safe behind transaction-mode PgBouncer with **zero 
configuration**, so the default async engine now works out of the box in 
recommended production deployments. A safe default beats a documentation note 
operators miss.
   
   asyncpg was originally chosen only because Airflow was pinned to SQLAlchemy 
1.4; that constraint is gone (`airflow-core` requires 
`sqlalchemy[asyncio]>=2.0.48`), and psycopg3 serves both sync and async from a 
single driver.
   
   ## Keeping asyncpg
   
   asyncpg remains fully supported as a throughput opt-in:
   
   ```bash
   pip install 'apache-airflow-providers-postgres[asyncpg]'
   ```
   
   ```ini
   [database]
   sql_alchemy_conn_async = postgresql+asyncpg://<user>:<password>@<host>/<db>
   ```
   
   Behind transaction-mode PgBouncer, also disable asyncpg's prepared-statement 
caching via `sql_alchemy_connect_args_async` (a dict defined in 
`airflow_local_settings.py`):
   
   ```python
   # airflow_local_settings.py
   connect_args_async = {"statement_cache_size": 0, 
"prepared_statement_cache_size": 0}
   ```
   
   ## Notes for reviewers
   
   - The derived async dialect is `postgresql+psycopg_async://` (explicit 
form), not the `postgresql+psycopg` shorthand mentioned in the issue body — 
chosen deliberately for explicitness, matching the explicit-sync-driver 
direction of #68314.
   - **Scope:** SQLite (`aiosqlite`) and MySQL (`aiomysql`) async derivations 
are unchanged, and an explicitly configured `sql_alchemy_conn_async` is never 
rewritten.
   - **psycopg2 is intentionally kept** as the sync hard dependency in this PR 
to keep it reviewable. Removing psycopg2 and migrating the sync engine to 
psycopg3 (aligned with the SQLAlchemy 2.1 upgrade, where the `postgresql://` 
default flips to psycopg3) is tracked in 
https://github.com/apache/airflow/issues/68453.
   - **Merge ordering:** depends on #68314 (explicit sync 
`postgresql+psycopg2://`). #67800 (async heartbeat) will rebase its docs onto 
this change.
   - **Bulk-write caveat:** psycopg3-async is markedly slower on batch 
`INSERT`/`COPY` in localhost benchmarks. Today's async routes are single-row 
OLTP, so this does not affect the switch, but bulk paths must be validated 
before they migrate to async (#67799).
   
   ## Validation
   
   Validated end to end against a real transaction-mode PgBouncer in front of 
Postgres (`dev/pgbouncer_e2e/`): the default-derived `psycopg_async` engine 
runs repeated single-row and row-returning queries with **no named prepared 
statements left on the backend**, and the documented asyncpg opt-in recipe 
behaves the same.
   
   ---
   
   
   ##### Was generative AI tooling used to co-author this PR?
   
   <!--
   If generative AI tooling has been used in the process of authoring this PR, 
please
   change below checkbox to `[X]` followed by the name of the tool, uncomment 
the "Generated-by".
   -->
   
   - [x] Yes (please specify the tool below)
   
   Generated-by: Claude Code (Fable 5) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
   
   ---
   
   * Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information. Note: commit author/co-author name and email in commits 
become permanently public when merged.
   * For fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   * When adding dependency, check compliance with the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   * For significant user-facing changes create newsfragment: 
`{pr_number}.significant.rst`, in 
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
 You can add this file in a follow-up commit after the PR is created so you 
know the PR number.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to