Dev-iL commented on code in PR #67800:
URL: https://github.com/apache/airflow/pull/67800#discussion_r3355447094


##########
airflow-core/docs/howto/set-up-database.rst:
##########
@@ -233,6 +233,13 @@ For more information regarding setup of the PostgreSQL 
connection, see `PostgreS
 
    See also :ref:`Helm Chart production guide <production-guide:pgbouncer>`
 
+   Some Airflow database routes use an async engine (the Execution API, for 
example). The asyncpg

Review Comment:
   I added a no-prepared-statements variant to the benchmark, and below are the 
results (values are in **queries/sec**, concurrency 10, 30s each, breeze 
postgres 14):
   
   | Query           | asyncpg 0.31.0 | asyncpg-no-prepare | psqlpy 0.12.0 | 
psycopg3 3.3.4 | psycopg3-async |
   | --------------- | -------------: | -----------------: | ------------: | 
-------------: | -------------: |
   | pg_type         | **1,743**      | 1,466              | 1,307         | 
607            | 764            |
   | generate_series | **2,632**      | 2,368              | 2,104         | 
910            | 1,605          |
   | large_object    | **6,916**      | 4,918              | 3,067         | 
1,565          | 1,012          |
   | arrays          | **2,753**      | 2,213              | 1,957         | 
722            | 1,216          |
   | copyfrom        | 130            | **133**            | 7             | 27 
            | 43             |
   | batch           | 278            | **282**            | 10            | 17 
            | 9              |
   | oneplusone      | **20,078**     | 9,945              | 10,162        | 
9,114          | 8,836          |
   
   Several things to note:
   - While the no-prepare variant is indeed slower, it still beats the other 
drivers on most tests. However, since these tests ran on localhost, I expect 
network hops to remove most of this advantage (though I haven't tested it). 
   - batch is ~30x slower and copyfrom ~3x on psycopg3. This is something to 
keep in mind, but shouldn't affect the decision to migrate, since heartbeat and 
today's async routes are single-row OLTP.
   
   This leaves us with the pgbouncer safety argument - here the benefit of 
switching to psycopg3 is unquestionable. So here's my plan:
   - Keep this PR focused on the endpoint migration + update the docs.
   - Reframe https://github.com/apache/airflow/issues/67801 from "make asyncpg 
PgBouncer-safe" to "switch the default async Postgres driver to psycopg3 
(postgresql+psycopg)." and open a followup PR to do that.
   - asyncpg stays as a documented opt-in for users favoring throughput.
   
   How does that sound?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to