a-meledin commented on issue #35267: URL: https://github.com/apache/airflow/issues/35267#issuecomment-1787027170
> I think you can limit it with [config](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#sql-alchemy-pool-size) I. First, I installed pgbouncer and added pool 20, then increased to 150. There were ~104 active connections waiting for lock and no waiting processes in pgbouncer (so 120-130 connection pool was enough). II. I removed "minischeduling" feature by setting: - schedule_after_task_execution = False - it also can be removed by "use_row_level_locking = False", but then just single Scheduler can work. This settings added 2 times speed up, but it seems as this DAG gets more new mapped taskinstances, the slower procesing becomes. > * Is it only with mapped tasks in that number you see the problem? As I mentioned above, if running with MappedNumber=108 param, execution of 3460 task instances takes 20 minutes. Celery flower shows runtime between 1.7 - 4.5 sec for task instance. I used one Celery Worker with 8 threads on Core i5 11 (12 CPU) machine and 32 Gb Mem, Ubuntu under WSL2. > * Have you checked if you have enough resources to run your database with this setup (I.e. handling 150 opened connections) I've used pgbouncer. And with schedule_after_task_execution = False airflow doesn't use more than 30-35 connections as I observed. > * Have you try to set lower/higher values? What happens them? If set pgbouncer lower pool limits (e.g. 20) then there are processes waiting for connection (For situation when schedule_after_task_execution = True and use_row_level_locking = True).. Sure, airflow and postgres consumed less resouces, but DAG's execution time slowed down even more. > Generally speaking when you allow PGBouncer to allow 150 connections to your database, then it will open all 150 connections and Postgres creates a separate process for each connection. So if your postress does not have enough resources (for example memory - but it can be CPU or other resources ) to run all your processes, it will slow down to a crawl as those process will compete for those resources. I observed this. Swap file was minimal. The problem was with locks. See above explanation. > * Did you see a regression comparing to other airflow versions or is it the only version of Airflow you tested it with? Can you try with 2.7.1 and upcoming (will be released tomorrow likely 2.7.3rc) and see if there are any differences. There were some changes in 2.7.2 that **should** decrease the number of connections used comparing to 2.7.1 but I wonder if this did not have some side effects, so if you can check those, that would be great. Tested only 2.7.2. with LocalExecutor and CeleryExecutor+Reddis+PG backend. A bit problematic to test under 2.7.1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org