a-meledin commented on issue #35267:
URL: https://github.com/apache/airflow/issues/35267#issuecomment-1787027170

   > I think you can limit it with 
[config](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#sql-alchemy-pool-size)
   
   I. First, I installed pgbouncer and added pool 20, then increased to 150. 
There were ~104 active connections waiting for lock and no waiting processes in 
pgbouncer (so 120-130 connection pool was enough).
   II. I removed "minischeduling" feature by setting: 
   - schedule_after_task_execution = False
   - it also can be removed by "use_row_level_locking = False", but then just 
single Scheduler can work.
   
   This settings added 2 times speed up, but it seems as this DAG gets more new 
mapped taskinstances, the slower procesing becomes. 
   
   > * Is it only with mapped tasks in that number you see the problem?
   
   As I mentioned above, if running with MappedNumber=108 param, execution of 
3460 task instances takes 20 minutes. Celery flower shows runtime between 1.7 - 
4.5 sec for task instance. I used one Celery Worker with 8 threads on Core i5 
11 (12 CPU) machine and 32 Gb Mem, Ubuntu under WSL2.
   
   > * Have you checked if you have enough resources to run your database with 
this setup (I.e. handling 150 opened connections)
   
   I've used pgbouncer. And with schedule_after_task_execution = False airflow 
doesn't use more than 30-35 connections as I observed.
   
   > * Have you try to set lower/higher values? What happens them?
   
   If set pgbouncer lower pool limits (e.g. 20) then there are processes 
waiting for connection (For situation when schedule_after_task_execution = True 
and use_row_level_locking = True)..  Sure, airflow and postgres consumed less 
resouces, but DAG's execution time slowed down even more.
   
   > Generally speaking when you allow PGBouncer to allow 150 connections to 
your database, then it will open all 150 connections and Postgres creates a 
separate process for each connection. So if your postress does not have enough 
resources (for example memory - but it can be CPU or other resources ) to run 
all your processes, it will slow down to a crawl as those process will compete 
for those resources.
   
   I observed this. Swap file was minimal. The problem was with locks. See 
above explanation.
   
   > * Did you see a regression comparing to other airflow versions or is it 
the only version of Airflow you tested it with?  Can you try with 2.7.1 and 
upcoming (will be released tomorrow likely 2.7.3rc) and see if there are any 
differences. There were some changes in 2.7.2 that **should** decrease the 
number of connections used comparing to 2.7.1 but I wonder if this did not have 
some side effects, so if you can check those, that would be great.
   
   Tested only 2.7.2. with LocalExecutor and CeleryExecutor+Reddis+PG backend. 
A bit problematic to test under 2.7.1.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to