xBis7 commented on PR #54103:
URL: https://github.com/apache/airflow/pull/54103#issuecomment-3275278733

   I ran some tests and gathered metrics from the scheduler, with and without 
this patch. The green lines are with the patch and the yellow lines are with 
the original code except for `Airflow Running and Queued Tasks` which doesn't 
make a distinction but it's still visible from the timestamp where the metrics 
with the original code start,
   
   The test is running dags with the below numbers of tasks
   * 1200 tasks
   * 1100 tasks
   * 1000 tasks
   * 470 tasks
   * 250 tasks
   * 45 tasks
   
   all concurrently.
   
   I've configured the scheduler to use strict limits to make the case more 
clear. Using the same configs and setup we are getting these results
   
   * number of dags that we are concurrently queuing tasks from
       * with the patch, up to 6
       * original code, max was 3
   * number of tasks queued at any given moment
       * with the patch, up to 24
       * original code, max was 8
   * number of tasks to examine at any given moment
       * with the patch, it's examining as many as it can queue, e.g. examines 
24 and queues 24
       * original code, as many as 62 but only queueing 8
   * total number of scheduler iterations to finish all the above dags
       * with the patch, 528 iterations
       * original code, 1486 iterations
   
   That depends on the system but there was also a timing difference. With the 
patch, the test took 516.36s (8.6 mins) while without it, it took 852.00s (14.2 
mins).
   
   <img width="2496" height="1024" alt="perf1" 
src="https://github.com/user-attachments/assets/001df0e3-f5cc-4e72-87d9-06f911a4b1d2";
 />
   
   The improvement with the patch in the scheduler throughput is noticeable. 
Please let me know if there is something else that makes sense to quantify here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to