Asquator commented on PR #53492:
URL: https://github.com/apache/airflow/pull/53492#issuecomment-3293688742

   Problems with this approach:
   - It may be slow. Especially in MySQL where this query might take 15x more 
time than the optimistic one.
   - Windows function don't really fit the use case, as we have different 
orthogonal limits that have to be enforced. Big thanks to @dstandish about 
paying attention and pushing us to find more flaws.
   - Some starvation cases still persist as in certain cases we're at the mercy 
of a query planner. Consider the following example:
   
   >Pool A: 16/32
   Pool B: 16:32
   >Task A: 32 mapped Pool A
   Task B: 32 mapped Pool B
   >Task A sorts first: 1-32
   Task B sorts second: 33-64
   >Task A gets 16 scheduled TIs
   Task B: 16 potential TIs are starved
   
   So we don't schedule 32 tasks alghough we technically could.
   
   Take it further:
   >Pools A,B,C,D 8/32
   Tasks A,B,C,D accordingly, 32 each, sort in this order
   >A: 1-32
   B: 33-64
   >C: 65-96
   D: 97-128
   >Dagrun max_active_tasks: 32
   Ideally we want to schedule 32 tasks (8 of A, 8 of B, 8 of C, 8 of D)
   >In reality we schedule 8 tasks of A and starve the other ready tasks
   
   Some may not consider it starvation, but I'm personally not ready to accept 
this approach knowing it's both slower and doesn't completely solve the issue.
   
   For any future developments I have to say that performance isn't completely 
terrible. We often got only 3x-6x slowdown. 
   The thing that bothers me the most is maintainability and unsolved cases. 
IMO WFs just don't fit for task selection with multidimensional limits. I'd 
rather try an exhaustive scan approach, see: 
https://github.com/apache/airflow/pull/55537


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to