Nataneljpwd opened a new pull request, #55797:
URL: https://github.com/apache/airflow/pull/55797

   This PR fixes an issue we noticed, where we had Negative open slots for our 
executors in our metrics.
   
   As it can be seen in here, the metric is calculated from the parallelism 
minus the length of `self.running`.
   <img width="2258" height="246" alt="image" 
src="https://github.com/user-attachments/assets/5384bd42-faf4-4cbc-9b58-12fbb299e78c";
 />
   
   In the K8S Executor, it is updated in only a few places, where we create 
tasks in the methods `self.adopt_launched_tasks` and 
`self._adopt_completed_tasks`.
   Where the first adds tasks to running when they are started, as they were 
just set to running, and the latter, adopts tasks which are completed, so that 
the K8S Watchers in `AirflowKubernetesScheduler` can delete the pods, this can 
cause an issue, where we can have negative amount of open slots for the K8S 
Executor, and it also means that we might drop tasks which we could set to 
running, just because completed tasks occupied their spot (as it is the check 
done by the SchedulerJobRunner, see Images bellow).
   
   Here is where the `SchedulerJobRunner` uses the `open_slots` property of the 
executor.
   <img width="2434" height="520" alt="image" 
src="https://github.com/user-attachments/assets/40aff00b-b7dc-478b-8e59-6818373b5588";
 />
   
   <img width="2312" height="880" alt="image" 
src="https://github.com/user-attachments/assets/15ca0d29-8aaa-4af5-91c4-7ef3e1759b8c";
 />
   
   
   This PR resolves the issue by addopting completed tasks to a different set, 
called `completed`, which resolves the negative open slots metric and the issue 
where in certain cases we run less tasks than we actually can.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to