droppoint commented on issue #32928:
URL: https://github.com/apache/airflow/issues/32928#issuecomment-1820413530

   Hi, everyone!
   
   I think I found the root cause of the problem.
   
   Short answer: The 
[KubernetesExecutor._adopt_completed_pods](https://github.com/apache/airflow/blob/main/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py#L645-L676)
 function is not compatible with concurrently running schedulers.
   
   Long answer: I encountered an issue described here after updating my Airflow 
instance from 2.4.3 to 2.7.3. After the update, the executor.open_slots metric 
started decreasing, as shown in the picture below.
   
   ![Снимок экрана от 2023-11-21 
10-04-04](https://github.com/apache/airflow/assets/619008/d4c3357a-ce03-4d56-8bba-45886f65fad1)
   
   After restarting the schedulers, the open_slots metric resets, but then it 
starts to decline again. After some investigation, I discovered two things:
   1. My KubernetesExecutor.running set has TaskInstances that were completed a 
while ago.
   2. These TaskInstances should not be in this scheduler because they were 
completed by my other scheduler.
   
   After digging through the logs, I found that pods belonging to those task 
instances were adopted. The log entry "Attempting to adopt pod" (with a capital 
"A") corresponds to this 
[line](https://github.com/apache/airflow/blob/main/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py#L664)
 in the code.
   
   ![Снимок экрана от 2023-11-20 
17-40-21](https://github.com/apache/airflow/assets/619008/b1f0217c-dff9-4e6e-b690-d11fe09b27e2)
   
   The first error in the function occurs 
[here](https://github.com/apache/airflow/blob/main/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py#L673-L676).
 Even if the scheduler fails to adopt a pod, the pod is still added to the 
KubernetesExecutor.running set. This piece of code was changed in #28871 and 
was merged into 
[2.5.2](https://airflow.apache.org/docs/apache-airflow/stable/release_notes.html#airflow-2-5-2-2023-03-15)
   
   In my case, the pod failed to be adopted because it had already been deleted 
(error 404). I decided to fix it by simply adding continue in the except block. 
After the fix, the situation improved a lot, but I still saw some TaskInstances 
in the KubernetesExecutor.running set that didn't belong to that scheduler. 
Then I found the second problem with this function. The 
[KubernetesExecutor._adopt_completed_pods](https://github.com/apache/airflow/blame/main/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py#L562)
 function is called unconditionally in the 
KubernetesExecutor.try_adopt_task_instances function, which is called by 
SchedulerJobRunner. SchedulerJobRunner sends a list of TaskInstances that need 
to be adopted because their Job.last_healthcheck missed the timeout. 
KubernetesExecutor.try_adopt_task_instances iterates through that list and 
tries to adopt (patch one of the labels) pods that belong to these 
TaskInstances. If adoption is successful, it adds the TaskIn
 stance to the KubernetesExecutor.running set. However, 
KubernetesExecutor._adopt_completed_pods, which is called during 
try_adopt_task_instances, does no such thing - it just gets the list of all 
completed pods and tries to adopt all completed pods that are not bound to the 
current scheduler. This results in the constant adoption of completed pods 
between schedulers. Scheduler 1 adopts completed pods of scheduler 2 and vice 
versa.
   
   So, how can we fix this?
   
   I think that the _adopt_completed_pods function needs to be removed because 
the presence of completed pods after scheduler failure is pretty harmless, and 
the [airflow 
cleanup-pods](https://github.com/apache/airflow/blob/main/airflow/cli/commands/kubernetes_command.py#L77-L150)
 CLI command can take care of that. Plus this function has already caused 
problems #26778 before. But I might be wrong. So if some maintainer can give 
advice on this situation, it would be great. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to