e-galan commented on code in PR #39329: URL: https://github.com/apache/airflow/pull/39329#discussion_r1598601888
########## airflow/providers/cncf/kubernetes/operators/pod.py: ########## @@ -1129,6 +1142,36 @@ def dry_run(self) -> None: def execute_complete(self, context: Context, event: dict, **kwargs): return self.trigger_reentry(context=context, event=event) + def process_duplicate_label_pods(self, pod_list: list[k8s.V1Pod]) -> k8s.V1Pod: + """ + Patch or delete the existing pod with duplicate labels. + + This is to handle an edge case that can happen only if reattach_on_restart + flag is False, and the previous run attempt has failed because the task + process has been killed externally by the cluster or another process. + + If the task process is killed externally, it breaks the code execution and + immediately exists the task. As a result the pod created in the previous attempt + will not be properly deleted or patched by cleanup() method. + + Return the newly created pod to be used for the next run attempt. + """ + new_pod = pod_list.pop(self._get_most_recent_pod_index(pod_list)) Review Comment: @romsharon98 The last created pod is supposed to run the next attempt of the same task, but it can't because of the exception which forbids having more than one pod with the same labels. The exception is raised in the scenario which I gave in the method doc-string and in the PR description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org