Vasu-Madaan opened a new issue, #53472:
URL: https://github.com/apache/airflow/issues/53472

   ### Apache Airflow Provider(s)
   
   cncf-kubernetes
   
   ### Versions of Apache Airflow Providers
   
   8.4.1
   
   ### Apache Airflow version
   
   Airflow 2
   
   ### Operating System
   
   linux
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   The KubernetesPodOperator does not correctly clean up pods when they are 
created with labels that have a None value. This results in orphaned pods 
remaining in the Kubernetes namespace after the task has finished, even when 
on_finish_action is set to delete_pod.
   
   ### What you think should happen instead
   
   Root Cause Analysis:
   
   The issue stems from an inconsistency between how labels are handled during 
pod creation versus pod deletion.
   
   Pod Creation: When a pod is created, the _get_ti_pod_labels method iterates 
through the labels and uses str(label) to process the value. In Python, 
str(None) evaluates to the string "None". Consequently, a pod is created with a 
valid Kubernetes label like my-label="None".
   
   Pod Deletion: When the task finishes, the cleanup method attempts to find 
the pod to delete it. It calls _build_find_pod_label_selector to construct a 
query for the Kubernetes API. This method, however, does not apply the same 
str() conversion. It uses the raw None object from the operator's self.labels 
dictionary.
   
   This inconsistency leads to a malformed or incorrect label selector, causing 
the Kubernetes API to return no matching pods. Since the operator cannot find 
the pod it created, it cannot delete it, leaving the pod orphaned.
   
   ### How to reproduce
   
   Example DAG to Reproduce:
   
   ```
   from __future__ import annotations
   
   import pendulum
   
   from airflow.models.dag import DAG
   from airflow.providers.cncf.kubernetes.operators.pod import 
KubernetesPodOperator
   
   with DAG(
       dag_id="kpo_none_label_bug_report",
       start_date=pendulum.datetime(2025, 1, 1, tz="UTC"),
       catchup=False,
       schedule=None,
       tags=["k8s", "bug"],
   ) as dag:
       kpo_task = KubernetesPodOperator(
           task_id="kpo_with_none_label",
           namespace="default",
           image="alpine",
           cmds=["echo"],
           arguments=["'Task finished successfully'"],
           # This label with a `None` value triggers the bug
           labels={"custom-label-with-none": None},
           name="kpo-none-label-test",
           on_finish_action="delete_pod",
           # Ensure a new pod is created each time to reliably test deletion
           reattach_on_restart=False,
       )
   ```
   
   Expected Behavior: After the task kpo_with_none_label completes, the 
corresponding pod (kpo-none-label-test-*) should be deleted from the Kubernetes 
namespace.
   
   Actual Behavior: The pod is not deleted and remains in the namespace with a 
Completed status.
   
   
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to