crabio opened a new issue, #35354:
URL: https://github.com/apache/airflow/issues/35354

   ### Apache Airflow version
   
   2.7.2
   
   ### What happened
   
   1. Airflow Scheduler has periodic errors:
   ```log
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py",
 line 111, in run
       self.resource_version = self._run(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py",
 line 167, in _run
       for event in self._pod_events(kube_client=kube_client, 
query_kwargs=kwargs):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/kubernetes/watch/watch.py", 
line 182, in stream
       raise client.rest.ApiException(
   kubernetes.client.exceptions.ApiException: (410)
   ```
   
   2. Scheduler at start has a lot of empty slots, what we see in it's metrics 
(airflow_executor_running_tasks, airflow_executor_open_slots). But after some 
time and these errors - it has no empty slots and no tasks are scheduled. All 
stuck in the queued state.
   
   
   After some analyze it seems confusing that in [kubernetes provider 
v7.8.0](https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/_modules/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.html)
 it has code to process it:
   ```
   except ApiException as e:
               if e.status == 410:  # Resource version is too old
                   if self.namespace == ALL_NAMESPACES:
                       pods = 
kube_client.list_pod_for_all_namespaces(watch=False)
                   else:
                       pods = 
kube_client.list_namespaced_pod(namespace=self.namespace, watch=False)
                   resource_version = pods.metadata.resource_version
                   query_kwargs["resource_version"] = resource_version
                   return self._pod_events(kube_client=kube_client, 
query_kwargs=query_kwargs)
               else:
                   raise
   ```
   
   But not clear, why we still have an error.
   Also I saw that this error was a lot of times in other versions of Airflow.
   
   ### What you think should happen instead
   
   _No response_
   
   ### How to reproduce
   
   1. Create docker of Airflow with these libraries:
   ```
   apache-airflow == 2.7.2
   dbt-core == 1.6.6
   dbt-snowflake == 1.6.4
   apache-airflow-providers-snowflake
   apache-airflow[statsd]
   facebook-business == 16.0.2
   google-ads == 21.1.0
   twitter-ads == 11.0.0
   acryl-datahub-airflow-plugin
   acryl-datahub[dbt]
   checksumdir
   filelock
   openpyxl
   cronsim
   apache-airflow-providers-cncf-kubernetes==7.8.0
   kubernetes
   ```
   
   2. Configure it via Community Helm chart with Kubernetes Executor
   3. Add more than 50 tasks in 2-3 DAGs
   
   
   ### Operating System
   
   Docker based on apache/airflow:2.7.2
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   apache-airflow-providers-cncf-kubernetes==7.8.0
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to