ecerulm commented on issue #21087: URL: https://github.com/apache/airflow/issues/21087#issuecomment-1119081074
> #15500 (comment) > The executor is trying to watch from (history) revision n, which has rolled off of history on the k8s side. n+2 might be the oldest available now. In airflow 2.3.0 (`kubernetes==23.3.0`) the executor tries to watch from revision n where n is the last received revision , at least in my testing in EKS the last received revision is not necessarily the **highest** revision number. I mean the watch can return events with revision numbers 1,2,3,**99**,4,5 and the executor will try to watch (on the retry) from revision 5 which gives `resource too old`. It should try to watch from revision 99. At least in my two EKS cluster is easy to reproduce this scenario (see my post on [stackoverflow](https://stackoverflow.com/questions/72133783/how-to-avoid-resource-too-old-when-retrying-a-watch-with-kubernetes-python-cli/72133784#72133784)). Below I start a watch that end after 5 second and immediately do another watch with the latest resource version (just like airflow kubernetes_executor.py does) and that **always** raises a `resource too old` for me. I guess in EKS you can't really ask for any other revision than `0` or the actual latest. ``` # python3 -m venv venv # source venv/bin/activate # pip install 'kubernetes==23.3.0' from kubernetes import client,config,watch config.load_kube_config(context='my-eks-context') v1 = client.CoreV1Api() watcher = watch.Watch() namespace = 'kube-system' last_resource_version=0 # this watch will timeout in 5s to have a fast way to simulate a watch that need to be retried for i in watcher.stream(v1.list_namespaced_pod, namespace, resource_version=last_resource_version, timeout_seconds=5): print(i['object'].metadata.resource_version) last_resource_version = i['object'].metadata.resource_version # we retry the watch starting from the last resource version known # but this ALWAYS raises ApiException: (410) Reason: Expired: too old resource version: 379140622 (380367990) for me for i in watcher.stream(v1.list_namespaced_pod, namespace, resource_version=last_resource_version, timeout_seconds=5): print('second loop', i['object'].metadata.resource_version) last_resource_version = i['object'].metadata.resource_version ``` as soon as changed to keep track of the actual **highest** revision number with ` last_resource_version = max(latest_resource_version,i['object'].metadata.resource_version)` I stop getting those `resource too old` My PR #23504 tackles the aforementioned cause of "revision too old" in EKS (I guess there could be other scenarios that lead to `resource too old`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
