ecerulm commented on issue #21087:
URL: https://github.com/apache/airflow/issues/21087#issuecomment-1121165082
@snjypl , let me see if I can address you comments (in no particular order,
sorry)
### allow_watch_bookmark=True does not prevent 410
> it is because of this bug in kubernetes python client that you were
getting 410 error even with allow_watch_bookmark enabled.
The PR kubernetes-client/python#1796 you mention as "the bug" does NOT solve
the 410 error, if you do `allow_watch_bookmark` and
`resource_version=someexpiredresourceversion` I still get a 410 GONE using your
PR code or am I missing something?:
```
git clone [email protected]:snjypl/kubernetes-client-python.git
cd kubernetes-client-python
git checkout bugfix/1729-bookmarkevents
pip install -e .
# Successfully installed kubernetes-23.0.1-snapshot
cd ..
python main6.py # the code that I shared earlier
Try to start a new watch with allow_watch_bookmarks=True and
resource_version=1
Traceback (most recent call last):
File "main6.py", line 24, in <module>
for i in watcher.stream(v1.list_namespaced_pod, namespace,
resource_version=1, allow_watch_bookmarks=True,timeout_seconds=2):
File
"/Users/rubelagu/git/python-kubernetes-client/kubernetes-client-python/kubernetes/watch/watch.py",
line 182, in stream
raise client.rest.ApiException(
kubernetes.client.exceptions.ApiException: (410)
Reason: Expired: too old resource version: 1 (384756203)
```
> shouldn't it be resource_version = last_resource_version. resource_version
= 1 will mostly always fail.
thats the point to simulate an already expired resource version, like it
would happen if there was a watch disconnect due to some network issue, there
was 5 minutes network disconnect from airflow to k8s api, until the next watch
could actually be performed, then the resource_version will be already expired.
In order to simulate that quickly I use `resource_version=1`
This was explicitly mentioned in the example comments :
```
# assume that last_resource_version has already expired by the time we are
able to reconnect to kubernetes api
# to simulate that scenario I send resource_version=1 which is for sure
expired
```
> the code you shared is not really a good way to demonstrate BOOKMARK.
It demonstrates that today if you use a `allow_watch_bookmarks=True` and
`resource_version=some_already_expired_resource_version` you will still get a
410 GONE.
This is true in `kubernetes==20.3.0` and **also true in your PR
kubernetes-client/python#1796 because I tested with your PR code**
> because of the bookmark event you will always have a resourceVersion which
is not old and is in the etcd cache.
I don't think that is true, the resourceVersion can be too old by the time
you get to use it. Two things to keep in mind:
* As per kubernetes documentation [watch
events](https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks):
> you shouldn't assume bookmarks are returned at any specific interval, nor
can clients assume that the API server will send any BOOKMARK event even when
requested.
* It's possible that some amount of time happens between the disconnect and
the next successful watch connection. So you can get a perfection accurate
resource_version X at the time of disconnection, and when the network
connectivity is restored then that resource_version X is already expired.
> my view is, there is no need airflow to keep track of the
last_resource_version or even have ResourceVersion singleton. it is not airflow
job to do the actual 'watching' of resource.
I fully agree with that statement but I guess that's a hefty change. But I
do really thing Airflow kubernetes_executor.py is trying too hard to track
resource_version and getting little to nothing by doing so.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]