Xintong Song created FLINK-21667:
------------------------------------

             Summary: Standby RM might remove resources from Kubernetes
                 Key: FLINK-21667
                 URL: https://issues.apache.org/jira/browse/FLINK-21667
             Project: Flink
          Issue Type: Bug
          Components: Deployment / Kubernetes
    Affects Versions: 1.12.2
            Reporter: Xintong Song
             Fix For: 1.13.0, 1.12.3


Currently, on initialization {{KubernetesResourceManagerDriver}} starts a watch 
for receiving pod events. It could happen that it starts to receive events 
before obtaining leadership. Consequently, a standby RM may remove terminated 
pods from Kubernetes during handling the events.

This is not very damaging atm, since the removed pods are already terminated 
anyway. However, it would still be good for a standby RM to strictly following 
the contract and make no modifications before obtaining leadership. We might 
consider to postpone starting of the watch to when the leadership is granted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to