Thomas Weise created FLINK-29100: ------------------------------------ Summary: Deployment with last-state upgrade mode stuck after initial error Key: FLINK-29100 URL: https://issues.apache.org/jira/browse/FLINK-29100 Project: Flink Issue Type: Bug Components: Kubernetes Operator Affects Versions: kubernetes-operator-1.1.0 Reporter: Thomas Weise Assignee: Thomas Weise
A deployment with last_state upgrade mode that never succeeds will be stuck in deploying state because no HA data exists. This can be reproduced by creating a deployment with invalid image or exception in entry point. Update to the CR that corrects the issue won't be reconciled due to "o.a.f.k.o.r.d.ApplicationReconciler [INFO ] [default.basic-checkpoint-ha-example] Job is not running yet and HA metadata is not available, waiting for upgradeable state". This forces manual intervention to delete the CR. Instead, operator should check if this is the initial deployment and if so skip the HA metadata check. -- This message was sent by Atlassian Jira (v8.20.10#820010)