[ 
https://issues.apache.org/jira/browse/FLINK-32012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17723156#comment-17723156
 ] 

Nicolas Fraison commented on FLINK-32012:
-----------------------------------------

I'd like to have your feeling on the first approach I'd like to do to merge 
upgrade/rollback.

 

Currently to do a rollback we
 * First check if spec change
 * If not check if we should rollback
 * If yes initiate rollback where we set status to ROLLING_BACK and wait for 
next reconcile loop
 * Next reconcile loop will do same checks
 * As it should rollback and state already up to date it will rollback relying 
on last stable spec but doesn't update the spec

The idea would be to really rollback the spec spec to rely for both mechanism 
on reconcileSpecChange:
 * Still keep the check if spec change
 * Then check if should rollback
 * In the initiate rollback set status to ROLLING_BACK and restore last stable 
spec and wait for next reconcile loop
 * Next reconcile loop will check if spec change
 * It will be the case so will run reconcileSpecChange but with status 
ROLLING_BACK so could then applied specificities of rollback

Would also simplify the operation that need to reapply the working spec like 
[resubmitJob|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/AbstractJobReconciler.java#L309]

Still I'm wondering if we will then be really able to simplify things as from 
your 
[comment|https://github.com/apache/flink-kubernetes-operator/pull/590#discussion_r1189847280]
 the rollback doesn't seem to be aligned to the upgrade?

For ex for savepoint upgrade mode:
 * upgrade
 ** delete ha metadata
 ** restore from savepoint
 * rollback
 ** keep ha metadata if exist
 ** restore from it if exist
 ** restore from savepoint if not exist and JM pod never started

> Operator failed to rollback due to missing HA metadata
> ------------------------------------------------------
>
>                 Key: FLINK-32012
>                 URL: https://issues.apache.org/jira/browse/FLINK-32012
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.4.0
>            Reporter: Nicolas Fraison
>            Priority: Major
>              Labels: pull-request-available
>
> The operator has well detected that the job was failing and initiate the 
> rollback but this rollback has failed due to `Rollback is not possible due to 
> missing HA metadata`
> We are relying on saevpoint upgrade mode and zookeeper HA.
> The operator is performing a set of action to also delete this HA data in 
> savepoint upgrade mode:
>  * [flink-kubernetes-operator/AbstractFlinkService.java at main · 
> apache/flink-kubernetes-operator|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L346]
>  : Suspend job with savepoint and deleteClusterDeployment
>  * [flink-kubernetes-operator/StandaloneFlinkService.java at main · 
> apache/flink-kubernetes-operator|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/StandaloneFlinkService.java#L158]
>  : Remove JM + TM deployment and delete HA data
>  * [flink-kubernetes-operator/AbstractFlinkService.java at main · 
> apache/flink-kubernetes-operator|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L1008]
>  : Wait cluster shutdown and delete zookeeper HA data
>  * [flink-kubernetes-operator/FlinkUtils.java at main · 
> apache/flink-kubernetes-operator|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/FlinkUtils.java#L155]
>  : Remove all child znode
> Then when running rollback the operator is looking for HA data even if we 
> rely on sevepoint upgrade mode:
>  * [flink-kubernetes-operator/AbstractFlinkResourceReconciler.java at main · 
> apache/flink-kubernetes-operator|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/AbstractFlinkResourceReconciler.java#L164]
>  Perform reconcile of rollback if it should rollback
>  * [flink-kubernetes-operator/AbstractFlinkResourceReconciler.java at main · 
> apache/flink-kubernetes-operator|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/AbstractFlinkResourceReconciler.java#L387]
>  Rollback failed as HA data is not available
>  * [flink-kubernetes-operator/FlinkUtils.java at main · 
> apache/flink-kubernetes-operator|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/FlinkUtils.java#L220]
>  Check if some child znodes are available
> For both step the pattern looks to be the same for kubernetes HA so it 
> doesn't looks to be linked to a bug with zookeeper.
>  
> From https://issues.apache.org/jira/browse/FLINK-30305 it looks to be 
> expected that the HA data has been deleted (as it is also performed by flink 
> when relying on savepoint upgrade mode).
> Still the use case seems to differ from 
> https://issues.apache.org/jira/browse/FLINK-30305 as the operator is aware of 
> the failure and treat a specific rollback event.
> So I'm wondering why we enforce such a check when performing rollback if we 
> rely on savepoint upgrade mode. Would it be fine to not rely on the HA data 
> and rollback from the last savepoint (the one we used in the deployment step)?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to