Hi Minh, Good catch. I will update in V3. -----Original Message----- From: Minh Hon Chau <minh.c...@dektech.com.au> Sent: Tuesday, May 18, 2021 9:44 AM To: Thang Duc Nguyen <thang.d.ngu...@dektech.com.au>; Thanh Nguyen <thanh.ngu...@dektech.com.au> Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [PATCH 1/1] smf: enhance smf to handle timeout in one step upgrade [#3262]
Hi Thang, Please see comment inline. Thanks Minh On 18/5/21 12:37 pm, thang.d.nguyen wrote: > In one step upgrade, during the lock nodegroup. The timeout can happen > and it causes the upgrade failed. By retrying if the return code of > saImmOmAdminOperationInvoke_2() is SA_AIS_ERR_NO_OP, the lock is > considered as successfully. > --- > src/smf/smfd/SmfAdminState.cc | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/src/smf/smfd/SmfAdminState.cc > b/src/smf/smfd/SmfAdminState.cc index 90ae093c4..a75bed798 100755 > --- a/src/smf/smfd/SmfAdminState.cc > +++ b/src/smf/smfd/SmfAdminState.cc > @@ -930,6 +930,16 @@ bool SmfAdminStateHandler::nodeGroupAdminOperation( > (imm_rc == SA_AIS_OK && oi_rc == SA_AIS_ERR_TRY_AGAIN)) { > base::Sleep(base::MillisToTimespec(2000)); > continue; > + } else if (imm_rc == SA_AIS_ERR_TIMEOUT) { > + // Retry > + continue; > + } else if (imm_rc == SA_AIS_ERR_NO_OP) { > + // If an admin operation is already performed SA_AIS_ERR_NO_OP > + // is returned. Treat this as OK, just log it and return > + // operation success > + LOG_NO("Admin op [%d] on [%s], return SA_AIS_ERR_NO_OP," > + "treated as OK", adminOp, nodeGroupName_s.c_str()); > + break; > } else if (imm_rc != SA_AIS_OK) { > LOG_NO( > "%s: saImmOmAdminOperationInvoke_2 Fail %s", @@ -948,7 > +958,8 @@ bool SmfAdminStateHandler::nodeGroupAdminOperation( > } > } > if (adminOpTimer.is_timeout()) { > - if ((imm_rc == SA_AIS_OK) && (oi_rc == SA_AIS_OK)) { > + if (((imm_rc == SA_AIS_OK) && (oi_rc == SA_AIS_OK)) || > + (imm_rc == SA_AIS_ERR_NO_OP)) { > // Timeout is passed but operation is ok. This is Ok > method_rc = true; [M] What if adminOpTimer is not timeout, you get the ERR_NO_OP, then method_rc is not True > } else if (imm_rc != SA_AIS_OK) { _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel