Hi Neel I can see a problem here. If a timeout there are two possibilities that the SI-swap has been done or the DI-swap is not done. >From AIS spec: "SA_AIS_ERR_TIMEOUT - An implementation-dependent timeout occurred, or the timeout, specified by the timeout parameter, occurred before the call could complete. It is unspecified whether the call succeeded or whether it did not."
If the SI-swap did succeed also SMF must have answered the CSI set callback for handling this. This means that this process is no longer ACTIVE and cannot be allowed to handle the campaign any longer e.g. fail the campaign. Is this handled correctly as is? It should be possible to check the smfd_cb->ha_state enum to determine if we are still ACTIVE. If not this process must end all campaign handling and become standby. A problem here is that I don't think this is handled. Also if this check is done it would be possible to try again in case of timeout if we are still ACTIVE. So even if your solution is implemented it means that if there is a timeout error it must still be checked whether we are active or not and if we are still active the campaign should fail but if we are standby nothing should be done. Thanks Lennart > -----Original Message----- > From: [email protected] [mailto:[email protected]] > Sent: den 27 september 2016 09:21 > To: Lennart Lund <[email protected]>; Rafael Odzakow > <[email protected]> > Cc: [email protected] > Subject: [PATCH 1 of 1] smf:retry for TIMEOUT in si-swap to be avoided > [#2069] > > osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc | 4 +--- > 1 files changed, 1 insertions(+), 3 deletions(-) > > > diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc > b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc > --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc > +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc > @@ -4247,7 +4247,6 @@ SmfSwapThread::main(void) > int rc = admOp.execute(0); > while ((rc == SA_AIS_ERR_TRY_AGAIN) || > (rc == SA_AIS_ERR_BUSY) || > - (rc == SA_AIS_ERR_TIMEOUT) || > (rc == SA_AIS_ERR_FAILED_OPERATION)) { > > if (retryCnt > max_swap_retry) { > @@ -4255,8 +4254,7 @@ SmfSwapThread::main(void) > goto exit_error; > } > > - if ((rc == SA_AIS_ERR_TIMEOUT) || > - (rc == SA_AIS_ERR_FAILED_OPERATION)) { > + if (rc == SA_AIS_ERR_FAILED_OPERATION) { > //A timeout or failed operation occur. It is > undefined if the > operation was successful or not. > //We wait for maximum two minutes to see if the > campaign > thread is terminated (which it is in a successful swap) > //If not terminated, retry the SWAP operation. ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
