Hi Neel

I can see a problem here. If a timeout there are two possibilities that the 
SI-swap has been done or the DI-swap is not done.
>From AIS spec:
"SA_AIS_ERR_TIMEOUT - An implementation-dependent timeout occurred, or the
timeout, specified by the timeout parameter, occurred before the call could 
complete.
It is unspecified whether the call succeeded or whether it did not."

If the SI-swap did succeed also SMF must have answered the CSI set callback for 
handling this. This means that this process is no longer ACTIVE and cannot be 
allowed to handle the campaign any longer e.g. fail the campaign. Is this 
handled correctly as is?

It should be possible to check the smfd_cb->ha_state enum to determine if we 
are still ACTIVE. If not this process must end all campaign handling and become 
standby. A problem here is that I don't think this is handled. Also if this 
check is done it would be possible to try again in case of timeout if we are 
still ACTIVE.

So even if your solution is implemented it means that if there is a timeout 
error it must still be checked whether we are active or not and if we are still 
active the campaign should fail but if we are standby nothing should be done.

Thanks
Lennart


> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: den 27 september 2016 09:21
> To: Lennart Lund <[email protected]>; Rafael Odzakow
> <[email protected]>
> Cc: [email protected]
> Subject: [PATCH 1 of 1] smf:retry for TIMEOUT in si-swap to be avoided
> [#2069]
> 
>  osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc |  4 +---
>  1 files changed, 1 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
> b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
> @@ -4247,7 +4247,6 @@ SmfSwapThread::main(void)
>       int rc = admOp.execute(0);
>       while ((rc == SA_AIS_ERR_TRY_AGAIN) ||
>                (rc == SA_AIS_ERR_BUSY) ||
> -              (rc == SA_AIS_ERR_TIMEOUT) ||
>                (rc == SA_AIS_ERR_FAILED_OPERATION)) {
> 
>                  if (retryCnt > max_swap_retry) {
> @@ -4255,8 +4254,7 @@ SmfSwapThread::main(void)
>                          goto exit_error;
>                  }
> 
> -                if ((rc == SA_AIS_ERR_TIMEOUT) ||
> -                    (rc == SA_AIS_ERR_FAILED_OPERATION)) {
> +                if (rc == SA_AIS_ERR_FAILED_OPERATION) {
>                          //A timeout or failed operation occur. It is 
> undefined if the
> operation was successful or not.
>                          //We wait for maximum two minutes to see if the 
> campaign
> thread is terminated (which it is in a successful swap)
>                          //If not terminated, retry the SWAP operation.

------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to