Hi Lennart,
On 2016/09/27 02:57 PM, Lennart Lund wrote:
> Hi Neel
>
> I can see a problem here. If a timeout there are two possibilities that the
> SI-swap has been done or the DI-swap is not done.
> From AIS spec:
> "SA_AIS_ERR_TIMEOUT - An implementation-dependent timeout occurred, or the
> timeout, specified by the timeout parameter, occurred before the call could
> complete.
> It is unspecified whether the call succeeded or whether it did not."
>
> If the SI-swap did succeed also SMF must have answered the CSI set callback
> for handling this. This means that this process is no longer ACTIVE and
> cannot be allowed to handle the campaign any longer e.g. fail the campaign.
> Is this handled correctly as is?
In the case of TIMEOUT, The following is checked.
while (SmfCampaignThread::instance() != NULL) {
when SMF receives Quiesced state, the campaign_oi_deactivate is called
which will terminates SmfCampaignThread.
Once the SmfCampaignThread is terminated,but still the re-try is performed.
rc = admOp.execute(0); must be guarded with SmfCampaignThread::instance().
I will resend the patch.
Thanks,
Neel.
> It should be possible to check the smfd_cb->ha_state enum to determine if we
> are still ACTIVE. If not this process must end all campaign handling and
> become standby. A problem here is that I don't think this is handled. Also if
> this check is done it would be possible to try again in case of timeout if we
> are still ACTIVE.
>
> So even if your solution is implemented it means that if there is a timeout
> error it must still be checked whether we are active or not and if we are
> still active the campaign should fail but if we are standby nothing should be
> done.
>
> Thanks
> Lennart
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]]
>> Sent: den 27 september 2016 09:21
>> To: Lennart Lund <[email protected]>; Rafael Odzakow
>> <[email protected]>
>> Cc: [email protected]
>> Subject: [PATCH 1 of 1] smf:retry for TIMEOUT in si-swap to be avoided
>> [#2069]
>>
>> osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc | 4 +---
>> 1 files changed, 1 insertions(+), 3 deletions(-)
>>
>>
>> diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc
>> @@ -4247,7 +4247,6 @@ SmfSwapThread::main(void)
>> int rc = admOp.execute(0);
>> while ((rc == SA_AIS_ERR_TRY_AGAIN) ||
>> (rc == SA_AIS_ERR_BUSY) ||
>> - (rc == SA_AIS_ERR_TIMEOUT) ||
>> (rc == SA_AIS_ERR_FAILED_OPERATION)) {
>>
>> if (retryCnt > max_swap_retry) {
>> @@ -4255,8 +4254,7 @@ SmfSwapThread::main(void)
>> goto exit_error;
>> }
>>
>> - if ((rc == SA_AIS_ERR_TIMEOUT) ||
>> - (rc == SA_AIS_ERR_FAILED_OPERATION)) {
>> + if (rc == SA_AIS_ERR_FAILED_OPERATION) {
>> //A timeout or failed operation occur. It is
>> undefined if the
>> operation was successful or not.
>> //We wait for maximum two minutes to see if the
>> campaign
>> thread is terminated (which it is in a successful swap)
>> //If not terminated, retry the SWAP operation.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel