Hi, in ticket #1290 BAD_HANDLE, one comment was that preferably should immutil be updated to handle BAD_HANDLE. Probably immutil should also handle other errors, e.g. ERR_TIMEOUT, as for this ticket #1360.
/Thanks HansN -----Original Message----- From: Anders Björnerstedt Sent: den 27 maj 2015 09:52 To: Nagendra Kumar; Hans Nordebäck; Praveen Malviya Cc: [email protected] Subject: RE: [devel] [PATCH 1 of 1] amfd: try again if ImplementerClear times out Hi again, I understand that this case is apparently due to message loss IMMND->IMMD. Since change-set: changeset: 3523:5228e50660b4 user: Anders Bjornerstedt <[email protected]> date: Fri Apr 27 15:02:01 2012 +0200 summary: Immsv: IMMND subscribes (mds) in normal mode on IMMD (#2446) IMMND subscribes to IMMD sv-id in NORMAL mode, i.e. not REDUNDANT mode. MDS will then supposedly take care of buffering messages that can not be delivered due to absence of IMMD until the IMMD is back. The IMMD itself MBCPs fevs mesasages (which ImplementerCLear is) to standby before sending. So this message loss could only happen if the IMMD goes down after having received the request, but before having MBCP'ed the request. So this should be very rare indeed. The fix in itself is safe though. I worry about the followingthough. Assume that instead of ImplementerClear, the AMf simply closes the OI handle and re-opens it. This should have exactly the same effect. A close of handle will be registgered and acceptet by the local IMMND and the local IMMND will close the implementer by generating that message internally to the IMMND. If that message gets lost, then the AMFD would here actually get Ok, but we would possibly have an "orphaned" OI. I have not seen one such case though. /AndersBj -----Original Message----- From: Anders Björnerstedt [mailto:[email protected]] Sent: den 27 maj 2015 09:23 To: Nagendra Kumar; Hans Nordebäck; Praveen Malviya Cc: [email protected] Subject: Re: [devel] [PATCH 1 of 1] amfd: try again if ImplementerClear times out Ok, that should be a very rare error scenario, but in this case for this particular operation (ImplementerClear) which is an idempotent operation, then repeating the request should work and be safe. /AndersBj -----Original Message----- From: Nagendra Kumar [mailto:[email protected]] Sent: den 27 maj 2015 08:35 To: Hans Nordebäck; Praveen Malviya Cc: [email protected] Subject: Re: [devel] [PATCH 1 of 1] amfd: try again if ImplementerClear times out Hi Hans N/Anders Bj, Thanks for your review. >>> A log record can be written before the "goto try_again". I would like to get logs record for BAD_HANDLE and other types of errors also. That's why the log record is on top common for all the errors. >>> An alternative to try_again can be to change IMMA_SYNCR_TIMEOUT from the >>> default 10 seconds to a higher value. In the scenario of the ticket(specific), Immd has gone down and the packet has been lost along with Immd and hence increase of wait time will increase the HA unavailability.) So, my thought was it is ok for default wait time for 10 seconds, and try again so that local Immd will respond, hence quicker recovery. Thanks -Nagu > -----Original Message----- > From: Hans Nordebäck [mailto:[email protected]] > Sent: 26 May 2015 18:15 > To: Nagendra Kumar; Praveen Malviya > Cc: [email protected] > Subject: Re: [PATCH 1 of 1] amfd: try again if ImplementerClear times > out > > ack, code review only. Some comments below discussed with AndersBj. > > A log record can be written before the "goto try_again". An > alternative to try_again can be to change IMMA_SYNCR_TIMEOUT from the > default 10 seconds to a higher value. > > /Thanks HansN > > On 05/12/2015 10:44 AM, [email protected] wrote: > > osaf/services/saf/amf/amfd/role.cc | 9 ++++++--- > > 1 files changed, 6 insertions(+), 3 deletions(-) > > > > > > ImplementerClear returns SA_AIS_ERR_TIMEOUT because the message sent > > from immnd to immd got lost because of node down. > > So, Amf need to handle it as try_again. > > After Amf does try_again, Imm returns SA_AIS_ERR_BAD_HANDLE. > > Amf handles SA_AIS_ERR_BAD_HANDLE and reinitializes its interface > > with > Imm. > > In handling SA_AIS_ERR_BAD_HANDLE, Amf sets the implementer in > avd_imm_reinit_bg_thread, so > > when Amf proceeds and calls avd_imm_applier_set, Imm returns > > SA_AIS_ERR_INVALID_PARAM. Since, it is expected error from Imm, Amf > > ignores it. > > > > diff --git a/osaf/services/saf/amf/amfd/role.cc > b/osaf/services/saf/amf/amfd/role.cc > > --- a/osaf/services/saf/amf/amfd/role.cc > > +++ b/osaf/services/saf/amf/amfd/role.cc > > @@ -659,15 +659,17 @@ void avd_mds_qsd_role_evh(AVD_CL_CB *cb, > > _exit(EXIT_FAILURE); // should never get here... > > } > > > > +try_again: > > /* Take mutex here to sync with imm reinit thread.*/ > > osaf_mutex_lock_ordie(&imm_reinit_mutex); > > - > > /* Give up IMM OI implementer role */ > > if ((rc = immutil_saImmOiImplementerClear(cb->immOiHandle)) != > SA_AIS_OK) { > > osaf_mutex_unlock_ordie(&imm_reinit_mutex); > > LOG_ER("FAILOVER Active --> Quiesced FAILED, > ImplementerClear failed %u", rc); > > if (rc == SA_AIS_ERR_BAD_HANDLE) { > > avd_imm_reinit_bg(); > > + } else if (rc == SA_AIS_ERR_TIMEOUT) { > > + goto try_again; > > } else > > osafassert(0); > > } else > > @@ -685,10 +687,11 @@ void avd_mds_qsd_role_evh(AVD_CL_CB *cb, > > LOG_ER("avd_imm_applier_set FAILED, %u", rc); > > if (rc == SA_AIS_ERR_BAD_HANDLE) { > > avd_imm_reinit_bg(); > > - } else if (rc == SA_AIS_ERR_EXIST) { > > + } else if ((rc == SA_AIS_ERR_EXIST) || (rc == > SA_AIS_ERR_INVALID_PARAM)) { > > /* This may arise if > immutil_saImmOiImplementerClear > > failed and amf reinitializes imm interface and > > - set applier in avd_imm_reinit_bg_thread.*/ > > + set applier in avd_imm_reinit_bg_thread. Imm may > > + return ERR_EXIST or INVALID_PARAM. */ > > } else > > osafassert(0); > > } else > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
