Hi Minh, >>If there's no any major problem, can we make SI Dep as last phase? Yes, absolutely. There is no problem. >>If I am right, I think you are testing @2.a) - and *fault* has just been as >>node reboot/powered-off by user during headless. Yes, you are right.
Thanks -Nagu > -----Original Message----- > From: minh chau [mailto:minh.c...@dektech.com.au] > Sent: 14 September 2016 17:54 > To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation continuation if > csi completes during headless [#1725 part 1] V1 > > Hi Nagu, > > I have proposed to change the order on 28 Jul: > > ============== > > I would like to change the above orders of implementation: > @0. We are here now: No admin op continuation, no recovery on faults > during headless. > Since componentRestart/suRestart has no impact on recovery after > headless, faults during headless here mean: failover escalation, node > reboot/powered-off by user during headless. Faults are different > phenomenons but they all result in loss of SUSI. Having #1902 will remove > the major impact of a node reboot due to immediate escalation and AMF > also has to deal with the loss of SUSI the same as without #1902 plus failover > escalation > > @1. Admin op continuation without required recovery on faults during > headless > @1.a) All CSI(s) callback completes during headless, but SUSI states are still > QUIESCED/QUIESCING > @1.b) One of CSI(s) callback is still ongoing after headless (AMFD would have > to wait for it?) > > @2. Recovery on faults. (Doing fault recovery needs to consider admin op > continuation which would have been implemented in step @1) Need #1902 > @2.a.) Faults in normal flow: No admin op continuation is required after > headless, but fault did happen during headless > @2.b.) Faults happen during admin operation while headless, after headless > AMFD needs to consider a recovery on fault together with admin op > continuation. > > @3. @1 + @2 + With SI Dep. > > =============== > I thought we have followed the above order so far? Because part 1 was > acked, which is "@1. Admin op continuation without required recovery on > faults during headless" > If there's no any major problem, can we make SI Dep as last phase? > If I am right, I think you are testing @2.a) - and *fault* has just been as > node > reboot/powered-off by user during headless. > > Thanks, > Minh > > On 14/09/16 21:48, Nagendra Kumar wrote: > > Hi Minh, > > If it is not tested, then it is fine. But, we had added (#1) the > following in the ticket #1725 on 27 Jul : > > > > =========================================== > > Nagendra Kumar - 2016-07-27 > > > > For 2N red model, implementation can be done in the following phased > manner. > > It has advantages of being logically segregated and it continues from > where we left in 5.0. > > (Phases #1, #2 and #3 is more related to ticket #1725 and phases #4 > > and #5 are related to #1902) > > > > 1. Node restart escalation (with and without SI Dep). > > 2. Without Si Dep : Admin op (no faults/escalations). > > 3. Without Si Dep : Admin Op + node restart faults/escalations during > headless. > > 4. Without Si Dep : > > a.) All faults in normal flows. > > b.) All faults during admin operation(minus node reboot during headless > as covered in #3). > > 5. With Si Dep : #2, #3 and #4. > > > > Since 5.0 already has immediate escalation model (component and node > restart/reboot), so #1, #2 and #3 completes left over portion of headless > contribution in 5.0 with that model. > > ====================================== > > > > Thanks > > -Nagu > > > >> -----Original Message----- > >> From: minh chau [mailto:minh.c...@dektech.com.au] > >> Sent: 14 September 2016 17:05 > >> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > >> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >> Cc: opensaf-devel@lists.sourceforge.net > >> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > >> continuation if csi completes during headless [#1725 part 1] V1 > >> > >> Hi Nagu, > >> > >> SI Dep is the last phase of implementation of headless recovery, its > >> support is not included in all patches attached in ticket #1725. > >> > >> Thanks, > >> Minh > >> > >> On 14/09/16 21:21, Nagendra Kumar wrote: > >>> Hi Minh, > >>> Have you tested Si Dep (2N Red model) for "node restart test > >> cases" ? I can't see it in the test case doc. > >>> Thanks > >>> -Nagu > >>> > >>>> -----Original Message----- > >>>> From: Nagendra Kumar > >>>> Sent: 13 September 2016 11:20 > >>>> To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; > >>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >>>> Cc: opensaf-devel@lists.sourceforge.net > >>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > >>>> continuation if csi completes during headless [#1725 part 1] V1 > >>>> > >>>> Hi Minh, > >>>> I have tested these scenarios again and it works well. > >>>> > >>>> Thanks > >>>> -Nagu > >>>> > >>>>> -----Original Message----- > >>>>> From: minh chau [mailto:minh.c...@dektech.com.au] > >>>>> Sent: 12 September 2016 11:53 > >>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > Malviya; > >>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>> Subject: Re: [PATCH 2 of 4] AMFND: Admin operation continuation if > >>>>> csi completes during headless [#1725 part 1] V1 > >>>>> > >>>>> Hi Nagu, > >>>>> > >>>>> One bug get hit by your configuration, where the absent SUSIs are > >>>>> found after headless but no real SUSIs are available also. In this > >>>>> case I think that AMFD can do like a fresh assignment. > >>>>> I attach the patch to ticket #1725, please help to test again. > >>>>> > >>>>> Thanks, > >>>>> Minh > >>>>> > >>>>> On 12/09/16 11:09, minh chau wrote: > >>>>>> Hi Nagu, > >>>>>> > >>>>>> I'm running the tests with this configuration and will get back to you. > >>>>>> > >>>>>> Thanks, > >>>>>> Minh > >>>>>> > >>>>>> On 09/09/16 22:26, Nagendra Kumar wrote: > >>>>>>> Hi Minh, > >>>>>>> I am using 1725_pending_review.tgz > >>>>>>> (1725_02_V2_bugfix_01_resend_buffer_in_set_leds.diff, > >>>>>>> 1725_02_V2_bugfix_02_honor_clusterinit_nodesync_timer.diff, > >>>>>>> 1725_02_V2_bugfix_03_restore_ng_admin.diff, > >>>>>>> 1725_03_V4_failover_absent_susi_longDn.diff, > >>>>>>> 1725_04_V2_headless_validation.diff, > >>>>>>> 1725_05_V2_resend_oper_state.diff, > >>>>>>> 1725_06a_fullscope_escalation_headless.diff). > >>>>>>> > >>>>>>> I am doing basic node reboot validation testing with no faults. > >>>>>>> > >>>>>>> Configuration: SU1(act) and SU2(stanby) both on PL-3. > >>>>>>> > >>>>>>> TC #1: Start SC-1, PL-3 and PL-5: Unlock SU1 and SU2. Stop SC-1 > >>>>>>> and stop PL-3, start PL-3 and start SC-1. > >>>>>>> After SC-1 and PL-3 comes back, ideally SU1 and SU2 should get > >>>>>>> assignments as Act and Std, but no assignment are being given to > >>>>>>> SUs on PL-3 and it shows following in status: > >>>>>>> > >>>>>>> Only Su2 has Std assignment. > >>>>>>> > >>>>>>> safSISU=safSu=SC- > >>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=O > >>>>>>> penSAF > >>>>>>> > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> safSISU=safSu=PL- > >>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O > >>>>>>> penSAF > >>>>>>> > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> > >> > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > >>>>> mo1,s > >>>>>>> afApp=AmfDemo1 > >>>>>>> > >>>>>>> saAmfSISUHAState=STANDBY(2) > >>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- > >>>>> 2N,safApp=OpenSAF > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> safSISU=safSu=PL- > >>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O > >>>>>>> penSAF > >>>>>>> > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> > >>>>>>> TC #2: Configuration same as TC#1. Stop PL-3 and don't start. > >>>>>>> The same issue: > >>>>>>> safSISU=safSu=PL- > >>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O > >>>>>>> penSAF > >>>>>>> > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> > >> > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > >>>>> mo1,s > >>>>>>> afApp=AmfDemo1 > >>>>>>> > >>>>>>> saAmfSISUHAState=STANDBY(2) > >>>>>>> safSISU=safSu=SC- > >>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O > >>>>>>> penSAF > >>>>>>> > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- > >>>>> 2N,safApp=OpenSAF > >>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>> > >>>>>>> TC #3: Configured SU1(Act) on PL-3 and SU2(Std) on PL-4. > >>>>>>> Stop SC-1, stop PL-3 and PL-4, but PL-5 is running. start SC-1, > >>>>>>> the same issue. > >>>>>>> > >>>>>>> TC #4: Same as TC #3, but SU3 configured on PL-5 as spare. SU3 > >>>>>>> doesn't get any assignment and Sg is unstable. > >>>>>>> > >>>>>>> Thanks > >>>>>>> -Nagu > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] > >>>>>>>> Sent: 18 August 2016 05:46 > >>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; Praveen > >>>> Malviya; > >>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au; > >>>>>>>> minh.c...@dektech.com.au > >>>>>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>>>>> Subject: [PATCH 2 of 4] AMFND: Admin operation continuation if > >>>>>>>> csi completes during headless [#1725 part 1] V1 > >>>>>>>> > >>>>>>>> osaf/services/saf/amf/amfnd/di.cc | 199 > >>>>>>>> +++++++++++++++++-------- > >>>>>>>> osaf/services/saf/amf/amfnd/include/avnd_di.h | 1 + > >>>>>>>> 2 files changed, 134 insertions(+), 66 deletions(-) > >>>>>>>> > >>>>>>>> > >>>>>>>> There're two options basically that AMFD can continue admin > >>>>>>>> operation wih completed csi(s) > >>>>>>>> > >>>>>>>> First: AMFD can use the sync SUSI fsm state as latest, AMFD > >>>>>>>> then has to explore its SUSI assignments with adminStates of > >>>>>>>> relevant entities to determine which SU should be on call of > susi_success(). > >>>>>>>> Deeper level of exploration for csi addition. It also depends > >>>>>>>> on SG Fsm state which is being used variously in different SG types. > >>>>>>>> > >>>>>>>> Second: AMFD uses the SUSI fsm state read from IMM as latest, > >>>>>>>> and AMFND needs to resend susi_resp messages which were > >>>>>>>> deferred during headless so that AMFD can continue the admin > >>>>>>>> operation > >>>> sequence. > >>>>>>>> Both cases of csi completion [during or after] headless can run > >>>>>>>> in the same code flow. > >>>>>>>> > >>>>>>>> The patch buffers susi_resp_msg during headless stage and > >>>>>>>> resend it to AMFD after headless. There could be a chance that > >>>>>>>> AMFND sent out susi response message but AMFD could not > receive > >>>>>>>> or process it. This case could be seen as a defect, which can > >>>>>>>> be fixed by securing the result of sending susi_resp message > >>>>>>>> from AMFND toward > >>>> AMFD. > >>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>> @@ -805,11 +805,6 @@ uint32_t > avnd_di_susi_resp_send(AVND_CB > >>>>>>>> if (cb->term_state == > >>>>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED) > >>>>>>>> return rc; > >>>>>>>> > >>>>>>>> - if (cb->is_avd_down == true) { > >>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>> - return rc; > >>>>>>>> - } > >>>>>>>> - > >>>>>>>> // should be in assignment pending state to be here > >>>>>>>> osafassert(m_AVND_SU_IS_ASSIGN_PEND(su)); > >>>>>>>> > >>>>>>>> @@ -820,64 +815,76 @@ uint32_t > >> avnd_di_susi_resp_send(AVND_CB > >>>>>>>> TRACE_ENTER2("Sending Resp su=%s, si=%s, curr_state=%u, > >>>>>>>> prv_state=%u", su->name.value, curr_si->name.value,curr_si- > >>>>>>>>> curr_state,curr_si->prv_state); > >>>>>>>> /* populate the susi resp msg */ > >>>>>>>> msg.info.avd = new AVSV_DND_MSG(); > >>>>>>>> - msg.type = AVND_MSG_AVD; > >>>>>>>> - msg.info.avd->msg_type = > >>>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id = ++(cb- > >>>>>>>>> snd_msg_id); > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- > >>>>>>>>> node_info.nodeId; > >>>>>>>> - if (si) { > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi = > >>>>>>>> - ((si->single_csi_add_rem_in_si == > >>>>>>>> AVSV_SUSI_ACT_BASE) ? > >>>>>>>> false : true); > >>>>>>>> - } > >>>>>>>> - TRACE("curr_assign_state '%u'", curr_si->curr_assign_state); > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > >>>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || > >>>>>>>> - m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? > >>>>>>>> - ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN : > >>>>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.su_name = su- > >>> name; > >>>>>>>> - if (si) { > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = si->name; > >>>>>>>> - if (AVSV_SUSI_ACT_ASGN == > >>>>>>>> si->single_csi_add_rem_in_si) { > >>>>>>>> - TRACE("si->curr_assign_state '%u'", curr_si- > >>>>>>>>> curr_assign_state); > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > >>>>>>>> - > >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || > >>>>>>>> - > >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? > >>>>>>>> - AVSV_SUSI_ACT_ASGN : > >>>>>>>> AVSV_SUSI_ACT_DEL; > >>>>>>>> - } > >>>>>>>> - } > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.ha_state = > >>>>>>>> - (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? > >>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error = > >>>>>>>> - (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) || > >>>>>>>> - m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) ? > >>>>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; > >>>>>>>> + msg.type = AVND_MSG_AVD; > >>>>>>>> + msg.info.avd->msg_type = > >> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- > >>>>>>>>> node_info.nodeId; > >>>>>>>> + if (si) { > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.single_csi = > >>>>>>>> + ((si->single_csi_add_rem_in_si == > >>>>>>>> AVSV_SUSI_ACT_BASE) ? false : true); > >>>>>>>> + } > >>>>>>>> + TRACE("curr_assign_state '%u'", curr_si->curr_assign_state); > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > >>>>>>>> + > >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > || > >>>>>>>> + > >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > ? > >>>>>>>> + ((!curr_si->prv_state) ? > >>>>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) : > >>>> AVSV_SUSI_ACT_DEL; > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.su_name = su- > >name; > >>>>>>>> + if (si) { > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- > >>>>>>>>> name; > >>>>>>>> + if (AVSV_SUSI_ACT_ASGN == si->single_csi_add_rem_in_si) { > >>>>>>>> + TRACE("si->curr_assign_state '%u'", curr_si- > >>>>>>>>> curr_assign_state); > >>>>>>>> + msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.msg_act = > >>>>>>>> + > >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > || > >>>>>>>> + > >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > ? > >>>>>>>> + AVSV_SUSI_ACT_ASGN : > >>>>>>>> AVSV_SUSI_ACT_DEL; > >>>>>>>> + } > >>>>>>>> + } > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.ha_state = > >>>>>>>> + (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? > >>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error = > >>>>>>>> + > >>>>>>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > || > >>>>>>>> + > >>>>>>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) > ? > >>>>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; > >>>>>>>> > >>>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == > >>>>>>>> AVSV_SUSI_ACT_ASGN) > >>>>>>>> - osafassert(si); > >>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == > >>>>>>>> AVSV_SUSI_ACT_ASGN) > >>>>>>>> + osafassert(si); > >>>>>>>> > >>>>>>>> - /* send the msg to AvD */ > >>>>>>>> - TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', > >>>>>>>> su'%s', si'%s', > >>>>>>>> ha_state'%u', error'%u', single_csi'%u'", > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id, msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.node_id, > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act, > >>>>>>>> msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, > >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, > >>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error, msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.single_csi); > >>>>>>>> + /* send the msg to AvD */ > >>>>>>>> + TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', > >>>>>>>> + su'%s', > >>>>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'", > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id, > >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id, > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act, > >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value, > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, > >>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error, > >>>>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi); > >>>>>>>> > >>>>>>>> - if ((su->si_list.n_nodes > 1) && (si == nullptr)) { > >>>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > >>>>>>>> == > >>>>>>>> AVSV_SUSI_ACT_DEL) > >>>>>>>> - LOG_NO("Removed 'all SIs' from '%s'", > >>>>>>>> su->name.value); > >>>>>>>> + if ((su->si_list.n_nodes > 1) && (si == nullptr)) { > >>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == > >>>>>>>> AVSV_SUSI_ACT_DEL) > >>>>>>>> + LOG_NO("Removed 'all SIs' from '%s'", su- > >>>>>>>>> name.value); > >>>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > >>>>>>>> == > >>>>>>>> AVSV_SUSI_ACT_MOD) > >>>>>>>> - LOG_NO("Assigned 'all SIs' %s of '%s'", > >>>>>>>> - ha_state[msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.ha_state], > >>>>>>>> - su->name.value); > >>>>>>>> - } > >>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act == > >>>>>>>> AVSV_SUSI_ACT_MOD) > >>>>>>>> + LOG_NO("Assigned 'all SIs' %s of '%s'", > >>>>>>>> + ha_state[msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.ha_state], > >>>>>>>> + su->name.value); > >>>>>>>> + } > >>>>>>>> > >>>>>>>> - rc = avnd_di_msg_send(cb, &msg); > >>>>>>>> - if (NCSCC_RC_SUCCESS == rc) > >>>>>>>> - msg.info.avd = 0; > >>>>>>>> - > >>>>>>>> - /* we have completed the SU SI msg processing */ > >>>>>>>> - if (su_assign_state_is_stable(su)) > >>>>>>>> - m_AVND_SU_ASSIGN_PEND_RESET(su); > >>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>> + if (cb->is_avd_down == true) { > >>>>>>>> + // We are in headless, buffer this msg > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; > >>>>>>>> + if (avnd_diq_rec_add(cb, &msg) == nullptr) { > >>>>>>>> + rc = NCSCC_RC_FAILURE; > >>>>>>>> + } > >>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF > >>>>>>>> director is offline"); > >>>>>>>> + } else { > >>>>>>>> + // We are in normal cluster, send msg to director > >>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > >>>>>>>> + ++(cb- > >>>>>>>>> snd_msg_id); > >>>>>>>> + /* send the msg to AvD */ > >>>>>>>> + rc = avnd_di_msg_send(cb, &msg); > >>>>>>>> + if (NCSCC_RC_SUCCESS == rc) > >>>>>>>> + msg.info.avd = 0; > >>>>>>>> + /* we have completed the SU SI msg processing */ > >>>>>>>> + if (su_assign_state_is_stable(su)) { > >>>>>>>> + m_AVND_SU_ASSIGN_PEND_RESET(su); > >>>>>>>> + } > >>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>> + } > >>>>>>>> > >>>>>>>> /* free the contents of avnd message */ > >>>>>>>> avnd_msg_content_free(cb, &msg); @@ -1256,14 +1263,7 > @@ > >>>> void > >>>>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_ > >>>>>>>> /* stop the AvD msg response timer */ > >>>>>>>> if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) { > >>>>>>>> m_AVND_TMR_MSG_RESP_STOP(cb, *rec); > >>>>>>>> - // Resend msgs from queue because amfd dropped during > >>>>>>>> sync > >>>>>>>> - if ((cb->dnd_list.head != nullptr)) { > >>>>>>>> - TRACE("retransmit message to amfd"); > >>>>>>>> - AVND_DND_MSG_LIST *pending_rec = 0; > >>>>>>>> - for (pending_rec = cb->dnd_list.head; pending_rec != > >>>>>>>> nullptr; pending_rec = pending_rec->next) { > >>>>>>>> - avnd_diq_rec_send(cb, pending_rec); > >>>>>>>> - } > >>>>>>>> - } > >>>>>>>> + avnd_diq_rec_send_buffered_msg(cb); > >>>>>>>> /* resend pg start track */ > >>>>>>>> avnd_di_resend_pg_start_track(cb); > >>>>>>>> } > >>>>>>>> @@ -1276,6 +1276,73 @@ void avnd_diq_rec_del(AVND_CB *cb, > >>>>> AVND_ > >>>>>>>> TRACE_LEAVE(); > >>>>>>>> return; > >>>>>>>> } > >>>>>>>> > >> > +/************************************************************ > >>>>>>>> **************** > >>>>>>>> + Name : avnd_diq_rec_send_buffered_msg > >>>>>>>> + > >>>>>>>> + Description : Resend buffered msg > >>>>>>>> + > >>>>>>>> + Arguments : cb - ptr to the AvND control block > >>>>>>>> + > >>>>>>>> + Return Values : None. > >>>>>>>> + > >>>>>>>> + Notes : None. > >>>>>>>> > >> > +************************************************************* > >>>>>>>> ********** > >>>>>>>> +*******/ void avnd_diq_rec_send_buffered_msg(AVND_CB *cb) > { > >>>>>>>> + TRACE_ENTER(); > >>>>>>>> + // Resend msgs from queue because amfnd dropped during > >>>> headless > >>>>>>>> + // or headless-synchronization > >>>>>>>> + if ((cb->dnd_list.head != nullptr)) { > >>>>>>>> + AVND_DND_MSG_LIST *pending_rec = 0; > >>>>>>>> + TRACE("Attach msg_id of buffered msg"); > >>>>>>>> + bool found = true; > >>>>>>>> + while (found) { > >>>>>>>> + found = false; > >>>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec > >>>>>>>> + != > >>>>>>>> nullptr; pending_rec = pending_rec->next) { > >>>>>>>> + if (pending_rec->msg.type == > >>>>>>>> AVND_MSG_AVD) { > >>>>>>>> + // At this moment, only oper_state > >>>>>>>> msg needs to report to director > >>>>>>>> + if (pending_rec->msg.info.avd- > >>>>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG && > >>>>>>>> + pending_rec->msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) { > >>>>>>>> + m_AVND_DIQ_REC_POP(cb, > >>>>>>>> pending_rec); #if 0 > >>>>>>>> + // only resend if this SUSI > >>>>>>>> does exist > >>>>>>>> + AVND_SU *su = > >>>>>>>> m_AVND_SUDB_REC_GET(cb->sudb, > >>>>>>>> + pending_rec- > >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name); > >>>>>>>> + if (su != nullptr && su- > >>>>>>>>> si_list.n_nodes > 0) { #endif > >>>>>>>> + pending_rec- > >>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > >>>>>>>>> ++(cb->snd_msg_id); > >>>>>>>> + > >>>>>>>> m_AVND_DIQ_REC_PUSH(cb, pending_rec); > >>>>>>>> + LOG_NO("Found and > >>>>>>>> resend buffered su_si_assign msg for SU:'%s', " > >>>>>>>> + > >>>>>>>> "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', " > >>>>>>>> + > >>>>>>>> "error:'%u', msg_id:'%u'", > >>>>>>>> + > >>>>>>>> pending_rec->msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, > >>>>>>>> + > >>>>>>>> pending_rec->msg.info.avd- > >>>>>>>>> msg_info.n2d_su_si_assign.si_name.value, > >>>>>>>> + > >>>>>>>> > >>>>>>>> pending_rec->msg.info.avd- > >msg_info.n2d_su_si_assign.ha_state, > >>>>>>>> + > >>>>>>>> > >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_act, > >>>>>>>> + > >>>>>>>> > >>>>>>>> pending_rec->msg.info.avd- > >msg_info.n2d_su_si_assign.single_csi > >>>>>>>> , > >>>>>>>> + > >>>>>>>> > >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.error, > >>>>>>>> + > >>>>>>>> > >>>>>>>> pending_rec->msg.info.avd->msg_info.n2d_su_si_assign.msg_id); > >>>>>>>> + > >>>>>>>> +#if 0 > >>>>>>>> + } else { > >>>>>>>> + > >>>>>>>> avnd_msg_content_free(cb, &pending_rec->msg); > >>>>>>>> + delete pending_rec; > >>>>>>>> + pending_rec = cb- > >>>>>>>>> dnd_list.head; > >>>>>>>> + } > >>>>>>>> +#endif > >>>>>>>> + found = true; > >>>>>>>> + } > >>>>>>>> + } > >>>>>>>> + } > >>>>>>>> + } > >>>>>>>> + TRACE("retransmit message to amfd"); > >>>>>>>> + for (pending_rec = cb->dnd_list.head; pending_rec != > >>>>>>>> +nullptr; > >>>>>>>> pending_rec = pending_rec->next) { > >>>>>>>> + avnd_diq_rec_send(cb, pending_rec); > >>>>>>>> + } > >>>>>>>> + } > >>>>>>>> + TRACE_LEAVE(); > >>>>>>>> + return; > >>>>>>>> +} > >>>>>>>> > >>>>>>>> > >>>>>>>> > >> > /************************************************************* > >>>>>>>> *************** > >>>>>>>> Name : avnd_diq_rec_send > >>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct avnd > >> void > >>>>>>>> avnd_diq_del(struct avnd_cb_tag *); AVND_DND_MSG_LIST > >>>>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG *msg); > void > >>>>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb, AVND_DND_MSG_LIST > >>>> *rec); > >>>>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag *cb); > >>>>>>>> uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb, > >>>>>>>> AVND_DND_MSG_LIST *rec); uint32_t > >> avnd_di_reg_su_rsp_snd(struct > >>>>>>>> avnd_cb_tag *cb, SaNameT *su_name, uint32_t ret_code); > >>>>>>>> uint32_t avnd_di_ack_nack_msg_send(struct avnd_cb_tag *cb, > >>>>>>>> uint32_t rcv_id, uint32_t view_num); > >>>> ------------------------------------------------------------------- > >>>> -- > >>>> --------- _______________________________________________ > >>>> Opensaf-devel mailing list > >>>> Opensaf-devel@lists.sourceforge.net > >>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel