Hi Minh, You can float the patch in parallel. Thanks -Nagu
> -----Original Message----- > From: minh chau [mailto:minh.c...@dektech.com.au] > Sent: 23 September 2016 14:54 > To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > Cc: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation continuation if > csi completes during headless [#1725 part 1] V1 > > Hi Nagu, > > Thanks for your update, I hope I could float the patch so everyone can do > code review, but let's wait for your testing first. > > Thanks, > Minh > > > > On 22/09/16 21:22, Nagendra Kumar wrote: > > Hi Minh, > > > > I have tested following scenarios till now and works well: > > 1. Faults under standalone system. Act and Std SUs are there and cluster > went headless. Faults occurred in Act and Standby separately and together > with recovery as restart, su f/o, node f/o, node s/o. Faults are also mixed > with two SGs. > > 2. During headless, the escalations from comp restart->su restart->su f/o - > > node f/o. Two SG was used for testings. When cluster recovers, the system > works well as expected. > > > > SG and node Auto repair was enabled and disabled in many test cases. > > > > Further testing will continue: > > 1. Faults during assignments. > > 2. Faults during admin operations. > > > > Thanks > > -Nagu > > > >> -----Original Message----- > >> From: minh chau [mailto:minh.c...@dektech.com.au] > >> Sent: 15 September 2016 12:27 > >> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen Malviya; > >> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >> Cc: opensaf-devel@lists.sourceforge.net > >> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > >> continuation if csi completes during headless [#1725 part 1] V1 > >> > >> Hi Nagu, > >> > >> Yes you are right. The "component failover" is unstable, I think I > >> will post my analysis of component failover problem to #1902 after > >> you have verified the other recoveries. > >> > >> Thanks, > >> Minh > >> > >> On 15/09/16 16:51, Nagendra Kumar wrote: > >>> Hi Minh, > >>> @2.a.) and @2.b.) are working except "Component Failover" > >> as recovery. Other recovery like SU Failover, etc are working fine > >> with 1725_pending_review.tgz and > 07_no_recovery_if_no_pending_susi.diff. > >>> Please confirm. > >>> > >>> Thanks > >>> -Nagu > >>> > >>>> -----Original Message----- > >>>> From: Nagendra Kumar > >>>> Sent: 15 September 2016 12:13 > >>>> To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; > >>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >>>> Cc: opensaf-devel@lists.sourceforge.net > >>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > >>>> continuation if csi completes during headless [#1725 part 1] V1 > >>>> > >>>> Hi Minh, > >>>>>> If there's no any major problem, can we make SI Dep as last phase? > >>>> Yes, absolutely. There is no problem. > >>>>>> If I am right, I think you are testing @2.a) - and *fault* has > >>>>>> just been as > >>>> node reboot/powered-off by user during headless. > >>>> Yes, you are right. > >>>> > >>>> Thanks > >>>> -Nagu > >>>> > >>>>> -----Original Message----- > >>>>> From: minh chau [mailto:minh.c...@dektech.com.au] > >>>>> Sent: 14 September 2016 17:54 > >>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > Malviya; > >>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > >>>>> continuation if csi completes during headless [#1725 part 1] V1 > >>>>> > >>>>> Hi Nagu, > >>>>> > >>>>> I have proposed to change the order on 28 Jul: > >>>>> > >>>>> ============== > >>>>> > >>>>> I would like to change the above orders of implementation: > >>>>> @0. We are here now: No admin op continuation, no recovery on > >>>>> faults during headless. > >>>>> Since componentRestart/suRestart has no impact on recovery after > >>>>> headless, faults during headless here mean: failover escalation, > >>>>> node reboot/powered-off by user during headless. Faults are > >>>>> different phenomenons but they all result in loss of SUSI. Having > >>>>> #1902 will remove the major impact of a node reboot due to > >>>>> immediate escalation and AMF also has to deal with the loss of > >>>>> SUSI the same as without > >>>>> #1902 plus failover escalation > >>>>> > >>>>> @1. Admin op continuation without required recovery on faults > >>>>> during headless > >>>>> @1.a) All CSI(s) callback completes during headless, but SUSI > >>>>> states are still QUIESCED/QUIESCING > >>>>> @1.b) One of CSI(s) callback is still ongoing after headless (AMFD > >>>>> would have to wait for it?) > >>>>> > >>>>> @2. Recovery on faults. (Doing fault recovery needs to consider > >>>>> admin op continuation which would have been implemented in step > >>>>> @1) Need > >>>>> #1902 > >>>>> @2.a.) Faults in normal flow: No admin op continuation is required > >>>>> after headless, but fault did happen during headless > >>>>> @2.b.) Faults happen during admin operation while headless, after > >>>>> headless AMFD needs to consider a recovery on fault together with > >>>>> admin op continuation. > >>>>> > >>>>> @3. @1 + @2 + With SI Dep. > >>>>> > >>>>> =============== > >>>>> I thought we have followed the above order so far? Because part 1 > >>>>> was acked, which is "@1. Admin op continuation without required > >>>>> recovery on faults during headless" > >>>>> If there's no any major problem, can we make SI Dep as last phase? > >>>>> If I am right, I think you are testing @2.a) - and *fault* has > >>>>> just been as node reboot/powered-off by user during headless. > >>>>> > >>>>> Thanks, > >>>>> Minh > >>>>> > >>>>> On 14/09/16 21:48, Nagendra Kumar wrote: > >>>>>> Hi Minh, > >>>>>> If it is not tested, then it is fine. But, we had added > (#1) > >>>>>> the > >>>>> following in the ticket #1725 on 27 Jul : > >>>>>> =========================================== > >>>>>> Nagendra Kumar - 2016-07-27 > >>>>>> > >>>>>> For 2N red model, implementation can be done in the following > >>>>>> phased > >>>>> manner. > >>>>>> It has advantages of being logically segregated and it continues > >>>>>> from > >>>>> where we left in 5.0. > >>>>>> (Phases #1, #2 and #3 is more related to ticket #1725 and phases > >>>>>> #4 and #5 are related to #1902) > >>>>>> > >>>>>> 1. Node restart escalation (with and without SI Dep). > >>>>>> 2. Without Si Dep : Admin op (no faults/escalations). > >>>>>> 3. Without Si Dep : Admin Op + node restart faults/escalations > during > >>>>> headless. > >>>>>> 4. Without Si Dep : > >>>>>> a.) All faults in normal flows. > >>>>>> b.) All faults during admin operation(minus node reboot > >>>>>> during headless > >>>>> as covered in #3). > >>>>>> 5. With Si Dep : #2, #3 and #4. > >>>>>> > >>>>>> Since 5.0 already has immediate escalation model (component and > >>>>>> node > >>>>> restart/reboot), so #1, #2 and #3 completes left over portion of > >>>>> headless contribution in 5.0 with that model. > >>>>>> ====================================== > >>>>>> > >>>>>> Thanks > >>>>>> -Nagu > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: minh chau [mailto:minh.c...@dektech.com.au] > >>>>>>> Sent: 14 September 2016 17:05 > >>>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > >> Malviya; > >>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >>>>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > >>>>>>> continuation if csi completes during headless [#1725 part 1] V1 > >>>>>>> > >>>>>>> Hi Nagu, > >>>>>>> > >>>>>>> SI Dep is the last phase of implementation of headless recovery, > >>>>>>> its support is not included in all patches attached in ticket #1725. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Minh > >>>>>>> > >>>>>>> On 14/09/16 21:21, Nagendra Kumar wrote: > >>>>>>>> Hi Minh, > >>>>>>>> Have you tested Si Dep (2N Red model) for "node > >> restart test > >>>>>>> cases" ? I can't see it in the test case doc. > >>>>>>>> Thanks > >>>>>>>> -Nagu > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Nagendra Kumar > >>>>>>>>> Sent: 13 September 2016 11:20 > >>>>>>>>> To: minh chau; hans.nordeb...@ericsson.com; Praveen Malviya; > >>>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>>>>>> Subject: Re: [devel] [PATCH 2 of 4] AMFND: Admin operation > >>>>>>>>> continuation if csi completes during headless [#1725 part 1] > >>>>>>>>> V1 > >>>>>>>>> > >>>>>>>>> Hi Minh, > >>>>>>>>> I have tested these scenarios again and it works well. > >>>>>>>>> > >>>>>>>>> Thanks > >>>>>>>>> -Nagu > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: minh chau [mailto:minh.c...@dektech.com.au] > >>>>>>>>>> Sent: 12 September 2016 11:53 > >>>>>>>>>> To: Nagendra Kumar; hans.nordeb...@ericsson.com; Praveen > >>>>> Malviya; > >>>>>>>>>> gary....@dektech.com.au; long.hb.ngu...@dektech.com.au > >>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>>>>>>> Subject: Re: [PATCH 2 of 4] AMFND: Admin operation > >>>>>>>>>> continuation if csi completes during headless [#1725 part 1] > >>>>>>>>>> V1 > >>>>>>>>>> > >>>>>>>>>> Hi Nagu, > >>>>>>>>>> > >>>>>>>>>> One bug get hit by your configuration, where the absent SUSIs > >>>>>>>>>> are found after headless but no real SUSIs are available also. > >>>>>>>>>> In this case I think that AMFD can do like a fresh assignment. > >>>>>>>>>> I attach the patch to ticket #1725, please help to test again. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Minh > >>>>>>>>>> > >>>>>>>>>> On 12/09/16 11:09, minh chau wrote: > >>>>>>>>>>> Hi Nagu, > >>>>>>>>>>> > >>>>>>>>>>> I'm running the tests with this configuration and will get > >>>>>>>>>>> back to > >>>> you. > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Minh > >>>>>>>>>>> > >>>>>>>>>>> On 09/09/16 22:26, Nagendra Kumar wrote: > >>>>>>>>>>>> Hi Minh, > >>>>>>>>>>>> I am using 1725_pending_review.tgz > >>>>>>>>>>>> (1725_02_V2_bugfix_01_resend_buffer_in_set_leds.diff, > >>>>>>>>>>>> > 1725_02_V2_bugfix_02_honor_clusterinit_nodesync_timer.diff, > >>>>>>>>>>>> 1725_02_V2_bugfix_03_restore_ng_admin.diff, > >>>>>>>>>>>> 1725_03_V4_failover_absent_susi_longDn.diff, > >>>>>>>>>>>> 1725_04_V2_headless_validation.diff, > >>>>>>>>>>>> 1725_05_V2_resend_oper_state.diff, > >>>>>>>>>>>> 1725_06a_fullscope_escalation_headless.diff). > >>>>>>>>>>>> > >>>>>>>>>>>> I am doing basic node reboot validation testing with no > faults. > >>>>>>>>>>>> > >>>>>>>>>>>> Configuration: SU1(act) and SU2(stanby) both on PL-3. > >>>>>>>>>>>> > >>>>>>>>>>>> TC #1: Start SC-1, PL-3 and PL-5: Unlock SU1 and SU2. Stop > >>>>>>>>>>>> SC-1 and stop PL-3, start PL-3 and start SC-1. > >>>>>>>>>>>> After SC-1 and PL-3 comes back, ideally SU1 and SU2 should > >>>>>>>>>>>> get assignments as Act and Std, but no assignment are being > >>>>>>>>>>>> given to SUs on PL-3 and it shows following in status: > >>>>>>>>>>>> > >>>>>>>>>>>> Only Su2 has Std assignment. > >>>>>>>>>>>> > >>>>>>>>>>>> safSISU=safSu=SC- > >>>>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed1,safApp=O > >>>>>>>>>>>> penSAF > >>>>>>>>>>>> > >>>>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>>>>>>> safSISU=safSu=PL- > >>>>>>>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O > >>>>>>>>>>>> penSAF > >>>>>>>>>>>> > >>>>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>>>>>>> > >> > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > >>>>>>>>>> mo1,s > >>>>>>>>>>>> afApp=AmfDemo1 > >>>>>>>>>>>> > >>>>>>>>>>>> saAmfSISUHAState=STANDBY(2) > >>>>>>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- > >>>>>>>>>> 2N,safApp=OpenSAF > >>>>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>>>>>>> safSISU=safSu=PL- > >>>>>>>>>> 3\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O > >>>>>>>>>>>> penSAF > >>>>>>>>>>>> > >>>>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>>>>>>> > >>>>>>>>>>>> TC #2: Configuration same as TC#1. Stop PL-3 and don't start. > >>>>>>>>>>>> The same issue: > >>>>>>>>>>>> safSISU=safSu=PL- > >>>>>>>>>> 5\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed3,safApp=O > >>>>>>>>>>>> penSAF > >>>>>>>>>>>> > >>>>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>>>>>>> > >> > safSISU=safSu=SU2\,safSg=AmfDemo_2N\,safApp=AmfDemo1,safSi=AmfDe > >>>>>>>>>> mo1,s > >>>>>>>>>>>> afApp=AmfDemo1 > >>>>>>>>>>>> > >>>>>>>>>>>> saAmfSISUHAState=STANDBY(2) > >>>>>>>>>>>> safSISU=safSu=SC- > >>>>>>>>>> 1\,safSg=NoRed\,safApp=OpenSAF,safSi=NoRed2,safApp=O > >>>>>>>>>>>> penSAF > >>>>>>>>>>>> > >>>>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>>>>>>> safSISU=safSu=SC-1\,safSg=2N\,safApp=OpenSAF,safSi=SC- > >>>>>>>>>> 2N,safApp=OpenSAF > >>>>>>>>>>>> saAmfSISUHAState=ACTIVE(1) > >>>>>>>>>>>> > >>>>>>>>>>>> TC #3: Configured SU1(Act) on PL-3 and SU2(Std) on PL-4. > >>>>>>>>>>>> Stop SC-1, stop PL-3 and PL-4, but PL-5 is running. start > >>>>>>>>>>>> SC-1, the same issue. > >>>>>>>>>>>> > >>>>>>>>>>>> TC #4: Same as TC #3, but SU3 configured on PL-5 as spare. > >>>>>>>>>>>> SU3 doesn't get any assignment and Sg is unstable. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks > >>>>>>>>>>>> -Nagu > >>>>>>>>>>>> > >>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>> From: Minh Hon Chau [mailto:minh.c...@dektech.com.au] > >>>>>>>>>>>>> Sent: 18 August 2016 05:46 > >>>>>>>>>>>>> To: hans.nordeb...@ericsson.com; Nagendra Kumar; > Praveen > >>>>>>>>> Malviya; > >>>>>>>>>>>>> gary....@dektech.com.au; > long.hb.ngu...@dektech.com.au; > >>>>>>>>>>>>> minh.c...@dektech.com.au > >>>>>>>>>>>>> Cc: opensaf-devel@lists.sourceforge.net > >>>>>>>>>>>>> Subject: [PATCH 2 of 4] AMFND: Admin operation > >>>>>>>>>>>>> continuation if csi completes during headless [#1725 part > >>>>>>>>>>>>> 1] V1 > >>>>>>>>>>>>> > >>>>>>>>>>>>> osaf/services/saf/amf/amfnd/di.cc | 199 > >>>>>>>>>>>>> +++++++++++++++++-------- > >>>>>>>>>>>>> osaf/services/saf/amf/amfnd/include/avnd_di.h | 1 + > >>>>>>>>>>>>> 2 files changed, 134 insertions(+), 66 deletions(-) > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> There're two options basically that AMFD can continue > >>>>>>>>>>>>> admin operation wih completed csi(s) > >>>>>>>>>>>>> > >>>>>>>>>>>>> First: AMFD can use the sync SUSI fsm state as latest, > >>>>>>>>>>>>> AMFD then has to explore its SUSI assignments with > >>>>>>>>>>>>> adminStates of relevant entities to determine which SU > >>>>>>>>>>>>> should be on call of > >>>>> susi_success(). > >>>>>>>>>>>>> Deeper level of exploration for csi addition. It also > >>>>>>>>>>>>> depends on SG Fsm state which is being used variously in > >>>>>>>>>>>>> different SG > >>>> types. > >>>>>>>>>>>>> Second: AMFD uses the SUSI fsm state read from IMM as > >>>>>>>>>>>>> latest, and AMFND needs to resend susi_resp messages > which > >>>>>>>>>>>>> were deferred during headless so that AMFD can continue > >>>>>>>>>>>>> the admin operation > >>>>>>>>> sequence. > >>>>>>>>>>>>> Both cases of csi completion [during or after] headless > >>>>>>>>>>>>> can run in the same code flow. > >>>>>>>>>>>>> > >>>>>>>>>>>>> The patch buffers susi_resp_msg during headless stage and > >>>>>>>>>>>>> resend it to AMFD after headless. There could be a chance > >>>>>>>>>>>>> that AMFND sent out susi response message but AMFD > could > >> not > >>>>> receive > >>>>>>>>>>>>> or process it. This case could be seen as a defect, which > >>>>>>>>>>>>> can be fixed by securing the result of sending susi_resp > >>>>>>>>>>>>> message from AMFND toward > >>>>>>>>> AMFD. > >>>>>>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>>>>>>> b/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>>>>>>> --- a/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/di.cc > >>>>>>>>>>>>> @@ -805,11 +805,6 @@ uint32_t > >>>>> avnd_di_susi_resp_send(AVND_CB > >>>>>>>>>>>>> if (cb->term_state == > >>>>>>>>>>>>> AVND_TERM_STATE_OPENSAF_SHUTDOWN_STARTED) > >>>>>>>>>>>>> return rc; > >>>>>>>>>>>>> > >>>>>>>>>>>>> - if (cb->is_avd_down == true) { > >>>>>>>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>>>>>>> - return rc; > >>>>>>>>>>>>> - } > >>>>>>>>>>>>> - > >>>>>>>>>>>>> // should be in assignment pending state to be here > >>>>>>>>>>>>> osafassert(m_AVND_SU_IS_ASSIGN_PEND(su)); > >>>>>>>>>>>>> > >>>>>>>>>>>>> @@ -820,64 +815,76 @@ uint32_t > >>>>>>> avnd_di_susi_resp_send(AVND_CB > >>>>>>>>>>>>> TRACE_ENTER2("Sending Resp su=%s, si=%s, > >>>>>>>>>>>>> curr_state=%u, prv_state=%u", su->name.value, > >>>>>>>>>>>>> curr_si->name.value,curr_si- > >>>>>>>>>>>>>> curr_state,curr_si->prv_state); > >>>>>>>>>>>>> /* populate the susi resp msg */ > >>>>>>>>>>>>> msg.info.avd = new AVSV_DND_MSG(); > >>>>>>>>>>>>> - msg.type = AVND_MSG_AVD; > >>>>>>>>>>>>> - msg.info.avd->msg_type = > >>>>>>>>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > >> ++(cb- > >>>>>>>>>>>>>> snd_msg_id); > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.node_id = > cb- > >>>>>>>>>>>>>> node_info.nodeId; > >>>>>>>>>>>>> - if (si) { > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.single_csi = > >>>>>>>>>>>>> - ((si->single_csi_add_rem_in_si == > >>>>>>>>>>>>> AVSV_SUSI_ACT_BASE) ? > >>>>>>>>>>>>> false : true); > >>>>>>>>>>>>> - } > >>>>>>>>>>>>> - TRACE("curr_assign_state '%u'", curr_si- > >>>>> curr_assign_state); > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > >>>>>>>>>>>>> - > >> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > >>>> || > >>>>>>>>>>>>> - > >> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > >>>> ? > >>>>>>>>>>>>> - ((!curr_si->prv_state) ? AVSV_SUSI_ACT_ASGN : > >>>>>>>>>>>>> AVSV_SUSI_ACT_MOD) : AVSV_SUSI_ACT_DEL; > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.su_name = > su- > >>>>>>>> name; > >>>>>>>>>>>>> - if (si) { > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name = si- > >>> name; > >>>>>>>>>>>>> - if (AVSV_SUSI_ACT_ASGN == > >>>>>>>>>>>>> si->single_csi_add_rem_in_si) { > >>>>>>>>>>>>> - TRACE("si->curr_assign_state '%u'", > >>>>>>>>>>>>> curr_si- > >>>>>>>>>>>>>> curr_assign_state); > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > >>>>>>>>>>>>> - > >>>>>>>>>>>>> > (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > >> || > >>>>>>>>>>>>> - > >>>>>>>>>>>>> > >> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) ? > >>>>>>>>>>>>> - AVSV_SUSI_ACT_ASGN : > >>>>>>>>>>>>> AVSV_SUSI_ACT_DEL; > >>>>>>>>>>>>> - } > >>>>>>>>>>>>> - } > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.ha_state = > >>>>>>>>>>>>> - (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? > >>>>>>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error = > >>>>>>>>>>>>> - > >> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > >>>> || > >>>>>>>>>>>>> - > >> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) > >>>> ? > >>>>>>>>>>>>> NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; > >>>>>>>>>>>>> + msg.type = AVND_MSG_AVD; > >>>>>>>>>>>>> + msg.info.avd->msg_type = > >>>>>>> AVSV_N2D_INFO_SU_SI_ASSIGN_MSG; > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.node_id = cb- > >>>>>>>>>>>>>> node_info.nodeId; > >>>>>>>>>>>>> + if (si) { > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.single_csi = > >>>>>>>>>>>>> + ((si->single_csi_add_rem_in_si == > >>>>>>>>>>>>> AVSV_SUSI_ACT_BASE) ? false : true); > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + TRACE("curr_assign_state '%u'", curr_si- > >>> curr_assign_state); > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act = > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > >>>>> || > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > >>>>> ? > >>>>>>>>>>>>> + ((!curr_si->prv_state) ? > >>>>>>>>>>>>> AVSV_SUSI_ACT_ASGN : AVSV_SUSI_ACT_MOD) : > >>>>>>>>> AVSV_SUSI_ACT_DEL; > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.su_name = > su- > >>>>>> name; > >>>>>>>>>>>>> + if (si) { > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name = > >>>>>>>>>>>>> + si- > >>>>>>>>>>>>>> name; > >>>>>>>>>>>>> + if (AVSV_SUSI_ACT_ASGN == > >>>>>>>>>>>>> + si->single_csi_add_rem_in_si) > >>>> { > >>>>>>>>>>>>> + TRACE("si->curr_assign_state '%u'", curr_si- > >>>>>>>>>>>>>> curr_assign_state); > >>>>>>>>>>>>> + msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.msg_act = > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > >>>>> || > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNING(curr_si)) > >>>>> ? > >>>>>>>>>>>>> + AVSV_SUSI_ACT_ASGN : > >>>>>>>>>>>>> AVSV_SUSI_ACT_DEL; > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.ha_state = > >>>>>>>>>>>>> + (SA_AMF_HA_QUIESCING == curr_si->curr_state) ? > >>>>>>>>>>>>> SA_AMF_HA_QUIESCED : curr_si->curr_state; > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error = > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>> (m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_ASSIGNED(curr_si) > >>>>> || > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>> m_AVND_SU_SI_CURR_ASSIGN_STATE_IS_REMOVED(curr_si)) > >>>>> ? > >>>>>>>>>>>>> +NCSCC_RC_SUCCESS : NCSCC_RC_FAILURE; > >>>>>>>>>>>>> > >>>>>>>>>>>>> - if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > == > >>>>>>>>>>>>> AVSV_SUSI_ACT_ASGN) > >>>>>>>>>>>>> - osafassert(si); > >>>>>>>>>>>>> + if (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > >>>>>>>>>>>>> + == > >>>>>>>>>>>>> AVSV_SUSI_ACT_ASGN) > >>>>>>>>>>>>> + osafassert(si); > >>>>>>>>>>>>> > >>>>>>>>>>>>> - /* send the msg to AvD */ > >>>>>>>>>>>>> - TRACE("Sending. msg_id'%u', node_id'%u', > msg_act'%u', > >>>>>>>>>>>>> su'%s', si'%s', > >>>>>>>>>>>>> ha_state'%u', error'%u', single_csi'%u'", > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_id, > >>>>>>>>>>>>> msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.node_id, > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.msg_act, > >>>>>>>>>>>>> msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, > >>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, > >>>>>>>>>>>>> - msg.info.avd->msg_info.n2d_su_si_assign.error, > >>>>>>>>>>>>> msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.single_csi); > >>>>>>>>>>>>> + /* send the msg to AvD */ > >>>>>>>>>>>>> + TRACE("Sending. msg_id'%u', node_id'%u', msg_act'%u', > >>>>>>>>>>>>> + su'%s', > >>>>>>>>>>>>> si'%s', ha_state'%u', error'%u', single_csi'%u'", > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id, > >>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.node_id, > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_act, > >>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name.value, > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.si_name.value, > >>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.ha_state, > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.error, > >>>>>>>>>>>>> +msg.info.avd->msg_info.n2d_su_si_assign.single_csi); > >>>>>>>>>>>>> > >>>>>>>>>>>>> - if ((su->si_list.n_nodes > 1) && (si == nullptr)) { > >>>>>>>>>>>>> - if (msg.info.avd- > >>> msg_info.n2d_su_si_assign.msg_act > >>>> == > >>>>>>>>>>>>> AVSV_SUSI_ACT_DEL) > >>>>>>>>>>>>> - LOG_NO("Removed 'all SIs' from '%s'", > >>>>>>>>>>>>> su->name.value); > >>>>>>>>>>>>> + if ((su->si_list.n_nodes > 1) && (si == nullptr)) { > >>>>>>>>>>>>> + if > >>>>>>>>>>>>> + (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > >>>>>>>>>>>>> + == > >>>>>>>>>>>>> AVSV_SUSI_ACT_DEL) > >>>>>>>>>>>>> + LOG_NO("Removed 'all SIs' from '%s'", su- > >>>>>>>>>>>>>> name.value); > >>>>>>>>>>>>> - if (msg.info.avd- > >>> msg_info.n2d_su_si_assign.msg_act > >>>> == > >>>>>>>>>>>>> AVSV_SUSI_ACT_MOD) > >>>>>>>>>>>>> - LOG_NO("Assigned 'all SIs' %s of '%s'", > >>>>>>>>>>>>> - ha_state[msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.ha_state], > >>>>>>>>>>>>> - su->name.value); > >>>>>>>>>>>>> - } > >>>>>>>>>>>>> + if > >>>>>>>>>>>>> + (msg.info.avd->msg_info.n2d_su_si_assign.msg_act > >>>>>>>>>>>>> + == > >>>>>>>>>>>>> AVSV_SUSI_ACT_MOD) > >>>>>>>>>>>>> + LOG_NO("Assigned 'all SIs' %s of '%s'", > >>>>>>>>>>>>> + ha_state[msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.ha_state], > >>>>>>>>>>>>> + su->name.value); > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> > >>>>>>>>>>>>> - rc = avnd_di_msg_send(cb, &msg); > >>>>>>>>>>>>> - if (NCSCC_RC_SUCCESS == rc) > >>>>>>>>>>>>> - msg.info.avd = 0; > >>>>>>>>>>>>> - > >>>>>>>>>>>>> - /* we have completed the SU SI msg processing */ > >>>>>>>>>>>>> - if (su_assign_state_is_stable(su)) > >>>>>>>>>>>>> - m_AVND_SU_ASSIGN_PEND_RESET(su); > >>>>>>>>>>>>> - m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>>>>>>> + if (cb->is_avd_down == true) { > >>>>>>>>>>>>> + // We are in headless, buffer this msg > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = 0; > >>>>>>>>>>>>> + if (avnd_diq_rec_add(cb, &msg) == nullptr) { > >>>>>>>>>>>>> + rc = NCSCC_RC_FAILURE; > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>>>>>>> + LOG_NO("avnd_di_susi_resp_send() deferred as AMF > >>>>>>>>>>>>> director is offline"); > >>>>>>>>>>>>> + } else { > >>>>>>>>>>>>> + // We are in normal cluster, send msg to director > >>>>>>>>>>>>> + msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > >>>>>>>>>>>>> + ++(cb- > >>>>>>>>>>>>>> snd_msg_id); > >>>>>>>>>>>>> + /* send the msg to AvD */ > >>>>>>>>>>>>> + rc = avnd_di_msg_send(cb, &msg); > >>>>>>>>>>>>> + if (NCSCC_RC_SUCCESS == rc) > >>>>>>>>>>>>> + msg.info.avd = 0; > >>>>>>>>>>>>> + /* we have completed the SU SI msg processing */ > >>>>>>>>>>>>> + if (su_assign_state_is_stable(su)) { > >>>>>>>>>>>>> + m_AVND_SU_ASSIGN_PEND_RESET(su); > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + m_AVND_SU_ALL_SI_RESET(su); > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> > >>>>>>>>>>>>> /* free the contents of avnd message */ > >>>>>>>>>>>>> avnd_msg_content_free(cb, &msg); @@ -1256,14 > >>>>>>>>>>>>> +1263,7 > >>>>> @@ > >>>>>>>>> void > >>>>>>>>>>>>> avnd_diq_rec_del(AVND_CB *cb, AVND_ > >>>>>>>>>>>>> /* stop the AvD msg response timer */ > >>>>>>>>>>>>> if (m_AVND_TMR_IS_ACTIVE(rec->resp_tmr)) { > >>>>>>>>>>>>> m_AVND_TMR_MSG_RESP_STOP(cb, *rec); > >>>>>>>>>>>>> - // Resend msgs from queue because amfd dropped > during > >>>>>>>>>>>>> sync > >>>>>>>>>>>>> - if ((cb->dnd_list.head != nullptr)) { > >>>>>>>>>>>>> - TRACE("retransmit message to amfd"); > >>>>>>>>>>>>> - AVND_DND_MSG_LIST *pending_rec = 0; > >>>>>>>>>>>>> - for (pending_rec = cb->dnd_list.head; pending_rec > >>>>>>>>>>>>> != > >>>>>>>>>>>>> nullptr; pending_rec = pending_rec->next) { > >>>>>>>>>>>>> - avnd_diq_rec_send(cb, pending_rec); > >>>>>>>>>>>>> - } > >>>>>>>>>>>>> - } > >>>>>>>>>>>>> + avnd_diq_rec_send_buffered_msg(cb); > >>>>>>>>>>>>> /* resend pg start track */ > >>>>>>>>>>>>> avnd_di_resend_pg_start_track(cb); > >>>>>>>>>>>>> } > >>>>>>>>>>>>> @@ -1276,6 +1276,73 @@ void > avnd_diq_rec_del(AVND_CB > >>>> *cb, > >>>>>>>>>> AVND_ > >>>>>>>>>>>>> TRACE_LEAVE(); > >>>>>>>>>>>>> return; > >>>>>>>>>>>>> } > >>>>>>>>>>>>> > >> > +/************************************************************ > >>>>>>>>>>>>> **************** > >>>>>>>>>>>>> + Name : avnd_diq_rec_send_buffered_msg > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + Description : Resend buffered msg > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + Arguments : cb - ptr to the AvND control block > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + Return Values : None. > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + Notes : None. > >>>>>>>>>>>>> > >> > +************************************************************* > >>>>>>>>>>>>> ********** > >>>>>>>>>>>>> +*******/ void > avnd_diq_rec_send_buffered_msg(AVND_CB > >>>> *cb) > >>>>> { > >>>>>>>>>>>>> + TRACE_ENTER(); > >>>>>>>>>>>>> + // Resend msgs from queue because amfnd dropped > >>>>>>>>>>>>> + during > >>>>>>>>> headless > >>>>>>>>>>>>> + // or headless-synchronization > >>>>>>>>>>>>> + if ((cb->dnd_list.head != nullptr)) { > >>>>>>>>>>>>> + AVND_DND_MSG_LIST *pending_rec = 0; > >>>>>>>>>>>>> + TRACE("Attach msg_id of buffered msg"); > >>>>>>>>>>>>> + bool found = true; > >>>>>>>>>>>>> + while (found) { > >>>>>>>>>>>>> + found = false; > >>>>>>>>>>>>> + for (pending_rec = cb->dnd_list.head; > >>>>>>>>>>>>> + pending_rec != > >>>>>>>>>>>>> nullptr; pending_rec = pending_rec->next) { > >>>>>>>>>>>>> + if (pending_rec->msg.type == > >>>>>>>>>>>>> AVND_MSG_AVD) { > >>>>>>>>>>>>> + // At this moment, only oper_state > >>>>>>>>>>>>> msg needs to report to director > >>>>>>>>>>>>> + if (pending_rec->msg.info.avd- > >>>>>>>>>>>>>> msg_type == AVSV_N2D_INFO_SU_SI_ASSIGN_MSG && > >>>>>>>>>>>>> + pending_rec->msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.msg_id == 0) { > >>>>>>>>>>>>> + m_AVND_DIQ_REC_POP(cb, > >>>>>>>>>>>>> pending_rec); #if 0 > >>>>>>>>>>>>> + // only resend if this SUSI > >>>>>>>>>>>>> does exist > >>>>>>>>>>>>> + AVND_SU *su = > >>>>>>>>>>>>> m_AVND_SUDB_REC_GET(cb->sudb, > >>>>>>>>>>>>> + pending_rec- > >>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.su_name); > >>>>>>>>>>>>> + if (su != nullptr && su- > >>>>>>>>>>>>>> si_list.n_nodes > 0) { #endif > >>>>>>>>>>>>> + pending_rec- > >>>>>>>>>>>>>> msg.info.avd->msg_info.n2d_su_si_assign.msg_id = > >>>>>>>>>>>>>> ++(cb->snd_msg_id); > >>>>>>>>>>>>> + > >>>>>>>>>>>>> m_AVND_DIQ_REC_PUSH(cb, pending_rec); > >>>>>>>>>>>>> + LOG_NO("Found and > >>>>>>>>>>>>> resend buffered su_si_assign msg for SU:'%s', " > >>>>>>>>>>>>> + > >>>>>>>>>>>>> "SI:'%s', ha_state:'%u', msg_act:'%u', single_csi:'%u', > >>>>>>>>>>>>> " > >>>>>>>>>>>>> + > >>>>>>>>>>>>> "error:'%u', msg_id:'%u'", > >>>>>>>>>>>>> + > >>>>>>>>>>>>> pending_rec->msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.su_name.value, > >>>>>>>>>>>>> + > >>>>>>>>>>>>> pending_rec->msg.info.avd- > >>>>>>>>>>>>>> msg_info.n2d_su_si_assign.si_name.value, > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>>>>>>>>>>> pending_rec->msg.info.avd- > >>>>>> msg_info.n2d_su_si_assign.ha_state, > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>>>>>>>>>>> pending_rec->msg.info.avd- > >>>>> msg_info.n2d_su_si_assign.msg_act, > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>>>>>>>>>>> pending_rec->msg.info.avd- > >>>>>> msg_info.n2d_su_si_assign.single_csi > >>>>>>>>>>>>> , > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>>>>>>>>>>> pending_rec->msg.info.avd- > >>> msg_info.n2d_su_si_assign.error, > >>>>>>>>>>>>> + > >>>>>>>>>>>>> > >>>>>>>>>>>>> pending_rec->msg.info.avd- > >>>>> msg_info.n2d_su_si_assign.msg_id); > >>>>>>>>>>>>> + > >>>>>>>>>>>>> +#if 0 > >>>>>>>>>>>>> + } else { > >>>>>>>>>>>>> + > >>>>>>>>>>>>> avnd_msg_content_free(cb, &pending_rec->msg); > >>>>>>>>>>>>> + delete pending_rec; > >>>>>>>>>>>>> + pending_rec = cb- > >>>>>>>>>>>>>> dnd_list.head; > >>>>>>>>>>>>> + } #endif > >>>>>>>>>>>>> + found = true; > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + TRACE("retransmit message to amfd"); > >>>>>>>>>>>>> + for (pending_rec = cb->dnd_list.head; > >>>>>>>>>>>>> +pending_rec != nullptr; > >>>>>>>>>>>>> pending_rec = pending_rec->next) { > >>>>>>>>>>>>> + avnd_diq_rec_send(cb, pending_rec); > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + } > >>>>>>>>>>>>> + TRACE_LEAVE(); > >>>>>>>>>>>>> + return; > >>>>>>>>>>>>> +} > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >> > /************************************************************* > >>>>>>>>>>>>> *************** > >>>>>>>>>>>>> Name : avnd_diq_rec_send > >>>>>>>>>>>>> diff --git a/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>>>>>>> b/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>>>>>>> --- a/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>>>>>>> +++ b/osaf/services/saf/amf/amfnd/include/avnd_di.h > >>>>>>>>>>>>> @@ -79,6 +79,7 @@ void avnd_di_msg_ack_process(struct > >>>> avnd > >>>>>>> void > >>>>>>>>>>>>> avnd_diq_del(struct avnd_cb_tag *); AVND_DND_MSG_LIST > >>>>>>>>>>>>> *avnd_diq_rec_add(struct avnd_cb_tag *cb, AVND_MSG > >> *msg); > >>>>> void > >>>>>>>>>>>>> avnd_diq_rec_del(struct avnd_cb_tag *cb, > >>>> AVND_DND_MSG_LIST > >>>>>>>>> *rec); > >>>>>>>>>>>>> +void avnd_diq_rec_send_buffered_msg(struct avnd_cb_tag > >>>> *cb); > >>>>>>>>>>>>> uint32_t avnd_diq_rec_send(struct avnd_cb_tag *cb, > >>>>>>>>>>>>> AVND_DND_MSG_LIST *rec); uint32_t > >>>>>>> avnd_di_reg_su_rsp_snd(struct > >>>>>>>>>>>>> avnd_cb_tag *cb, SaNameT *su_name, uint32_t ret_code); > >>>>>>>>>>>>> uint32_t avnd_di_ack_nack_msg_send(struct avnd_cb_tag > >> *cb, > >>>>>>>>>>>>> uint32_t rcv_id, uint32_t view_num); > >>>>>>>>> -------------------------------------------------------------- > >>>>>>>>> -- > >>>>>>>>> - > >>>>>>>>> -- > >>>>>>>>> -- > >>>>>>>>> --------- > _______________________________________________ > >>>>>>>>> Opensaf-devel mailing list > >>>>>>>>> Opensaf-devel@lists.sourceforge.net > >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel > >>>> ------------------------------------------------------------------- > >>>> -- > >>>> --------- _______________________________________________ > >>>> Opensaf-devel mailing list > >>>> Opensaf-devel@lists.sourceforge.net > >>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel > ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel