Some comments: 1) In the patch with the change: su_ha_state = avd_su_state_determine(su); - if (su_ha_state == SA_AMF_HA_QUIESCED) { - if (su->su_switch == AVSV_SI_TOGGLE_SWITCH) { - AVD_SU_SI_REL *temp_susi; - for (temp_susi = su->list_of_susi; temp_susi != NULL; temp_susi = temp_susi->su_next) { - if (temp_susi->si->invocation != 0) { - avd_saImmOiAdminOperationResult(cb->immOiHandle, - temp_susi->si->invocation, SA_AIS_ERR_BAD_OPERATION); - temp_susi->si->invocation = 0; - } - } - } - m_AVD_SET_SU_SWITCH(cb, su, AVSV_SI_TOGGLE_STABLE); - } else if (su_ha_state == SA_AMF_HA_QUIESCING) { + if (su_ha_state == SA_AMF_HA_QUIESCING) {
The else part is executed for quiesced assignments, but it is coded for only two cases mentioned already there. /* This may happen 1. when Act SU is locked and it is transitioning into Quisced and the same time stdby SU faulted. 2. Durin SI-SWAP when Act SU goes to Quiesced, Std SU goes to Act and Quiesced SU goes to Std and if it fails, then we seed to delete the assignments. */ But the case of #823 is different. In #823 quiesced assignment has faulted which is not the above two cases. So the removed "if " statement with " m_AVD_SET_SU_SWITCH(cb, su, AVSV_SI_TOGGLE_STABLE);" should be there. 2)In the change @@ -2382,9 +2390,6 @@ static uint32_t avd_sg_2n_susi_sucss_su_ if (flag == true) { node_admin_state_set(su_node_ptr, SA_AMF_ADMIN_LOCKED); } - } else if (su->su_switch == AVSV_SI_TOGGLE_SWITCH) { - /* this SU switch state is true change to false. */ - m_AVD_SET_SU_SWITCH(cb, su, AVSV_SI_TOGGLE_STABLE); } The code is removed, but there is no replacement with complete_siswap(). Not sure, which use case it will violate. So let us keep this. Thanks Praveen On 02-Apr-14 2:20 PM, Hans Feldt wrote: > Please check and comment the attached patch. > Thanks, > Hans > >> -----Original Message----- >> From: Nagendra Kumar [mailto:nagendr...@oracle.com] >> Sent: den 2 april 2014 09:01 >> To: Hans Feldt; Praveen Malviya >> Cc: opensaf-devel@lists.sourceforge.net >> Subject: RE: [devel] [PATCH 1 of 1] amfd: return AIS_OK instead of BAD_OP >> for si-swap admin op [#823] >> >> If the recovery of failed Su doesn't happen i.e. goes into instantiatin >> failed state/termination failed state, then are you going to rerun >> failed operation ? >> >> Thanks >> -Nagu >>> -----Original Message----- >>> From: Hans Feldt [mailto:hans.fe...@ericsson.com] >>> Sent: 02 April 2014 12:14 >>> To: praveen malviya >>> Cc: opensaf-devel@lists.sourceforge.net >>> Subject: Re: [devel] [PATCH 1 of 1] amfd: return AIS_OK instead of BAD_OP >>> for >>> si-swap admin op [#823] >>> >>> Well my view point still stands. By separating the su_switch from >>> invocation_id >>> we introduce even more sub-states and we already have enough of those! >>> Basically the same patch as you have with a small addition causes the admin >>> response to be sent after recovery of the failed SU. Will send a patch. >>> /Hans >>> >>> On 04/02/2014 08:28 AM, praveen malviya wrote: >>>> Any update/comment on this. >>>> >>>> Thanks, >>>> Praveen >>>> On 01-Apr-14 2:27 PM, praveen malviya wrote: >>>>> On 01-Apr-14 1:27 PM, Hans Feldt wrote: >>>>>> On 03/28/2014 02:14 PM, praveen.malv...@oracle.com wrote: >>>>>>> osaf/services/saf/amf/amfd/sg_2n_fsm.cc | 23 +++++++++++++--------- >>> - >>>>>>> 1 files changed, 13 insertions(+), 10 deletions(-) >>>>>>> >>>>>>> >>>>>>> Problem: During si-swap if quiesced assignment faults, AMF returns >>>>>>> BAD_OPERATION even though si-swap completes successfuly after >>> recovery. >>>>>>> Reason: During si-swap, AMF sends quiesced assignment to active SU. >>>>>>> During quiesced assignments, one of the components faults. >>>>>>> AMF performs cleanup of this failed component and perfroms the >>>>>>> failover of assignments to standby SU. For the faulted SU >>>>>>> assignments are deleted and when it gets sucessfully repaired, AMF >>>>>>> assigns it with standby assignments. >>>>>>> Thus si-swap operation eventually gets successful, but AMF returns >>>>>>> BAD_OPERATION for the operation. >>>>>>> >>>>>>> Fix: Since AMF performs si-swap successfuly despite fault in >>>>>>> quiesced state, this patch ensures that AMF returns SA_AIS_OK for >>>>>>> the operation. >>>>>>> >>>>>>> diff --git a/osaf/services/saf/amf/amfd/sg_2n_fsm.cc >>>>>>> b/osaf/services/saf/amf/amfd/sg_2n_fsm.cc >>>>>>> --- a/osaf/services/saf/amf/amfd/sg_2n_fsm.cc >>>>>>> +++ b/osaf/services/saf/amf/amfd/sg_2n_fsm.cc >>>>>>> @@ -888,16 +888,6 @@ static uint32_t avd_sg_2n_su_fault_su_op >>>>>>> if (su->sg_of_su->su_oper_list.su == su) { >>>>>>> su_ha_state = avd_su_state_determine(su); >>>>>>> if (su_ha_state == SA_AMF_HA_QUIESCED) { >>>>>>> - if (su->su_switch == AVSV_SI_TOGGLE_SWITCH) { >>>>>>> - AVD_SU_SI_REL *temp_susi; >>>>>>> - for (temp_susi = su->list_of_susi; temp_susi != >>>>>>> NULL; temp_susi = temp_susi->su_next) { >>>>>>> - if (temp_susi->si->invocation != 0) { >>>>>>> - avd_saImmOiAdminOperationResult(cb->immOiHandle, >>>>>>> - temp_susi->si->invocation, SA_AIS_ERR_BAD_OPERATION); >>>>>>> - temp_susi->si->invocation = 0; >>>>>>> - } >>>>>>> - } >>>>>>> - } >>>>>>> m_AVD_SET_SU_SWITCH(cb, su, AVSV_SI_TOGGLE_STABLE); >>>>>> Why isn't the above line removed also? >>>>>> >>>>>> This state should be kept until the admin op has been responded, >>>>>> then we forget all about it. >>>>>> >>>>> In the AMF code nowhere we are checking su->su_switch and >>>>> si->invocation simultaneously which means they are kept very much >>>>> independent. So responding for the admin operation is not very much >>>>> dependent on the toggling of su_switch. So we can keep the above >>>>> line. >>>>>>> } else if (su_ha_state == SA_AMF_HA_QUIESCING) { >>>>>>> if (avd_sidep_si_dependency_exists_within_su(su)) { >>>>>>> @@ -2095,6 +2085,19 @@ static uint32_t avd_sg_2n_susi_sucss_su_ >>>>>>> } >>>>>>> >>>>>>> m_AVD_SET_SG_FSM(cb, (su->sg_of_su), >>>>>>> AVD_SG_FSM_SG_REALIGN); >>>>>>> + >>>>>>> + if (su->su_switch == AVSV_SI_TOGGLE_STABLE) { >>>>>> if the above line is removed, this code won't get triggered. The >>>>>> same needs to be placed somewhere else. >>>>>> >>>>> I think if "if statement" is removed, the code inside it will always >>>>> be executed which we do not want, just to not break existing >>>>> functionality. >>>>> I did not get what do you mean by removing and placing it somewhere else. >>>>> >>>>> Thanks >>>>> Praveen >>>>> >>>>>>> + for (AVD_SU_SI_REL *temp_susi = >>>>>>> su->list_of_susi; >>>>>>> + temp_susi != NULL; >>>>>>> + temp_susi = temp_susi->su_next) { >>>>>>> + if (temp_susi->si->invocation != 0) { >>>>>>> + avd_saImmOiAdminOperationResult(cb->immOiHandle, >>>>>>> + temp_susi->si->invocation, SA_AIS_OK); >>>>>>> + temp_susi->si->invocation = 0; >>>>>>> + } >>>>>>> + } >>>>>>> + } >>>>>>> + >>>>>>> } >>>>>>> } else if ((act == AVSV_SUSI_ACT_MOD) && (state == >>>>>>> SA_AMF_HA_STANDBY) && >>>>>>> (su->sg_of_su->su_oper_list.su == su)) { >>>>>>> >>>>>>> >>>>> --------------------------------------------------------------------- >>>>> --------- _______________________________________________ >>>>> Opensaf-devel mailing list >>>>> Opensaf-devel@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>> >>>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> Opensaf-devel mailing list >>> Opensaf-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel