ack, code review only/Thanks HansN On 10/05/2015 12:47 PM, [email protected] wrote: > osaf/services/saf/amf/amfd/sgproc.cc | 21 +++++++++++++++++++++ > 1 files changed, 21 insertions(+), 0 deletions(-) > > > NG gets stuck in SHUTTING_DOWN state during shutdown op and controller > failover. > > During SHUTDOWN admin operation on NG, initial admin state is set to > SHUTTING_DOWN and > it is checkpointed to standby AMFD. On decoding it, standby AMFD sets > node->admin_ng > and it clears it when active AMFD checkpoints the LOCKED state. Now after > fail-over when > AMFD gets quiescing success response from AMFND it clears this pointer in > process_su_si_response_for_ng() assuming there is only one SU hosted on that > node. > After this when response for second SU comes, this response is not processed > from NG > perspective as AMFD has already cleared node->admin_ng. Issue does not occur > when node hosts > only one application SU. > > Patch fixes the problem by avoiding clearing of node->admin_ng when NG is in > SHUTTING_DOWN state. > > diff --git a/osaf/services/saf/amf/amfd/sgproc.cc > b/osaf/services/saf/amf/amfd/sgproc.cc > --- a/osaf/services/saf/amf/amfd/sgproc.cc > +++ b/osaf/services/saf/amf/amfd/sgproc.cc > @@ -400,6 +400,27 @@ void process_su_si_response_for_ng(AVD_S > ng->node_oper_list.erase(Amf::to_string(&node->name)); > TRACE("node_oper_list size:%u",ng->oper_list_size()); > } > + > + /*Handling for the case: There are pending assignments on more than one > SUs > + on same node of nodegroup with atleast one quiescing assignment and > controller > + failover occured. > + Below if block will be hit only when assignments for quiescing state > are still pending > + on atleast one SU and on atleast one node of NG. > + */ > + if ((ng->saAmfNGAdminState == SA_AMF_ADMIN_SHUTTING_DOWN) && > + (ng->admin_ng_pend_cbk.admin_oper == 0) && > + (ng->admin_ng_pend_cbk.invocation == 0)) { > + /*During SHUTDOWN admin operation on NG, initial admin state is > set to SHUTTING_DOWN > + and it is checkpointed to standby AMFD. On decoding it, > standby AMFD sets > + node->admin_ng and it clears it when active AMFD checkpoints > the LOCKED state. > + In case active AMFD sends quiescing state and reboots after > checkpointing only > + SHUTTING_DOWN state, standby AMFD will be able to mark NG > LOCKED by processing > + response of assignments as it has set node->admin_ng. So this > pointer should be > + cleared only when NG is marked LOCKED. And in that case we > will not be in this if block. > + */ > + TRACE_1("'%s' in shutting_down state after > failover.",ng->name.value); > + goto done; > + } > /*If assignment changes are done on all the SUs on each node of > nodegroup > then reply to IMM for status of admin operation.*/ > if (ng->node_oper_list.empty())
------------------------------------------------------------------------------ _______________________________________________ Opensaf-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-devel
