Hi Praveen,

     The only admin operations that are going on are inside amfd due to 
CLM tracking.  No operator is issuing admin lock commands.

     Here is the admin lock start:

May 26 14:27:57.501241 osafamfd [22769:node.cc:0983] TR su_cnt_admin_oper:6
May 26 14:27:57.501246 osafamfd [22769:node.cc:1007] << 
avd_node_admin_lock_unlock_shutdown
May 26 14:27:57.501250 osafamfd [22769:clm.cc:0175] << clm_node_exit_start
May 26 14:27:57.501254 osafamfd [22769:clm.cc:0379] << clm_track_cb

     Here MDS notifies amfd that the node is down.  Node gets deleted here:

May 26 14:28:00.218238 osafamfd [22769:ndfsm.cc:0324] >> 
avd_mds_avnd_down_evh: 2010f, 0x72d160
May 26 14:28:00.218253 osafamfd [22769:ndproc.cc:0923] >> 
avd_node_failover: 'safAmfNode=SC-1,safAmfCluster=Q50amfCluster'

     And, when the SA_CLM_CHANGE_COMPLETED comes the node has already 
been deleted:

May 26 14:28:00.265919 osafamfd [22769:clm.cc:0213] >> clm_track_cb: '0' 
'4' '1'
May 26 14:28:00.265930 osafamfd [22769:clm.cc:0273] IN clm_track_cb: CLM 
node 'safNode=SC-1,safCluster=Q50clmCluster' is not an AMF cluster member
May 26 14:28:00.265938 osafamfd [22769:clm.cc:0379] << clm_track_cb

     Do we consider the CLM operation to have completed when the 
CHANGE_COMPLETED callback comes?

     How do we handle this case?

Alex

On 05/27/2015 09:09 AM, praveen malviya wrote:
>
>
> On 27-May-15 2:58 AM, Alex Jones wrote:
>> Praveen/Nagu,
>>
>>       I'm seeing an issue where the node admin state is different 
>> between
>> IMM and amfd.  I can reproduce this very consistently.
>>
>>       If I power down the standby controller (which is also hosting 
>> other
>> standby SUs), when it comes back up amfd still thinks the admin state is
>> locked, even though IMM does not.  When I am in this state, if I try to
>> force the admin change, I see:
>>
>> imm.cc:1756] >> report_admin_op_error: inv:124554051585, res:6, Error
>> String: 'Clm lock operation going on'
>>
> Before bringing down the node, did admin issue lock on clm node?
>
> I think node was powered down before the completion of  CLM lock.
>
> Thanks
> Praveen
>>       After looking at the code and the traces, it appears that the
>> ClmResponse to clm_node_exit_start() is never sent.
>> node->su_cnt_admin_oper is 6 which is correct (the number of the SUs),
>> so it waits to send the clm response.
>>
>>       I thought maybe we needed to add this to the end of
>> avd_node_down_mw_susi_failover():
>>
>>          if (avnd->clm_pend_inv != 0) {
>>                   // send CLM response
>>                   LOG_NO("sending CLM response due to node fail");
>>                   saClmResponse_4(cb->clmHandle, avnd->clm_pend_inv,
>> SA_CLM_CALLBACK_RESPONSE_OK);
>>                   avnd->clm_pend_inv = 0;
>>           }
>>
>>       If I add this code, this doesn't totally clear the problem.  I
>> still have to manually unlock the amf node when it comes back up.
>>
>>       How is this supposed to work?
>>
>> Alex
>>
>>
>> ------------------------------------------------------------------------------
>>  
>>
>> _______________________________________________
>> Opensaf-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/opensaf-users
>>
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to