Praveen,

     (Moving this to devel list.)

     I should also point out that this is due to PLM blade shutdown. 
PLM->CLM->AMF.

     The code doesn't look like this PLM case is handled correctly. Do 
we really want to change admin state for this?  I guess there is no 
other way to do it in the code.  It's not really an admin operation.  
And I don't see where the admin state gets changed back for this case.

 1. Does it make sense to check in clm.cc:clm_track_cb for "safEE="
    under NODE_LEFT || NODE_SHUTDOWN, and save the
    node->saAmfNodeAdminState, and then restore it when
    clm_node_exit_start finishes?
 2. And then add the below code in avd_node_down_mw_susi_failover?


Alex


On 05/27/2015 09:09 AM, praveen malviya wrote:
>
>
> On 27-May-15 2:58 AM, Alex Jones wrote:
>> Praveen/Nagu,
>>
>>       I'm seeing an issue where the node admin state is different 
>> between
>> IMM and amfd.  I can reproduce this very consistently.
>>
>>       If I power down the standby controller (which is also hosting 
>> other
>> standby SUs), when it comes back up amfd still thinks the admin state is
>> locked, even though IMM does not.  When I am in this state, if I try to
>> force the admin change, I see:
>>
>> imm.cc:1756] >> report_admin_op_error: inv:124554051585, res:6, Error
>> String: 'Clm lock operation going on'
>>
> Before bringing down the node, did admin issue lock on clm node?
>
> I think node was powered down before the completion of  CLM lock.
>
> Thanks
> Praveen
>>       After looking at the code and the traces, it appears that the
>> ClmResponse to clm_node_exit_start() is never sent.
>> node->su_cnt_admin_oper is 6 which is correct (the number of the SUs),
>> so it waits to send the clm response.
>>
>>       I thought maybe we needed to add this to the end of
>> avd_node_down_mw_susi_failover():
>>
>>          if (avnd->clm_pend_inv != 0) {
>>                   // send CLM response
>>                   LOG_NO("sending CLM response due to node fail");
>>                   saClmResponse_4(cb->clmHandle, avnd->clm_pend_inv,
>> SA_CLM_CALLBACK_RESPONSE_OK);
>>                   avnd->clm_pend_inv = 0;
>>           }
>>
>>       If I add this code, this doesn't totally clear the problem.  I
>> still have to manually unlock the amf node when it comes back up.
>>
>>       How is this supposed to work?
>>
>> Alex
>>
>>
>> ------------------------------------------------------------------------------
>>  
>>
>> _______________________________________________
>> Opensaf-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/opensaf-users
>>
>

------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to