- **Milestone**: 4.7.2 --> 5.0.2


---

** [tickets:#1935] amfnd: amfnd is not resetting escalations params.**

**Status:** review
**Milestone:** 5.0.2
**Created:** Thu Aug 04, 2016 12:51 PM UTC by Praveen
**Last Updated:** Fri Sep 02, 2016 05:36 AM UTC
**Owner:** Praveen
**Attachments:**

- 
[add_su.xml](https://sourceforge.net/p/opensaf/tickets/1935/attachment/add_su.xml)
 (2.4 kB; text/xml)
- 
[messages](https://sourceforge.net/p/opensaf/tickets/1935/attachment/messages) 
(79.5 kB; application/octet-stream)
- 
[nodeswitch.xml](https://sourceforge.net/p/opensaf/tickets/1935/attachment/nodeswitch.xml)
 (9.5 kB; text/xml)
- 
[osafamfd](https://sourceforge.net/p/opensaf/tickets/1935/attachment/osafamfd) 
(11.0 MB; application/octet-stream)
- 
[osafamfnd](https://sourceforge.net/p/opensaf/tickets/1935/attachment/osafamfnd)
 (3.7 MB; application/octet-stream)


When AMFND sends node-switchover recovery request to AMFD, it performs 
failover/switchover of SUs on the failed node. After removal of assignments of 
all application SUs on the failed node. AMFD reboots the node if NodeAutoRepair 
is enabled. 
Consider the case when NodeAutoRepair is not enabled. In this case, AMFD will 
node reboot the failed node. 
So this failed node is present and in repair pending state. Now a user perfoms 
floowing operations:
1)Deletes the failed SU from the failed node along with the comps after lock 
and lock-in the SU. 
2)Adds agains same SU and comp
3)Performs unlock-in operations.
Because of unlock-in operation, AMFND at failed node will instantiate the SU. 
This will act as trigger and AMFND in avnd_su_pres_fsm_run() will try to inform 
for node-switchover again since AMFND has not clear some global variables 
related to escalation. Here AMFND may crash. In once case crash was observed 
and in another case, AMFND was successful to send to AMFD recovery request 
again.
In the crashed case, AMFND is accessing illegal memory. 
In successful case SU had got same address :
Aug  4 17:42:13.658513 osafamfnd [14409:susm.cc:1412] NO Informing director of 
Nodeswitchover
Aug  4 17:42:13.658519 osafamfnd [14409:di.cc:0724] >> avnd_di_oper_send: SU 
'0x2534350', recv '4'
Aug  4 17:42:13.658525 osafamfnd [14409:di.cc:0737] TR SU 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1, su oper '2'

Aug  4 17:42:58.449067 osafamfnd [14409:susm.cc:1412] NO Informing director of 
Nodeswitchover
Aug  4 17:42:58.449080 osafamfnd [14409:di.cc:0724] >> avnd_di_oper_send: SU 
'0x2534350', recv '4'
Aug  4 17:42:58.449093 osafamfnd [14409:di.cc:0737] TR SU 
'safSu=SU1,safSg=AmfDemo,safApp=AmfDemo1, su oper '1'

Attached is the configuration and traces, steps to reproduce:
1)Bring up the configuration by disabling nodeautorepair flag.
2)Kill comp in SU1.
3)Lock and lock-in the SU and the delete it.
4)Now add su again by using the attached add_su.xml.
5)Unlock-in the SU.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to