A simple start up sequence after headless as in 2162.seq, 3 major cases that 
SC-1 can down from start up till headless recovery:
1- SC1 goes down before SC2 is successfully indicated as standby controller, 
which happens before standby assignment for 2N Opensaf SU
2- SC2 is standby controller, SC1 goes down before SC2 complete cold sync
3- SC2 completes cold sync, SC1 goes down before cluster initiation is 
done/timeout

If case 2 happens, SC2 is rebooted as expected result today
Case 1 is reproduced, see attached file c1.tgz, the result is that SC2 gets 
stuck in receiving node_up msg from all nodes, the cause is 2N Opensaf SU in 
SC2 could not be assigned as active
Case 3 is primarily reported in this ticket.


Attachments:

- 
[2162.seq](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/f093e418/493c/attachment/2162.seq)
 (1.9 kB; application/octet-stream)


---

** [tickets:#2162] AMF: Headless recovery failed if SC failover during headless 
sync**

**Status:** assigned
**Milestone:** 5.2.FC
**Labels:** headless recovery 
**Created:** Thu Nov 03, 2016 11:01 AM UTC by Minh Hon Chau
**Last Updated:** Thu Nov 03, 2016 11:01 AM UTC
**Owner:** Minh Hon Chau
**Attachments:**

- [log.tgz](https://sourceforge.net/p/opensaf/tickets/2162/attachment/log.tgz) 
(1.4 MB; application/x-compressed)


Test steps:
- Set up 2N assignment, PL4 hosts SU4 (active assignment), PL5 host SU5 
(standby assignment)
- Stop SCs
- Stop PL4
- Restart SC1
- Restart SC2
- Since PL4 is stopped, headless sync will be time out in 10 secs. During this 
10 secs, reboot SC1 to trigger SC failover
Observation: SC2 becomes active controller, cold sync complete, but SU5 still 
has standby assignment.

When SC2 becomes active controller, the part of code that performs headless 
recovery is not executed (function failover_absent_assignment()). Therefore, 
the transient assignments remain after SC failover.

Log/trace are attached.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to