For now only the excessive assignment in NwayActive and NoRed are removed. For
2N, once the 2N SG is detected it has any excessive assignment, any node that
has assignment in such 2N SG will be rebooted. In the scenario of ticket's
description, all payloads have six 2N assignment, thus all payloads are
rebooted. This is to align with behavior of #2920, which reboot the nodes
having 2N duplicated assignments.
---
** [tickets:#2929] amfd: Too many assignments after split brain**
**Status:** accepted
**Milestone:** 5.18.09
**Created:** Wed Sep 19, 2018 12:01 PM UTC by Minh Hon Chau
**Last Updated:** Wed Sep 26, 2018 07:32 PM UTC
**Owner:** Minh Hon Chau
**Attachments:**
-
[amfd.tgz](https://sourceforge.net/p/opensaf/tickets/2929/attachment/amfd.tgz)
(404.2 kB; application/x-compressed-tar)
In ticket #2920, amfd has rebooted the nodes that have duplicated 2N
assignments after split-brain. However, if there are spare 2N SUs hosted on
each payloads, amfd does not reboot the nodes that have duplicated assignments
and those assignments are remained intact after split-brain
Configuration:
- start cluster with 2 controllers + 6 payloads
- SC absence is enabled
- 6 SUs hosted on 6 payloads respectively, i.e SU1 hosted on PL3, SU2 on PL4,
...
- SU1 and SU2 initially are given 2N Active/Standby assignments
- Split network to separate PL3 and PL4 apart from cluster. Since SC absence is
enabled, PL3 and PL4 don't reboot
- Now SU3 (PL5) and SU4(PL6) are given 2N Active/Standby assignments
- Split network two partitions, [SC1, PL5, PL6] and [SC2, PL7, PL8]
- The second partition has SC2 becomes active, so the SUs in PL7 and PL8 are
given assignments.
- Restart SCs.
- After both SCs come back from reboot, there are totally 3 Active(s) and 3
Standby(s) assignments
Synced from headless
~~~
2018-09-19 19:00:29.857 PL-3 osafamfnd[193]: NO Synced
SU:safSu=1,safSg=1,safApp=osaftest <0, 1, 3>
2018-09-19 19:00:29.857 PL-3 osafamfnd[193]: NO Synced
SISU:safSi=NoRed4,safApp=OpenSAF,safSu=PL-3,safSg=NoRed,safApp=OpenSAF <1, 3>
2018-09-19 19:00:29.857 PL-4 osafamfnd[193]: NO Synced
SISU:safSi=A,safApp=osaftest,safSu=2,safSg=1,safApp=osaftest <2, 3>
2018-09-19 19:00:29.857 PL-4 osafamfnd[193]: NO Synced
SU:safSu=2,safSg=1,safApp=osaftest <0, 1, 3>
2018-09-19 21:46:23.318 PL-5 osafamfnd[193]: NO Synced
SISU:safSi=A,safApp=osaftest,safSu=3,safSg=1,safApp=osaftest <1, 3>
2018-09-19 21:46:23.319 PL-5 osafamfnd[193]: NO Synced
SU:safSu=3,safSg=1,safApp=osaftest <0, 1, 3>
2018-09-19 21:46:23.319 PL-6 osafamfnd[193]: NO Synced
SISU:safSi=A,safApp=osaftest,safSu=4,safSg=1,safApp=osaftest <2, 3>
2018-09-19 21:46:23.319 PL-6 osafamfnd[193]: NO Synced
SU:safSu=4,safSg=1,safApp=osaftest <0, 1, 3>
2018-09-19 21:46:23.318 PL-8 osafamfnd[193]: NO Synced
SISU:safSi=A,safApp=osaftest,safSu=6,safSg=1,safApp=osaftest <1, 3>
2018-09-19 21:46:23.318 PL-8 osafamfnd[193]: NO Synced
SU:safSu=6,safSg=1,safApp=osaftest <0, 1, 3>
2018-09-19 21:46:23.320 PL-7 osafamfnd[193]: NO Synced
SISU:safSi=A,safApp=osaftest,safSu=5,safSg=1,safApp=osaftest <2, 3>
2018-09-19 21:46:23.320 PL-7 osafamfnd[193]: NO Synced
SU:safSu=5,safSg=1,safApp=osaftest <0, 1, 3>
In avd_sg_2n_act_susi(), amfd always picks the first two assignments, which are
correctly active and standby, thus amfd didn't reboot the nodes
~~~
~~~
<143>1 2018-09-19T21:46:26.819976+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33675"] 262:amf/amfd/sg_2n_fsm.cc:677 >> avd_sg_2n_su_chose_asgn:
'safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.819982+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33676"] 262:amf/amfd/si_dep.cc:711 >>
avd_sidep_update_si_dep_state_for_all_sis: 'safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.819989+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33677"] 262:amf/amfd/si_dep.cc:718 <<
avd_sidep_update_si_dep_state_for_all_sis
<143>1 2018-09-19T21:46:26.819995+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33678"] 262:amf/amfd/sg_2n_fsm.cc:522 >> avd_sg_2n_act_susi:
'safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820001+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33679"] 262:amf/amfd/sg_2n_fsm.cc:532 TR
si'safSi=A,safApp=osaftest', su'safSu=1,safSg=1,safApp=osaftest',
si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820008+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33680"] 262:amf/amfd/sg_2n_fsm.cc:536 TR
si'safSi=A,safApp=osaftest', su'safSu=2,safSg=1,safApp=osaftest',
si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820014+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33681"] 262:amf/amfd/sg_2n_fsm.cc:550 TR
su_1'safSu=1,safSg=1,safApp=osaftest', su_2'safSu=2,safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820021+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33682"] 262:amf/amfd/sg_2n_fsm.cc:282 >> su_assigned_susi_find:
'safSu=1,safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820027+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33683"] 262:amf/amfd/sg_2n_fsm.cc:288 TR Act
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820033+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33684"] 262:amf/amfd/sg_2n_fsm.cc:303 TR act_found'1',
quisced_found'0', std_found'0'
<143>1 2018-09-19T21:46:26.82004+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33685"] 262:amf/amfd/sg_2n_fsm.cc:312 TR si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820046+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33686"] 262:amf/amfd/sg_2n_fsm.cc:317 TR
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820052+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33687"] 262:amf/amfd/sg_2n_fsm.cc:323 TR
su'safSu=2,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820058+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33688"] 262:amf/amfd/sg_2n_fsm.cc:325 TR Act
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820064+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33689"] 262:amf/amfd/sg_2n_fsm.cc:327 TR Std
su'safSu=2,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820071+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33690"] 262:amf/amfd/sg_2n_fsm.cc:346 TR 3. Act
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820077+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33691"] 262:amf/amfd/sg_2n_fsm.cc:348 TR 3. Std
su'safSu=2,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820083+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33692"] 262:amf/amfd/sg_2n_fsm.cc:493 << su_assigned_susi_find: act
su: 'safSu=1,safSg=1,safApp=osaftest', stdby su:
'safSu=2,safSg=1,safApp=osaftest', si: 'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.82009+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33693"] 262:amf/amfd/sg_2n_fsm.cc:282 >> su_assigned_susi_find:
'safSu=2,safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820097+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33694"] 262:amf/amfd/sg_2n_fsm.cc:297 TR Stdby
su'safSu=2,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820103+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33695"] 262:amf/amfd/sg_2n_fsm.cc:303 TR act_found'0',
quisced_found'0', std_found'1'
<143>1 2018-09-19T21:46:26.820109+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33696"] 262:amf/amfd/sg_2n_fsm.cc:446 TR si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820115+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33697"] 262:amf/amfd/sg_2n_fsm.cc:451 TR
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820121+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33698"] 262:amf/amfd/sg_2n_fsm.cc:466 TR
su'safSu=2,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820128+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33699"] 262:amf/amfd/sg_2n_fsm.cc:468 TR Act
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820134+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33700"] 262:amf/amfd/sg_2n_fsm.cc:470 TR Std
su'safSu=2,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.82014+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33701"] 262:amf/amfd/sg_2n_fsm.cc:480 TR 3. Act
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820146+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33702"] 262:amf/amfd/sg_2n_fsm.cc:482 TR 3. Std
su'safSu=2,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820151+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33703"] 262:amf/amfd/sg_2n_fsm.cc:493 << su_assigned_susi_find: act
su: 'safSu=1,safSg=1,safApp=osaftest', stdby su:
'safSu=2,safSg=1,safApp=osaftest', si: 'safSi=A,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820156+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33704"] 262:amf/amfd/sg_2n_fsm.cc:4127 >> avd_su_state_determine:
SU 'safSu=1,safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.82016+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33705"] 262:amf/amfd/sg_2n_fsm.cc:4152 TR Assigned
su'safSu=1,safSg=1,safApp=osaftest', si'safSi=A,safApp=osaftest', state'1'
<143>1 2018-09-19T21:46:26.820163+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33706"] 262:amf/amfd/sg_2n_fsm.cc:4158 TR act_found'0',
quisced_found'0', quiscing_found'0'
<143>1 2018-09-19T21:46:26.820167+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33707"] 262:amf/amfd/sg_2n_fsm.cc:4175 << avd_su_state_determine:
state '1'
<143>1 2018-09-19T21:46:26.82017+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33708"] 262:amf/amfd/sg_2n_fsm.cc:640 << avd_sg_2n_act_susi: act:
'safSu=1,safSg=1,safApp=osaftest', stdby: 'safSu=2,safSg=1,safApp=osaftest'
<143>1 2018-09-19T21:46:26.820173+10:00 SC-1 osafamfd 262 osafamfd [meta
sequenceId="33709"] 262:amf/amfd/sg_2n_fsm.cc:801 << avd_sg_2n_su_chose_asgn:
'(null)'
~~~
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets