The attr saAmfClusterStartupTimeout currently is set as 10 sec by default. It's 
only started if all NCS SUs of active controller get assigned. In big clusters, 
if this timeout is still set as 10secs, when it times out there are still many 
nodes hasn't joined cluster, many SU out-of-service. AMFD could not start 
assignment when cluster init timeout.
Aug 19 12:32:05.923649 osafamfd [6705:timer.cc:0066] >> avd_start_tmr: 1
Aug 19 12:32:15.987858 osafamfd [6705:cluster.cc:0055] >> 
avd_cluster_tmr_init_evh 

Aug 19 12:32:15.988226 osafamfd [6705:sg_2n_fsm.cc:2808] >> realign: 
'safSg=2N,safApp=ABC-01'
Aug 19 12:32:15.988254 osafamfd [6705:sg_2n_fsm.cc:0606] TR No in service SUs 
available in the SG

Aug 19 12:32:15.988640 osafamfd [6705:sg_2n_fsm.cc:2808] >> realign: 
'safSg=2N,safApp=ABC-02'
Aug 19 12:32:15.988661 osafamfd [6705:sg_2n_fsm.cc:0606] TR No in service SUs 
available in the SG

However, this does not cause any problem in cluster start-up scenario because 
AMFD will also start assignment up on receiving avd_su_oper_state_evh() by 
calling su_insvc(). This happen after a node completes joining cluster. The one 
joins cluster earlier, the better chance that its SU been assigned active.

Also, if all NCS SUs of active controller have not been assigned, the cb state 
is not INIT_DONE, AMFD will reject node_up msg of all other nodes.

In admin operation continuation after headless, AMFD can't do a similiar 
sequence as above, because the way SU has fresh assignment (su_insvc) is 
different from SU continues its pending assignment (susi_success). AMFD needs 
to have all nodes joined cluster before performing a continuation of admin 
operation.


---

** [tickets:#1988] AMF: Admin operation continuation does not work with short 
cluster init timeout**

**Status:** assigned
**Milestone:** 5.1.RC1
**Created:** Wed Aug 31, 2016 12:04 AM UTC by Minh Hon Chau
**Last Updated:** Wed Aug 31, 2016 12:04 AM UTC
**Owner:** Minh Hon Chau


In scenario of admin continuation after headless, if saAmfClusterStartupTimeout 
configures short value, then the admin continuation will initiate when 
saAmfClusterStartupTimeout expires but the SU is still in OUT OF SERVICE. The 
eventual result is failure of admin operation after headless.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to