Traces of clmd,amfd and immnd on both controllers,with syslog of all nodes are 
attached


Attachments:

- 
[1762.tgz](https://sourceforge.net/p/opensaf/tickets/_discuss/thread/e65ee23f/1931/attachment/1762.tgz)
 (3.9 MB; application/x-compressed-tar)


---

** [tickets:#1762] CLM : Healthy payloads are marked as Non-member nodes after 
failover**

**Status:** unassigned
**Milestone:** 5.0.RC2
**Created:** Thu Apr 14, 2016 11:08 AM UTC by Srikanth R
**Last Updated:** Thu Apr 14, 2016 11:08 AM UTC
**Owner:** nobody


Setup :
Changeset : 7436 5.0.FC
5 nodes cluster with Application deployed on PL-3 and PL-4.

Issue :
Healthy payloads are marked as Non-member nodes after failover

Steps performed :

 * Started opensaf on all the nodes .i.e SC-1 to PL-5
 * Initially brought up AMF application deployed on PL-3 and PL-4
 * Ran some tests on the setup including switchovers, failovers and  CLM lock 
operations on PL-3 and PL-4.
 * Restarted opensafd on PL-4. After the restart, AMF applications on PL-3 got 
the corresponding standby assignment as per expectation.
          Below is the trace from osafclmd
 Apr 14 14:15:45.621396 osafclmd [6745:clms_ntf.c:0180] TR Notification for CLM 
node safNode=PL-4,safCluster=myClmCluster exit
 Apr 14 14:15:56.548867 osafclmd [6745:clms_ntf.c:0142] TR Notification for CLM 
node safNode=PL-4,safCluster=myClmCluster Join
 
 
 * Similarly restarted opensafd on PL-3 and the AMF application came up fine.
 Apr 14 14:16:00.890903 osafclmd [6745:clms_ntf.c:0180] TR Notification for CLM 
node safNode=PL-3,safCluster=myClmCluster exit
 Apr 14 14:21:41.602270 osafclmd [6745:clms_ntf.c:0142] TR Notification for CLM 
node safNode=PL-3,safCluster=myClmCluster Join
 
 
 * Now induced a failover by killing ckptd on the active controller SC-1.
 
 * SC-2 took active role.
 Apr 14 14:21:44 CONTROLLER-2 osafamfd[22600]: NO FAILOVER StandBy --> Active
 
 * But the two payloads PL-3 and PL-4 are marked as out  of cluster by AMF.  
PL-5 is still part of the cluster

Apr 14 14:21:45 CONTROLLER-2 osafamfd[22600]: NO Node 'PL-4' left the cluster
Apr 14 14:21:45 CONTROLLER-2 osafamfd[22600]: NO Node 'PL-3' left the cluster
Apr 14 14:21:45 CONTROLLER-2 osafamfd[22600]: WA avd_msg_sanity_chk: invalid 
node ID (2030f)

 * Below is the trace from CLMD about PL-3 & PL-4 exit, just after the active 
promotion.
 Apr 14 14:21:45.009100 osafclmd [22590:clms_ntf.c:0180] TR Notification for 
CLM node safNode=PL-4,safCluster=myClmCluster exit
 Apr 14 14:21:45.136368 osafclmd [22590:clms_ntf.c:0180] TR Notification for 
CLM node safNode=PL-3,safCluster=myClmCluster exit
 
 * The AMF applications on PL-3 and PL-4 did not receive any csi removal 
callback during failover, but AMF nodes are marked as disabled &  attribute 
saClmNodeIsMember of the CLM objects PL_3 and PL-4 is set to 0.  Opensafd  
status doesn't show PL-3 and PL-4, 
 
 * The CLM apis on PL-3 and PL-4 failed with ERR_UNAVAILABLE, but not for other 
services like CKPT, MQSV.
 
 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to