- Description has changed:

Diff:

~~~~

--- old
+++ new
@@ -60,3 +60,16 @@
 Apr  6 11:10:05 OEL_M-SLOT-4 kernel: Initializing cgroup subsys cpuset
 Apr  6 11:10:05 OEL_M-SLOT-4 kernel: Initializing cgroup subsys cpu
 Apr  6 11:10:05 OEL_M-SLOT-4 kernel: Linux version 2.6.32.67 (root@SLOT-1) 
(gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Jul 29 
11:31:39 I
+
+
+
+-----
+
+When the sync server fails to send sync-finalize message, it keeps waiting for 
the sync-finalize message to come over fevs.
+
+Apr  6  7:55:35.719179 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0
+Apr  6  7:55:35.819531 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0
+Apr  6  7:55:35.919867 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0
+Apr  6  7:55:36.020219 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0
+
+So it's stuck in IMM_SERVER_SYNC_SERVER state and can't handle new sync 
request from newly joined nodes. That's why PL-4 couldn't join the cluster. The 
coordinator can only announce sync when it's in IMM_SERVER_READY state.

~~~~




---

** [tickets:#1735] imm: IMMND doesn't handle error when sending finalize-sync 
message over fevs**

**Status:** accepted
**Milestone:** 4.6.2
**Created:** Thu Apr 07, 2016 04:54 AM UTC by Madhurika Koppula
**Last Updated:** Wed Apr 13, 2016 08:10 AM UTC
**Owner:** Hung Nguyen
**Attachments:**

- 
[immnd.tgz](https://sourceforge.net/p/opensaf/tickets/1735/attachment/immnd.tgz)
 (41.2 MB; application/octet-stream)


Setup:
Changeset- 7436
Version - opensaf 5.0
4 nodes configured with single PBE

Issue observed: 

1)It is observed after the scenario mentioned in ticket #1733

2) After reboot, Immnd failed to come up stating that NODE STATE-> 
IMM_NODE_ISOLATED and could not respawn properly.

3) PL-4 node didnot join the cluster after the multiple starts until cluster 
reset.


Following is the timestamp of payload PL-4:

Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: Started
Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: NO Persistent Back-End capability 
configured, Pbe file:imm.db (suffix may get added)
Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: NO SETTING COORD TO 0 CLOUD PROTO
Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Apr  6 10:52:52 OEL_M-SLOT-4 osafimmnd[1646]: NO NODE STATE-> IMM_NODE_ISOLATED


Apr  6 11:00:52 OEL_M-SLOT-4 opensafd[1616]: ER Timed-out for response from 
IMMND
Apr  6 11:00:52 OEL_M-SLOT-4 opensafd[1616]: ER
Apr  6 11:00:52 OEL_M-SLOT-4 opensafd[1616]: ER Going for recovery
Apr  6 11:00:52 OEL_M-SLOT-4 opensafd[1616]: ER Trying To RESPAWN 
/usr/lib64/opensaf/clc-cli/osaf-immnd attempt #1
Apr  6 11:00:52 OEL_M-SLOT-4 opensafd[1616]: ER Sending SIGABRT to IMMND, 
pid=1646, (origin parent pid=1636)
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: Started
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO Persistent Back-End capability 
configured, Pbe file:imm.db (suffix may get added)
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO Fevs count adjusted to 51312 
preLoadPid: 0
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO SETTING COORD TO 0 CLOUD PROTO
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Apr  6 11:01:07 OEL_M-SLOT-4 osafimmnd[1808]: NO NODE STATE-> IMM_NODE_ISOLATED


Apr  6 11:09:07 OEL_M-SLOT-4 opensafd[1616]: ER Timed-out for response from 
IMMND
Apr  6 11:09:07 OEL_M-SLOT-4 opensafd[1616]: ER Could Not RESPAWN IMMND
Apr  6 11:09:07 OEL_M-SLOT-4 opensafd[1616]: ER
Apr  6 11:09:07 OEL_M-SLOT-4 opensafd[1616]: ER Trying To RESPAWN 
/usr/lib64/opensaf/clc-cli/osaf-immnd attempt #2
Apr  6 11:09:07 OEL_M-SLOT-4 opensafd[1616]: ER Sending SIGABRT to IMMND, 
pid=1808, (origin parent pid=1802)
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: Started
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: NO Persistent Back-End capability 
configured, Pbe file:imm.db (suffix may get added)
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: NO IMMD service is UP ... 
ScAbsenseAllowed?:0 introduced?:0
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: NO SERVER STATE: 
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: NO SETTING COORD TO 0 CLOUD PROTO
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Apr  6 11:09:22 OEL_M-SLOT-4 osafimmnd[1976]: NO NODE STATE-> IMM_NODE_ISOLATED


Apr  6 11:10:05 OEL_M-SLOT-4 kernel: imklog 5.8.10, log source = /proc/kmsg 
started.
Apr  6 11:10:05 OEL_M-SLOT-4 rsyslogd: [origin software="rsyslogd" 
swVersion="5.8.10" x-pid="1217" x-info="http://www.rsyslog.com";] start
Apr  6 11:10:05 OEL_M-SLOT-4 kernel: Initializing cgroup subsys cpuset
Apr  6 11:10:05 OEL_M-SLOT-4 kernel: Initializing cgroup subsys cpu
Apr  6 11:10:05 OEL_M-SLOT-4 kernel: Linux version 2.6.32.67 (root@SLOT-1) (gcc 
version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Jul 29 11:31:39 I



-----

When the sync server fails to send sync-finalize message, it keeps waiting for 
the sync-finalize message to come over fevs.

Apr  6  7:55:35.719179 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0
Apr  6  7:55:35.819531 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0
Apr  6  7:55:35.919867 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0
Apr  6  7:55:36.020219 osafimmnd [3728:immnd_proc.c:0840] TR Coord: Sync done, 
but waiting for confirmed finalizeSync. Out queue:0

So it's stuck in IMM_SERVER_SYNC_SERVER state and can't handle new sync request 
from newly joined nodes. That's why PL-4 couldn't join the cluster. The 
coordinator can only announce sync when it's in IMM_SERVER_READY state.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to