4 nodes cluster with single PBE with load of 10k objects.

---

** [tickets:#1835] Imm: Immd helathcheck callback got timed-out on active 
controller when starting opensaf on PL-4 and stopping opensaf on PL-3 
simultaneously.**

**Status:** unassigned
**Milestone:** 4.7.2
**Created:** Tue May 17, 2016 10:27 AM UTC by Madhurika Koppula
**Last Updated:** Tue May 17, 2016 10:27 AM UTC
**Owner:** nobody
**Attachments:**

- 
[messages_SC-1](https://sourceforge.net/p/opensaf/tickets/1835/attachment/messages_SC-1)
 (794.5 kB; application/octet-stream)


Setup:
Changeset- 7613
Version - opensaf 5.0
4 nodes cluster with single PBE.

Reproducible steps:

1) Bring up Active controller, standby controller and any payload PL-3.
2) Now bringup payload Pl-4 and stop opensaf on payload PL-3 during Immnd 
start-up sync of PL-4.

Below is the snippet of Immd helathcheck callback time-out on active controller 
SC-1.


May 17 15:00:25 REG-S1 osafmsgd[11279]: ER saImmOiImplementerSet failed with 
return value=6
May 17 15:01:35 REG-S1 osafimmloadd: ER Too many TRY_AGAIN on saImmOmSearchNext 
- aborting
May 17 15:01:35 REG-S1 osafimmnd[11165]: ER SYNC APPARENTLY FAILED status:1
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO -SERVER STATE: 
IMM_SERVER_SYNC_SERVER --> IMM_SERVER_READY
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO NODE STATE-> 
IMM_NODE_FULLY_AVAILABLE (2761)
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO Epoch set to 8 in ImmModel
May 17 15:01:35 REG-S1 osafimmnd[11165]: NO Coord broadcasting ABORT_SYNC, 
epoch:8

May 17 15:05:13 REG-S1 osafamfnd[11227]: NO SU failover probation timer started 
(timeout: 1200000000000 ns)
May 17 15:05:13 REG-S1 osafamfnd[11227]: NO Performing failover of 
'safSu=SC-1,safSg=2N,safApp=OpenSAF' (SU failover count: 1)

**May 17 15:05:13 REG-S1 osafamfnd[11227]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated 
from 'componentFailover' to 'suFailover'
May 17 15:05:13 REG-S1 osafamfnd[11227]: NO 
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
May 17 15:05:13 REG-S1 osafamfnd[11227]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:healthCheckcallbackTimeout Recovery is:suFailover**

May 17 15:05:13 REG-S1 osafamfnd[11227]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId = 
131343, SupervisionTime = 60
May 17 15:05:13 REG-S1 opensaf_reboot: Rebooting local node; timeout=60
May 17 15:05:17 REG-S1 kernel: [21682.049674] md: stopping all md devices.

Attaching the syslog of Active controller.
Immnd traces are huge to attach.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to