- **status**: review --> fixed
- **Comment**:
changeset: 8736:c3c90b5fb832
branch: opensaf-5.0.x
parent: 8732:ea44141c05ee
user: Nagendra Kumar<nagendr...@oracle.com>
date: Thu Mar 30 10:17:41 2017 +0530
summary: amfd: handle BAD_HANDLE return during config read [#2361]
changeset: 8737:f9a5a957c16a
branch: opensaf-5.1.x
parent: 8733:be2fd9824bc4
user: Nagendra Kumar<nagendr...@oracle.com>
date: Thu Mar 30 10:18:05 2017 +0530
summary: amfd: handle BAD_HANDLE return during config read [#2361]
changeset: 8738:a10d52313ef5
tag: tip
parent: 8735:68a5e668f807
user: Nagendra Kumar<nagendr...@oracle.com>
date: Thu Mar 30 10:18:25 2017 +0530
summary: amfd: handle BAD_HANDLE return during config read [#2361]
[staging:c3c90b]
[staging:f9a5a9]
[staging:a10d52]
---
** [tickets:#2361] AMFD: amfd crashed with healthCheckcallbackTimeout causing
both controllers to reboot**
**Status:** fixed
**Milestone:** 5.0.2
**Created:** Fri Mar 10, 2017 09:08 AM UTC by Chani Srivastava
**Last Updated:** Tue Mar 14, 2017 10:42 AM UTC
**Owner:** Nagendra Kumar
**Environment details**
OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )
**Step**
1. Bringu opensaf on four nodes and create a load of 1 lakh objects
2. Imm test cases running on standby controller
SC-1 syslog
Mar 7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' recovery action escalated
from 'componentFailover' to 'suFailover'
Mar 7 19:45:58 OSAF-SC1 osafamfnd[4720]: NO
'safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to
'healthCheckcallbackTimeout' : Recovery is 'suFailover'
**Mar 7 19:45:58 OSAF-SC1 osafamfnd[4720]: ER
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due
to:healthCheckcallbackTimeout Recovery is:suFailover
Mar 7 19:45:58 OSAF-SC1 osafamfnd[4720]: Rebooting OpenSAF NodeId = 131343 EE
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId =
131343, SupervisionTime = 60**
Mar 7 19:45:58 OSAF-SC1 opensaf_reboot: Rebooting local node; timeout=60
SC-2 syslog
Mar 7 19:41:00 OSAF-SC2 osafamfd[4339]: ER Failed to read configuration, AMF
will not start
Mar 7 19:41:00 OSAF-SC2 osafamfd[4339]: ER avd_imm_config_get FAILED
**Mar 7 19:41:00 OSAF-SC2 osafamfnd[4349]: ER AMFD has unexpectedly crashed.
Rebooting node**
Mar 7 19:41:00 OSAF-SC2 osafamfnd[4349]: Rebooting OpenSAF NodeId = 131599 EE
Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId =
131599, SupervisionTime = 60
Mar 7 19:41:00 OSAF-SC2 opensaf_reboot: Rebooting local node; timeout=60
amfd, immnd and immd traces are shared seperately as those are huge in size
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets