- **status**: unassigned --> assigned
- **assigned_to**: Nagendra Kumar


---

** [tickets:#2338] amfd got crashed while changing role from queised to active**

**Status:** assigned
**Milestone:** 5.2.RC1
**Created:** Fri Mar 03, 2017 05:41 AM UTC by Ritu Raj
**Last Updated:** Fri Mar 03, 2017 05:42 AM UTC
**Owner:** Nagendra Kumar
**Attachments:**

- 
[osafamfd.tgz](https://sourceforge.net/p/opensaf/tickets/2338/attachment/osafamfd.tgz)
 (2.8 MB; application/octet-stream)
- 
[syslog.7z](https://sourceforge.net/p/opensaf/tickets/2338/attachment/syslog.7z)
 (649.4 kB; application/octet-stream)


#Environment details
OS : Suse 64bit
Changeset : 8634 ( 5.2.FC)
Setup : 4 nodes ( 2 controllers and 2 payloads with 1PBE enabled )


#Summary
amfd got crashed while changing role from queised to active

#Steps followed & Observed behaviour
   1. Invoke switchovers
   2. After few successfull switchovers, SC-1 got Active role and SC-2 got 
standby role.
   3. Invoke one more switchover where SC-1 got queised role and 
        SC-2 successfully become active after this cpd got crashed(SC-2) while 
SC-1 changing role from queised to active amfd got crashed on SC-1, resulted 
into cluster reset

>>For CPD crash refer ticket #2337

Syslog of SC-1:
Mar  2 14:12:00 TestBed-R1 osafimmnd[2138]: NO PBE-OI established on this SC. 
Dumping incrementally to file imm.db
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER Impl Set Failed for 
SaAmfNodeSwBundle, returned 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: ER avd_imm_applier_set FAILED, 5
Mar  2 14:12:03 TestBed-R1 osafamfd[2178]: src/amf/amfd/role.cc:807: 
avd_mds_qsd_role_evh: Assertion '0' failed.
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: ER AMFD has unexpectedly crashed. 
Rebooting node
Mar  2 14:12:03 TestBed-R1 osafamfnd[2188]: Rebooting OpenSAF NodeId = 131343 
EE Name = , Reason: AMFD has unexpectedly crashed. Rebooting node, OwnNodeId = 
131343, SupervisionTime = 60



BT:
(gdb) thread apply all bt

Thread 4 (Thread 0x7f2e04fe4b00 (LWP 2182)):
0  0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1  0x00007f2e0414633b in osaf_ppoll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout_ts=0x7f2e04fe4180, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x00007f2e04146261 in osaf_poll (io_fds=0x7f2e04fe41c0, i_nfds=1, 
i_timeout=30000) at src/base/osaf_poll.c:44
3  0x00007f2e04146430 in osaf_poll_one_fd (i_fd=15, i_timeout=30000) at 
src/base/osaf_poll.c:128
4  0x00007f2e0418d360 in rda_read_msg (sockfd=15, msg=0x7f2e04fe4260 "10 1", 
size=64) at src/rde/agent/rda_papi.cc:673
5  0x00007f2e0418cb40 in rda_callback_task (rda_callback_cb=0x7f2e0549c440) at 
src/rde/agent/rda_papi.cc:150
6  0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
7  0x00007f2e034209cd in clone () from /lib64/libc.so.6
8  0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f2e05004b00 (LWP 2181)):
0  0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1  0x00007f2e04188958 in mdtm_process_recv_events () at 
src/mds/mds_dt_tipc.c:669
2  0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
3  0x00007f2e034209cd in clone () from /lib64/libc.so.6
4  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f2e0503ab00 (LWP 2180)):
0  0x00007f2e034174f6 in poll () from /lib64/libc.so.6
1  0x00007f2e0414633b in osaf_ppoll (io_fds=0x7f2e0503a270, i_nfds=1, 
i_timeout_ts=0x7f2e0503a2a0, i_sigmask=0x0) at src/base/osaf_poll.c:105
2  0x00007f2e04150604 in ncs_tmr_wait () at src/base/sysf_tmr.c:406
3  0x00007f2e036c47b6 in start_thread () from /lib64/libpthread.so.0
4  0x00007f2e034209cd in clone () from /lib64/libc.so.6
5  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f2e05007720 (LWP 2178)):
0  0x00007f2e0337bb55 in raise () from /lib64/libc.so.6
1  0x00007f2e0337d131 in abort () from /lib64/libc.so.6
2  0x00007f2e0414b6e7 in __osafassert_fail (__file=0x7f2e05215e2f 
"src/amf/amfd/role.cc", __line=807,
    __func=0x7f2e05216c90 <avd_mds_qsd_role_evh(cl_cb_tag*, 
AVD_EVT*)::__FUNCTION__> "avd_mds_qsd_role_evh", __assertion=0x7f2e05216548 "0")
    at src/base/sysf_def.c:281
3  0x00007f2e05182755 in avd_mds_qsd_role_evh (cb=0x7f2e054640c0 
<_control_block>, evt=0x7f2dfc000b20) at src/amf/amfd/role.cc:807
4  0x00007f2e05156536 in process_event (cb_now=0x7f2e054640c0 <_control_block>, 
evt=0x7f2dfc000b20) at src/amf/amfd/main.cc:811
5  0x00007f2e051560ee in main_loop () at src/amf/amfd/main.cc:702
6  0x00007f2e051566fd in main (argc=2, argv=0x7fff5826f318) at 
src/amf/amfd/main.cc:861
(gdb)





Notes:
1. Syslog of both controller's attached
2. amfd bt attached
3. amfd trace attached

Both nodes are not in time sysnc, there is time gap between two nodes
Relative to SC-2, SC-1 is (+50 min ahead)
Time Diff
==========
TestBed-R1:~  date
Thu Mar 2 16:34:45 IST 2017
TestBed-R2:~  date
Thu Mar 2 15:44:30 IST 2017
=========


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to