AMFD and AMFND  traces around this time. In both of these cases cores are 
generated by AMFND and watchdog respectively. In both the cases timer was 
stopped and trading was done around that time.

For AMFD crash:
AMFD traces:
Mar 28 17:12:43.075338 osafamfd [6614:6614:src/mbc/mbcsv_act.c:0412] << 
ncs_mbscv_rcv_decode
Mar 28 17:12:43.075345 osafamfd [6614:6614:src/mbc/mbcsv_util.c:0929] >> 
mbcsv_send_msg: event type: 12
Mar 28 17:12:43.075352 osafamfd [6614:6614:src/mbc/mbcsv_util.c:0954] TR 
NCS_MBCSV_MSG_SYNC_SEND_RSP event
Mar 28 17:12:43.075399 osafamfd [6614:6614:src/mbc/mbcsv_mds.c:0185] >> 
mbcsv_mds_send_msg: sending to vdest:1
Mar 28 17:12:43.075407 osafamfd [6614:6614:src/mbc/mbcsv_mds.c:0218] TR send 
type MDS_SENDTYPE_RRSP:
Mar 28 17:12:43.075576 osafamfd [6614:6614:src/mbc/mbcsv_mds.c:0244] << 
mbcsv_mds_send_msg: success
Mar 28 17:12:43.075599 osafamfd [6614:6614:src/mbc/mbcsv_util.c:0999] << 
mbcsv_send_msg
Mar 28 17:12:43.075606 osafamfd [6614:6614:src/mbc/mbcsv_act.c:0452] << 
ncs_mbcsv_rcv_async_update
Mar 28 17:12:43.075615 osafamfd [6614:6614:src/mbc/mbcsv_pr_evts.c:0222] << 
mbcsv_process_events
Mar 28 17:12:43.075625 osafamfd [6614:6614:src/mbc/mbcsv_pr_evts.c:0278] << 
mbcsv_hdl_dispatch_all
Mar 28 17:12:43.075633 osafamfd [6614:6614:src/mbc/mbcsv_api.c:0435] << 
mbcsv_process_dispatch_request: retval: 1
Mar 28 17:12:52.871798 osafamfd [6614:6616:src/mbc/mbcsv_tmr.c:0250] TR Timer 
expired. my role:2, svc_id:10, pwe_hdl:65537, peer_anchor:565213973364764, tmr 
type:NCS_MBCSV_TMR_SEND_WARM_SYNC
Mar 28 17:13:43.342772 osafamfd [6614:6616:src/amf/amfd/timer.cc:0154] >> 
avd_tmr_exp
Mar 28 17:13:43.342824 osafamfd [6614:6616:src/amf/amfd/timer.cc:0175] << 
avd_tmr_exp
Mar 28 17:13:43.348644 osafamfd [6614:6614:src/amf/amfd/main.cc:0774] >> 
process_event: evt->rcv_evt 14
Mar 28 17:13:43.348673 osafamfd [6614:6614:src/amf/amfd/ndfsm.cc:1058] >> 
avd_tmr_snd_hb_evh: seq_id=1212
Mar 28 17:13:43.349443 osafamfd [6614:6614:src/amf/amfd/timer.cc:0113] >> 
avd_stop_tmr: 0

messages:
Mar 28 17:13:43 PM_SC-1 osafamfnd[6629]: ER AMF director heart beat timeout, 
generating core for amfd
Mar 28 17:13:43 PM_SC-1 kernel: [17482.341638] ata1.00: device reported invalid 
CHS sector 0
Mar 28 17:13:43 PM_SC-1 osaffmd[6546]: NO AMFND down on: 2020f
Mar 28 17:13:43 PM_SC-1 kernel: [17482.341647] ata1: EH complete
Mar 28 17:13:44 PM_SC-1 osafamfnd[6629]: Rebooting OpenSAF NodeId = 131343 EE 
Name = , Reason: AMF director heart beat timeout, OwnNodeId = 131343, 
SupervisionTime = 60
Mar 28 17:13:44 PM_SC-1 osaffmd[6546]: NO FM down on: 2020f


For amfnd:
amfnd:
Mar 28 17:12:27.530364 osafamfnd [6502:6502:src/amf/amfnd/cbq.cc:0242] >> 
avnd_evt_ava_resp_evh
Mar 28 17:12:27.530373 osafamfnd [6502:6502:src/amf/amfnd/proxy.cc:0509] TR 
safComp=AMFWDOG,safSu=SC-2,safSg=NoRed,safApp=OpenSAF: Type=15
Mar 28 17:12:27.530382 osafamfnd [6502:6502:src/amf/amfnd/proxy.cc:0612] >> 
avnd_int_ext_comp_val: safComp=AMFWDOG,safSu=SC-2,safSg=NoRed,safApp=OpenSAF
Mar 28 17:12:27.530390 osafamfnd [6502:6502:src/amf/amfnd/proxy.cc:0000] << 
avnd_int_ext_comp_val
Mar 28 17:12:27.530403 osafamfnd [6502:6502:src/amf/amfnd/tmr.cc:0126] TR 
callback response timer stopped
Mar 28 17:12:27.530412 osafamfnd [6502:6502:src/amf/amfnd/cbq.cc:0543] << 
avnd_evt_ava_resp_evh
Mar 28 17:12:27.530419 osafamfnd [6502:6502:src/amf/amfnd/main.cc:0669] TR Evt 
Type:33 success
Mar 28 17:12:27.530427 osafamfnd [6502:6502:src/amf/amfnd/main.cc:0674] << 
avnd_evt_process
Mar 28 17:12:33.631398 osafamfnd [6502:6502:src/amf/amfnd/main.cc:0646] >> 
avnd_evt_process

messages:
Mar 28 17:13:27 PM_SC-2 osafamfwd[6518]: TIMEOUT receiving AMF health check 
request, generating core for amfnd
Mar 28 17:13:35 PM_SC-2 kernel: [16964.152949] ata1.00: qc timeout (cmd 0xe7)
Mar 28 17:13:35 PM_SC-2 kernel: [16964.152994] ata1.00: FLUSH failed Emask 0x4
Mar 28 17:13:35 PM_SC-2 kernel: [16964.153005] ata1: hard resetting link
Mar 28 17:13:35 PM_SC-2 kernel: [16964.472367] ata1: SATA link up 3.0 Gbps 
(SStatus 123 SControl 300)
Mar 28 17:13:35 PM_SC-2 kernel: [16964.473051] ata1.00: configured for UDMA/133
Mar 28 17:13:35 PM_SC-2 kernel: [16964.473055] ata1.00: retrying FLUSH 0xe7 
Emask 0x4
Mar 28 17:13:43 PM_SC-2 kernel: [16972.985932] ata1.00: device reported invalid 
CHS sector 0
Mar 28 17:13:44 PM_SC-2 osafamfwd[6518]: Last received healthcheck cnt=1208 at 
Tue Mar 28 17:12:27 2017
Mar 28 17:13:44 PM_SC-2 osafamfwd[6518]: Rebooting OpenSAF NodeId = 0 EE Name = 
No EE Mapped, Reason: AMF unexpectedly crashed, OwnNodeId = 131599, 
SupervisionTime = 60
Mar 28 17:13:44 PM_SC-2 osafclmd[6482]: AL AMF Node Director is down, terminate 
this process



---

** [tickets:#2403] amf: amfd and amfnd crashes while calling TRACE() API.**

**Status:** unassigned
**Milestone:** 5.2.RC2
**Created:** Thu Mar 30, 2017 08:39 AM UTC by Praveen
**Last Updated:** Thu Mar 30, 2017 08:39 AM UTC
**Owner:** nobody



Observed AMFD and AMFND crashes when calling TRACE() API. 

amfd:
\#0  0x00007f44834db70d in write () from /lib64/libpthread.so.0
\#1  0x00007f4483eb9af9 in output_ () from /usr/local/lib/libopensaf_core.so.0
\#2  0x00007f4484dbc714 in Trace::trace(char const*, char const*, unsigned int, 
unsigned int, char const*, ...) ()
    at ./src/base/logtrace.h:166
\#3  0x00007f4484e7b3cb in avd_stop_tmr(cl_cb_tag*, avd_tmr_tag*) () at 
src/amf/amfd/timer.cc:113
 \#4  0x00007f4484e03556 in\ avd_tmr_snd_hb_evh(cl_cb_tag*, AVD_EVT*) () at 
src/amf/amfd/ndfsm.cc:1066
 \#5  0x00007f4484e005b4 in process_event(cl_cb_tag*, AVD_EVT*) () at 
src/amf/amfd/main.cc:792
\#6  0x00007f4484db9a1e in main () at src/amf/amfd/main.cc:693


amfnd:
(gdb) bt
\#0  0x00007fdd0f3f270d in write () from /lib64/libpthread.so.0
\#1  0x00007fdd0fb57af9 in output_ () from /usr/local/lib/libopensaf_core.so.0
\#2  0x00007fdd10844a64 in Trace::trace(char const*, char const*, unsigned int, 
unsigned int, char const*, ...) ()
    at ./src/base/logtrace.h:166
\#3  0x00007fdd1086e695 in avnd_main_process() () at src/amf/amfnd/main.cc:646
\#4  0x00007fdd1084342f in main () at src/amf/amfnd/main.cc:207




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to