- **status**: unassigned --> assigned
- **assigned_to**: Praveen
---
** [tickets:#2320] clm: standby clmd crashes due to missing node information**
**Status:** assigned
**Milestone:** 5.2.RC2
**Created:** Wed Feb 22, 2017 12:55 PM UTC by Zoran Milinkovic
**Last Updated:** Fri Feb 24, 2017 10:06 AM UTC
**Owner:** Praveen
The standby CLMD service crashed due to missing PL-3 information.
syslog from SC-2:
~~~
Feb 13 00:43:31 SC-2-2 osafamfd[5082]: NO Cold sync complete!
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: Ruling epoch noted as:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: SaImmRepositoryInitModeT changed
and noted as 'SA_IMM_KEEP_REPOSITORY'
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE-> IMM_NODE_R_AVAILABLE
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 19082
Feb 13 00:43:31 SC-2-2 osafimmnd[5024]: NO Epoch set to 5 in ImmModel
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at
node 2020f old epoch: 4 new epoch:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at
node 2010f old epoch: 4 new epoch:5
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO IMMND coord at 2010f
Feb 13 00:43:31 SC-2-2 osafimmd[5009]: NO SBY: New Epoch for IMMND process at
node 2030f old epoch: 0 new epoch:5
Feb 13 00:43:31 SC-2-2 osafclmd[5066]: ER Node is NULL,problem with the
database.
Feb 13 00:43:31 SC-2-2 osafclmd[5066]:
../../opensaf/src/clm/clmd/clms_mbcsv.c:468: ckpt_proc_node_rec: Assertion '0'
failed.
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: NO
'safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF' faulted due to 'avaDown' :
Recovery is 'nodeFailfast'
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: ER
safComp=CLM,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Feb 13 00:43:32 SC-2-2 osafamfnd[5096]: Rebooting OpenSAF NodeId = 131599 EE
Name = , Reason: Component faulted: recovery is node failfast, OwnNodeId =
131599, SupervisionTime = 60
Feb 13 00:43:32 SC-2-2 opensaf_reboot: Rebooting local node; timeout=60
~~~
Coredump:
~~~
[New LWP 5066]
[New LWP 5069]
[New LWP 5068]
[New LWP 5070]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib64/opensaf/osafclmd'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fbc7be880c7 in raise () from /lib64/libc.so.6
### BT ###
#0 0x00007fbc7be880c7 in raise () from /lib64/libc.so.6
#1 0x00007fbc7be89478 in abort () from /lib64/libc.so.6
#2 0x00007fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468,
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec",
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at
../../opensaf/src/base/sysf_def.c:281
#3 0x00007fbc7e1aa016 in ckpt_proc_node_rec (cb=<optimized out>,
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
#4 0x00007fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=<optimized out>,
cb=0x7fbc7e3be100 <_clms_cb>) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:2310
#5 ckpt_decode_cbk_handler (cbk_arg=0x7fff27b6b1a0) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:1997
#6 mbcsv_callback (arg=0x7fff27b6b1a0) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:719
#7 0x00007fbc7c856f76 in ncs_mbscv_rcv_decode (peer=peer@entry=0x7fbc7f217a60,
evt=evt@entry=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:393
#8 0x00007fbc7c857146 in ncs_mbcsv_rcv_async_update (peer=0x7fbc7f217a60,
evt=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:440
#9 0x00007fbc7c85dd30 in mbcsv_process_events (rcvd_evt=0x7fbc740036e0,
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at
../../opensaf/src/mbc/mbcsv_pr_evts.c:168
#10 0x00007fbc7c85de9b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753,
mbx=mbx@entry=4283432961) at ../../opensaf/src/mbc/mbcsv_pr_evts.c:272
#11 0x00007fbc7c8586c2 in mbcsv_process_dispatch_request (arg=0x7fff27b6b310)
at ../../opensaf/src/mbc/mbcsv_api.c:423
#12 0x00007fbc7e1aa7be in clms_mbcsv_dispatch (mbcsv_hdl=<optimized out>) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:687
#13 0x00007fbc7e19e4e4 in main (argc=<optimized out>, argv=<optimized out>) at
../../opensaf/src/clm/clmd/clms_main.c:535
### BT FULL ###
#0 0x00007fbc7be880c7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007fbc7be89478 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468,
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec",
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at
../../opensaf/src/base/sysf_def.c:281
No locals.
#3 0x00007fbc7e1aa016 in ckpt_proc_node_rec (cb=<optimized out>,
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
param = 0x7fbc7f218a60
node = 0x0
ip = 0x0
__FUNCTION__ = "ckpt_proc_node_rec"
#4 0x00007fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=<optimized out>,
cb=0x7fbc7e3be100 <_clms_cb>) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:2310
ckpt_cluster_rec = <optimized out>
rc = 1
num_bytes = <optimized out>
hdr = 0x7fbc7f218a50
ckpt_finalize_rec = <optimized out>
ckpt_node_rec = <optimized out>
ckpt_node_config_rec = <optimized out>
ckpt_node_del_rec = <optimized out>
ckpt_node_down_rec = <optimized out>
ckpt_msg = 0x7fbc7f218a50
ckpt_client_rec = <optimized out>
ckpt_csync_node_rec = <optimized out>
ckpt_agent_down = <optimized out>
#5 ckpt_decode_cbk_handler (cbk_arg=0x7fff27b6b1a0) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:1997
rc = 1
msg_fmt_version = 1
#6 mbcsv_callback (arg=0x7fff27b6b1a0) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:719
rc = 1
__FUNCTION__ = "mbcsv_callback"
#7 0x00007fbc7c856f76 in ncs_mbscv_rcv_decode (peer=peer@entry=0x7fbc7f217a60,
evt=evt@entry=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:393
parg = {
i_op = NCS_MBCSV_CBOP_DEC,
i_client_hdl = 0,
i_ckpt_hdl = 4292870177,
info = {
encode = {
io_msg_type = NCS_MBCSV_MSG_ASYNC_UPDATE,
io_action = NCS_MBCSV_ACT_ADD,
io_reo_type = 6,
io_reo_hdl = 0,
io_uba = {
start = 0x0,
ub = 0x0,
bufp = 0x7000000000 <error: Cannot access memory at address
0x7000000000>,
res = 112,
ttl = 0,
max = 2132894108
},
io_req_context = 9209973925752930305,
i_peer_version = 21264
},
decode = {
i_msg_type = NCS_MBCSV_MSG_ASYNC_UPDATE,
i_action = NCS_MBCSV_ACT_ADD,
i_reo_type = 6,
i_uba = {
start = 0x0,
ub = 0x0,
bufp = 0x0,
res = 0,
ttl = 112,
max = 112
},
o_req_context = 140447563473308,
i_peer_version = 1
},
peer = {
i_service = NCS_SERVICE_ID_LEAP_TMR,
i_peer_version = 1
},
notify = {
i_uba = {
start = 0x100000001,
ub = 0x6,
bufp = 0x0,
res = 0,
ttl = 0,
max = 0
},
i_peer_version = 0,
i_msg = 0x70
},
error = {
i_code = NCS_MBCSV_WARM_SYNC_TMR_EXP,
i_err = true,
i_arg = 0x6,
i_peer_version = 0
}
}
}
status = 2
mbc_inst = 0x7fbc7f2057f0
__FUNCTION__ = "ncs_mbscv_rcv_decode"
#8 0x00007fbc7c857146 in ncs_mbcsv_rcv_async_update (peer=0x7fbc7f217a60,
evt=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:440
__FUNCTION__ = "ncs_mbcsv_rcv_async_update"
#9 0x00007fbc7c85dd30 in mbcsv_process_events (rcvd_evt=0x7fbc740036e0,
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at
../../opensaf/src/mbc/mbcsv_pr_evts.c:168
mbc_reg = 0x7fbc7f2057f0
peer = <optimized out>
ckpt = <optimized out>
hdl_to_give = 4291821601
__FUNCTION__ = "mbcsv_process_events"
#10 0x00007fbc7c85de9b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753,
mbx=mbx@entry=4283432961) at ../../opensaf/src/mbc/mbcsv_pr_evts.c:272
rcvd_evt = <optimized out>
rc = 1
__FUNCTION__ = "mbcsv_hdl_dispatch_all"
#11 0x00007fbc7c8586c2 in mbcsv_process_dispatch_request (arg=0x7fff27b6b310)
at ../../opensaf/src/mbc/mbcsv_api.c:423
mbc_reg = <optimized out>
rc = SA_AIS_OK
mail_box = 4283432961
__FUNCTION__ = "mbcsv_process_dispatch_request"
#12 0x00007fbc7e1aa7be in clms_mbcsv_dispatch (mbcsv_hdl=<optimized out>) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:687
mbcsv_arg = {
i_op = NCS_MBCSV_OP_DISPATCH,
i_mbcsv_hdl = 4293918753,
info = {
initialize = {
i_mbcsv_cb = 0x2,
i_version = 0,
i_service = NCS_SERVICE_ID_BASE,
o_mbcsv_hdl = 0
},
sel_obj_get = {
o_select_obj = 2
},
dispatch = {
i_disp_flags = SA_DISPATCH_ALL
},
finalize = {
i_dummy = 2
},
open = {
i_pwe_hdl = 2,
i_client_hdl = 0,
o_ckpt_hdl = 0
},
close = {
i_ckpt_hdl = 2
},
chg_role = {
i_ckpt_hdl = 2,
i_ha_state = 0
},
send_ckpt = {
i_ckpt_hdl = 2,
i_send_type = NCS_MBCSV_SND_SYNC,
i_reo_type = 0,
i_reo_hdl = 0,
i_action = NCS_MBCSV_ACT_DONT_CARE,
io_no_peer = false
},
send_notify = {
i_ckpt_hdl = 2,
i_msg_dest = NCS_MBCSV_ACTIVE,
i_msg = 0x0
},
send_data_req = {
i_ckpt_hdl = 2,
i_uba = {
start = 0x0,
ub = 0x0,
bufp = 0x0,
res = 0,
ttl = 0,
max = 0
}
},
obj_set = {
i_ckpt_hdl = 2,
i_obj = NCS_MBCSV_OBJ_WARM_SYNC_ON_OFF,
i_val = 0
},
obj_get = {
i_ckpt_hdl = 2,
i_obj = NCS_MBCSV_OBJ_WARM_SYNC_ON_OFF,
o_val = 0
}
}
}
#13 0x00007fbc7e19e4e4 in main (argc=<optimized out>, argv=<optimized out>) at
../../opensaf/src/clm/clmd/clms_main.c:535
mbx_fd = <optimized out>
error = <optimized out>
rc = <optimized out>
term_fd = 17
timeout = <optimized out>
__FUNCTION__ = "main"
### THREAD APPLY ALL BT ###
Thread 4 (Thread 0x7fbc7e333b00 (LWP 5070)):
#0 0x00007fbc7bf2fbfd in poll () from /lib64/libc.so.6
#1 0x00007fbc7c84e361 in poll (__timeout=30000, __nfds=1,
__fds=0x7fbc7e3331f0) at /usr/include/bits/poll2.h:46
#2 osaf_ppoll (io_fds=io_fds@entry=0x7fbc7e3331f0, i_nfds=i_nfds@entry=1,
i_timeout_ts=i_timeout_ts@entry=0x7fbc7e3331c0, i_sigmask=i_sigmask@entry=0x0)
at ../../opensaf/src/base/osaf_poll.c:105
#3 0x00007fbc7c84e4fb in osaf_poll (io_fds=io_fds@entry=0x7fbc7e3331f0,
i_nfds=i_nfds@entry=1, i_timeout=i_timeout@entry=30000) at
../../opensaf/src/base/osaf_poll.c:44
#4 0x00007fbc7c84e545 in osaf_poll_one_fd (i_fd=11, i_timeout=30000) at
../../opensaf/src/base/osaf_poll.c:128
#5 0x00007fbc7c87f2d7 in rda_read_msg (sockfd=11, msg=msg@entry=0x7fbc7e333290
"10 2", size=64) at ../../opensaf/src/rde/agent/rda_papi.cc:673
#6 0x00007fbc7c87f564 in rda_callback_task (rda_callback_cb=0x7fbc7f2034a0) at
../../opensaf/src/rde/agent/rda_papi.cc:150
#7 0x00007fbc7c2030a4 in start_thread () from /lib64/libpthread.so.0
#8 0x00007fbc7bf3802d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7fbc7e373b00 (LWP 5068)):
#0 0x00007fbc7bf2fbfd in poll () from /lib64/libc.so.6
#1 0x00007fbc7c84e361 in poll (__timeout=5100, __nfds=1, __fds=0x7fbc7e373260)
at /usr/include/bits/poll2.h:46
#2 osaf_ppoll (io_fds=io_fds@entry=0x7fbc7e373260, i_nfds=i_nfds@entry=1,
i_timeout_ts=0x7fbc7e373280, i_sigmask=i_sigmask@entry=0x0) at
../../opensaf/src/base/osaf_poll.c:105
#3 0x00007fbc7c85549f in ncs_tmr_wait () at
../../opensaf/src/base/sysf_tmr.c:414
#4 0x00007fbc7c2030a4 in start_thread () from /lib64/libpthread.so.0
#5 0x00007fbc7bf3802d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7fbc7e353b00 (LWP 5069)):
#0 0x00007fbc7bf2fbfd in poll () from /lib64/libc.so.6
#1 0x00007fbc7c87c9cf in poll (__timeout=20000, __nfds=3,
__fds=0x7fbc7e353280) at /usr/include/bits/poll2.h:46
#2 mdtm_process_recv_events () at ../../opensaf/src/mds/mds_dt_tipc.c:684
#3 0x00007fbc7c2030a4 in start_thread () from /lib64/libpthread.so.0
#4 0x00007fbc7bf3802d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fbc7e3ab740 (LWP 5066)):
#0 0x00007fbc7be880c7 in raise () from /lib64/libc.so.6
#1 0x00007fbc7be89478 in abort () from /lib64/libc.so.6
#2 0x00007fbc7c85202e in __osafassert_fail (__file=__file@entry=0x7fbc7e1b7d50
"../../opensaf/src/clm/clmd/clms_mbcsv.c", __line=__line@entry=468,
__func=__func@entry=0x7fbc7e1b8820 <__FUNCTION__.12739> "ckpt_proc_node_rec",
__assertion=__assertion@entry=0x7fbc7e1b78ea "0") at
../../opensaf/src/base/sysf_def.c:281
#3 0x00007fbc7e1aa016 in ckpt_proc_node_rec (cb=<optimized out>,
data=0x7fbc7f218a50) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:468
#4 0x00007fbc7e1ae044 in ckpt_decode_async_update (cbk_arg=<optimized out>,
cb=0x7fbc7e3be100 <_clms_cb>) at ../../opensaf/src/clm/clmd/clms_mbcsv.c:2310
#5 ckpt_decode_cbk_handler (cbk_arg=0x7fff27b6b1a0) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:1997
#6 mbcsv_callback (arg=0x7fff27b6b1a0) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:719
#7 0x00007fbc7c856f76 in ncs_mbscv_rcv_decode (peer=peer@entry=0x7fbc7f217a60,
evt=evt@entry=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:393
#8 0x00007fbc7c857146 in ncs_mbcsv_rcv_async_update (peer=0x7fbc7f217a60,
evt=0x7fbc740036e0) at ../../opensaf/src/mbc/mbcsv_act.c:440
#9 0x00007fbc7c85dd30 in mbcsv_process_events (rcvd_evt=0x7fbc740036e0,
mbcsv_hdl=mbcsv_hdl@entry=4293918753) at
../../opensaf/src/mbc/mbcsv_pr_evts.c:168
#10 0x00007fbc7c85de9b in mbcsv_hdl_dispatch_all (mbcsv_hdl=4293918753,
mbx=mbx@entry=4283432961) at ../../opensaf/src/mbc/mbcsv_pr_evts.c:272
#11 0x00007fbc7c8586c2 in mbcsv_process_dispatch_request (arg=0x7fff27b6b310)
at ../../opensaf/src/mbc/mbcsv_api.c:423
#12 0x00007fbc7e1aa7be in clms_mbcsv_dispatch (mbcsv_hdl=<optimized out>) at
../../opensaf/src/clm/clmd/clms_mbcsv.c:687
#13 0x00007fbc7e19e4e4 in main (argc=<optimized out>, argv=<optimized out>) at
../../opensaf/src/clm/clmd/clms_main.c:535
~~~
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets