<143>1 2019-12-13T12:45:22.523861+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26365"] >> mds_mcm_svc_up
<143>1 2019-12-13T12:45:22.523868+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26367"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 |
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 2 | MY_PCON = 0 |
<143>1 2019-12-13T12:45:22.523893+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26374"] Increase svc_cnt
<143>1 2019-12-13T12:45:22.524007+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26397"] >> mds_mcm_svc_up
<143>1 2019-12-13T12:45:22.524013+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26399"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 |
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 2 | MY_PCON = 0 |
<143>1 2019-12-13T12:45:22.524032+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26406"] Increase svc_cnt
...
<143>1 2019-12-13T12:45:22.524136+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26429"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.524142+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26431"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 |
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524175+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26442"] MCM:API: svc_down : ROLE MISMATCH : svc_id = CLMA(35) on
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest =
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
<143>1 2019-12-13T12:45:22.524191+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26446"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.524197+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26448"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 |
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524229+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26459"] MCM:API: svc_down : ROLE MISMATCH : svc_id = IMMND(25) on
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest =
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
~~~
Unexpect receive 2 more same svc down events, then svc_cnt is -2
~~~
<143>1 2019-12-13T12:45:22.524245+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26463"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.52425+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26465"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 |
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524282+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26476"] MCM:API: svc_down : ROLE MISMATCH : svc_id = CLMA(35) on
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest =
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
<143>1 2019-12-13T12:45:22.524297+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26480"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.524302+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26482"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 |
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524334+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="26493"] MCM:API: svc_down : ROLE MISMATCH : svc_id = IMMND(25) on
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest =
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
~~~
So, svc_cnt was wrong at this point.
Then later, I guess svc_cnt is -1 then svc up, down timer unexpected start and
then svc down, svc_cnt is -1
When timer expire, svc_cnt is not zero (-1) then not DEL Adest from List but
timer info is freed.
~~~
<143>1 2019-12-13T12:45:55.47399+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="76841"] Increase svc_cnt
<142>1 2019-12-13T12:45:55.993951+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="78837"] MCM:API: Adest <0x0002010f, 217> down timer start
<143>1 2019-12-13T12:45:55.9939+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="78826"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:55.993904+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="78827"] MCM:API: LOCAL SVC INFO : svc_id = CLMA(35) | PWE id = 1 |
VDEST id = 65535 |
<143>1 2019-12-13T12:45:55.993914+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="78828"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 |
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 2 | MY_PCON = 0 |
...
<143>1 2019-12-13T12:46:05.97889+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="80963"] FREE tmr_req_info=0x7f713c022840
<142>1 2019-12-13T12:46:06.079167+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="80966"] TIMEOUT Adest <0x0002010f, 217> down
<143>1 2019-12-13T12:46:06.079178+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="80967"] Need reset tmr_req_info=0x7f713c02c680
<143>1 2019-12-13T12:46:06.079186+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="80968"] FREE tmr_req_info=0x7f713c02c680
~~~
Next svc up increase svc_cnt become zero
Next svc up, svc_cnt is zero, order stop/free down timer which already expired
and freed and the coredump generated.
~~~
<143>1 2019-12-13T12:46:07.900413+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="87883"] Increase svc_cnt
<143>1 2019-12-13T12:46:07.900497+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="87905"] tmr_req_info=0x7f713c02c680
<143>1 2019-12-13T12:46:07.9005+01:00 PL-5 osafimmnd 160 mds.log [meta
sequenceId="87906"] Stop and Free Adest <0x0002010f, 217> down timer
<141>1 2019-12-13T12:46:08.044753+01:00 PL-5 osafimmnd 299 mds.log [meta
sequenceId="1"] BEGIN MDS LOGGING| ARCHW=a|64bit=1
~~~
---
** [tickets:#3129] mds: crash in stop_mds_down_tmr()**
**Status:** assigned
**Milestone:** 5.20.01
**Created:** Mon Dec 16, 2019 05:29 AM UTC by Thuan Tran
**Last Updated:** Mon Dec 16, 2019 05:30 AM UTC
**Owner:** Thuan Tran
osafimmnd coredump is generated
~~~
Core was generated by `/usr/local/lib/opensaf/osafimmnd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f714a16e3a0 in
std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*,
std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
[Current thread is 1 (Thread 0x7f714b121b00 (LWP 164))]
Thread 1 (Thread 0x7f714b121b00 (LWP 164)):
#0 0x00007f714a16e3a0 in
std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*,
std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#1 0x00007f714a69f9f0 in std::_Rb_tree<base::timer::TimerHandle::QueueEntry,
base::timer::TimerHandle::QueueEntry,
std::_Identity<base::timer::TimerHandle::QueueEntry>,
std::less<base::timer::TimerHandle::QueueEntry>,
std::allocator<base::timer::TimerHandle::QueueEntry> >::_M_erase_aux
(this=this@entry=0x55719ddecc88, __position=...) at
/usr/include/c++/7/bits/stl_tree.h:2473
__y = <optimized out>
#2 0x00007f714a69f5d4 in std::_Rb_tree<base::timer::TimerHandle::QueueEntry,
base::timer::TimerHandle::QueueEntry,
std::_Identity<base::timer::TimerHandle::QueueEntry>,
std::less<base::timer::TimerHandle::QueueEntry>,
std::allocator<base::timer::TimerHandle::QueueEntry>
>::erase[abi:cxx11](std::_Rb_tree_const_iterator<base::timer::TimerHandle::QueueEntry>)
(__position=..., this=0x55719ddecc88) at
/usr/include/c++/7/bits/stl_tree.h:1113
__result = <optimized out>
__result = <optimized out>
#3 std::multiset<base::timer::TimerHandle::QueueEntry,
std::less<base::timer::TimerHandle::QueueEntry>,
std::allocator<base::timer::TimerHandle::QueueEntry>
>::erase[abi:cxx11](std::_Rb_tree_const_iterator<base::timer::TimerHandle::QueueEntry>)
(__position=..., this=0x55719ddecc88) at
/usr/include/c++/7/bits/stl_multiset.h:630
No locals.
#4 base::timer::TimerHandle::Stop (this=this@entry=0x55719ddecc40,
timer=timer@entry=0x7f713c035780) at src/base/timer/timer_handle.cc:54
old_expiration = <optimized out>
new_expiration = <optimized out>
iter = <optimized out>
#5 0x00007f714a693265 in base::NcsTmrHandle::Stop (this=0x55719ddecc40,
timer=0x7f713c035780) at src/base/ncssysf_tmr.cc:103
lock = {mutex_ = 0x55719ddeccb8}
#6 0x00007f714a693b84 in ncs_tmr_stop (tmrID=<optimized out>) at
src/base/ncssysf_tmr.cc:240
tmr = <optimized out>
#7 0x00007f714a6aaa20 in stop_mds_down_tmr (adest_info=0x7f713c01e400) at
src/mds/mds_c_api.c:84
tmr_req_info = 0x7f713c02c680
tmr_req_info = <optimized out>
#8 0x00007f714a6ae04d in mds_mcm_svc_up (pwe_id=<optimized out>,
svc_id=svc_id@entry=34, role=V_DEST_RL_STANDBY,
scope=scope@entry=NCSMDS_SCOPE_NONE, vdest_id=vdest_id@entry=16,
vdest_policy=NCS_VDEST_TYPE_MxN, adest=<optimized out>, my_pcon=false,
local_svc_hdl=562945658454041, subtn_ref_val=4, svc_sub_part_ver=1 '\001',
archword_type=10 '\n') at src/mds/mds_c_api.c:2026
adest_info = 0x7f713c01e400
status = <optimized out>
local_subtn_scope = NCSMDS_SCOPE_NONE
local_subtn_view = MDS_VIEW_NORMAL
active_adest = 0
dest_role = 32625
tmr_running = 157
cbinfo = {i_yr_svc_hdl = 0, i_yr_svc_id = 0, i_op = MDS_CALLBACK_COPY,
info = {cpy = {i_msg = 0x0, i_last = false, i_to_svc_id = 0, o_cpy = 0x0,
i_rem_svc_pvt_ver = 0 '\000', o_msg_fmt_ver = 0}, enc = {i_msg = 0x0,
i_to_svc_id = 0, io_uba = 0x0, i_rem_svc_pvt_ver = 0 '\000', o_msg_fmt_ver =
0}, dec = {io_uba = 0x0, i_fr_svc_id = 0, i_is_resp = false, o_msg = 0x0,
i_node_id = 0, i_msg_fmt_ver = 0, i_node_name = '\000' <repeats 254 times>},
enc_flat = {i_msg = 0x0, i_to_svc_id = 0, io_uba = 0x0, i_rem_svc_pvt_ver = 0
'\000', o_msg_fmt_ver = 0}, dec_flat = {io_uba = 0x0, i_fr_svc_id = 0,
i_is_resp = false, o_msg = 0x0, i_node_id = 0, i_msg_fmt_ver = 0, i_node_name =
'\000' <repeats 254 times>}, receive = {i_msg = 0x0, i_rsp_reqd = false,
i_msg_ctxt = {length = 0 '\000', data = '\000' <repeats 11 times>}, i_fr_dest =
0, i_fr_svc_id = 0, i_fr_anc = 0, i_to_dest = 0, i_to_svc_id = 0, i_priority =
0, i_node_id = 0, i_node_name = '\000' <repeats 254 times>, sender_pwe_hdl = 0,
i_msg_fmt_ver = 0, pid = 0, uid = 0, gid = 0}, direct_receive = {i_direct_buff
= 0x0, i_direct_buff_len = 0, i_rsp_reqd = false, i_msg_ctxt = {length = 0
'\000', data = '\000' <repeats 11 times>}, i_fr_dest = 0, i_fr_svc_id = 0,
i_fr_anc = 0, i_to_dest = 0, i_to_svc_id = 0, i_priority = 0, i_node_id = 0,
i_node_name = '\000' <repeats 254 times>, sender_pwe_hdl = 0, i_msg_fmt_ver =
0}, svc_evt = {i_change = NCSMDS_NONE, i_dest = 0, i_anc = 0, i_role = 0,
i_node_id = 0, i_pwe_id = 0, i_svc_id = 0, i_your_id = 0, svc_pwe_hdl = 0,
i_rem_svc_pvt_ver = 0 '\000', i_dest_details = '\000' <repeats 318 times>},
sys_evt = {i_change = NCSMDS_NONE, i_node_id = 0, i_evt_mask = 0}, quiesced_ack
= {i_dummy = 0}, node_evt = {node_chg = NCSMDS_NODE_UP, node_id = 0,
addr_family = 0, length = 0, ip_addr_len = 0, ip_addr = '\000' <repeats 45
times>, i_node_name_len = 0, i_node_name = '\000' <repeats 254 times>},
msg_loss_evt = {i_dest = 0, i_pwe_id = 0, i_svc_id = 0, i_vdest_id = 0}}}
subtn_result_info = 0x0
active_subtn_result_info = 0x0
next_active_result_info = 0x0
log_subtn_result_info = 0x0
__FUNCTION__ = "mds_mcm_svc_up"
#9 0x00007f714a6beea7 in mds_mdtm_process_recvdata (rcv_bytes=<optimized out>,
buff_in=<optimized out>) at src/mds/mds_dt_trans.c:1137
pwe_id = <optimized out>
svc_id = <optimized out>
role = <optimized out>
scope = NCSMDS_SCOPE_NONE
vdest = 16
policy = NCS_VDEST_TYPE_MxN
svc_hdl = 562945658454041
svc_sub_part_ver = 1 '\001'
archword_type = 10 '\n'
msg_type = MDTM_LIB_UP_TYPE
mds_version = <optimized out>
mds_indentifire = <optimized out>
server_type = <optimized out>
server_instance_lower = 2685861904
node_id = <optimized out>
process_id = 217
ref_val = 4
adest = 564113889558745
src_nodeid = <optimized out>
src_process_id = <optimized out>
buff_dump = 0
tcp_id = <optimized out>
buffer = 0x7f713c037b22 ""
__PRETTY_FUNCTION__ = "mds_mdtm_process_recvdata"
#10 0x00007f714a6bfe87 in mdtm_process_poll_recv_data_tcp () at
src/mds/mds_dt_trans.c:903
recd_bytes = <optimized out>
__FUNCTION__ = "mdtm_process_poll_recv_data_tcp"
__PRETTY_FUNCTION__ = "mdtm_process_poll_recv_data_tcp"
#11 0x00007f714a6c029f in mdtm_process_recv_events_tcp () at
src/mds/mds_dt_trans.c:995
pollres = <optimized out>
~~~
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets