<143>1 2019-12-13T12:45:22.523861+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26365"] >> mds_mcm_svc_up
<143>1 2019-12-13T12:45:22.523868+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26367"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 | 
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 2 | MY_PCON = 0 |
<143>1 2019-12-13T12:45:22.523893+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26374"] Increase svc_cnt
<143>1 2019-12-13T12:45:22.524007+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26397"] >> mds_mcm_svc_up
<143>1 2019-12-13T12:45:22.524013+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26399"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 | 
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 2 | MY_PCON = 0 |
<143>1 2019-12-13T12:45:22.524032+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26406"] Increase svc_cnt
...
<143>1 2019-12-13T12:45:22.524136+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26429"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.524142+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26431"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 | 
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524175+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26442"] MCM:API: svc_down : ROLE MISMATCH : svc_id = CLMA(35) on 
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest = 
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
<143>1 2019-12-13T12:45:22.524191+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26446"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.524197+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26448"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 | 
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524229+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26459"] MCM:API: svc_down : ROLE MISMATCH : svc_id = IMMND(25) on 
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest = 
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
~~~
Unexpect receive 2 more same svc down events, then svc_cnt is -2
~~~
<143>1 2019-12-13T12:45:22.524245+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26463"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.52425+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26465"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 | 
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524282+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26476"] MCM:API: svc_down : ROLE MISMATCH : svc_id = CLMA(35) on 
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest = 
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
<143>1 2019-12-13T12:45:22.524297+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26480"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:22.524302+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26482"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 | 
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 1 | MY_PCON = 0 |
<142>1 2019-12-13T12:45:22.524334+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="26493"] MCM:API: svc_down : ROLE MISMATCH : svc_id = IMMND(25) on 
DEST id = 65535 got DOWN for svc_id = CLMS(34) on Vdest id = 16 on Adest = 
<rem_node[0x2020f]:dest_pid[217]>, rem_svc_pvt_ver=1, rem_svc_archword=10
~~~
So, svc_cnt was wrong at this point.

Then later, I guess svc_cnt is -1 then svc up, down timer unexpected start and 
then svc down, svc_cnt is -1
When timer expire, svc_cnt is not zero (-1) then not DEL Adest from List but 
timer info is freed.
~~~
<143>1 2019-12-13T12:45:55.47399+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="76841"] Increase svc_cnt
<142>1 2019-12-13T12:45:55.993951+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="78837"] MCM:API: Adest <0x0002010f, 217> down timer start
<143>1 2019-12-13T12:45:55.9939+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="78826"] >> mds_mcm_svc_down
<143>1 2019-12-13T12:45:55.993904+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="78827"] MCM:API: LOCAL SVC INFO  : svc_id = CLMA(35) | PWE id = 1 | 
VDEST id = 65535 |
<143>1 2019-12-13T12:45:55.993914+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="78828"] MCM:API: REMOTE SVC INFO : svc_id = CLMS(34) | PWE id = 1 | 
VDEST id = 16 | POLICY = 2 | SCOPE = 4 | ROLE = 2 | MY_PCON = 0 |
...
<143>1 2019-12-13T12:46:05.97889+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="80963"] FREE tmr_req_info=0x7f713c022840
<142>1 2019-12-13T12:46:06.079167+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="80966"] TIMEOUT Adest <0x0002010f, 217> down
<143>1 2019-12-13T12:46:06.079178+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="80967"] Need reset tmr_req_info=0x7f713c02c680
<143>1 2019-12-13T12:46:06.079186+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="80968"] FREE tmr_req_info=0x7f713c02c680
~~~
Next svc up increase svc_cnt become zero
Next svc up, svc_cnt is zero, order stop/free down timer which already expired 
and freed and the coredump generated.
~~~
<143>1 2019-12-13T12:46:07.900413+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="87883"] Increase svc_cnt
<143>1 2019-12-13T12:46:07.900497+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="87905"] tmr_req_info=0x7f713c02c680
<143>1 2019-12-13T12:46:07.9005+01:00 PL-5 osafimmnd 160 mds.log [meta 
sequenceId="87906"] Stop and Free Adest <0x0002010f, 217> down timer
<141>1 2019-12-13T12:46:08.044753+01:00 PL-5 osafimmnd 299 mds.log [meta 
sequenceId="1"] BEGIN MDS LOGGING| ARCHW=a|64bit=1
~~~


---

** [tickets:#3129] mds: crash in stop_mds_down_tmr()**

**Status:** assigned
**Milestone:** 5.20.01
**Created:** Mon Dec 16, 2019 05:29 AM UTC by Thuan Tran
**Last Updated:** Mon Dec 16, 2019 05:30 AM UTC
**Owner:** Thuan Tran


osafimmnd coredump is generated
~~~
Core was generated by `/usr/local/lib/opensaf/osafimmnd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f714a16e3a0 in 
std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*, 
std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
[Current thread is 1 (Thread 0x7f714b121b00 (LWP 164))]

Thread 1 (Thread 0x7f714b121b00 (LWP 164)):
#0  0x00007f714a16e3a0 in 
std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*, 
std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#1  0x00007f714a69f9f0 in std::_Rb_tree<base::timer::TimerHandle::QueueEntry, 
base::timer::TimerHandle::QueueEntry, 
std::_Identity<base::timer::TimerHandle::QueueEntry>, 
std::less<base::timer::TimerHandle::QueueEntry>, 
std::allocator<base::timer::TimerHandle::QueueEntry> >::_M_erase_aux 
(this=this@entry=0x55719ddecc88, __position=...) at 
/usr/include/c++/7/bits/stl_tree.h:2473
        __y = <optimized out>
#2  0x00007f714a69f5d4 in std::_Rb_tree<base::timer::TimerHandle::QueueEntry, 
base::timer::TimerHandle::QueueEntry, 
std::_Identity<base::timer::TimerHandle::QueueEntry>, 
std::less<base::timer::TimerHandle::QueueEntry>, 
std::allocator<base::timer::TimerHandle::QueueEntry> 
>::erase[abi:cxx11](std::_Rb_tree_const_iterator<base::timer::TimerHandle::QueueEntry>)
 (__position=..., this=0x55719ddecc88) at 
/usr/include/c++/7/bits/stl_tree.h:1113
        __result = <optimized out>
        __result = <optimized out>
#3  std::multiset<base::timer::TimerHandle::QueueEntry, 
std::less<base::timer::TimerHandle::QueueEntry>, 
std::allocator<base::timer::TimerHandle::QueueEntry> 
>::erase[abi:cxx11](std::_Rb_tree_const_iterator<base::timer::TimerHandle::QueueEntry>)
 (__position=..., this=0x55719ddecc88) at 
/usr/include/c++/7/bits/stl_multiset.h:630
No locals.
#4  base::timer::TimerHandle::Stop (this=this@entry=0x55719ddecc40, 
timer=timer@entry=0x7f713c035780) at src/base/timer/timer_handle.cc:54
        old_expiration = <optimized out>
        new_expiration = <optimized out>
        iter = <optimized out>
#5  0x00007f714a693265 in base::NcsTmrHandle::Stop (this=0x55719ddecc40, 
timer=0x7f713c035780) at src/base/ncssysf_tmr.cc:103
        lock = {mutex_ = 0x55719ddeccb8}
#6  0x00007f714a693b84 in ncs_tmr_stop (tmrID=<optimized out>) at 
src/base/ncssysf_tmr.cc:240
        tmr = <optimized out>
#7  0x00007f714a6aaa20 in stop_mds_down_tmr (adest_info=0x7f713c01e400) at 
src/mds/mds_c_api.c:84
        tmr_req_info = 0x7f713c02c680
        tmr_req_info = <optimized out>
#8  0x00007f714a6ae04d in mds_mcm_svc_up (pwe_id=<optimized out>, 
svc_id=svc_id@entry=34, role=V_DEST_RL_STANDBY, 
scope=scope@entry=NCSMDS_SCOPE_NONE, vdest_id=vdest_id@entry=16, 
vdest_policy=NCS_VDEST_TYPE_MxN, adest=<optimized out>, my_pcon=false, 
local_svc_hdl=562945658454041, subtn_ref_val=4, svc_sub_part_ver=1 '\001', 
archword_type=10 '\n') at src/mds/mds_c_api.c:2026
        adest_info = 0x7f713c01e400
        status = <optimized out>
        local_subtn_scope = NCSMDS_SCOPE_NONE
        local_subtn_view = MDS_VIEW_NORMAL
        active_adest = 0
        dest_role = 32625
        tmr_running = 157
        cbinfo = {i_yr_svc_hdl = 0, i_yr_svc_id = 0, i_op = MDS_CALLBACK_COPY, 
info = {cpy = {i_msg = 0x0, i_last = false, i_to_svc_id = 0, o_cpy = 0x0, 
i_rem_svc_pvt_ver = 0 '\000', o_msg_fmt_ver = 0}, enc = {i_msg = 0x0, 
i_to_svc_id = 0, io_uba = 0x0, i_rem_svc_pvt_ver = 0 '\000', o_msg_fmt_ver = 
0}, dec = {io_uba = 0x0, i_fr_svc_id = 0, i_is_resp = false, o_msg = 0x0, 
i_node_id = 0, i_msg_fmt_ver = 0, i_node_name = '\000' <repeats 254 times>}, 
enc_flat = {i_msg = 0x0, i_to_svc_id = 0, io_uba = 0x0, i_rem_svc_pvt_ver = 0 
'\000', o_msg_fmt_ver = 0}, dec_flat = {io_uba = 0x0, i_fr_svc_id = 0, 
i_is_resp = false, o_msg = 0x0, i_node_id = 0, i_msg_fmt_ver = 0, i_node_name = 
'\000' <repeats 254 times>}, receive = {i_msg = 0x0, i_rsp_reqd = false, 
i_msg_ctxt = {length = 0 '\000', data = '\000' <repeats 11 times>}, i_fr_dest = 
0, i_fr_svc_id = 0, i_fr_anc = 0, i_to_dest = 0, i_to_svc_id = 0, i_priority = 
0, i_node_id = 0, i_node_name = '\000' <repeats 254 times>, sender_pwe_hdl = 0, 
i_msg_fmt_ver = 0, pid = 0, uid = 0, gid = 0}, direct_receive = {i_direct_buff 
= 0x0, i_direct_buff_len = 0, i_rsp_reqd = false, i_msg_ctxt = {length = 0 
'\000', data = '\000' <repeats 11 times>}, i_fr_dest = 0, i_fr_svc_id = 0, 
i_fr_anc = 0, i_to_dest = 0, i_to_svc_id = 0, i_priority = 0, i_node_id = 0, 
i_node_name = '\000' <repeats 254 times>, sender_pwe_hdl = 0, i_msg_fmt_ver = 
0}, svc_evt = {i_change = NCSMDS_NONE, i_dest = 0, i_anc = 0, i_role = 0, 
i_node_id = 0, i_pwe_id = 0, i_svc_id = 0, i_your_id = 0, svc_pwe_hdl = 0, 
i_rem_svc_pvt_ver = 0 '\000', i_dest_details = '\000' <repeats 318 times>}, 
sys_evt = {i_change = NCSMDS_NONE, i_node_id = 0, i_evt_mask = 0}, quiesced_ack 
= {i_dummy = 0}, node_evt = {node_chg = NCSMDS_NODE_UP, node_id = 0, 
addr_family = 0, length = 0, ip_addr_len = 0, ip_addr = '\000' <repeats 45 
times>, i_node_name_len = 0, i_node_name = '\000' <repeats 254 times>}, 
msg_loss_evt = {i_dest = 0, i_pwe_id = 0, i_svc_id = 0, i_vdest_id = 0}}}
        subtn_result_info = 0x0
        active_subtn_result_info = 0x0
        next_active_result_info = 0x0
        log_subtn_result_info = 0x0
        __FUNCTION__ = "mds_mcm_svc_up"
#9  0x00007f714a6beea7 in mds_mdtm_process_recvdata (rcv_bytes=<optimized out>, 
buff_in=<optimized out>) at src/mds/mds_dt_trans.c:1137
        pwe_id = <optimized out>
        svc_id = <optimized out>
        role = <optimized out>
        scope = NCSMDS_SCOPE_NONE
        vdest = 16
        policy = NCS_VDEST_TYPE_MxN
        svc_hdl = 562945658454041
        svc_sub_part_ver = 1 '\001'
        archword_type = 10 '\n'
        msg_type = MDTM_LIB_UP_TYPE
        mds_version = <optimized out>
        mds_indentifire = <optimized out>
        server_type = <optimized out>
        server_instance_lower = 2685861904
        node_id = <optimized out>
        process_id = 217
        ref_val = 4
        adest = 564113889558745
        src_nodeid = <optimized out>
        src_process_id = <optimized out>
        buff_dump = 0
        tcp_id = <optimized out>
        buffer = 0x7f713c037b22 ""
        __PRETTY_FUNCTION__ = "mds_mdtm_process_recvdata"
#10 0x00007f714a6bfe87 in mdtm_process_poll_recv_data_tcp () at 
src/mds/mds_dt_trans.c:903
        recd_bytes = <optimized out>
        __FUNCTION__ = "mdtm_process_poll_recv_data_tcp"
        __PRETTY_FUNCTION__ = "mdtm_process_poll_recv_data_tcp"
#11 0x00007f714a6c029f in mdtm_process_recv_events_tcp () at 
src/mds/mds_dt_trans.c:995
        pollres = <optimized out>
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to