Hi Guys,
I tried to reproduce the issue on latest opensaf (5.22.11 -
7089987e9f2d7e5b2f039c14dfb942d2830a27cc) but i did not succeed.
I have an setup of 2 controllers and 2 payloads with headless feature enabled
and 1 PBE with 100k objects.
I stopped the opensaf on one of the payload to reproduce the issue, but it
failed to reproduce.
So, can I close this ticket?
Thanks
Mohan
---
** [tickets:#2110] AMF : amfd aborted on both controllers after opensafd
stopped on payload**
**Status:** assigned
**Milestone:** future
**Created:** Tue Oct 11, 2016 05:35 AM UTC by Srikanth R
**Last Updated:** Fri Mar 24, 2017 03:15 AM UTC
**Owner:** Minh Hon Chau
Changeset : 5.1GA 8190
Setup : 4 nodes setup with PBE enabled ( 1 lakh objects) and headless feature
enabled .
Steps performed :
-> Brought up opensaf on 4 node setup
-> Ran IMM test application on Oct 8th and also performed middleware failovers.
-> For two days, setup is left idle.
-> On Oct 10 14:07:38, stopped opensaf on PL-4 for which amfd on both
controllers aborted
Oct 10 14:07:38 SLES-SLOT1 osafimmnd[2748]: NO Global discard node received for
nodeId:2040f pid:3261
Oct 10 14:07:38 SLES-SLOT1 osafamfd[2788]: NO Node 'PL-4' left the cluster
Oct 10 14:07:38 SLES-SLOT1 osafamfd[2788]: su.cc:2006: dec_curr_act_si:
Assertion 'saAmfSUNumCurrActiveSIs > 0' failed.
Oct 10 14:07:38 SLES-SLOT1 osafamfnd[2798]: WA AMF director unexpectedly crashed
Below is the back trace :
2 0x00007f7426025197 in __osafassert_fail (__file=0x51b4ed "su.cc",
__line=2006,
__func=0x51ce30 <AVD_SU::dec_curr_act_si()::__FUNCTION__> "dec_curr_act_si",
__assertion=0x51c884 "saAmfSUNumCurrActiveSIs > 0") at sysf_def.c:281
3 0x00000000004de88c in AVD_SU::dec_curr_act_si (this=0x7bde40) at su.cc:2006
4 0x00000000004c504e in avd_susi_delete (cb=0x75dba0 <_control_block>,
susi=0x7eb940, ckpt=false) at siass.cc:554
5 0x000000000049a326 in SG_NORED::node_fail (this=0x7bc210, cb=0x75dba0
<_control_block>, su=0x7bde40) at sg_nored_fsm.cc:781
6 0x00000000004bd4d7 in avd_node_down_mw_susi_failover (cb=0x75dba0
<_control_block>, avnd=0x7b04d0) at sgproc.cc:1983
7 0x0000000000461a77 in avd_node_failover (node=0x7b04d0) at ndproc.cc:1142
8 0x0000000000459d63 in avd_mds_avnd_down_evh (cb=0x75dba0 <_control_block>,
evt=0x7f741c002270) at ndfsm.cc:684
9 0x0000000000453f60 in process_event (cb_now=0x75dba0 <_control_block>,
evt=0x7f741c002270) at main.cc:775
10 0x0000000000453c83 in main_loop () at main.cc:696
11 0x00000000004541ff in main (argc=2, argv=0x7fffedc7f828) at main.cc:848
Below is the amfnd trace :
Oct 10 14:07:38.712919 osafamfd [2788:imm.cc:1751] << avd_saImmOiRtObjectDelete
Oct 10 14:07:38.712922 osafamfd [2788:csi.cc:1292] << avd_compcsi_delete
Oct 10 14:07:38.712925 osafamfd [2788:mbcsv_api.c:0773] >>
mbcsv_process_snd_ckpt_request: Sending checkpoint data to all STANDBY peers,
as per the send-type specified
Oct 10 14:07:38.712928 osafamfd [2788:mbcsv_api.c:0803] TR svc_id:10,
pwe_hdl:65537
Oct 10 14:07:38.712931 osafamfd [2788:mbcsv_util.c:0343] >>
mbcsv_send_ckpt_data_to_all_peers
Oct 10 14:07:38.712934 osafamfd [2788:mbcsv_util.c:0387] TR dispatching FSM for
NCSMBCSV_SEND_ASYNC_UPDATE
Oct 10 14:07:38.712936 osafamfd [2788:mbcsv_act.c:0101] TR ASYNC update to be
sent. role: 1, svc_id: 10, pwe_hdl: 65537
Oct 10 14:07:38.712939 osafamfd [2788:mbcsv_util.c:0399] TR calling encode
callback
Oct 10 14:07:38.712942 osafamfd [2788:chkop.cc:0228] TR Async update
Oct 10 14:07:38.712945 osafamfd [2788:ckpt_enc.cc:0681] >> enc_siass: io_action
'2'
Oct 10 14:07:38.712998 osafamfd [2788:ckpt_enc.cc:0704] << enc_siass
Oct 10 14:07:38.713001 osafamfd [2788:mbcsv_util.c:0438] TR send the encoded
message to any other peer with same s/w version
Oct 10 14:07:38.713004 osafamfd [2788:mbcsv_util.c:0441] TR dispatching FSM for
NCSMBCSV_SEND_ASYNC_UPDATE
Oct 10 14:07:38.713006 osafamfd [2788:mbcsv_act.c:0101] TR ASYNC update to be
sent. role: 1, svc_id: 10, pwe_hdl: 65537
Oct 10 14:07:38.713009 osafamfd [2788:mbcsv_mds.c:0185] >> mbcsv_mds_send_msg:
sending to vdest:1
Oct 10 14:07:38.713012 osafamfd [2788:mbcsv_mds.c:0201] TR send type
MDS_SENDTYPE_RED
Oct 10 14:07:38.713023 osafamfd [2788:mbcsv_mds.c:0244] << mbcsv_mds_send_msg:
success
Oct 10 14:07:38.713027 osafamfd [2788:mbcsv_util.c:0492] <<
mbcsv_send_ckpt_data_to_all_peers
Oct 10 14:07:38.713030 osafamfd [2788:mbcsv_api.c:0868] <<
mbcsv_process_snd_ckpt_request: retval: 1
Oct 10 14:07:38.713033 osafamfd [2788:siass.cc:0496] >> avd_susi_delete:
safSu=PL-4,safSg=NoRed,safApp=OpenSAF safSi=NoRed4,safApp=OpenSAF
Oct 10 14:09:23.708873 osafamfd [2802:main.cc:0500] >> initialize
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets