from:"Gary Lee"

[tickets] [opensaf:tickets] #3346 osaf: build failed with gcc/g++ 12

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Type**: defect --> enhancement



---

**[tickets:#3346] osaf: build failed with gcc/g++ 12**

**Status:** fixed
**Milestone:** 5.24.02
**Created:** Fri Jan 12, 2024 11:22 AM UTC by Thang Duc Nguyen
**Last Updated:** Mon Jan 22, 2024 04:21 AM UTC
**Owner:** Thang Duc Nguyen





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3288] fmd: failed during setting role from standby to active**

**Status:** review
**Milestone:** 5.24.09
**Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** Huu The Truong


After the standby SC down then another SC is promoted to became 
the new standby SC.
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f:
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY

At that time, the new standby SC received peer info response from old standby 
SC is needed promote to became the active SC.
2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role 
to ACTIVE
2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE
...
2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request
2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop 
supervision timer

While an another active SC is alive lead to the standby SC reboots itselft 
because cluster has only one active SC.
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, 
SupervisionTime = 60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3306 ckpt: checkpoint node director responding to async call.

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3306] ckpt: checkpoint node director responding to async call.**

**Status:** accepted
**Milestone:** 5.24.09
**Created:** Thu Feb 17, 2022 10:46 AM UTC by Mohan  Kanakam
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** Mohan  Kanakam


During  section create, one ckptnd sends async request(normal mds send) to 
another ckptnd. But, another ckptnd is responding to the request in assumption 
that it received the sync request and it has to respond to the sender ckptnd. 
In few cases, it is needed to respond when a sync req comes to ckptnd, but in 
few cases, it receives async req and it needn't respond async request.
We are getting the following messages in mds log when creating the section:
 sc1-VirtualBox osafckptnd 27692 mds.log [meta sequenceId="2"] MDS_SND_RCV: 
Invalid Sync CTXT Len


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3293] log: Replace ScopeLock by standard lock**

**Status:** review
**Milestone:** 5.24.09
**Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** Hieu Hong Hoang


We created a class ScopeLock to support recursive mutex. It's used a lot in 
module log. However, the C++ std have a std:unique_lock which supports 
std::recursive_mutex. We should use the standard lock instead of creating a new 
class.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3312 fmd: sc failed to failover in roamming mode

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3312] fmd: sc failed to failover in roamming mode**

**Status:** assigned
**Milestone:** 5.24.09
**Created:** Tue Mar 29, 2022 03:44 AM UTC by Huu The Truong
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** Huu The Truong


Shutdown SC-6 (role is standby):
2022-03-07 12:14:52.551 INFO: * Stop standby SC (SC-6)

SC-10 changed role to standby:
2022-03-07 12:14:54.919 SC-10 osafrded[384]: NO RDE role set to STANDBY
However, a service of the old standby is still alive, lead to SC-10 received 
peer info from the old standby (SC-6). It mistakes this service as active SC is 
downing.

SC-10 changed role to active then rebooted.
2022-03-07 12:14:55.522 SC-10 osaffmd[394]: NO Controller Failover: Setting 
role to ACTIVE
2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO RDE role set to ACTIVE
2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO Running 
'/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s)
2022-03-07 12:14:55.654 SC-10 opensaf_sc_active: 
49cbd770-9e07-11ec-b3b4-525400fd3480 expected on SC-1
2022-03-07 12:14:55.656 SC-10 osafntfd[439]: NO ACTIVE request
2022-03-07 12:14:55.656 SC-10 osaffmd[394]: NO Controller promoted. Stop 
supervision timer
2022-03-07 12:14:55.657 SC-10 osafclmd[450]: NO ACTIVE request
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: NO FAILOVER StandBy --> Active
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 20a0f, 
SupervisionTime = 60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3323 imm: PL sync failed after reconnected with SC

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3323] imm: PL sync failed after reconnected with SC**

**Status:** unassigned
**Milestone:** 5.24.09
**Created:** Wed Oct 05, 2022 09:37 AM UTC by Son Tran Ngoc
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** nobody


Active SC1 and PL4 lost connect suddenly ( may be due to the environment 
reason),they re-established contact but PL4 sync failed due to PL4 did not 
update active SC1 information and discard message from IMMD SC1
PL4 sync failed log
2022-09-22 04:07:05.230 DEBUG: Syncing node PL-4 (timeout=120)
2022-09-22 04:08:06.325 WARNING: waiting more than 60 sec for node PL-4 to sync

PL4 discard message from SC1 log
2022-09-22 04:07:08.406 PL-4 osafimmnd[354]: WA DISCARD message from IMMD 2010f 
as ACT:0 SBY:2020f
2022-09-22 04:07:09.013 PL-4 osafimmnd[354]: message repeated 243 times: [ WA 
DISCARD message from IMMD 2010f as ACT:0 SBY:2020f]
step to reproduce.
1.start SCs, PLs.
2.Block traffic SC1 and PL4 ( make sure block traffic after IMM State : 
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT ).
3.Unblock traffic SC1 and PL4.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3335 imm: Valgrind reported errors

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3335] imm: Valgrind reported errors**

**Status:** assigned
**Milestone:** 5.24.09
**Created:** Mon Apr 10, 2023 03:06 AM UTC by PhanTranQuocDat
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** PhanTranQuocDat


Valgrind detects memleaks:

/var/lib/lxc/SC-2/rootfs/var/log/opensaf/immd.valgrind
==417== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost 
in loss record 108 of 111
==417==at 0x4C31B0F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x52ACC2B: sysf_alloc_pkt (sysf_mem.c:429)
==417==by 0x529BA1F: ncs_enc_init_space_pp (hj_ubaid.c:144)
==417==by 0x52C9996: mdtm_fill_data (mds_dt_common.c:1454)
==417==by 0x52CACCD: mdtm_process_recv_message_common (mds_dt_common.c:544)
==417==by 0x52CB071: mdtm_process_recv_data (mds_dt_common.c:1126)
==417==by 0x52D5B8E: mdtm_process_recv_events (mds_dt_tipc.c:1144)
==417==by 0x55106DA: start_thread (pthread_create.c:463)
==417==by 0x584961E: clone (clone.S:95) 

/var/lib/lxc/SC-1/rootfs/var/log/opensaf/immd.valgrind
==417== 7 bytes in 1 blocks are definitely lost in loss record 6 of 117
==417==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238)
==417==by 0x1172C4: mbcsv_dec_async_update (immd_mbcsv.c:1128)
==417==by 0x1172C4: immd_mbcsv_decode_proc (immd_mbcsv.c:1402)
==417==by 0x1172C4: immd_mbcsv_callback (immd_mbcsv.c:411)
==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409)
==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460)
==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166)
==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271)
==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426)

--
==417== 8 bytes in 1 blocks are definitely lost in loss record 15 of 117
==417==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238)
==417==by 0x117288: mbcsv_dec_async_update (immd_mbcsv.c:1119)
==417==by 0x117288: immd_mbcsv_decode_proc (immd_mbcsv.c:1402)
==417==by 0x117288: immd_mbcsv_callback (immd_mbcsv.c:411)
==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409)
==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460)
==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166)
==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271)
==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426)

--
==417== 16 bytes in 1 blocks are definitely lost in loss record 20 of 117
==417==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238)
==417==by 0x11724C: mbcsv_dec_async_update (immd_mbcsv.c:1110)
==417==by 0x11724C: immd_mbcsv_decode_proc (immd_mbcsv.c:1402)
==417==by 0x11724C: immd_mbcsv_callback (immd_mbcsv.c:411)
==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409)
==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460)
==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166)
==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271)
==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3341 log: memleak detected by valgrind

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3341] log: memleak detected by valgrind**

**Status:** review
**Milestone:** 5.24.09
**Created:** Thu Aug 17, 2023 04:41 AM UTC by Thien Minh Huynh
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** Thien Minh Huynh


==526== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost 
in loss record 172 of 175
==526== at 0x4C31B0F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==526== by 0x5B3BC2B: sysf_alloc_pkt (sysf_mem.c:429)
==526== by 0x5B2A9CD: ncs_enc_init_space (hj_ubaid.c:108)
==526== by 0x5B4A03D: ncs_mbcsv_encode_message (mbcsv_util.c:899)
==526== by 0x5B4A56C: mbcsv_send_msg (mbcsv_util.c:1029)
==526== by 0x5B47112: mbcsv_process_events (mbcsv_pr_evts.c:139)
==526== by 0x5B473BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271)
==526== by 0x5B41A19: mbcsv_process_dispatch_request (mbcsv_api.c:426)
==526== by 0x13C6AD: lgs_mbcsv_dispatch(unsigned int) (lgs_mbcsv.cc:509)
==526== by 0x119919: main (lgs_main.cc:592)


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3343 amf: SU is not in healthy state

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3343] amf: SU is not in healthy state**

**Status:** review
**Milestone:** 5.24.09
**Created:** Wed Sep 13, 2023 11:30 AM UTC by Thang Duc Nguyen
**Last Updated:** Wed Nov 15, 2023 12:56 AM UTC
**Owner:** Thang Duc Nguyen


System is not in healthy state in the below scenario
1. Deploy 2N model, each PI SU contains 1 PI comp and 1 NPI comp.
2. Terminate PI component then lock that SU. (Some sleep time is added in the 
instantiation script).
3. The SU is in LOCKED(AdminState) and UNINSTANTIATED(PresenceState).

Can not use amf-adm to repair the SU. Only reboot node can help to cover the 
issue.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3347 smf: Valgrind reported errors

2024-02-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.24.02 --> 5.24.09



---

**[tickets:#3347] smf: Valgrind reported errors**

**Status:** assigned
**Milestone:** 5.24.09
**Created:** Mon Feb 19, 2024 09:07 AM UTC by Nguyen Huynh Tai
**Last Updated:** Mon Feb 19, 2024 09:07 AM UTC
**Owner:** Nguyen Huynh Tai


14:49:09  Verify valgrind result
14:49:09  ==585== 2 errors in context 1 of 6:
14:49:09  ==585== Syscall param socketcall.sendto(msg) points to uninitialised 
byte(s)
14:49:09  ==585==at 0x5509B62: sendto (sendto.c:27)
14:49:09  ==585==by 0x52C5ACF: mds_retry_sendto (mds_dt_tipc.c:3154)
14:49:09  ==585==by 0x52C5CC4: mdtm_sendto (mds_dt_tipc.c:3211)
14:49:09  ==585==by 0x52C68EF: mds_mdtm_send_tipc (mds_dt_tipc.c:2815)
14:49:09  ==585==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send 
(mds_c_sndrcv.c:1774)
14:49:09  ==585==by 0x52B31B6: mds_mcm_send_msg_enc (mds_c_sndrcv.c:1255)
14:49:09  ==585==by 0x52B574C: mcm_pvt_normal_snd_process_common 
(mds_c_sndrcv.c:1194)
14:49:09  ==585==by 0x52B6323: mcm_pvt_normal_svc_snd (mds_c_sndrcv.c:1017)
14:49:09  ==585==by 0x52B6323: mds_mcm_send (mds_c_sndrcv.c:781)
14:49:09  ==585==by 0x52B6323: mds_send (mds_c_sndrcv.c:458)
14:49:09  ==585==by 0x52BEFDB: ncsmds_api (mds_papi.c:165)
14:49:09  ==585==by 0x4E41519: smfsv_mds_msg_send (smfsv_evt.c:1365)
14:49:09  ==585==by 0x10AE71: smfnd_cbk_req_proc (smfnd_evt.c:336)
14:49:09  ==585==by 0x10B465: proc_cbk_req_rsp (smfnd_evt.c:545)
14:49:09  ==585==by 0x10B465: smfnd_process_mbx (smfnd_evt.c:591)
14:49:09  ==585==  Address 0x6aeaa4f is 63 bytes inside a block of size 67 
alloc'd
14:49:09  ==585==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
14:49:09  ==585==by 0x52C654B: mds_mdtm_send_tipc (mds_dt_tipc.c:2734)
14:49:09  ==585==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send 
(mds_c_sndrcv.c:1774)
14:49:09  ==586== 2 errors in context 1 of 6:
14:49:09  ==586== Syscall param socketcall.sendto(msg) points to uninitialised 
byte(s)
14:49:09  ==586==at 0x5509B62: sendto (sendto.c:27)
14:49:09  ==586==by 0x52C5ACF: mds_retry_sendto (mds_dt_tipc.c:3154)
14:49:09  ==586==by 0x52C5CC4: mdtm_sendto (mds_dt_tipc.c:3211)
14:49:09  ==586==by 0x52C68EF: mds_mdtm_send_tipc (mds_dt_tipc.c:2815)
14:49:09  ==586==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send 
(mds_c_sndrcv.c:1774)
14:49:09  ==586==by 0x52B31B6: mds_mcm_send_msg_enc (mds_c_sndrcv.c:1255)
14:49:09  ==586==by 0x52B574C: mcm_pvt_normal_snd_process_common 
(mds_c_sndrcv.c:1194)
14:49:09  ==586==by 0x52B6323: mcm_pvt_normal_svc_snd (mds_c_sndrcv.c:1017)
14:49:09  ==586==by 0x52B6323: mds_mcm_send (mds_c_sndrcv.c:781)
14:49:09  ==586==by 0x52B6323: mds_send (mds_c_sndrcv.c:458)
14:49:09  ==586==by 0x52BEFDB: ncsmds_api (mds_papi.c:165)
14:49:09  ==586==by 0x4E41519: smfsv_mds_msg_send (smfsv_evt.c:1365)
14:49:09  ==586==by 0x10AE71: smfnd_cbk_req_proc (smfnd_evt.c:336)
14:49:09  ==586==by 0x10B465: proc_cbk_req_rsp (smfnd_evt.c:545)
14:49:09  ==586==by 0x10B465: smfnd_process_mbx (smfnd_evt.c:591)
14:49:09  ==586==  Address 0x6afcdaf is 63 bytes inside a block of size 67 
alloc'd
14:49:09  ==586==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
14:49:09  ==586==by 0x52C654B: mds_mdtm_send_tipc (mds_dt_tipc.c:2734)
14:49:09  ==586==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send 
(mds_c_sndrcv.c:1774)
14:49:09  ==586== 2 errors in context 1 of 6:
14:49:09  ==586== Syscall param socketcall.sendto(msg) points to uninitialised 
byte(s)
14:49:09  ==586==at 0x5509B62: sendto (sendto.c:27)
14:49:09  ==586==by 0x52C5ACF: mds_retry_sendto (mds_dt_tipc.c:3154)
14:49:09  ==586==by 0x52C5CC4: mdtm_sendto (mds_dt_tipc.c:3211)
14:49:09  ==586==by 0x52C68EF: mds_mdtm_send_tipc (mds_dt_tipc.c:2815)
14:49:09  ==586==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send 
(mds_c_sndrcv.c:1774)
14:49:09  ==586==by 0x52B31B6: mds_mcm_send_msg_enc (mds_c_sndrcv.c:1255)
14:49:09  ==586==by 0x52B574C: mcm_pvt_normal_snd_process_common 
(mds_c_sndrcv.c:1194)
14:49:09  ==586==by 0x52B6323: mcm_pvt_normal_svc_snd (mds_c_sndrcv.c:1017)
14:49:09  ==586==by 0x52B6323: mds_mcm_send (mds_c_sndrcv.c:781)
14:49:09  ==586==by 0x52B6323: mds_send (mds_c_sndrcv.c:458)
14:49:09  ==586==by 0x52BEFDB: ncsmds_api (mds_papi.c:165)
14:49:09  ==586==by 0x4E41519: smfsv_mds_msg_send (smfsv_evt.c:1365)
14:49:09  ==586==by 0x10AE71: smfnd_cbk_req_proc (smfnd_evt.c:336)
14:49:09  ==586==by 0x10B465: proc_cbk_req_rsp (smfnd_evt.c:545)
14:49:09  ==586==by 0x10B465: smfnd_process_mbx (smfnd_evt.c:591)
14:49:09  ==586==  Address 0x6b0f10f is 63 bytes inside a block of size 67 
alloc'd
14:49:09  ==586==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
14:49:09  ==586==by 0x52C654B: mds_mdtm_send_tipc (mds_dt_tipc.c:2734)
14:49:09  ==586==by 0

[tickets] [opensaf:tickets] #3335 imm: Valgrind reported errors

2023-07-29 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.23.07 --> 5.23.12



---

**[tickets:#3335] imm: Valgrind reported errors**

**Status:** assigned
**Milestone:** 5.23.12
**Created:** Mon Apr 10, 2023 03:06 AM UTC by PhanTranQuocDat
**Last Updated:** Fri Apr 28, 2023 08:18 AM UTC
**Owner:** PhanTranQuocDat


Valgrind detects memleaks:

/var/lib/lxc/SC-2/rootfs/var/log/opensaf/immd.valgrind
==417== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost 
in loss record 108 of 111
==417==at 0x4C31B0F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x52ACC2B: sysf_alloc_pkt (sysf_mem.c:429)
==417==by 0x529BA1F: ncs_enc_init_space_pp (hj_ubaid.c:144)
==417==by 0x52C9996: mdtm_fill_data (mds_dt_common.c:1454)
==417==by 0x52CACCD: mdtm_process_recv_message_common (mds_dt_common.c:544)
==417==by 0x52CB071: mdtm_process_recv_data (mds_dt_common.c:1126)
==417==by 0x52D5B8E: mdtm_process_recv_events (mds_dt_tipc.c:1144)
==417==by 0x55106DA: start_thread (pthread_create.c:463)
==417==by 0x584961E: clone (clone.S:95) 

/var/lib/lxc/SC-1/rootfs/var/log/opensaf/immd.valgrind
==417== 7 bytes in 1 blocks are definitely lost in loss record 6 of 117
==417==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238)
==417==by 0x1172C4: mbcsv_dec_async_update (immd_mbcsv.c:1128)
==417==by 0x1172C4: immd_mbcsv_decode_proc (immd_mbcsv.c:1402)
==417==by 0x1172C4: immd_mbcsv_callback (immd_mbcsv.c:411)
==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409)
==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460)
==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166)
==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271)
==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426)

--
==417== 8 bytes in 1 blocks are definitely lost in loss record 15 of 117
==417==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238)
==417==by 0x117288: mbcsv_dec_async_update (immd_mbcsv.c:1119)
==417==by 0x117288: immd_mbcsv_decode_proc (immd_mbcsv.c:1402)
==417==by 0x117288: immd_mbcsv_callback (immd_mbcsv.c:411)
==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409)
==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460)
==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166)
==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271)
==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426)

--
==417== 16 bytes in 1 blocks are definitely lost in loss record 20 of 117
==417==at 0x4C33B25: calloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238)
==417==by 0x11724C: mbcsv_dec_async_update (immd_mbcsv.c:1110)
==417==by 0x11724C: immd_mbcsv_decode_proc (immd_mbcsv.c:1402)
==417==by 0x11724C: immd_mbcsv_callback (immd_mbcsv.c:411)
==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409)
==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460)
==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166)
==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271)
==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3336 amf: node did not reboot in split-brain prevention

2023-07-29 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.23.07 --> 5.23.12



---

**[tickets:#3336] amf: node did not reboot in split-brain prevention**

**Status:** assigned
**Milestone:** 5.23.12
**Created:** Wed Apr 26, 2023 08:48 AM UTC by Thang Duc Nguyen
**Last Updated:** Wed Apr 26, 2023 08:48 AM UTC
**Owner:** Thang Duc Nguyen


With split-brain prevention with arbitration enable, relaxation mode is enable.
The arbitration is down, then one SC is down. But the remain SC is still alive.
It should be rebooted in this case.
~~~
2023-04-04T07:52:12.137+02:00 SC-2.1 osafamfd[5337]: NO Node 'SC-1' is down. 
Start failover delay timer
...
2023-04-04T07:52:19.286+02:00 SC-2.1 osafamfd[5337]: NO Relaxed node promotion 
is enabled, peer SC is connected
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3040 Amfnd: reboot if mismatch msg id b/w amfd and amfnd

2023-03-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.06 --> 5.23.03



---

** [tickets:#3040] Amfnd: reboot if mismatch msg id b/w amfd and amfnd**

**Status:** fixed
**Milestone:** 5.23.03
**Created:** Thu May 16, 2019 07:33 AM UTC by Thang Duc Nguyen
**Last Updated:** Sun Jan 29, 2023 09:10 AM UTC
**Owner:** Thang Duc Nguyen


During SC failover, message received on ACTIVE AMFD can not be checked point to 
AMFD on STANDBY SC. But the AMFND still process the message ack for that 
message then it remove from queue. 
STANDBY SC takes ACTIVE and mismatch message id b/w AMFD and AMFND on new 
ACTIVE. As consequence, clm track start can not invoked to update cluster 
member nodes if these node was rebooted.  

Need to reboot recovery if received message  id of amfd mismatch with sent 
message id of amfnd.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2798 mds: mdstest 5 1, 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed

2023-03-27 Thread Gary Lee via Opensaf-tickets

commit db861cc07e61330bf0c1686869b18e33d302f255
Author: hieu.h.hoang 
Date:   Wed Nov 30 08:27:05 2022 +0700

mds: Fix failed test cases in mdstest [#2798]

A number of test cases are failed because it retrieved the event without
polling the select object. Solution is to poll the select object.



---

** [tickets:#2798] mds: mdstest 5 1,5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 
failed**

**Status:** fixed
**Milestone:** 5.23.03
**Created:** Wed Mar 07, 2018 04:19 AM UTC by Hoa Le
**Last Updated:** Mon Mar 27, 2023 11:31 PM UTC
**Owner:** Hieu Hong Hoang
**Attachments:**

- 
[mdstest_5_1.tar.gz](https://sourceforge.net/p/opensaf/tickets/2798/attachment/mdstest_5_1.tar.gz)
 (8.4 MB; application/gzip)


Opensaf commit 5629f554686a498f328e0c79fc946379cbcf6967

mdstest 5 1

~~~
LOG_NO("\nAction: Retrieve only ONE event\n");
if (mds_service_subscribe(gl_tet_adest.mds_pwe1_hdl, 500,
  NCSMDS_SCOPE_INTRACHASSIS, 2,
  svcids) != NCSCC_RC_SUCCESS) {
LOG_NO("\nFail\n");
FAIL = 1;
} else {
LOG_NO("\nAction: Retrieve only ONE event\n");
if (mds_service_retrieve(gl_tet_adest.mds_pwe1_hdl, 500,
 SA_DISPATCH_ONE) != NCSCC_RC_SUCCESS) {
LOG_NO("Fail, retrieve ONE\n");
FAIL = 1;
} else
LOG_NO("\nSuccess\n");
~~~


After the subscription request being successful, mdstest would expectedly 
receive two 2 MDS_UP events of services 600 and 700. These info will be 
retrieved in the next step of the test case (mds_service_retrieve).

The problem here is, these MDS_UP events are processed in a separate (parallel) 
thread (mds core thread) from the test case's main thread. In a bad scenario, 
if the mds core thread cannot be processed before the RETRIEVE operations in 
the main thread, the RETRIEVE request with "SA_DISPATCH_ONE" flag will return 
"error", and the test case will fail.

<143>1 2018-03-07T01:10:29.936907+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="155"] << mds_mcm_svc_subscribe*** // MDS SUBSCRIBE request***
...
<142>1 2018-03-07T01:10:29.937631+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="162"] MDS_SND_RCV: info->info.retrieve_msg.i_dispatchFlags == 
SA_DISPATCH_ONE*** // MDS RETRIEVE request with SA DISPATCH ONE flag came 
before MDS UP events being processed***
<139>1 2018-03-07T01:10:29.937729+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="163"] MDS_SND_RCV: msgelem == NULL
<142>1 2018-03-07T01:10:29.937953+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="164"] MDTM: Processing pollin events
<142>1 2018-03-07T01:10:29.938333+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="165"] MDTM: Received SVC event
<143>1 2018-03-07T01:10:29.93838+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="166"] >> mds_mcm_svc_up
<143>1 2018-03-07T01:10:29.938418+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="167"] MCM:API: LOCAL SVC INFO  : svc_id = INTERNAL(500) | PWE id = 
1 | VDEST id = 65535 |
<143>1 2018-03-07T01:10:29.938439+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="168"] MCM:API: REMOTE SVC INFO : svc_id = EXTERNAL(600) | PWE id = 
1 | VDEST id = 65535 | POLICY = 1 | SCOPE = 3 | ROLE = 1 | MY_PCON = 0 |

2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Action: Retrieve only ONE event
2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Request to ncsmds_api: MDS 
RETRIEVE has FAILED
2018-03-07 01:10:29.942 SC-1 mdstest: NO Fail, retrieve ONE

The same issue was observed in mdstest 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2798 mds: mdstest 5 1, 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed

2023-03-27 Thread Gary Lee via Opensaf-tickets

- **Milestone**: future --> 5.23.03



---

** [tickets:#2798] mds: mdstest 5 1,5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 
failed**

**Status:** fixed
**Milestone:** 5.23.03
**Created:** Wed Mar 07, 2018 04:19 AM UTC by Hoa Le
**Last Updated:** Mon Dec 19, 2022 01:47 AM UTC
**Owner:** Hieu Hong Hoang
**Attachments:**

- 
[mdstest_5_1.tar.gz](https://sourceforge.net/p/opensaf/tickets/2798/attachment/mdstest_5_1.tar.gz)
 (8.4 MB; application/gzip)


Opensaf commit 5629f554686a498f328e0c79fc946379cbcf6967

mdstest 5 1

~~~
LOG_NO("\nAction: Retrieve only ONE event\n");
if (mds_service_subscribe(gl_tet_adest.mds_pwe1_hdl, 500,
  NCSMDS_SCOPE_INTRACHASSIS, 2,
  svcids) != NCSCC_RC_SUCCESS) {
LOG_NO("\nFail\n");
FAIL = 1;
} else {
LOG_NO("\nAction: Retrieve only ONE event\n");
if (mds_service_retrieve(gl_tet_adest.mds_pwe1_hdl, 500,
 SA_DISPATCH_ONE) != NCSCC_RC_SUCCESS) {
LOG_NO("Fail, retrieve ONE\n");
FAIL = 1;
} else
LOG_NO("\nSuccess\n");
~~~


After the subscription request being successful, mdstest would expectedly 
receive two 2 MDS_UP events of services 600 and 700. These info will be 
retrieved in the next step of the test case (mds_service_retrieve).

The problem here is, these MDS_UP events are processed in a separate (parallel) 
thread (mds core thread) from the test case's main thread. In a bad scenario, 
if the mds core thread cannot be processed before the RETRIEVE operations in 
the main thread, the RETRIEVE request with "SA_DISPATCH_ONE" flag will return 
"error", and the test case will fail.

<143>1 2018-03-07T01:10:29.936907+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="155"] << mds_mcm_svc_subscribe*** // MDS SUBSCRIBE request***
...
<142>1 2018-03-07T01:10:29.937631+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="162"] MDS_SND_RCV: info->info.retrieve_msg.i_dispatchFlags == 
SA_DISPATCH_ONE*** // MDS RETRIEVE request with SA DISPATCH ONE flag came 
before MDS UP events being processed***
<139>1 2018-03-07T01:10:29.937729+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="163"] MDS_SND_RCV: msgelem == NULL
<142>1 2018-03-07T01:10:29.937953+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="164"] MDTM: Processing pollin events
<142>1 2018-03-07T01:10:29.938333+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="165"] MDTM: Received SVC event
<143>1 2018-03-07T01:10:29.93838+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="166"] >> mds_mcm_svc_up
<143>1 2018-03-07T01:10:29.938418+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="167"] MCM:API: LOCAL SVC INFO  : svc_id = INTERNAL(500) | PWE id = 
1 | VDEST id = 65535 |
<143>1 2018-03-07T01:10:29.938439+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="168"] MCM:API: REMOTE SVC INFO : svc_id = EXTERNAL(600) | PWE id = 
1 | VDEST id = 65535 | POLICY = 1 | SCOPE = 3 | ROLE = 1 | MY_PCON = 0 |

2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Action: Retrieve only ONE event
2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Request to ncsmds_api: MDS 
RETRIEVE has FAILED
2018-03-07 01:10:29.942 SC-1 mdstest: NO Fail, retrieve ONE

The same issue was observed in mdstest 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3331 Valgrind reports errors

2023-03-27 Thread Gary Lee via Opensaf-tickets

- **status**: assigned --> fixed



---

** [tickets:#3331] Valgrind reports errors **

**Status:** fixed
**Milestone:** 5.23.03
**Created:** Wed Mar 01, 2023 02:20 AM UTC by PhanTranQuocDat
**Last Updated:** Wed Mar 22, 2023 08:35 AM UTC
**Owner:** PhanTranQuocDat


Valgrind detected errors:

==484== 542 (536 direct, 6 indirect) bytes in 1 blocks are definitely lost in 
loss record 312 of 368
==484==at 0x4C3217F: operator new(unsigned long) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==484==by 0x1636BC: csiattr_create(std::__cxx11::basic_string, std::allocator > const&, SaImmAttrValuesT_2 
const**) (csiattr.cc:78)
==484==by 0x164443: csiattr_create_apply (csiattr.cc:519)
==484==by 0x164443: csiattr_ccb_apply_cb(CcbUtilOperationData*) 
(csiattr.cc:713)
==484==by 0x172155: ccb_apply_cb(unsigned long long, unsigned long long) 
(imm.cc:1265)
==484==by 0x54B0C94: imma_process_callback_info(imma_cb*, 
imma_client_node*, imma_callback_info*, unsigned long long) 

==407== Invalid read of size 1
==407==at 0x5732C3A: mds_svc_op_uninstall (mds_svc_op.c:375)
==407==by 0x57320C7: ncsmds_api (mds_papi.c:147)
==407==by 0x54A31D2: imma_mds_unregister(imma_cb*) (imma_mds.cc:171)
==407==by 0x54A25D4: imma_destroy (imma_init.cc:219)
==407==by 0x54A25D4: imma_shutdown(ncsmds_svc_id) (imma_init.cc:337)
==407==by 0x54AF316: saImmOmFinalize (imma_om_api.cc:940)
==407==by 0x5061961: immutil_saImmOmFinalize (immutil.c:1572)
==407==by 0x141267: hydra_config_get (main.cc:775)
==407==by 0x141267: avnd_cb_create (main.cc:349)

==461== Mismatched free() / delete / delete []
==461==at 0x4C3323B: operator delete(void*) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==461==by 0x1308D4: comp_init(avnd_comp_tag*, SaImmAttrValuesT_2 const**) 
(compdb.cc:1422)
==461==by 0x131066: avnd_comp_config_reinit(avnd_comp_tag*) (compdb.cc:1759)
==461==by 0x123FD7: avnd_comp_clc_uninst_inst_hdler(avnd_cb_tag*, 
avnd_comp_tag*) (clc.cc:1584)
==461==by 0x124390: avnd_comp_clc_fsm_run(avnd_cb_tag*, avnd_comp_tag*, 
avnd_comp_clc_pres_fsm_ev) (clc.cc:887)
==461==by 0x153FE6: avnd_su_pres_uninst_suinst_hdler(avnd_cb_tag*, 
avnd_su_tag*, avnd_comp_tag*) (susm.cc:2145)
==461==by 0x1567C0: avnd_su_pres_fsm_run(avnd_cb_tag*, avnd_su_tag*, 
avnd_comp_tag*, avnd_su_pres_fsm_ev) (susm.cc:1604)
==461==by 0x15C3AA: avnd_evt_ir_evh(avnd_cb_tag*, avnd_evt_tag*) 
(susm.cc:4236)
==461==by 0x141D25: avnd_evt_process (main.cc:692)
==461==by 0x141D25: avnd_main_process() (main.cc:644)
==461==by 0x1170AD: main (main.cc:225)

==407== Invalid read of size 8
==407==at 0x119942: avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*, avnd_evt_tag*) 
(cbq.cc:678)
==407==by 0x141D15: avnd_evt_process (main.cc:692)
==407==by 0x141D15: avnd_main_process() (main.cc:644)
==407==by 0x1170AD: main (main.cc:225)
==407==  Address 0x8bb2ad0 is 64 bytes inside a block of size 112 free'd
==407==at 0x4C3323B: operator delete(void*) (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==407==by 0x11962B: avnd_comp_cbq_rec_pop_and_del(avnd_cb_tag*, 
avnd_comp_tag*, unsigned int, bool) (cbq.cc:973)
==407==by 0x119941: avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*, avnd_evt_tag*) 
(cbq.cc:678)
==407==by 0x141D15: avnd_evt_process (main.cc:692)


==428== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost 
in loss record 285 of 289
==428==at 0x4C31B0F: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==428==by 0x5914C2B: sysf_alloc_pkt (sysf_mem.c:429)
==428==by 0x5903A1F: ncs_enc_init_space_pp (hj_ubaid.c:144)
==428==by 0x5931996: mdtm_fill_data (mds_dt_common.c:1453)
==428==by 0x5932CC2: mdtm_process_recv_message_common (mds_dt_common.c:544)
==428==by 0x5933061: mdtm_process_recv_data (mds_dt_common.c:1125)
==428==by 0x59351D6: mds_mdtm_process_recvdata (mds_dt_trans.c:1217)
==428==by 0x5936426: mdtm_process_poll_recv_data_tcp (mds_dt_trans.c:903)
==428==by 0x593683E: mdtm_process_recv_events_tcp (mds_dt_trans.c:995)
==428==by 0x61196DA: start_thread (pthread_create.c:463) 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3332 rde: incorrect use of pointer

2023-03-26 Thread Gary Lee via Opensaf-tickets




---

** [tickets:#3332] rde: incorrect use of pointer**

**Status:** unassigned
**Milestone:** 5.23.03
**Created:** Mon Mar 27, 2023 06:47 AM UTC by Gary Lee
**Last Updated:** Mon Mar 27, 2023 06:47 AM UTC
**Owner:** nobody


In rda_papi.cc, there is a use after free.

if (m_NCS_TASK_START(rda_callback_cb->task_handle) != NCSCC_RC_SUCCESS) {
  m_NCS_MEM_FREE(rda_callback_cb, 0, 0, 0);
  m_NCS_TASK_RELEASE(rda_callback_cb->task_handle);
  rc = PCSRDA_RC_TASK_SPAWN_FAILED;
  break;
}


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3102 mds: waste 1.5s in waiting Adest already down to send response message type

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.20.02 --> 5.22.11



---

** [tickets:#3102] mds: waste 1.5s in waiting Adest already down to send 
response message type**

**Status:** fixed
**Milestone:** 5.22.11
**Created:** Thu Oct 17, 2019 09:23 AM UTC by Thuan Tran
**Last Updated:** Thu Jul 28, 2022 07:20 AM UTC
**Owner:** Thuan Tran
**Attachments:**

- [mds.log](https://sourceforge.net/p/opensaf/tickets/3102/attachment/mds.log) 
(16.9 kB; application/octet-stream)


On Active SC, do following commands:

~~~
pkill -STOP osafntfd
ntfsend &
pkill -9 ntfsend
pkill -CONT osafntfd
~~~

Check mds.log will see osafntfd stuck in 1.5s to waiting for agent already down 
to send response message type.




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.11 --> 5.23.03



---

** [tickets:#3288] fmd: failed during setting role from standby to active**

**Status:** review
**Milestone:** 5.23.03
**Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong
**Last Updated:** Thu Nov 17, 2022 04:11 PM UTC
**Owner:** Huu The Truong


After the standby SC down then another SC is promoted to became 
the new standby SC.
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f:
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY

At that time, the new standby SC received peer info response from old standby 
SC is needed promote to became the active SC.
2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role 
to ACTIVE
2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE
...
2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request
2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop 
supervision timer

While an another active SC is alive lead to the standby SC reboots itselft 
because cluster has only one active SC.
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, 
SupervisionTime = 60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.11 --> 5.23.03



---

** [tickets:#3293] log: Replace ScopeLock by standard lock**

**Status:** review
**Milestone:** 5.23.03
**Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang
**Last Updated:** Wed Jun 01, 2022 12:59 AM UTC
**Owner:** Hieu Hong Hoang


We created a class ScopeLock to support recursive mutex. It's used a lot in 
module log. However, the C++ std have a std:unique_lock which supports 
std::recursive_mutex. We should use the standard lock instead of creating a new 
class.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3306 ckpt: checkpoint node director responding to async call.

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.11 --> 5.23.03



---

** [tickets:#3306] ckpt: checkpoint node director responding to async call.**

**Status:** accepted
**Milestone:** 5.23.03
**Created:** Thu Feb 17, 2022 10:46 AM UTC by Mohan  Kanakam
**Last Updated:** Thu Oct 06, 2022 02:30 PM UTC
**Owner:** Mohan  Kanakam


During  section create, one ckptnd sends async request(normal mds send) to 
another ckptnd. But, another ckptnd is responding to the request in assumption 
that it received the sync request and it has to respond to the sender ckptnd. 
In few cases, it is needed to respond when a sync req comes to ckptnd, but in 
few cases, it receives async req and it needn't respond async request.
We are getting the following messages in mds log when creating the section:
 sc1-VirtualBox osafckptnd 27692 mds.log [meta sequenceId="2"] MDS_SND_RCV: 
Invalid Sync CTXT Len


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3312 fmd: sc failed to failover in roamming mode

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.11 --> 5.23.03



---

** [tickets:#3312] fmd: sc failed to failover in roamming mode**

**Status:** assigned
**Milestone:** 5.23.03
**Created:** Tue Mar 29, 2022 03:44 AM UTC by Huu The Truong
**Last Updated:** Thu Nov 17, 2022 04:08 PM UTC
**Owner:** Huu The Truong


Shutdown SC-6 (role is standby):
2022-03-07 12:14:52.551 INFO: * Stop standby SC (SC-6)

SC-10 changed role to standby:
2022-03-07 12:14:54.919 SC-10 osafrded[384]: NO RDE role set to STANDBY
However, a service of the old standby is still alive, lead to SC-10 received 
peer info from the old standby (SC-6). It mistakes this service as active SC is 
downing.

SC-10 changed role to active then rebooted.
2022-03-07 12:14:55.522 SC-10 osaffmd[394]: NO Controller Failover: Setting 
role to ACTIVE
2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO RDE role set to ACTIVE
2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO Running 
'/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s)
2022-03-07 12:14:55.654 SC-10 opensaf_sc_active: 
49cbd770-9e07-11ec-b3b4-525400fd3480 expected on SC-1
2022-03-07 12:14:55.656 SC-10 osafntfd[439]: NO ACTIVE request
2022-03-07 12:14:55.656 SC-10 osaffmd[394]: NO Controller promoted. Stop 
supervision timer
2022-03-07 12:14:55.657 SC-10 osafclmd[450]: NO ACTIVE request
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: NO FAILOVER StandBy --> Active
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 20a0f, 
SupervisionTime = 60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3322 log: log agent in main process is disabled after child process exits

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.11 --> 5.23.03



---

** [tickets:#3322] log: log agent in main process is disabled after child 
process exits**

**Status:** review
**Milestone:** 5.23.03
**Created:** Wed Oct 05, 2022 09:36 AM UTC by Hieu Hong Hoang
**Last Updated:** Thu Oct 06, 2022 10:14 AM UTC
**Owner:** Hieu Hong Hoang
**Attachments:**

- [loga](https://sourceforge.net/p/opensaf/tickets/3322/attachment/loga) (28.8 
kB; application/octet-stream)
- 
[osaflogd](https://sourceforge.net/p/opensaf/tickets/3322/attachment/osaflogd) 
(527.4 kB; application/octet-stream)


While using log agent, if the new process was created by forking current 
process, both processes have a same mds address. If the child process exits, 
the destructor of log agent is invoked. It will unregister with mds so all 
other subscribing services will detect this log agent is down. However the main 
process is still using that mds address, all requests from main process become 
invalid because logd thinks this log agent has already been down.

Log analysis:

* Main process initializes the mds address:
~~~
<143>1 2022-10-05T10:41:05.537677+02:00 SC-2 logtest 763 loga [meta 
sequenceId="89"] 763:log/agent/lga_mds.cc:1287 >> lga_mds_init 
<143>1 2022-10-05T10:41:05.537782+02:00 SC-2 logtest 763 loga [meta 
sequenceId="90"] 763:log/agent/lga_mds.cc:1334 << lga_mds_init 
~~~
* Duplicate the main process by using 
https://man7.org/linux/man-pages/man2/fork.2.html . Then the duplicated process 
exits and the destructor of log agent is invoked:
~~~
<143>1 2022-10-05T10:41:05.541101+02:00 SC-2 logtest 763 loga [meta 
sequenceId="156"] 772:log/agent/lga_agent.cc:167 >> ~LogAgent 
<143>1 2022-10-05T10:41:05.54126+02:00 SC-2 logtest 763 loga [meta 
sequenceId="157"] 772:log/agent/lga_state.cc:160 >> stop_recovery2_thread 
<143>1 2022-10-05T10:41:05.541297+02:00 SC-2 logtest 763 loga [meta 
sequenceId="158"] 772:log/agent/lga_state.cc:166 TR stop_recovery2_thread 
RecoveryState::kNormal no thread to stop
<143>1 2022-10-05T10:41:05.541315+02:00 SC-2 logtest 763 loga [meta 
sequenceId="159"] 772:log/agent/lga_state.cc:183 << stop_recovery2_thread 
<143>1 2022-10-05T10:41:05.541322+02:00 SC-2 logtest 763 loga [meta 
sequenceId="160"] 772:log/agent/lga_util.cc:125 >> lga_shutdown 
<143>1 2022-10-05T10:41:05.541329+02:00 SC-2 logtest 763 loga [meta 
sequenceId="161"] 772:log/agent/lga_mds.cc:1351 >> lga_mds_deinit 
<143>1 2022-10-05T10:41:05.541573+02:00 SC-2 logtest 763 loga [meta 
sequenceId="162"] 772:log/agent/lga_mds.cc:1362 << lga_mds_deinit 
~~~
* Logd detects this log agent is down and delete all clients of this log agent:
~~~
<143>1 2022-10-05T10:41:05.541593+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3386"] 452:log/logd/lgs_mds.cc:1230 T8 MDS DOWN dest: 
2020f3aafb5a7, node ID: 2020f, svc_id: 21
<143>1 2022-10-05T10:41:05.541636+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3387"] 447:log/logd/lgs_evt.cc:415 >> proc_lga_updn_mds_msg 
<143>1 2022-10-05T10:41:05.541648+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3388"] 447:log/logd/lgs_evt.cc:436 TR proc_lga_updn_mds_msg: 
LGSV_LGS_EVT_LGA_DOWN mds_dest = 2020f3aafb5a7
<143>1 2022-10-05T10:41:05.541656+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3389"] 447:log/logd/lgs_evt.cc:338 >> 
lgs_client_delete_by_mds_dest: mds_dest 2020f3aafb5a7
<143>1 2022-10-05T10:41:05.541663+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3390"] 447:log/logd/lgs_evt.cc:191 >> lgs_client_delete: client_id 9
<143>1 2022-10-05T10:41:05.541678+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3391"] 447:log/logd/lgs_evt.cc:213 T4 client_id: 9, REMOVE stream 
id: 2
<143>1 2022-10-05T10:41:05.541686+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3392"] 447:log/logd/lgs_stream.cc:856 >> log_stream_close: 
safLgStrCfg=saLogSystem,safApp=safLogService
<143>1 2022-10-05T10:41:05.541713+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3393"] 447:log/logd/lgs_stream.cc:922 << log_stream_close: rc=0, 
numOpeners=7
<143>1 2022-10-05T10:41:05.541737+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3394"] 447:log/logd/lgs_evt.cc:239 << lgs_client_delete 
<143>1 2022-10-05T10:41:05.541744+02:00 SC-1 osaflogd 447 osaflogd [meta 
sequenceId="3395"] 447:log/logd/lgs_evt.cc:348 << lgs_client_delete_by_mds_dest 
~~~
* The main process send a writting request to logd:
~~~
<143>1 2022-10-05T10:41:07.541457+02:00 SC-2 logtest 763 loga [meta 
sequenceId="182"] 763:log/agent/lga_mds.cc:1439 >> lga_mds_msg_async_send 
<143>1 2022-10-05T10:41:07.541488+02:00 SC-2 logtest 763 loga [meta 
sequenceId="183"] 763:log/agent/lga_mds.cc:789 >> lga_mds_enc 
<143>1 2022-10-05T10:41:07.541516+02:00 SC-2 logtest 763 loga [meta 
sequenceId="184"] 763:log/agent/lga_mds.cc:820 T2 msgtype: 0
<143>1 2022-10-05T10:41:07.541524+02:00 SC-2 logtest 763 loga [meta 
sequenceId="185"] 763:log/agent/lga_mds.cc:834 T2 api_info.type: 4
<143>1 2022-10-05T10:41:07.541533+02:00 SC-2 logtest 763 loga [meta 
sequenceId="186"] 763:log/age

[tickets] [opensaf:tickets] #3323 imm: PL sync failed after reconnected with SC

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.11 --> 5.23.03



---

** [tickets:#3323] imm: PL sync failed after reconnected with SC**

**Status:** review
**Milestone:** 5.23.03
**Created:** Wed Oct 05, 2022 09:37 AM UTC by Son Tran Ngoc
**Last Updated:** Thu Nov 17, 2022 02:42 PM UTC
**Owner:** Son Tran Ngoc


Active SC1 and PL4 lost connect suddenly ( may be due to the environment 
reason),they re-established contact but PL4 sync failed due to PL4 did not 
update active SC1 information and discard message from IMMD SC1
PL4 sync failed log
2022-09-22 04:07:05.230 DEBUG: Syncing node PL-4 (timeout=120)
2022-09-22 04:08:06.325 WARNING: waiting more than 60 sec for node PL-4 to sync

PL4 discard message from SC1 log
2022-09-22 04:07:08.406 PL-4 osafimmnd[354]: WA DISCARD message from IMMD 2010f 
as ACT:0 SBY:2020f
2022-09-22 04:07:09.013 PL-4 osafimmnd[354]: message repeated 243 times: [ WA 
DISCARD message from IMMD 2010f as ACT:0 SBY:2020f]
step to reproduce.
1.start SCs, PLs.
2.Block traffic SC1 and PL4 ( make sure block traffic after IMM State : 
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT ).
3.Unblock traffic SC1 and PL4.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3294 mds: refactor huge api functions

2022-11-17 Thread Gary Lee via Opensaf-tickets

- **Milestone**: future --> 5.22.11



---

** [tickets:#3294] mds: refactor huge api functions**

**Status:** fixed
**Milestone:** 5.22.11
**Created:** Tue Oct 26, 2021 06:02 AM UTC by Hieu Hong Hoang
**Last Updated:** Mon Aug 08, 2022 03:15 AM UTC
**Owner:** Hieu Hong Hoang


Some functions have 1.5K+ line of code. It's hard to maintain those functions.  
We should refactor it into smaller sub-functions.

For example:
~~~
line 1863: uint32_t mds_mcm_svc_up(PW_ENV_ID pwe_id, MDS_SVC_ID svc_id, 
V_DEST_RL role,
line 1864:  NCSMDS_SCOPE_TYPE scope, MDS_VDEST_ID vdest_id,
line 1865:  NCS_VDEST_TYPE vdest_policy, MDS_DEST adest,
line 1866:  bool my_pcon, MDS_SVC_HDL local_svc_hdl,
line 1867:  MDS_SUBTN_REF_VAL subtn_ref_val,
line 1868:  MDS_SVC_PVT_SUB_PART_VER svc_sub_part_ver,
line 1869:  MDS_SVC_ARCHWORD_TYPE archword_type)
line 1870:  {

line 3494: }
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3311 imm: PBE logging inconsistent

2022-06-01 Thread Gary Lee via Opensaf-tickets

- **Component**: fm --> imm



---

** [tickets:#3311] imm: PBE logging inconsistent**

**Status:** fixed
**Milestone:** 5.22.06
**Created:** Tue Mar 15, 2022 01:32 AM UTC by Thang Duc Nguyen
**Last Updated:** Wed Jun 01, 2022 01:01 AM UTC
**Owner:** Vu Minh Hoang


There are some logging in syslog for PBE but they are inconsistent. It's not 
easy for troubleshooting.
E.g,
- LOG_WA("**Persistent back-end** process has apparently died.");
- LOG_NO( "**Persistent Back-End** capability configured, Pbe file:%s (suffix 
may get added)",
immnd_cb->mPbeFile);
- LOG_NO("**Persistent Back End** OI attached, pid: %u", pbe_pid);
 - LOG_ER("IMM RELOAD with NO **persistent back end** => ensure cluster restart 
by IMMD exit at both SCs, exiting");
.etc,

Should make they consistent.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3310 log: log agent is blocked for 10s

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Component**: unknown --> log



---

** [tickets:#3310] log: log agent is blocked for 10s**

**Status:** fixed
**Milestone:** 5.22.06
**Created:** Tue Mar 08, 2022 03:32 AM UTC by Hieu Hong Hoang
**Last Updated:** Wed Jun 01, 2022 01:02 AM UTC
**Owner:** Hieu Hong Hoang


In [ticket 3291](https://sourceforge.net/p/opensaf/tickets/3291/), a new 
message was introduced which is sent from the log director to the log agent. 
When a log agent that contains #3291 run with a log director that doesn't 
contains #3291, that log agent must wait for 10s. We should avoid this.
~~~
<143>1 2022-02-19T14:06:22.358273+01:00 SC-2 osafamfd 27233 osafamfd [meta 
sequenceId="749743"] 27233:log/agent/lga_agent.cc:409 >> saLogInitialize 
...
<143>1 2022-02-19T14:06:32.359774+01:00 SC-2 osafamfd 27233 osafamfd [meta 
sequenceId="749784"] 27233:log/agent/lga_agent.cc:348 TR Waiting for initial 
clm status timeout
<143>1 2022-02-19T14:06:32.359838+01:00 SC-2 osafamfd 27233 osafamfd [meta 
sequenceId="749785"] 27233:log/agent/lga_agent.cc:361 << WaitLogServerUp 
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3310 log: log agent is blocked for 10s

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Component**: log --> unknown



---

** [tickets:#3310] log: log agent is blocked for 10s**

**Status:** fixed
**Milestone:** 5.22.06
**Created:** Tue Mar 08, 2022 03:32 AM UTC by Hieu Hong Hoang
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** Hieu Hong Hoang


In [ticket 3291](https://sourceforge.net/p/opensaf/tickets/3291/), a new 
message was introduced which is sent from the log director to the log agent. 
When a log agent that contains #3291 run with a log director that doesn't 
contains #3291, that log agent must wait for 10s. We should avoid this.
~~~
<143>1 2022-02-19T14:06:22.358273+01:00 SC-2 osafamfd 27233 osafamfd [meta 
sequenceId="749743"] 27233:log/agent/lga_agent.cc:409 >> saLogInitialize 
...
<143>1 2022-02-19T14:06:32.359774+01:00 SC-2 osafamfd 27233 osafamfd [meta 
sequenceId="749784"] 27233:log/agent/lga_agent.cc:348 TR Waiting for initial 
clm status timeout
<143>1 2022-02-19T14:06:32.359838+01:00 SC-2 osafamfd 27233 osafamfd [meta 
sequenceId="749785"] 27233:log/agent/lga_agent.cc:361 << WaitLogServerUp 
~~~



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3311 imm: PBE logging inconsistent

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Component**: unknown --> fm



---

** [tickets:#3311] imm: PBE logging inconsistent**

**Status:** fixed
**Milestone:** 5.22.06
**Created:** Tue Mar 15, 2022 01:32 AM UTC by Thang Duc Nguyen
**Last Updated:** Wed Jun 01, 2022 01:01 AM UTC
**Owner:** Vu Minh Hoang


There are some logging in syslog for PBE but they are inconsistent. It's not 
easy for troubleshooting.
E.g,
- LOG_WA("**Persistent back-end** process has apparently died.");
- LOG_NO( "**Persistent Back-End** capability configured, Pbe file:%s (suffix 
may get added)",
immnd_cb->mPbeFile);
- LOG_NO("**Persistent Back End** OI attached, pid: %u", pbe_pid);
 - LOG_ER("IMM RELOAD with NO **persistent back end** => ensure cluster restart 
by IMMD exit at both SCs, exiting");
.etc,

Should make they consistent.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3311 imm: PBE logging inconsistent

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Component**: imm --> unknown



---

** [tickets:#3311] imm: PBE logging inconsistent**

**Status:** fixed
**Milestone:** 5.22.06
**Created:** Tue Mar 15, 2022 01:32 AM UTC by Thang Duc Nguyen
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** Vu Minh Hoang


There are some logging in syslog for PBE but they are inconsistent. It's not 
easy for troubleshooting.
E.g,
- LOG_WA("**Persistent back-end** process has apparently died.");
- LOG_NO( "**Persistent Back-End** capability configured, Pbe file:%s (suffix 
may get added)",
immnd_cb->mPbeFile);
- LOG_NO("**Persistent Back End** OI attached, pid: %u", pbe_pid);
 - LOG_ER("IMM RELOAD with NO **persistent back end** => ensure cluster restart 
by IMMD exit at both SCs, exiting");
.etc,

Should make they consistent.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.06 --> 5.22.11



---

** [tickets:#3288] fmd: failed during setting role from standby to active**

**Status:** review
**Milestone:** 5.22.11
**Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** Huu The Truong


After the standby SC down then another SC is promoted to became 
the new standby SC.
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f:
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY

At that time, the new standby SC received peer info response from old standby 
SC is needed promote to became the active SC.
2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role 
to ACTIVE
2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE
...
2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request
2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop 
supervision timer

While an another active SC is alive lead to the standby SC reboots itselft 
because cluster has only one active SC.
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, 
SupervisionTime = 60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.06 --> 5.22.11



---

** [tickets:#3293] log: Replace ScopeLock by standard lock**

**Status:** review
**Milestone:** 5.22.11
**Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** Hieu Hong Hoang


We created a class ScopeLock to support recursive mutex. It's used a lot in 
module log. However, the C++ std have a std:unique_lock which supports 
std::recursive_mutex. We should use the standard lock instead of creating a new 
class.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3299 immnd: unexpected reboot node after merge network back

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.06 --> 5.22.11



---

** [tickets:#3299] immnd: unexpected reboot node  after merge network back**

**Status:** assigned
**Milestone:** 5.22.11
**Created:** Wed Dec 01, 2021 03:07 AM UTC by Huu The Truong
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** Huu The Truong


Split network into two partitions. On each partition, create a dummy object.
During merge network back has happened twice the reboot node.

The first reboot node:
2021-09-07 12:20:06.926 SC-7 osafimmnd[376]: NO Used to be on another 
partition. Rebooting...
2021-09-07 12:20:06.935 SC-7 osafamfnd[431]: NO AVD NEW_ACTIVE, adest:1
2021-09-07 12:20:06.935 SC-7 osafimmnd[376]: Quick local node rebooting, 
Reason: Used to be on another partition. Rebooting...
2021-09-07 12:20:06.952 SC-7 opensaf_reboot: Do quick local node reboot

At the second, this reboot is unexpected:
2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: NO Used to be on another 
partition. Rebooting...
2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: Quick local node rebooting, 
Reason: Used to be on another partition. Rebooting...
2021-09-07 12:20:11.989 SC-7 osafclmna[333]: NO 
safNode=SC-7,safCluster=myClmCluster Joined cluster, nodeid=2070f
2021-09-07 12:20:11.992 SC-7 opensaf_reboot: Do quick local node reboot
2021-09-07 12:20:12.022 SC-7 opensafd[305]: ER Service RDE has unexpectedly 
crashed. Unable to continue, exiting


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3306 ckpt: checkpoint node director responding to async call.

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.06 --> 5.22.11



---

** [tickets:#3306] ckpt: checkpoint node director responding to async call.**

**Status:** accepted
**Milestone:** 5.22.11
**Created:** Thu Feb 17, 2022 10:46 AM UTC by Mohan  Kanakam
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** Mohan  Kanakam


During  section create, one ckptnd sends async request(normal mds send) to 
another ckptnd. But, another ckptnd is responding to the request in assumption 
that it received the sync request and it has to respond to the sender ckptnd. 
In few cases, it is needed to respond when a sync req comes to ckptnd, but in 
few cases, it receives async req and it needn't respond async request.
We are getting the following messages in mds log when creating the section:
 sc1-VirtualBox osafckptnd 27692 mds.log [meta sequenceId="2"] MDS_SND_RCV: 
Invalid Sync CTXT Len


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3312 fmd: sc failed to failover in roamming mode

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.06 --> 5.22.11



---

** [tickets:#3312] fmd: sc failed to failover in roamming mode**

**Status:** assigned
**Milestone:** 5.22.11
**Created:** Tue Mar 29, 2022 03:44 AM UTC by Huu The Truong
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** Huu The Truong


Shutdown SC-6 (role is standby):
2022-03-07 12:14:52.551 INFO: * Stop standby SC (SC-6)

SC-10 changed role to standby:
2022-03-07 12:14:54.919 SC-10 osafrded[384]: NO RDE role set to STANDBY
However, a service of the old standby is still alive, lead to SC-10 received 
peer info from the old standby (SC-6). It mistakes this service as active SC is 
downing.

SC-10 changed role to active then rebooted.
2022-03-07 12:14:55.522 SC-10 osaffmd[394]: NO Controller Failover: Setting 
role to ACTIVE
2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO RDE role set to ACTIVE
2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO Running 
'/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s)
2022-03-07 12:14:55.654 SC-10 opensaf_sc_active: 
49cbd770-9e07-11ec-b3b4-525400fd3480 expected on SC-1
2022-03-07 12:14:55.656 SC-10 osafntfd[439]: NO ACTIVE request
2022-03-07 12:14:55.656 SC-10 osaffmd[394]: NO Controller promoted. Stop 
supervision timer
2022-03-07 12:14:55.657 SC-10 osafclmd[450]: NO ACTIVE request
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: NO FAILOVER StandBy --> Active
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
2022-03-07 12:14:55.657 SC-10 osafamfd[461]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 20a0f, 
SupervisionTime = 60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3316 base: increase buffer for list of members in a group

2022-05-31 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.06 --> 5.22.11



---

** [tickets:#3316] base: increase buffer for list of members in a group**

**Status:** assigned
**Milestone:** 5.22.11
**Created:** Tue May 24, 2022 03:36 AM UTC by PhanTranQuocDat
**Last Updated:** Wed Jun 01, 2022 12:57 AM UTC
**Owner:** PhanTranQuocDat


When access imm, a user needs to be authenticated to be superuser or  a member 
of authorized group.  In case authorized group has too many users, the default 
system buffer is insufficient to contain leading to  error: Numerical result 
out of range.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3300 dtm: osaftransportd replied Stream not found when delete stream

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Type**: enhancement --> defect



---

** [tickets:#3300] dtm: osaftransportd replied Stream not found when delete 
stream**

**Status:** wontfix
**Milestone:** 5.22.01
**Created:** Thu Dec 09, 2021 06:35 AM UTC by Thien Minh Huynh
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Thien Minh Huynh


if delete stream right after enable trace, the error 'stream not found' will be 
raised.
The error sometimes cause confusion in the case of the wrong stream name.
~~~
root@SC-1:~# osaflog --delete osafimmnd
ERROR: osaftransportd replied 'Stream not found'
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3292 log: Introduce an initial clm node status

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Type**: enhancement --> defect



---

** [tickets:#3292] log: Introduce an initial clm node status**

**Status:** duplicate
**Milestone:** 5.22.01
**Created:** Mon Oct 18, 2021 09:02 AM UTC by Hieu Hong Hoang
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Hieu Hong Hoang


Currently, loga don't know when logd is ready for a request. In this ticket, we 
introduce a new message from logd to loga. When logd detect a log agent is up, 
logd will send the initial clm node status to loga. Loga should wait for that 
message before sending any request to logd.
This callback could solve the issue with priority of messages in 
[1396](https://sourceforge.net/p/opensaf/tickets/1396/).
In 1396: the agent down message received before the initial client message but 
it's processed after the initial client message due to the priority of messages.
If the loga wait for the initial clm node status message before sending the  
initial client message, all messages will be processed in right order. 
Following is the order of processing messages:
1. logd: agent up message
2. loga: initial clm node status message
3. logd: initial client message
4. logd: final client message
5. logd: agent down message


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #209 plmd crashed while deleting plm entities at various points.

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#209] plmd crashed while deleting plm entities at various points.**

**Status:** review
**Milestone:** future
**Created:** Wed May 15, 2013 07:02 AM UTC by Mathi Naickan
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** MeenakshiTK


When the command , "immcfg -d safHE=7220_slot_1,safDomain=domain_1" is ran plm 
crashed with segmentation fault.
the above object has three childs dpb_1,dpb_2 and PL-13
plmd crashed with the following backtrace :
Program terminated with signal 11, Segmentation fault.
#0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at 
plms_utils.c:1360
1360 if (0 == strcmp(tail->plm_entity->dn_name_str,
(gdb) bt fH[[K
#0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at 
plms_utils.c:1360
#1 0x08088c8d in plms_chld_get (ent=0x80fe450, chld_list=0xbbc5d9f8) at 
plms_utils.c:842
#2 0x0805ca90 in plms_delete_objects (obj_type=6, obj_name=0x810a2a8) at 
plms_imm.c:697
#3 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=68719608079, ccb_id=2) at 
plms_imm.c:1425
#4 0x032fc46f in imma_process_callback_info (cb=) at imma_proc.c:2005
#5 0x032fb393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592
#6 0x032ebcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548
#7 0x08051bff in main (argc=2, argv=0xbbc5e414) at plms_main.c:484
#8 0x033aee0c in libc_start_main () from /lib/libc.so.6
#9 0x0804c401 in _start ()

While deleting the entity, which doesn't have any child, it crashed with the 
following backtrace
#0 0x0805cb14 in plms_delete_objects (obj_type=7, obj_name=0x8109588) at 
plms_imm.c:707
#1 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=824633852431, ccb_id=6) at 
plms_imm.c:1425
#2 0x0189c46f in imma_process_callback_info (cb=) at imma_proc.c:2005
#3 0x0189b393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592
#4 0x0188bcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548
#5 0x08051bff in main (argc=2, argv=0xbe1d6db4) at plms_main.c:484
#6 0x0194ee0c in libc_start_main () from /lib/libc.so.6
#7 0x0804c401 in _start ()

Also, check the following issue : 
Crash in plmc_err callback when ee_id is passed as empty string.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2082 CKPT : Track cbk not invoked for section creation after cpnd restart

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2082] CKPT : Track cbk not invoked for section creation after cpnd 
restart**

**Status:** review
**Milestone:** future
**Created:** Thu Sep 29, 2016 11:06 AM UTC by Srikanth R
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Mohan  Kanakam


Changeset: 7997 5.1.FC

Track Callback is not invoked after cpnd restart. Below are the apis called 
from the applications , spawned on two nodes .i.e payloads.


On first node :

-> Initialize with cpsv 
-> Create a ckpt with ACTIVE REPLICA flag.
 
 On second node.
 -> Initialize with cpsv.

 On First node,
 -> Open the checkpoint in writing mode
-> Open the checkpoint in reading mode.
 -> Kill cpnd process
 -> Register for Track callback.

On Second node, 
 -> Open the ckpt in read mode.
 -> Kill cpnd proces
 -> Register for Track callback.
 
 
After ensuring that both agents registered for track callback, create a section 
from the application on first node. For section creation, callback should be 
invoked for applications on two nodes.

Currently callback is not invoked for the application on second node. With out 
cpnd restart, callback is invoked for the two applications


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2454 amfnd: Clean up variable of active amfd status

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2454] amfnd: Clean up variable of active amfd status**

**Status:** review
**Milestone:** future
**Created:** Thu May 04, 2017 12:34 PM UTC by Minh Hon Chau
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Minh Hon Chau


In amfnd, we have variable cb->is_avd_down and set of macros: 
m_AVND_CB_IS_AVD_UP, m_AVND_CB_AVD_UP_SET, m_AVND_CB_AVD_UP_RESET which is 
using flag AVND_CB_FLAG_AVD_UP, they are all indicating active amfd down/up. 
Amfnd should only use variable @is_avd_down or the macros, not both of them.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2500 build: Schema files (.xsd) are missing from distribution tarballs

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2500] build: Schema files (.xsd) are missing from distribution 
tarballs**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Jun 16, 2017 10:45 AM UTC by Anders Widell
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Anders Widell


The XML schema (xsd) files are missing in the release tarballs, at least for 
IMM and probably also for SMF.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2849 smf: Incorrect logging may flood syslog

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2849] smf: Incorrect logging may flood syslog**

**Status:** review
**Milestone:** future
**Created:** Tue May 08, 2018 11:29 AM UTC by elunlen
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Krishna Pawar


In function getNodeDestination() in SmfUtils there is a:
LOG_NO("%s: className '%s'", __FUNCTION__, className);
That should be changed to a TRACE or removed.

It is printed in a loop that may go on for 10 seconds with a delay of 2 seconds 
meaning that this printout may happen 5 times. The problem however, is that 
this function is called in aloop by waitForNodeDestination() . The logging is 
not done at very high speed though. One log every 2 seconds




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2861 osaf: Faulty constructs in opensaf detected when compiling with gcc 8.1

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2861] osaf: Faulty constructs in opensaf detected when compiling 
with gcc 8.1**

**Status:** unassigned
**Milestone:** future
**Created:** Mon May 21, 2018 02:25 PM UTC by Hans Nordebäck
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** nobody


Compiling with gcc 8.1 with  -Wno-format-truncation -Wno-stringop-overflow 
-Wno-format-overflow makes the compilation succeed, but these -Wno- should not 
be used,  the faulty constructs should instead be corrected. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2914 clm : add missing testcases in Clm apitest

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2914] clm : add missing testcases  in Clm apitest**

**Status:** review
**Milestone:** future
**Created:** Mon Aug 20, 2018 01:34 PM UTC by Richa 
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Mohan  Kanakam


Adding missing test cases in clm apitest.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2930 ckpt: non collocated checkpoint is not deleted from /dev/shm after switch over.

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2930] ckpt: non collocated checkpoint is not deleted from  
/dev/shm  after switch over.**

**Status:** review
**Milestone:** future
**Created:** Thu Sep 20, 2018 07:52 AM UTC by Mohan  Kanakam
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Mohan  Kanakam


Steps to reproduce:
1) Create a non collocated checkpoint on SC-1 and PL-4.
2) Create a one section on SC-1.
3)During switch over operation, close the application on PL-4.
4)After switch over, close the application of SC-1.
4) Observe that checkpoint is not deleted from the /dev/shm on SC-1 and SC-2 .

SC-1:
root@mohan-VirtualBox:/home/mohan/opensaf-code/src/ckpt/ckptnd# ls /dev/shm/
opensaf_CPND_CHECKPOINT_INFO_131343
  opensaf_NCS_GLND_LCK_CKPT_INFO 
 opensaf_NCS_MQND_QUEUE_CKPT_INFO
pulse-shm-1049372244  pulse-shm-2170855640  pulse-shm-493188609
opensaf_NCS_GLND_EVT_CKPT_INFO
   opensaf_NCS_GLND_RES_CKPT_INFO
  opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131343_2  
pulse-shm-2086668063  pulse-shm-3681026513

SC-2:

root@mohan-VirtualBox:~# ls /dev/shm
opensaf_CPND_CHECKPOINT_INFO_131599
opensaf_NCS_GLND_EVT_CKPT_INFO
opensaf_NCS_GLND_LCK_CKPT_INFO
opensaf_NCS_GLND_RES_CKPT_INFO
opensaf_NCS_MQND_QUEUE_CKPT_INFO
opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131599_2
pulse-shm-2892283080
pulse-shm-2910971180
pulse-shm-3340597930
pulse-shm-528662130
pulse-shm-551961907


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2932 ckpt: converting the checkpoint service from c to c++

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2932] ckpt: converting the checkpoint service from  c to c++  **

**Status:** review
**Milestone:** future
**Created:** Mon Oct 01, 2018 01:25 PM UTC by Mohan  Kanakam
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Mohan  Kanakam


Converting the checkpoint service  from c to  c++. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2966 lck: add missing test case of lck apitest

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2966] lck: add missing test case of lck apitest**

**Status:** review
**Milestone:** future
**Created:** Mon Nov 19, 2018 05:30 AM UTC by Mohan  Kanakam
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Mohan  Kanakam





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2967 msg: add missing test case of msg apitest

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#2967] msg: add missing test case of msg apitest**

**Status:** review
**Milestone:** future
**Created:** Tue Nov 20, 2018 05:33 AM UTC by Mohan  Kanakam
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Mohan  Kanakam





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3254 Enhancement of NTF notification

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#3254] Enhancement of NTF notification**

**Status:** assigned
**Milestone:** future
**Created:** Thu Mar 18, 2021 03:54 AM UTC by Thanh Nguyen
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Thanh Nguyen


When IMM changes an attribute with NOTIFY flag, NTF sends the changed attribute
values and the old attribute values fetched from IMM.
The fetching of old attribute values from IMM might not be successful under a 
certain condition in which the old values are overwritten before NTF attempts 
to fetch the old values.

To avoid this situation, IMM will send spontaneously the old attribute values to
NTF.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3280 dtm: loss of TCP connection requires node reboot

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#3280] dtm: loss of TCP connection requires node reboot**

**Status:** unassigned
**Milestone:** future
**Created:** Fri Aug 27, 2021 11:33 AM UTC by Mohan  Kanakam
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Mohan  Kanakam


Some time we see loss of TCP connection among payloads or among controller and 
payloads in the cluster.
Example: If we have 2 controllers and 10 payloads(starting from PL-3 to PL-10), 
we see TCP connection loss at PL-4 among  PL-5. The connection of PL-4 with 
other payloads remains established.
We also see connection loss at PL-7 with SC-2, the connection of PL-7 with 
other nodes remains established. This result in PL-7 reboot when controller 
failover happens i.e. SC-1 fails and SC-2 takes Act role. PL-7 thinks that 
there was a single controller in the cluster and it reboots.

This could be reproduced by adding iptables rule to drop the packets.

So, the expected behavior is dtmd on PL-4/PL-5 can retry the connection for few 
times before declaring the node is down.
The only drawback with this approach is that it will delay the application 
failover time or even controller failover time.

Any suggestion  on it ??


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> 5.22.04



---

** [tickets:#3293] log: Replace ScopeLock by standard lock**

**Status:** review
**Milestone:** 5.22.04
**Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Hieu Hong Hoang


We created a class ScopeLock to support recursive mutex. It's used a lot in 
module log. However, the C++ std have a std:unique_lock which supports 
std::recursive_mutex. We should use the standard lock instead of creating a new 
class.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3294 mds: refactor huge api functions

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> future



---

** [tickets:#3294] mds: refactor huge api functions**

**Status:** assigned
**Milestone:** future
**Created:** Tue Oct 26, 2021 06:02 AM UTC by Hieu Hong Hoang
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Hieu Hong Hoang


Some functions have 1.5K+ line of code. It's hard to maintain those functions.  
We should refactor it into smaller sub-functions.

For example:
~~~
line 1863: uint32_t mds_mcm_svc_up(PW_ENV_ID pwe_id, MDS_SVC_ID svc_id, 
V_DEST_RL role,
line 1864:  NCSMDS_SCOPE_TYPE scope, MDS_VDEST_ID vdest_id,
line 1865:  NCS_VDEST_TYPE vdest_policy, MDS_DEST adest,
line 1866:  bool my_pcon, MDS_SVC_HDL local_svc_hdl,
line 1867:  MDS_SUBTN_REF_VAL subtn_ref_val,
line 1868:  MDS_SVC_PVT_SUB_PART_VER svc_sub_part_ver,
line 1869:  MDS_SVC_ARCHWORD_TYPE archword_type)
line 1870:  {

line 3494: }
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3295 dtm: osaflog tool does not work with short argument

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> 5.22.04



---

** [tickets:#3295] dtm: osaflog tool does not work with short argument**

**Status:** assigned
**Milestone:** 5.22.04
**Created:** Thu Nov 18, 2021 03:05 AM UTC by Thien Minh Huynh
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Vu Minh Hoang


step reproduce:
osaflog -p mds.log
expected: message of mds stream will be printed out like long argument --print.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3299 immnd: unexpected reboot node after merge network back

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> 5.22.04



---

** [tickets:#3299] immnd: unexpected reboot node  after merge network back**

**Status:** assigned
**Milestone:** 5.22.04
**Created:** Wed Dec 01, 2021 03:07 AM UTC by Huu The Truong
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Huu The Truong


Split network into two partitions. On each partition, create a dummy object.
During merge network back has happened twice the reboot node.

The first reboot node:
2021-09-07 12:20:06.926 SC-7 osafimmnd[376]: NO Used to be on another 
partition. Rebooting...
2021-09-07 12:20:06.935 SC-7 osafamfnd[431]: NO AVD NEW_ACTIVE, adest:1
2021-09-07 12:20:06.935 SC-7 osafimmnd[376]: Quick local node rebooting, 
Reason: Used to be on another partition. Rebooting...
2021-09-07 12:20:06.952 SC-7 opensaf_reboot: Do quick local node reboot

At the second, this reboot is unexpected:
2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: NO Used to be on another 
partition. Rebooting...
2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: Quick local node rebooting, 
Reason: Used to be on another partition. Rebooting...
2021-09-07 12:20:11.989 SC-7 osafclmna[333]: NO 
safNode=SC-7,safCluster=myClmCluster Joined cluster, nodeid=2070f
2021-09-07 12:20:11.992 SC-7 opensaf_reboot: Do quick local node reboot
2021-09-07 12:20:12.022 SC-7 opensafd[305]: ER Service RDE has unexpectedly 
crashed. Unable to continue, exiting


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3304 mds: Packet loss without log

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> 5.22.04



---

** [tickets:#3304] mds: Packet loss without log**

**Status:** review
**Milestone:** 5.22.04
**Created:** Tue Jan 04, 2022 03:47 AM UTC by Hieu Hong Hoang
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Hieu Hong Hoang


In mds module, there is a packet loss feature. This feature provide a packet 
loss callback which is called  when packet loss is detected. Because it is turn 
off by default, there are no callback or log when packet loss occurs.
In this ticket,  the log of packet loss will be printed out by default . These 
logs are helpful in many cases. 



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.22.01 --> 5.22.04



---

** [tickets:#3288] fmd: failed during setting role from standby to active**

**Status:** review
**Milestone:** 5.22.04
**Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Huu The Truong


After the standby SC down then another SC is promoted to became 
the new standby SC.
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f:
2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY

At that time, the new standby SC received peer info response from old standby 
SC is needed promote to became the active SC.
2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role 
to ACTIVE
2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE
...
2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request
2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop 
supervision timer

While an another active SC is alive lead to the standby SC reboots itselft 
because cluster has only one active SC.
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active 
FAILED, Standby OUT OF SYNC
2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE 
Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, 
SupervisionTime = 60


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3285 amf: amfd takes time to update to update runtime of node in large cluster size

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Component**: unknown --> amf



---

** [tickets:#3285] amf: amfd takes time to update to update runtime of node in 
large cluster size**

**Status:** fixed
**Milestone:** 5.22.01
**Created:** Thu Sep 23, 2021 08:55 AM UTC by Thang Duc Nguyen
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Thang Duc Nguyen


In large cluster size(lager than 36 nodes), and many components reside on each 
node. The runtime of nodes (AdminState and OperationalState) are take time to 
update in IMM and it causes the application get wrong state of node in IMM 
instead AMF already update its data base. 

Suggestion: use sync request to updates these attributes for node as high 
priority.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3285 amf: amfd takes time to update to update runtime of node in large cluster size

2022-01-23 Thread Gary Lee via Opensaf-tickets

- **Component**: amf --> unknown



---

** [tickets:#3285] amf: amfd takes time to update to update runtime of node in 
large cluster size**

**Status:** fixed
**Milestone:** 5.22.01
**Created:** Thu Sep 23, 2021 08:55 AM UTC by Thang Duc Nguyen
**Last Updated:** Sun Jan 23, 2022 09:58 PM UTC
**Owner:** Thang Duc Nguyen


In large cluster size(lager than 36 nodes), and many components reside on each 
node. The runtime of nodes (AdminState and OperationalState) are take time to 
update in IMM and it causes the application get wrong state of node in IMM 
instead AMF already update its data base. 

Suggestion: use sync request to updates these attributes for node as high 
priority.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3268 imm: New admin op to force IMM Agent using new timeout value

2021-09-14 Thread Gary Lee via Opensaf-tickets

- **Component**: unknown --> imm



---

** [tickets:#3268] imm: New admin op to force IMM Agent using new timeout 
value**

**Status:** fixed
**Milestone:** 5.21.09
**Created:** Mon Jun 21, 2021 01:02 AM UTC by Minh Hon Chau
**Last Updated:** Tue Sep 14, 2021 08:06 AM UTC
**Owner:** Thien Minh Huynh


This ticket is created in conjunction with #3260, in which the single step 
upgrade fails due to a lot of component timeout.

The IMMA_SYNC_TIMEOUT technically environment variable can be exported with 
appropriate value in each components on every nodes; however, if the cluster 
has several nodes, each of which has hundreds of components, that make 
IMMA_SYNC_TIMEOUT update difficult in live site.

This ticket introduces  a configurable attribute to  distribute the new timeout 
to all IMM clients and force them to use the new timeout.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3268 imm: New admin op to force IMM Agent using new timeout value

2021-09-14 Thread Gary Lee via Opensaf-tickets

- **Component**: imm --> unknown



---

** [tickets:#3268] imm: New admin op to force IMM Agent using new timeout 
value**

**Status:** fixed
**Milestone:** 5.21.09
**Created:** Mon Jun 21, 2021 01:02 AM UTC by Minh Hon Chau
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Thien Minh Huynh


This ticket is created in conjunction with #3260, in which the single step 
upgrade fails due to a lot of component timeout.

The IMMA_SYNC_TIMEOUT technically environment variable can be exported with 
appropriate value in each components on every nodes; however, if the cluster 
has several nodes, each of which has hundreds of components, that make 
IMMA_SYNC_TIMEOUT update difficult in live site.

This ticket introduces  a configurable attribute to  distribute the new timeout 
to all IMM clients and force them to use the new timeout.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2849 smf: Incorrect logging may flood syslog

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2849] smf: Incorrect logging may flood syslog**

**Status:** review
**Milestone:** 5.21.12
**Created:** Tue May 08, 2018 11:29 AM UTC by elunlen
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Krishna Pawar


In function getNodeDestination() in SmfUtils there is a:
LOG_NO("%s: className '%s'", __FUNCTION__, className);
That should be changed to a TRACE or removed.

It is printed in a loop that may go on for 10 seconds with a delay of 2 seconds 
meaning that this printout may happen 5 times. The problem however, is that 
this function is called in aloop by waitForNodeDestination() . The logging is 
not done at very high speed though. One log every 2 seconds




---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2855 mds: Improve tipc receive logic

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2855] mds: Improve tipc receive logic**

**Status:** accepted
**Milestone:** future
**Created:** Wed May 16, 2018 12:29 PM UTC by Hans Nordebäck
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Minh Hon Chau


The tipc receive buffer is 2MB, should not be extended as that will affect e.g. 
available kernel memory. The receive thread normally runs as a realtime thread, 
RR and with the gl_mds_library_mutex taken in contention with other MDS, time 
sharing threads. In several cases TIPC_OVERLOAD has happened as consuming 
messages are not fast enough and TIPC drops these message at receive buffer 
full.
This ticket will change the receive thread from a realtime thread to the 
standard round-robin time-sharing policy and a new realtime thread will be 
created to only receive messages and adding  these to a larger buffer using a 
lock-free algorithm. The old receive thread will run as the standard time 
sharing thread and will consume  messages from this shared buffer. The recvmsg 
in recvfrom_connectionless will be changed to read from the shared buffer.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2856 imm: create test cases to verify the ticket 2422

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: assigned --> unassigned
- **Milestone**: 5.21.09 --> future



---

** [tickets:#2856] imm: create test cases to verify the ticket 2422**

**Status:** unassigned
**Milestone:** future
**Created:** Thu May 17, 2018 06:29 AM UTC by Vu Minh Nguyen
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Vu Minh Nguyen


Create test cases for the ticket [#2422].


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2861 osaf: Faulty constructs in opensaf detected when compiling with gcc 8.1

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2861] osaf: Faulty constructs in opensaf detected when compiling 
with gcc 8.1**

**Status:** unassigned
**Milestone:** 5.21.12
**Created:** Mon May 21, 2018 02:25 PM UTC by Hans Nordebäck
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** nobody


Compiling with gcc 8.1 with  -Wno-format-truncation -Wno-stringop-overflow 
-Wno-format-overflow makes the compilation succeed, but these -Wno- should not 
be used,  the faulty constructs should instead be corrected. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #209 plmd crashed while deleting plm entities at various points.

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#209] plmd crashed while deleting plm entities at various points.**

**Status:** review
**Milestone:** 5.21.12
**Created:** Wed May 15, 2013 07:02 AM UTC by Mathi Naickan
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** MeenakshiTK


When the command , "immcfg -d safHE=7220_slot_1,safDomain=domain_1" is ran plm 
crashed with segmentation fault.
the above object has three childs dpb_1,dpb_2 and PL-13
plmd crashed with the following backtrace :
Program terminated with signal 11, Segmentation fault.
#0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at 
plms_utils.c:1360
1360 if (0 == strcmp(tail->plm_entity->dn_name_str,
(gdb) bt fH[[K
#0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at 
plms_utils.c:1360
#1 0x08088c8d in plms_chld_get (ent=0x80fe450, chld_list=0xbbc5d9f8) at 
plms_utils.c:842
#2 0x0805ca90 in plms_delete_objects (obj_type=6, obj_name=0x810a2a8) at 
plms_imm.c:697
#3 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=68719608079, ccb_id=2) at 
plms_imm.c:1425
#4 0x032fc46f in imma_process_callback_info (cb=) at imma_proc.c:2005
#5 0x032fb393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592
#6 0x032ebcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548
#7 0x08051bff in main (argc=2, argv=0xbbc5e414) at plms_main.c:484
#8 0x033aee0c in libc_start_main () from /lib/libc.so.6
#9 0x0804c401 in _start ()

While deleting the entity, which doesn't have any child, it crashed with the 
following backtrace
#0 0x0805cb14 in plms_delete_objects (obj_type=7, obj_name=0x8109588) at 
plms_imm.c:707
#1 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=824633852431, ccb_id=6) at 
plms_imm.c:1425
#2 0x0189c46f in imma_process_callback_info (cb=) at imma_proc.c:2005
#3 0x0189b393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592
#4 0x0188bcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548
#5 0x08051bff in main (argc=2, argv=0xbe1d6db4) at plms_main.c:484
#6 0x0194ee0c in libc_start_main () from /lib/libc.so.6
#7 0x0804c401 in _start ()

Also, check the following issue : 
Crash in plmc_err callback when ee_id is passed as empty string.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2082 CKPT : Track cbk not invoked for section creation after cpnd restart

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2082] CKPT : Track cbk not invoked for section creation after cpnd 
restart**

**Status:** review
**Milestone:** 5.21.12
**Created:** Thu Sep 29, 2016 11:06 AM UTC by Srikanth R
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Mohan  Kanakam


Changeset: 7997 5.1.FC

Track Callback is not invoked after cpnd restart. Below are the apis called 
from the applications , spawned on two nodes .i.e payloads.


On first node :

-> Initialize with cpsv 
-> Create a ckpt with ACTIVE REPLICA flag.
 
 On second node.
 -> Initialize with cpsv.

 On First node,
 -> Open the checkpoint in writing mode
-> Open the checkpoint in reading mode.
 -> Kill cpnd process
 -> Register for Track callback.

On Second node, 
 -> Open the ckpt in read mode.
 -> Kill cpnd proces
 -> Register for Track callback.
 
 
After ensuring that both agents registered for track callback, create a section 
from the application on first node. For section creation, callback should be 
invoked for applications on two nodes.

Currently callback is not invoked for the application on second node. With out 
cpnd restart, callback is invoked for the two applications


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2451 clm: Make the cluster reset admin op safe

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: review --> unassigned
- **Milestone**: 5.21.09 --> future



---

** [tickets:#2451] clm: Make the cluster reset admin op safe**

**Status:** unassigned
**Milestone:** future
**Created:** Wed May 03, 2017 10:51 AM UTC by Anders Widell
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Hans Nordebäck


The cluster reset admin operation that was implemented in ticket [#2053] is not 
safe: if a node reboots very fast it can come up again and join the old cluster 
before other nodes have rebooted. See mail discussion:

https://sourceforge.net/p/opensaf/mailman/message/35398725/

This can be solved by implementing a two-phase cluster reset or by introducing 
a cluster generation number which is increased at each cluster reset (maybe 
both ordered an spontaneous cluster resets). A node will not be allowed to join 
the cluster with a different cluster genration without first rebooting.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2454 amfnd: Clean up variable of active amfd status

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2454] amfnd: Clean up variable of active amfd status**

**Status:** review
**Milestone:** 5.21.12
**Created:** Thu May 04, 2017 12:34 PM UTC by Minh Hon Chau
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Minh Hon Chau


In amfnd, we have variable cb->is_avd_down and set of macros: 
m_AVND_CB_IS_AVD_UP, m_AVND_CB_AVD_UP_SET, m_AVND_CB_AVD_UP_RESET which is 
using flag AVND_CB_FLAG_AVD_UP, they are all indicating active amfd down/up. 
Amfnd should only use variable @is_avd_down or the macros, not both of them.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2500 build: Schema files (.xsd) are missing from distribution tarballs

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: accepted --> unassigned
- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2500] build: Schema files (.xsd) are missing from distribution 
tarballs**

**Status:** unassigned
**Milestone:** 5.21.12
**Created:** Fri Jun 16, 2017 10:45 AM UTC by Anders Widell
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Anders Widell


The XML schema (xsd) files are missing in the release tarballs, at least for 
IMM and probably also for SMF.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2541 nid: order of system log print out is not correct

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2541] nid: order of system log print out is not correct**

**Status:** review
**Milestone:** future
**Created:** Wed Aug 02, 2017 07:52 AM UTC by Rafael Odzakow
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Rafael Odzakow


using echo -n in opensafd causes delay write to log in a systemd environment 
causing unconsistent order of the logs. "Starting opensaf" will end up after 
"Startup finished" in the system log.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2612 amfd: return SA_AIS_ERR_NO_RESOURCES for CSI CCB if SG is unstable

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2612] amfd: return SA_AIS_ERR_NO_RESOURCES for CSI CCB if SG is 
unstable**

**Status:** assigned
**Milestone:** future
**Created:** Thu Oct 05, 2017 03:23 PM UTC by Alex Jones
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Alex Jones


Return SA_AIS_ERR_NO_RESOURCES if CSI ccb apply fails due to SG being unstable.

This is similar to ticket #2184, but for CSIs.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2693 clm: Use longer election delay time on isolated nodes

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: assigned --> unassigned
- **Milestone**: 5.21.09 --> future



---

** [tickets:#2693] clm: Use longer election delay time on isolated nodes**

**Status:** unassigned
**Milestone:** future
**Created:** Tue Nov 21, 2017 04:52 PM UTC by Anders Widell
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Anders Widell


In addition to the CLMNA_ELECTION_DELAY_TIME configuration, allow configuration 
of a separate (longer) election delay time to be used on isolated nodes, i.e. 
nodes that cannot see any other node on the network. This will decrease the 
possibility of split-brain in situations where a node is temporarily 
disconnected from the rest of the cluster.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2768 nid: Use host name when node_name is missing or empty

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: assigned --> unassigned
- **Milestone**: 5.21.09 --> future



---

** [tickets:#2768] nid: Use host name when node_name is missing or empty**

**Status:** unassigned
**Milestone:** future
**Created:** Mon Jan 22, 2018 01:00 PM UTC by Anders Widell
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Anders Widell


In the case /etc/opensaf/node_name is missing or empty, we can use the hostname 
of the machine we are running on.  This would reduce the amount of needed 
per-node configuration if hostname and node_name are the same.

Also, make /etc/opensaf/node_name empty as default when installing OpenSAF.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2798 mds: mdstest 5 1, 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2798] mds: mdstest 5 1,5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 
failed**

**Status:** review
**Milestone:** future
**Created:** Wed Mar 07, 2018 04:19 AM UTC by Hoa Le
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Hoa Le
**Attachments:**

- 
[mdstest_5_1.tar.gz](https://sourceforge.net/p/opensaf/tickets/2798/attachment/mdstest_5_1.tar.gz)
 (8.4 MB; application/gzip)


Opensaf commit 5629f554686a498f328e0c79fc946379cbcf6967

mdstest 5 1

~~~
LOG_NO("\nAction: Retrieve only ONE event\n");
if (mds_service_subscribe(gl_tet_adest.mds_pwe1_hdl, 500,
  NCSMDS_SCOPE_INTRACHASSIS, 2,
  svcids) != NCSCC_RC_SUCCESS) {
LOG_NO("\nFail\n");
FAIL = 1;
} else {
LOG_NO("\nAction: Retrieve only ONE event\n");
if (mds_service_retrieve(gl_tet_adest.mds_pwe1_hdl, 500,
 SA_DISPATCH_ONE) != NCSCC_RC_SUCCESS) {
LOG_NO("Fail, retrieve ONE\n");
FAIL = 1;
} else
LOG_NO("\nSuccess\n");
~~~


After the subscription request being successful, mdstest would expectedly 
receive two 2 MDS_UP events of services 600 and 700. These info will be 
retrieved in the next step of the test case (mds_service_retrieve).

The problem here is, these MDS_UP events are processed in a separate (parallel) 
thread (mds core thread) from the test case's main thread. In a bad scenario, 
if the mds core thread cannot be processed before the RETRIEVE operations in 
the main thread, the RETRIEVE request with "SA_DISPATCH_ONE" flag will return 
"error", and the test case will fail.

<143>1 2018-03-07T01:10:29.936907+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="155"] << mds_mcm_svc_subscribe*** // MDS SUBSCRIBE request***
...
<142>1 2018-03-07T01:10:29.937631+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="162"] MDS_SND_RCV: info->info.retrieve_msg.i_dispatchFlags == 
SA_DISPATCH_ONE*** // MDS RETRIEVE request with SA DISPATCH ONE flag came 
before MDS UP events being processed***
<139>1 2018-03-07T01:10:29.937729+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="163"] MDS_SND_RCV: msgelem == NULL
<142>1 2018-03-07T01:10:29.937953+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="164"] MDTM: Processing pollin events
<142>1 2018-03-07T01:10:29.938333+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="165"] MDTM: Received SVC event
<143>1 2018-03-07T01:10:29.93838+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="166"] >> mds_mcm_svc_up
<143>1 2018-03-07T01:10:29.938418+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="167"] MCM:API: LOCAL SVC INFO  : svc_id = INTERNAL(500) | PWE id = 
1 | VDEST id = 65535 |
<143>1 2018-03-07T01:10:29.938439+07:00 SC-1 mdstest 473 mds.log [meta 
sequenceId="168"] MCM:API: REMOTE SVC INFO : svc_id = EXTERNAL(600) | PWE id = 
1 | VDEST id = 65535 | POLICY = 1 | SCOPE = 3 | ROLE = 1 | MY_PCON = 0 |

2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Action: Retrieve only ONE event
2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Request to ncsmds_api: MDS 
RETRIEVE has FAILED
2018-03-07 01:10:29.942 SC-1 mdstest: NO Fail, retrieve ONE

The same issue was observed in mdstest 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3281 mds: Wrong sending NO_ACTIVE after split brain detection

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#3281] mds: Wrong sending NO_ACTIVE after split brain detection**

**Status:** assigned
**Milestone:** 5.21.12
**Created:** Mon Sep 06, 2021 01:47 AM UTC by Hieu Hong Hoang
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Hieu Hong Hoang
**Attachments:**

- 
[dump_mds_5.21.06.patch](https://sourceforge.net/p/opensaf/tickets/3281/attachment/dump_mds_5.21.06.patch)
 (3.5 kB; application/octet-stream)
- 
[mds.log.1](https://sourceforge.net/p/opensaf/tickets/3281/attachment/mds.log.1)
 (12.0 MB; application/octet-stream)


Configuration:
- Cluster with 10 SCs, allow sc absence.
- Split cluster to four partitions:  [[SC-1, SC-2], [SC-3, SC-4, SC-5, SC-6], 
[SC-7, SC-8, SC-9], [SC-10]]
- Role of SCs after network splits:  [[SC-1(ACT), SC-2(STB)], [SC-3(ACT), SC-4, 
SC-5(STB), SC-6], [SC-7(STB), SC-8(ATC), SC-9], [SC-10(ATC)]]
- Network merges

Observation:
- All active and standby SCs rebooted due to split brain detected except SCs in 
partition 1. The SCs in partition 1 don't reboot because the active and standby 
SCs in other partitions rebooted too fast.  
- Ntf agent in SC1 fails to send notification to ntf server and it will not 
recover.

~~~
<143>1 2021-09-04T07:11:56.225221+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136348"] 439:amf/amfd/imm.cc:419 >> execute 
<143>1 2021-09-04T07:11:56.225223+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136349"] 439:amf/amfd/ntf.cc:804 >> exec: Ntf Type:3000, sent 
status:0
<143>1 2021-09-04T07:11:56.225227+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136350"] 439:amf/amfd/ntf.cc:491 >> avd_try_send_notification: Ntf 
Type:3000, sent status:0
<143>1 2021-09-04T07:11:56.225231+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136351"] 439:ntf/agent/ntfa_api.c:2016 >> saNtfNotificationSend 
<143>1 2021-09-04T07:11:56.225235+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136352"] 439:ntf/agent/ntfa_api.c:62 TR NTFS server is unavailable
<143>1 2021-09-04T07:11:56.225238+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136353"] 439:ntf/agent/ntfa_api.c:2260 << saNtfNotificationSend 
<143>1 2021-09-04T07:11:56.225241+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136354"] 439:amf/amfd/ntf.cc:513 TR Notification Send unsuccesful 
TRY_AGAIN or TIMEOUT rc:6
<143>1 2021-09-04T07:11:56.225243+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136355"] 439:amf/amfd/ntf.cc:532 << avd_try_send_notification 
<143>1 2021-09-04T07:11:56.225246+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136356"] 439:amf/amfd/ntf.cc:811 TR TRY-AGAIN
<143>1 2021-09-04T07:11:56.225249+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136357"] 439:amf/amfd/ntf.cc:822 << exec 
<143>1 2021-09-04T07:11:56.225252+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="136358"] 439:amf/amfd/imm.cc:427 << execute: 2
...
<143>1 2021-09-04T07:26:00.185418+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191040"] 439:amf/amfd/imm.cc:419 >> execute 
<143>1 2021-09-04T07:26:00.185465+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191041"] 439:amf/amfd/ntf.cc:804 >> exec: Ntf Type:3000, sent 
status:0
<143>1 2021-09-04T07:26:00.185475+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191042"] 439:amf/amfd/ntf.cc:491 >> avd_try_send_notification: Ntf 
Type:3000, sent status:0
<143>1 2021-09-04T07:26:00.185485+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191043"] 439:ntf/agent/ntfa_api.c:2016 >> saNtfNotificationSend 
<143>1 2021-09-04T07:26:00.185497+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191044"] 439:ntf/agent/ntfa_api.c:59 TR NTFS server is down
<143>1 2021-09-04T07:26:00.185505+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191045"] 439:ntf/agent/ntfa_api.c:2260 << saNtfNotificationSend 
<143>1 2021-09-04T07:26:00.185513+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191046"] 439:amf/amfd/ntf.cc:513 TR Notification Send unsuccesful 
TRY_AGAIN or TIMEOUT rc:6
<143>1 2021-09-04T07:26:00.18552+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191047"] 439:amf/amfd/ntf.cc:532 << avd_try_send_notification 
<143>1 2021-09-04T07:26:00.185528+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191048"] 439:amf/amfd/ntf.cc:811 TR TRY-AGAIN
<143>1 2021-09-04T07:26:00.185536+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191049"] 439:amf/amfd/ntf.cc:822 << exec 
<143>1 2021-09-04T07:26:00.185543+02:00 SC-1 osafamfd 439 osafamfd [meta 
sequenceId="191050"] 439:amf/amfd/imm.cc:427 << execute: 2
~~~

Reason:
- Before the active SC-3 rebooted, SC-1 still had enough time to connect to the 
ntf server in SC-3. When SC-3 rebooted, the ntf agent in SC-1 received a 
NO_ACTIVE message of NTFS service. Actually, the ntf server in SC-1 is still in 
active state.
- The following mds log is generated by the opensaf code which applied the 
patch dump_mds_5.21.06.patch.

+ The ntf agent in SC-1 detects the ntf server in SC-3 is up. Mds updates the 
active destination of NTFS service to

[tickets] [opensaf:tickets] #3280 dtm: loss of TCP connection requires node reboot

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#3280] dtm: loss of TCP connection requires node reboot**

**Status:** unassigned
**Milestone:** 5.21.12
**Created:** Fri Aug 27, 2021 11:33 AM UTC by Mohan  Kanakam
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Mohan  Kanakam


Some time we see loss of TCP connection among payloads or among controller and 
payloads in the cluster.
Example: If we have 2 controllers and 10 payloads(starting from PL-3 to PL-10), 
we see TCP connection loss at PL-4 among  PL-5. The connection of PL-4 with 
other payloads remains established.
We also see connection loss at PL-7 with SC-2, the connection of PL-7 with 
other nodes remains established. This result in PL-7 reboot when controller 
failover happens i.e. SC-1 fails and SC-2 takes Act role. PL-7 thinks that 
there was a single controller in the cluster and it reboots.

This could be reproduced by adding iptables rule to drop the packets.

So, the expected behavior is dtmd on PL-4/PL-5 can retry the connection for few 
times before declaring the node is down.
The only drawback with this approach is that it will delay the application 
failover time or even controller failover time.

Any suggestion  on it ??


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3271 amf: Issue of headless restoration with Roaming SC

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#3271] amf: Issue of headless restoration with Roaming SC**

**Status:** unassigned
**Milestone:** 5.21.12
**Created:** Fri Jul 02, 2021 05:07 AM UTC by Minh Hon Chau
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** nobody


In robustness test of roaming SC cluster recovery from split brain, the test 
performs rolling split then rejoin every active SC by 3 seconds with promote 
active timer = 0.
The following log shows the issue starting point.

The SC-1 is promoted to active right after the previous active is split. amfnd 
on SC-1 starts to send headless state information to amfd on SC-1 (this case 
does not happen without roaming SC, where the active SC after headless does not 
have amfnd's headless information in the SC).

Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=2N,safApp=ABC-012 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=2N,safApp=OpenSAF <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=All-NWayActive,safApp=ABC-012,safSu=b769074fb6,safSg=NWayActive,safApp=ABC-012
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NWayActive,safApp=ABC-012 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=d4bf28eca3,safApp=ABC-012,safSu=b769074fb6,safSg=NoRed,safApp=ABC-012
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NoRed,safApp=ABC-012 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=748f4402ae,safApp=ABC-456,safSu=b769074fb6,safSg=NoRed,safApp=ABC-456
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NoRed,safApp=ABC-456 <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SISU:safSi=d4bf28eca3,safApp=OpenSAF,safSu=b769074fb6,safSg=NoRed,safApp=OpenSAF
 <1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced 
SU:safSu=b769074fb6,safSg=NoRed,safApp=OpenSAF <0, 1, 3>
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO 10 CSICOMP states sent
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO 33 COMP states sent
Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Sending node up due to 
NCSMDS_NEW_ACTIVE

Jun 30 19:19:45 SC-1 osafamfd[8779]: NO Receive message with event type:12, 
msg_type:31, from node:21a0f, msg_id:0
Jun 30 19:19:45 SC-1 osafamfd[8779]: NO Receive message with event type:13, 
msg_type:32, from node:21a0f, msg_id:0

amfd on SC-1 restores the headless information

Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Received node_up from 21a0f: msg_id 1
Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Enter restore headless cached RTAs from 
IMM
Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Leave reading headless cached RTAs from 
IMM: SUCCESS
Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Node 
'47740d42-79f8-a1c9-ea73-8cb599ef2deb' joined the cluster
Jun 30 19:19:47 SC-1 osafamfnd[8802]: NO Assigning 'safSi=SC-2N,safApp=OpenSAF' 
ACTIVE to 'safSu=b769074fb6,safSg=2N,safApp=OpenSAF'


The other SCs rejoins and misses out the headless restoration of amfd-SC1, that 
causes the issue of amfd-SC1 inconsistent with the amfnd(s) on other SCs

Jun 30 19:19:48 SC-1 osafamfd[8779]: NO Receive message with event type:12, 
msg_type:31, from node:21c0f, msg_id:0
Jun 30 19:19:48 SC-1 osafamfd[8779]: NO Receive message with event type:13, 
msg_type:32, from node:21c0f, msg_id:0

Jun 30 19:19:52 SC-1 osafamfd[8779]: NO Receive message with event type:12, 
msg_type:31, from node:21b0f, msg_id:0
Jun 30 19:19:52 SC-1 osafamfd[8779]: NO Receive message with event type:13, 
msg_type:32, from node:21b0f, msg_id:0


Jun 30 19:19:54 SC-1 osafamfd[8779]: NO Received node_up from 21b0f: msg_id 1
Jun 30 19:19:54 SC-1 osafamfd[8779]: NO Received node_up from 21c0f: msg_id 1


Jun 30 19:19:56 SC-1 osafamfd[8779]: NO Received node_up from 21b0f: msg_id 1
Jun 30 19:19:56 SC-1 osafamfd[8779]: NO Received node_up from 21c0f: msg_id 1

Jun 30 19:19:57 SC-1 osafamfd[8779]: NO Cluster startup is done


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3270 build: make rpm got unversioned python

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#3270] build: make rpm got unversioned python**

**Status:** review
**Milestone:** 5.21.12
**Created:** Mon Jun 28, 2021 07:10 AM UTC by Thien Minh Huynh
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Thien Minh Huynh


%{__python} is no longer pointed to python2.
~~~
rpmbuild -bb --clean --rmspec --rmsource \
--define "_topdir `pwd`/rpms" --define "_tmppath `pwd`/rpms/tmp" \
`pwd`/rpms/SPECS/opensaf.spec
error: attempt to use unversioned python, define %__python to /usr/bin/python2 
or /usr/bin/python3 explicitly
error: line 1556: %{python_sitelib}/pyosaf/*.py

make[1]: *** [Makefile:26843: rpm] Error 1
make[1]: Leaving directory '/root/osaf-build/opensaf-5.21.06'
make: *** [makefile:8: all] Error 2
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3254 Enhancement of NTF notification

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#3254] Enhancement of NTF notification**

**Status:** assigned
**Milestone:** 5.21.12
**Created:** Thu Mar 18, 2021 03:54 AM UTC by Thanh Nguyen
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Thanh Nguyen


When IMM changes an attribute with NOTIFY flag, NTF sends the changed attribute
values and the old attribute values fetched from IMM.
The fetching of old attribute values from IMM might not be successful under a 
certain condition in which the old values are overwritten before NTF attempts 
to fetch the old values.

To avoid this situation, IMM will send spontaneously the old attribute values to
NTF.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3106 dtm: flush the logtrace asap when the logtrace owner is terminated

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#3106] dtm: flush the logtrace asap when the logtrace owner is 
terminated**

**Status:** review
**Milestone:** future
**Created:** Thu Oct 24, 2019 03:57 AM UTC by Vu Minh Nguyen
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Vu Minh Nguyen


This ticket will add a machanism in logtrace server, so that it can detect the 
logtrace owner terminated and does the flush right away to avoid losing traces 
from trace files.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3094 mds: Reassembly timer timeout causes message discarded

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#3094] mds: Reassembly timer timeout causes message discarded**

**Status:** unassigned
**Milestone:** future
**Created:** Sat Sep 28, 2019 09:59 PM UTC by Minh Hon Chau
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Minh Hon Chau


Run test: export MDS_TIPC_FCTRL_ENABLED=1; ckpttest 20 11
Test sometimes failed because the Reassembly timer timeout and discarded all 
fragment.

Some outlined log:
- ckptnd as a mds receiver, receives the first fragment of big message, start 
Reassembly timer (5 seconds hard coded)

<142>1 2019-09-28T19:41:33.372579+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477553"] MDTM: Reassembly started
<143>1 2019-09-28T19:41:33.372582+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477554"] MDTM: Recd fragmented message(first frag) with Frag 
Seqnum=4 SVC Seq num =3, from src Adest = <72075194378064089>
<142>1 2019-09-28T19:41:33.372585+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477555"] MDTM: User Recd msg len=65223
<143>1 2019-09-28T19:41:33.372603+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477556"] 
MCM_DB:RecvMessage:TimerStart:Reassemble:Hdl=0xfee7:SrcSvcId=CPA(18):SrcVdest=65535,DestSvcHdl=562945658454033
<143>1 2019-09-28T19:41:33.372616+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477557"] MDTM: size: 65262  anc is NULL
<143>1 2019-09-28T19:41:33.37262+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477558"] FCTRL: [me] <-- [node:1001001, ref:3859124441], 
RcvData[mseq:4, mfrag:32770, fseq:5], rcvwnd[acked:4, rcv:4, nacked:0]
<143>1 2019-09-28T19:41:33.372623+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477559"] MDTM: Recd message with Fragment Seqnum=4, frag_num=2, 
from src_id=<0x01001001:3859124441>

<143>1 2019-09-28T19:41:33.372669+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477568"] FCTRL: [me] <-- [node:1001001, ref:3859124441], 
RcvData[mseq:4, mfrag:32771, fseq:6], rcvwnd[acked:4, rcv:5, nacked:0]
<143>1 2019-09-28T19:41:33.372673+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477569"] MDTM: Recd message with Fragment Seqnum=4, frag_num=3, 
from src_id=<0x01001001:3859124441>

- The big message causes Tipc buffer overflow

<139>1 2019-09-28T19:41:33.384242+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477862"] FCTRL: [me] <-- [node:1001001, ref:3859124441], 
RcvData[mseq:4, mfrag:32801, fseq:36], rcvwnd[acked:31, rcv:33, nacked:0], 
Error[msg loss]
<<..>>
<139>1 2019-09-28T19:41:33.386422+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478063"] FCTRL: [me] <-- [node:1001001, ref:3859124441], 
RcvData[mseq:4, mfrag:32800, fseq:35], rcvwnd[acked:46, rcv:48, nacked:0], 
Error[unexpected retransmission]
<<..>>
<139>1 2019-09-28T19:41:33.386658+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478091"] FCTRL: [me] <-- [node:1001001, ref:3859124441], 
RcvData[mseq:4, mfrag:32813, fseq:48], rcvwnd[acked:46, rcv:48, nacked:0], 
Error[unexpected retransmission]
<<..>>
<139>1 2019-09-28T19:41:33.384873+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="477905"] FCTRL: [me] <-- [node:1001001, ref:3859124441], 
RcvData[mseq:4, mfrag:32814, fseq:49], rcvwnd[acked:31, rcv:33, nacked:0], 
Error[msg loss]
<<..>>

- The transmission problem is resolved, but the Reassembly timer has expired, 
any message from sender has passed the tipc flow control with correct sequence 
number will be dropped at mds_dt_common

<142>1 2019-09-28T19:41:38.392219+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478200"] MDTM: Processing Timer mailbox events
<143>1 2019-09-28T19:41:38.392328+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478201"] MDTM: Tmr Mailbox Processing:Reassemble Tmr Hdl=0xfee7

<142>1 2019-09-28T19:41:38.392623+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478206"]  MSG loss not enbaled mds_mcm_msg_loss

<143>1 2019-09-28T19:41:39.380193+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478210"] MDTM: size: 65262  anc is NULL
<143>1 2019-09-28T19:41:39.380227+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478211"] FCTRL: [me] <-- [node:1001001, ref:3859124441], 
RcvData[mseq:4, mfrag:32822, fseq:57], rcvwnd[acked:56, rcv:56, nacked:0]
<143>1 2019-09-28T19:41:39.380243+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478212"] MDTM: Recd message with Fragment Seqnum=4, frag_num=54, 
from src_id=<0x01001001:3859124441>
<143>1 2019-09-28T19:41:39.380263+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478213"] MDS_DT_COMMON : reassembly queue doesnt exist seq_num=4, 
Adest = <0x01001001,3859124441
<139>1 2019-09-28T19:41:39.380273+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478214"] MDTM: Some stale message recd, hence dropping Adest = 
<72075194378064089>

<143>1 2019-09-28T19:41:39.380309+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478215"] MDTM: size: 65262  anc is NULL
<143>1 2019-09-28T19:41:39.380322+10:00 SC-1 osafckptnd 431 mds.log [meta 
sequenceId="478216"] FCTRL:

[tickets] [opensaf:tickets] #3000 rde: rdegetrole timeout in rda

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#3000] rde: rdegetrole timeout in rda**

**Status:** unassigned
**Milestone:** future
**Created:** Wed Jan 16, 2019 07:58 AM UTC by Canh Truong
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** nobody


In the "rdegetrole" command or set role ..., rda send the geting role request 
and poll for the response with timeout 30s. But in there are some place that 
loop over 30s in rde when etcd plugin is used. That causes  the poll event is 
timeout. The command "rdegetrole" or set role will get error.


while (rc == SA_AIS_ERR_FAILED_OPERATION && retries < kMaxRetry) {
  ++retries;
  std::this_thread::sleep_for(kSleepInterval);
  ...
  }


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2979 mds: create a UNIX socket for local node communication when using TIPC

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2979] mds: create a UNIX socket for local node communication when 
using TIPC**

**Status:** unassigned
**Milestone:** future
**Created:** Tue Dec 04, 2018 08:32 PM UTC by Alex Jones
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** nobody


We need to create a UNIX socket for local node communication when using TIPC, 
much like DTM does for TCP. This is useful when using AMF container/contained 
and TIPC as the transport. Currently, to allow communication between the 
amf-lib in the container with amfnd on the host, we need to allow the container 
access to the host's network stack so it can see the TIPC address.

If we created a UNIX socket for local node TIPC communication we could pass 
this socket file to the container instead of exposing the host's entire network 
stack. MDS can then use the socket file for TIPC, like it does with TCP.



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2976 msg: allow msg lib to be used in a container

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2976] msg: allow msg lib to be used in a container**

**Status:** accepted
**Milestone:** future
**Created:** Mon Nov 26, 2018 02:49 PM UTC by Alex Jones
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Alex Jones


The implementation of MSG has the node director create the IPC msg queue while 
the agent is also able to access and use the IPC msg queue. This breaks when 
trying to run only the agent inside a container. saMsgMessageGet expects access 
to the IPC message queue locally, so this function always fails.

Need to investigate whether we can have the container running the agent share 
the IPC message queue with the node director, or whether we need to reimplement 
the code so that only one entity has access to the IPC message queue.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2967 msg: add missing test case of msg apitest

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2967] msg: add missing test case of msg apitest**

**Status:** review
**Milestone:** 5.21.12
**Created:** Tue Nov 20, 2018 05:33 AM UTC by Mohan  Kanakam
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Mohan  Kanakam





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2966 lck: add missing test case of lck apitest

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2966] lck: add missing test case of lck apitest**

**Status:** review
**Milestone:** 5.21.12
**Created:** Mon Nov 19, 2018 05:30 AM UTC by Mohan  Kanakam
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Mohan  Kanakam





---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2948 rde: race between quiesced node and standby node become active node

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2948] rde: race between quiesced node and standby node become 
active node**

**Status:** unassigned
**Milestone:** future
**Created:** Mon Oct 29, 2018 10:48 AM UTC by Canh Truong
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** nobody


Summary issue:
Config cluster with 5 SCs. 

-  Starting cluster: SC5 is active, SC4 standby and [SC1-SC3] is quiesced. SC1 
and SC2 are promoting those nodes to become ACTIVE (by rde component). SC5 is 
answering the take-over request from SC2 first and reject this request. Then 
SC5 is rebooted (manually) before answering the request from SC.

- After SC5 is rebooted. SC4 is also promoted it self node to become the ACTVE. 
SC4 send the take-over request.

- Unfortunately the node become ACTIVE is quiesced SC1, not standby SC4.  So 
all synced data from ACTIVE may be lost


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2932 ckpt: converting the checkpoint service from c to c++

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2932] ckpt: converting the checkpoint service from  c to c++  **

**Status:** review
**Milestone:** 5.21.12
**Created:** Mon Oct 01, 2018 01:25 PM UTC by Mohan  Kanakam
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Mohan  Kanakam


Converting the checkpoint service  from c to  c++. 


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2930 ckpt: non collocated checkpoint is not deleted from /dev/shm after switch over.

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2930] ckpt: non collocated checkpoint is not deleted from  
/dev/shm  after switch over.**

**Status:** review
**Milestone:** 5.21.12
**Created:** Thu Sep 20, 2018 07:52 AM UTC by Mohan  Kanakam
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Mohan  Kanakam


Steps to reproduce:
1) Create a non collocated checkpoint on SC-1 and PL-4.
2) Create a one section on SC-1.
3)During switch over operation, close the application on PL-4.
4)After switch over, close the application of SC-1.
4) Observe that checkpoint is not deleted from the /dev/shm on SC-1 and SC-2 .

SC-1:
root@mohan-VirtualBox:/home/mohan/opensaf-code/src/ckpt/ckptnd# ls /dev/shm/
opensaf_CPND_CHECKPOINT_INFO_131343
  opensaf_NCS_GLND_LCK_CKPT_INFO 
 opensaf_NCS_MQND_QUEUE_CKPT_INFO
pulse-shm-1049372244  pulse-shm-2170855640  pulse-shm-493188609
opensaf_NCS_GLND_EVT_CKPT_INFO
   opensaf_NCS_GLND_RES_CKPT_INFO
  opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131343_2  
pulse-shm-2086668063  pulse-shm-3681026513

SC-2:

root@mohan-VirtualBox:~# ls /dev/shm
opensaf_CPND_CHECKPOINT_INFO_131599
opensaf_NCS_GLND_EVT_CKPT_INFO
opensaf_NCS_GLND_LCK_CKPT_INFO
opensaf_NCS_GLND_RES_CKPT_INFO
opensaf_NCS_MQND_QUEUE_CKPT_INFO
opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131599_2
pulse-shm-2892283080
pulse-shm-2910971180
pulse-shm-3340597930
pulse-shm-528662130
pulse-shm-551961907


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2914 clm : add missing testcases in Clm apitest

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#2914] clm : add missing testcases  in Clm apitest**

**Status:** review
**Milestone:** 5.21.12
**Created:** Mon Aug 20, 2018 01:34 PM UTC by Richa 
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Mohan  Kanakam


Adding missing test cases in clm apitest.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2908 dtm: Add support for connection-oriented TIPC

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2908] dtm: Add support for connection-oriented TIPC**

**Status:** assigned
**Milestone:** future
**Created:** Wed Aug 01, 2018 01:20 PM UTC by Anders Widell
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Minh Hon Chau


DTM currently uses TCP connections for communication between nodes. This ticket 
proposes that support shall be added for using connection-oriented TIPC instead 
of, or possibly also at the same time as TCP. The reason for using both TCP and 
TIPC at the same time would be that DTM can monitor the connectivity between 
nodes using both TIPC and TCP. The connectivity should then only be considered 
to be up if both TCP and TIPC are connected. CLM could use this information to 
disallow cluster membership of nodes that are not connected using both TCP and 
TIPC. However, it could still be possible to send messages between the nodes 
using just one of the two connection types - this could be useful to avoid 
split-brain problems.

Another reason for adding support for TIPC in DTM is that we can avoid the 
problem that our current TIPC implementation can lose messages, and we would no 
longer require real-time priority for the MDS thread. In fact, the MDS thread 
could be completely removed and we could remove the MDS code for handling TIPC 
(only DTM would need to support TIPC).

This is a rather large enhancement if all features mentioned above are 
implemented. However, as a first step a very small implementation could simply 
add support in DTM for using TIPC instead of TCP. This ought to be easy to 
implement. We would then have three possible configurations:

* TCP using DTM
* TIPC using DTM (new)
* TIPC without DTM


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2878 ntf: initialize client in log service loop forever

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> future



---

** [tickets:#2878] ntf: initialize client in log service loop forever **

**Status:** review
**Milestone:** future
**Created:** Fri Jun 15, 2018 06:33 AM UTC by Canh Truong
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Canh Truong


When NTFD is started, the NtfAdmin class object is defined and client in log 
service is initialized in this object. The initialiation of client may be loop 
forever if TRY_AGAIN error always return.
*  do {
result = saLogInitialize(&logHandle, &logCallbacks, &logversion);
if (SAAISERRTRYAGAIN == result) {
  if (firsttry) {
LOGWA("saLogInitialize returns try again, retries...");
firsttry = 0;
  }
  usleep(AISTIMEOUT);
  logversion = kLogVersion;
}
  } while (SAAISERRTRYAGAIN == result);*

Somehow log service has not been started on time or log service is busy, NTFD 
is kept in the loop.  NTFD started for long time and the AMF hasn't received 
csi callback (in 30 seconds ??). The error is printout:
"2018-04-23 02:13:56.326 SC-2 osafamfnd[272]: ER 
safComp=NTF,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackTimeout Recovery is:nodeFailfast "

NTFD should be updated at initialization of log client.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3282 amfd: coredump while deleting csi

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **Milestone**: 5.21.09 --> 5.21.12



---

** [tickets:#3282] amfd: coredump while deleting  csi**

**Status:** assigned
**Milestone:** 5.21.12
**Created:** Thu Sep 09, 2021 03:11 AM UTC by Hieu Hong Hoang
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Hieu Hong Hoang
**Attachments:**

- [bt.SC-1](https://sourceforge.net/p/opensaf/tickets/3282/attachment/bt.SC-1) 
(10.7 kB; application/octet-stream)


When a service unit assigns to a service instance which  doesn't support all 
component service instances of that service instance, amfd will failed to 
delete those unsupported component service instances.
For example: An application have two SU and one SI configured as below
~~~
SU1 contains COMP_A and COMP_B
SU2 contains COMP_A
SI1 have CSI_A and CSI_B
COMP_A supports CSI_A, COMP_B supports CSI_B.
~~~
After opensaf assign SI1 to SU1 and SU2, amfd will crash if we delete the 
CSI_B. 

~~~
2021-09-07 11:32:21.213 SC-1 osafimmnd[376]: NO Ccb 23 COMMITTED 
(immcfg_SC-1_1562)
2021-09-07 11:32:21.213 SC-1 osafamfd[439]: src/amf/amfd/csi.cc:945: 
ccb_apply_delete_hdlr: Assertion 't_csicomp' failed.
2021-09-07 11:32:21.305 SC-1 osafamfnd[459]: ER AMFD has unexpectedly crashed. 
Rebooting node
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3263 rde: Cluster is unrecoverable after all nodes split-brain in roaming SC

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: assigned --> fixed
- **Milestone**: 5.21.06 --> 5.21.09



---

** [tickets:#3263] rde: Cluster is unrecoverable after all nodes split-brain in 
roaming SC**

**Status:** fixed
**Milestone:** 5.21.09
**Created:** Fri May 14, 2021 04:56 AM UTC by Minh Hon Chau
**Last Updated:** Tue Sep 14, 2021 06:09 AM UTC
**Owner:** Minh Hon Chau


In Roaming SC deployment, if split-brain occurs that separates all nodes apart, 
in which each partition has one SC, we have all SCs becoming active. At rejoin, 
all SCs detect themself as duplicated active to one of other SCs, they should 
all reboot, ideally.
However, sometimes the last active SC is not detected as duplicated because all 
the other SCs already reboot. The last SC does not find any others as active 
duplicated to itself. As of this result, since the last SC is not healthy 
throughout the split time, it's causing many errors for other nodes to rejoin 
again after reboot.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3263 rde: Cluster is unrecoverable after all nodes split-brain in roaming SC

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: fixed --> assigned



---

** [tickets:#3263] rde: Cluster is unrecoverable after all nodes split-brain in 
roaming SC**

**Status:** assigned
**Milestone:** 5.21.06
**Created:** Fri May 14, 2021 04:56 AM UTC by Minh Hon Chau
**Last Updated:** Tue Sep 14, 2021 06:09 AM UTC
**Owner:** Minh Hon Chau


In Roaming SC deployment, if split-brain occurs that separates all nodes apart, 
in which each partition has one SC, we have all SCs becoming active. At rejoin, 
all SCs detect themself as duplicated active to one of other SCs, they should 
all reboot, ideally.
However, sometimes the last active SC is not detected as duplicated because all 
the other SCs already reboot. The last SC does not find any others as active 
duplicated to itself. As of this result, since the last SC is not healthy 
throughout the split time, it's causing many errors for other nodes to rejoin 
again after reboot.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3263 rde: Cluster is unrecoverable after all nodes split-brain in roaming SC

2021-09-13 Thread Gary Lee via Opensaf-tickets

commit bbe47278c2499bc738bf0c2dc8cc4e9a026d
Author: Minh Chau 
Date:   Tue Jul 13 18:00:41 2021 +1000

rde: Add timeout of waiting for peer info [#3263]

This ticket revisit the waiting for peer info and
fix the problem of disordered peer_up and peer info
in the commit d1593b03b3c9bec292b14dde65264c261760bf46



---

** [tickets:#3263] rde: Cluster is unrecoverable after all nodes split-brain in 
roaming SC**

**Status:** fixed
**Milestone:** 5.21.06
**Created:** Fri May 14, 2021 04:56 AM UTC by Minh Hon Chau
**Last Updated:** Wed May 26, 2021 11:07 AM UTC
**Owner:** Minh Hon Chau


In Roaming SC deployment, if split-brain occurs that separates all nodes apart, 
in which each partition has one SC, we have all SCs becoming active. At rejoin, 
all SCs detect themself as duplicated active to one of other SCs, they should 
all reboot, ideally.
However, sometimes the last active SC is not detected as duplicated because all 
the other SCs already reboot. The last SC does not find any others as active 
duplicated to itself. As of this result, since the last SC is not healthy 
throughout the split time, it's causing many errors for other nodes to rejoin 
again after reboot.


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3279 ntf: Errors under compilation for 32 bit machine

2021-09-13 Thread Gary Lee via Opensaf-tickets

- **status**: assigned --> fixed



---

** [tickets:#3279] ntf: Errors under compilation for 32 bit machine**

**Status:** fixed
**Milestone:** 5.21.09
**Created:** Thu Aug 26, 2021 01:15 AM UTC by Thanh Nguyen
**Last Updated:** Tue Sep 14, 2021 06:01 AM UTC
**Owner:** Thanh Nguyen


Ticket 3277 patch has compilation errors when compiled for 32 bit machine.
Following is an extract of one compilation.

CXX src/ntf/ntfd/bin_osafntfd-NtfSubscription.o
In file included from ./src/base/ncs_osprm.h:32:0,
from ./src/mds/mds_papi.h:32,
from ./src/ntf/common/ntfsv_msg.h:26,
from ./src/ntf/ntfd/ntfs_com.h:33,
from ./src/ntf/ntfd/NtfNotification.h:29,
from ./src/ntf/ntfd/NtfFilter.h:29,
from ./src/ntf/ntfd/NtfSubscription.h:25,
from src/ntf/ntfd/NtfSubscription.cc:22:
src/ntf/ntfd/NtfSubscription.cc: In member function 'void 
NtfSubscription::sendNotification(NtfSmartPtr&, NtfClient*)':
./src/base/logtrace.h:173:65: error: format '%lu' expects argument of type 
'long unsigned int', but argument 6 has type 'MDS_DEST {aka long long unsigned 
int}' [-Werror=format=]
logtrace_trace(FILE, __LINE, CAT_TRACE, (format), ##args)
^
src/ntf/ntfd/NtfSubscription.cc:305:9: note: in expansion of macro 'TRACE'
TRACE("Nodeid: %u, MdsDest: %lu", evt->info.mds_info.node_id,
^
./src/base/logtrace.h:163:61: error: format '%lu' expects argument of type 
'long unsigned int', but argument 5 has type 'MDS_DEST {aka long long unsigned 
int}' [-Werror=format=]
logtrace_log(FILE, __LINE, LOG_ERR, (format), ##args)
^
src/ntf/ntfd/NtfSubscription.cc:316:9: note: in expansion of macro 'LOG_ER'
LOG_ER("Down event missed for app with mdsdest: %lu on node: %u",
^
CXX src/ntf/ntfd/bin_osafntfd-NtfLogger.o
CXX src/ntf/ntfd/bin_osafntfd-NtfReader.o
cc1plus: all warnings being treated as errors
Makefile:21672: recipe for target 'src/ntf/ntfd/bin_osafntfd-NtfSubscription.o' 
failed



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #3279 ntf: Errors under compilation for 32 bit machine

2021-09-13 Thread Gary Lee via Opensaf-tickets

commit 2249c5f7035ad7ec31b2ecd71b88a4d35745acd3
Author: Thanh Nguyen 
Date:   Tue Aug 31 08:06:52 2021 +1000

ntf: Fix compilation errors for 32 bit machine [#3279]

Patch for ticket 3277 failed compilation for 32 bit machine.
This patch fixes these compilation errors.



---

** [tickets:#3279] ntf: Errors under compilation for 32 bit machine**

**Status:** assigned
**Milestone:** 5.21.09
**Created:** Thu Aug 26, 2021 01:15 AM UTC by Thanh Nguyen
**Last Updated:** Tue Sep 14, 2021 05:53 AM UTC
**Owner:** Thanh Nguyen


Ticket 3277 patch has compilation errors when compiled for 32 bit machine.
Following is an extract of one compilation.

CXX src/ntf/ntfd/bin_osafntfd-NtfSubscription.o
In file included from ./src/base/ncs_osprm.h:32:0,
from ./src/mds/mds_papi.h:32,
from ./src/ntf/common/ntfsv_msg.h:26,
from ./src/ntf/ntfd/ntfs_com.h:33,
from ./src/ntf/ntfd/NtfNotification.h:29,
from ./src/ntf/ntfd/NtfFilter.h:29,
from ./src/ntf/ntfd/NtfSubscription.h:25,
from src/ntf/ntfd/NtfSubscription.cc:22:
src/ntf/ntfd/NtfSubscription.cc: In member function 'void 
NtfSubscription::sendNotification(NtfSmartPtr&, NtfClient*)':
./src/base/logtrace.h:173:65: error: format '%lu' expects argument of type 
'long unsigned int', but argument 6 has type 'MDS_DEST {aka long long unsigned 
int}' [-Werror=format=]
logtrace_trace(FILE, __LINE, CAT_TRACE, (format), ##args)
^
src/ntf/ntfd/NtfSubscription.cc:305:9: note: in expansion of macro 'TRACE'
TRACE("Nodeid: %u, MdsDest: %lu", evt->info.mds_info.node_id,
^
./src/base/logtrace.h:163:61: error: format '%lu' expects argument of type 
'long unsigned int', but argument 5 has type 'MDS_DEST {aka long long unsigned 
int}' [-Werror=format=]
logtrace_log(FILE, __LINE, LOG_ERR, (format), ##args)
^
src/ntf/ntfd/NtfSubscription.cc:316:9: note: in expansion of macro 'LOG_ER'
LOG_ER("Down event missed for app with mdsdest: %lu on node: %u",
^
CXX src/ntf/ntfd/bin_osafntfd-NtfLogger.o
CXX src/ntf/ntfd/bin_osafntfd-NtfReader.o
cc1plus: all warnings being treated as errors
Makefile:21672: recipe for target 'src/ntf/ntfd/bin_osafntfd-NtfSubscription.o' 
failed



---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

[tickets] [opensaf:tickets] #2726 smf: Smfnd does not protect global variables used in more than one thread

2021-05-31 Thread Gary Lee via Opensaf-tickets

- **Milestone**: future --> 5.21.06



---

** [tickets:#2726] smf: Smfnd does not protect global variables used in more 
than one thread**

**Status:** fixed
**Milestone:** 5.21.06
**Created:** Mon Dec 04, 2017 11:29 AM UTC by elunlen
**Last Updated:** Mon Apr 26, 2021 03:15 AM UTC
**Owner:** Thanh Nguyen


Several global variables (cb structure) are handled both in the main thread and 
in the mds thread but no mutex is used for protection. Make handling of global 
variables thread safe


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.___
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1100 matches

Mail list logo