[tickets] [opensaf:tickets] #3346 osaf: build failed with gcc/g++ 12
- **Type**: defect --> enhancement --- **[tickets:#3346] osaf: build failed with gcc/g++ 12** **Status:** fixed **Milestone:** 5.24.02 **Created:** Fri Jan 12, 2024 11:22 AM UTC by Thang Duc Nguyen **Last Updated:** Mon Jan 22, 2024 04:21 AM UTC **Owner:** Thang Duc Nguyen --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3288] fmd: failed during setting role from standby to active** **Status:** review **Milestone:** 5.24.09 **Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** Huu The Truong After the standby SC down then another SC is promoted to became the new standby SC. 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f: 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY At that time, the new standby SC received peer info response from old standby SC is needed promote to became the active SC. 2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role to ACTIVE 2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE ... 2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request 2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop supervision timer While an another active SC is alive lead to the standby SC reboots itselft because cluster has only one active SC. 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active FAILED, Standby OUT OF SYNC 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, SupervisionTime = 60 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3306 ckpt: checkpoint node director responding to async call.
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3306] ckpt: checkpoint node director responding to async call.** **Status:** accepted **Milestone:** 5.24.09 **Created:** Thu Feb 17, 2022 10:46 AM UTC by Mohan Kanakam **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** Mohan Kanakam During section create, one ckptnd sends async request(normal mds send) to another ckptnd. But, another ckptnd is responding to the request in assumption that it received the sync request and it has to respond to the sender ckptnd. In few cases, it is needed to respond when a sync req comes to ckptnd, but in few cases, it receives async req and it needn't respond async request. We are getting the following messages in mds log when creating the section: sc1-VirtualBox osafckptnd 27692 mds.log [meta sequenceId="2"] MDS_SND_RCV: Invalid Sync CTXT Len --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3293] log: Replace ScopeLock by standard lock** **Status:** review **Milestone:** 5.24.09 **Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** Hieu Hong Hoang We created a class ScopeLock to support recursive mutex. It's used a lot in module log. However, the C++ std have a std:unique_lock which supports std::recursive_mutex. We should use the standard lock instead of creating a new class. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3312 fmd: sc failed to failover in roamming mode
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3312] fmd: sc failed to failover in roamming mode** **Status:** assigned **Milestone:** 5.24.09 **Created:** Tue Mar 29, 2022 03:44 AM UTC by Huu The Truong **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** Huu The Truong Shutdown SC-6 (role is standby): 2022-03-07 12:14:52.551 INFO: * Stop standby SC (SC-6) SC-10 changed role to standby: 2022-03-07 12:14:54.919 SC-10 osafrded[384]: NO RDE role set to STANDBY However, a service of the old standby is still alive, lead to SC-10 received peer info from the old standby (SC-6). It mistakes this service as active SC is downing. SC-10 changed role to active then rebooted. 2022-03-07 12:14:55.522 SC-10 osaffmd[394]: NO Controller Failover: Setting role to ACTIVE 2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO RDE role set to ACTIVE 2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO Running '/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s) 2022-03-07 12:14:55.654 SC-10 opensaf_sc_active: 49cbd770-9e07-11ec-b3b4-525400fd3480 expected on SC-1 2022-03-07 12:14:55.656 SC-10 osafntfd[439]: NO ACTIVE request 2022-03-07 12:14:55.656 SC-10 osaffmd[394]: NO Controller promoted. Stop supervision timer 2022-03-07 12:14:55.657 SC-10 osafclmd[450]: NO ACTIVE request 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: NO FAILOVER StandBy --> Active 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: ER FAILOVER StandBy --> Active FAILED, Standby OUT OF SYNC 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 20a0f, SupervisionTime = 60 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3323 imm: PL sync failed after reconnected with SC
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3323] imm: PL sync failed after reconnected with SC** **Status:** unassigned **Milestone:** 5.24.09 **Created:** Wed Oct 05, 2022 09:37 AM UTC by Son Tran Ngoc **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** nobody Active SC1 and PL4 lost connect suddenly ( may be due to the environment reason),they re-established contact but PL4 sync failed due to PL4 did not update active SC1 information and discard message from IMMD SC1 PL4 sync failed log 2022-09-22 04:07:05.230 DEBUG: Syncing node PL-4 (timeout=120) 2022-09-22 04:08:06.325 WARNING: waiting more than 60 sec for node PL-4 to sync PL4 discard message from SC1 log 2022-09-22 04:07:08.406 PL-4 osafimmnd[354]: WA DISCARD message from IMMD 2010f as ACT:0 SBY:2020f 2022-09-22 04:07:09.013 PL-4 osafimmnd[354]: message repeated 243 times: [ WA DISCARD message from IMMD 2010f as ACT:0 SBY:2020f] step to reproduce. 1.start SCs, PLs. 2.Block traffic SC1 and PL4 ( make sure block traffic after IMM State : IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT ). 3.Unblock traffic SC1 and PL4. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3335 imm: Valgrind reported errors
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3335] imm: Valgrind reported errors** **Status:** assigned **Milestone:** 5.24.09 **Created:** Mon Apr 10, 2023 03:06 AM UTC by PhanTranQuocDat **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** PhanTranQuocDat Valgrind detects memleaks: /var/lib/lxc/SC-2/rootfs/var/log/opensaf/immd.valgrind ==417== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost in loss record 108 of 111 ==417==at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x52ACC2B: sysf_alloc_pkt (sysf_mem.c:429) ==417==by 0x529BA1F: ncs_enc_init_space_pp (hj_ubaid.c:144) ==417==by 0x52C9996: mdtm_fill_data (mds_dt_common.c:1454) ==417==by 0x52CACCD: mdtm_process_recv_message_common (mds_dt_common.c:544) ==417==by 0x52CB071: mdtm_process_recv_data (mds_dt_common.c:1126) ==417==by 0x52D5B8E: mdtm_process_recv_events (mds_dt_tipc.c:1144) ==417==by 0x55106DA: start_thread (pthread_create.c:463) ==417==by 0x584961E: clone (clone.S:95) /var/lib/lxc/SC-1/rootfs/var/log/opensaf/immd.valgrind ==417== 7 bytes in 1 blocks are definitely lost in loss record 6 of 117 ==417==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238) ==417==by 0x1172C4: mbcsv_dec_async_update (immd_mbcsv.c:1128) ==417==by 0x1172C4: immd_mbcsv_decode_proc (immd_mbcsv.c:1402) ==417==by 0x1172C4: immd_mbcsv_callback (immd_mbcsv.c:411) ==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409) ==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460) ==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166) ==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271) ==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) -- ==417== 8 bytes in 1 blocks are definitely lost in loss record 15 of 117 ==417==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238) ==417==by 0x117288: mbcsv_dec_async_update (immd_mbcsv.c:1119) ==417==by 0x117288: immd_mbcsv_decode_proc (immd_mbcsv.c:1402) ==417==by 0x117288: immd_mbcsv_callback (immd_mbcsv.c:411) ==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409) ==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460) ==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166) ==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271) ==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) -- ==417== 16 bytes in 1 blocks are definitely lost in loss record 20 of 117 ==417==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238) ==417==by 0x11724C: mbcsv_dec_async_update (immd_mbcsv.c:1110) ==417==by 0x11724C: immd_mbcsv_decode_proc (immd_mbcsv.c:1402) ==417==by 0x11724C: immd_mbcsv_callback (immd_mbcsv.c:411) ==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409) ==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460) ==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166) ==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271) ==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3341 log: memleak detected by valgrind
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3341] log: memleak detected by valgrind** **Status:** review **Milestone:** 5.24.09 **Created:** Thu Aug 17, 2023 04:41 AM UTC by Thien Minh Huynh **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** Thien Minh Huynh ==526== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost in loss record 172 of 175 ==526== at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==526== by 0x5B3BC2B: sysf_alloc_pkt (sysf_mem.c:429) ==526== by 0x5B2A9CD: ncs_enc_init_space (hj_ubaid.c:108) ==526== by 0x5B4A03D: ncs_mbcsv_encode_message (mbcsv_util.c:899) ==526== by 0x5B4A56C: mbcsv_send_msg (mbcsv_util.c:1029) ==526== by 0x5B47112: mbcsv_process_events (mbcsv_pr_evts.c:139) ==526== by 0x5B473BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271) ==526== by 0x5B41A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) ==526== by 0x13C6AD: lgs_mbcsv_dispatch(unsigned int) (lgs_mbcsv.cc:509) ==526== by 0x119919: main (lgs_main.cc:592) --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3343 amf: SU is not in healthy state
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3343] amf: SU is not in healthy state** **Status:** review **Milestone:** 5.24.09 **Created:** Wed Sep 13, 2023 11:30 AM UTC by Thang Duc Nguyen **Last Updated:** Wed Nov 15, 2023 12:56 AM UTC **Owner:** Thang Duc Nguyen System is not in healthy state in the below scenario 1. Deploy 2N model, each PI SU contains 1 PI comp and 1 NPI comp. 2. Terminate PI component then lock that SU. (Some sleep time is added in the instantiation script). 3. The SU is in LOCKED(AdminState) and UNINSTANTIATED(PresenceState). Can not use amf-adm to repair the SU. Only reboot node can help to cover the issue. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3347 smf: Valgrind reported errors
- **Milestone**: 5.24.02 --> 5.24.09 --- **[tickets:#3347] smf: Valgrind reported errors** **Status:** assigned **Milestone:** 5.24.09 **Created:** Mon Feb 19, 2024 09:07 AM UTC by Nguyen Huynh Tai **Last Updated:** Mon Feb 19, 2024 09:07 AM UTC **Owner:** Nguyen Huynh Tai 14:49:09 Verify valgrind result 14:49:09 ==585== 2 errors in context 1 of 6: 14:49:09 ==585== Syscall param socketcall.sendto(msg) points to uninitialised byte(s) 14:49:09 ==585==at 0x5509B62: sendto (sendto.c:27) 14:49:09 ==585==by 0x52C5ACF: mds_retry_sendto (mds_dt_tipc.c:3154) 14:49:09 ==585==by 0x52C5CC4: mdtm_sendto (mds_dt_tipc.c:3211) 14:49:09 ==585==by 0x52C68EF: mds_mdtm_send_tipc (mds_dt_tipc.c:2815) 14:49:09 ==585==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send (mds_c_sndrcv.c:1774) 14:49:09 ==585==by 0x52B31B6: mds_mcm_send_msg_enc (mds_c_sndrcv.c:1255) 14:49:09 ==585==by 0x52B574C: mcm_pvt_normal_snd_process_common (mds_c_sndrcv.c:1194) 14:49:09 ==585==by 0x52B6323: mcm_pvt_normal_svc_snd (mds_c_sndrcv.c:1017) 14:49:09 ==585==by 0x52B6323: mds_mcm_send (mds_c_sndrcv.c:781) 14:49:09 ==585==by 0x52B6323: mds_send (mds_c_sndrcv.c:458) 14:49:09 ==585==by 0x52BEFDB: ncsmds_api (mds_papi.c:165) 14:49:09 ==585==by 0x4E41519: smfsv_mds_msg_send (smfsv_evt.c:1365) 14:49:09 ==585==by 0x10AE71: smfnd_cbk_req_proc (smfnd_evt.c:336) 14:49:09 ==585==by 0x10B465: proc_cbk_req_rsp (smfnd_evt.c:545) 14:49:09 ==585==by 0x10B465: smfnd_process_mbx (smfnd_evt.c:591) 14:49:09 ==585== Address 0x6aeaa4f is 63 bytes inside a block of size 67 alloc'd 14:49:09 ==585==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 14:49:09 ==585==by 0x52C654B: mds_mdtm_send_tipc (mds_dt_tipc.c:2734) 14:49:09 ==585==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send (mds_c_sndrcv.c:1774) 14:49:09 ==586== 2 errors in context 1 of 6: 14:49:09 ==586== Syscall param socketcall.sendto(msg) points to uninitialised byte(s) 14:49:09 ==586==at 0x5509B62: sendto (sendto.c:27) 14:49:09 ==586==by 0x52C5ACF: mds_retry_sendto (mds_dt_tipc.c:3154) 14:49:09 ==586==by 0x52C5CC4: mdtm_sendto (mds_dt_tipc.c:3211) 14:49:09 ==586==by 0x52C68EF: mds_mdtm_send_tipc (mds_dt_tipc.c:2815) 14:49:09 ==586==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send (mds_c_sndrcv.c:1774) 14:49:09 ==586==by 0x52B31B6: mds_mcm_send_msg_enc (mds_c_sndrcv.c:1255) 14:49:09 ==586==by 0x52B574C: mcm_pvt_normal_snd_process_common (mds_c_sndrcv.c:1194) 14:49:09 ==586==by 0x52B6323: mcm_pvt_normal_svc_snd (mds_c_sndrcv.c:1017) 14:49:09 ==586==by 0x52B6323: mds_mcm_send (mds_c_sndrcv.c:781) 14:49:09 ==586==by 0x52B6323: mds_send (mds_c_sndrcv.c:458) 14:49:09 ==586==by 0x52BEFDB: ncsmds_api (mds_papi.c:165) 14:49:09 ==586==by 0x4E41519: smfsv_mds_msg_send (smfsv_evt.c:1365) 14:49:09 ==586==by 0x10AE71: smfnd_cbk_req_proc (smfnd_evt.c:336) 14:49:09 ==586==by 0x10B465: proc_cbk_req_rsp (smfnd_evt.c:545) 14:49:09 ==586==by 0x10B465: smfnd_process_mbx (smfnd_evt.c:591) 14:49:09 ==586== Address 0x6afcdaf is 63 bytes inside a block of size 67 alloc'd 14:49:09 ==586==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 14:49:09 ==586==by 0x52C654B: mds_mdtm_send_tipc (mds_dt_tipc.c:2734) 14:49:09 ==586==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send (mds_c_sndrcv.c:1774) 14:49:09 ==586== 2 errors in context 1 of 6: 14:49:09 ==586== Syscall param socketcall.sendto(msg) points to uninitialised byte(s) 14:49:09 ==586==at 0x5509B62: sendto (sendto.c:27) 14:49:09 ==586==by 0x52C5ACF: mds_retry_sendto (mds_dt_tipc.c:3154) 14:49:09 ==586==by 0x52C5CC4: mdtm_sendto (mds_dt_tipc.c:3211) 14:49:09 ==586==by 0x52C68EF: mds_mdtm_send_tipc (mds_dt_tipc.c:2815) 14:49:09 ==586==by 0x52B1AD6: mcm_msg_encode_full_or_flat_and_send (mds_c_sndrcv.c:1774) 14:49:09 ==586==by 0x52B31B6: mds_mcm_send_msg_enc (mds_c_sndrcv.c:1255) 14:49:09 ==586==by 0x52B574C: mcm_pvt_normal_snd_process_common (mds_c_sndrcv.c:1194) 14:49:09 ==586==by 0x52B6323: mcm_pvt_normal_svc_snd (mds_c_sndrcv.c:1017) 14:49:09 ==586==by 0x52B6323: mds_mcm_send (mds_c_sndrcv.c:781) 14:49:09 ==586==by 0x52B6323: mds_send (mds_c_sndrcv.c:458) 14:49:09 ==586==by 0x52BEFDB: ncsmds_api (mds_papi.c:165) 14:49:09 ==586==by 0x4E41519: smfsv_mds_msg_send (smfsv_evt.c:1365) 14:49:09 ==586==by 0x10AE71: smfnd_cbk_req_proc (smfnd_evt.c:336) 14:49:09 ==586==by 0x10B465: proc_cbk_req_rsp (smfnd_evt.c:545) 14:49:09 ==586==by 0x10B465: smfnd_process_mbx (smfnd_evt.c:591) 14:49:09 ==586== Address 0x6b0f10f is 63 bytes inside a block of size 67 alloc'd 14:49:09 ==586==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 14:49:09 ==586==by 0x52C654B: mds_mdtm_send_tipc (mds_dt_tipc.c:2734) 14:49:09 ==586==by 0
[tickets] [opensaf:tickets] #3335 imm: Valgrind reported errors
- **Milestone**: 5.23.07 --> 5.23.12 --- **[tickets:#3335] imm: Valgrind reported errors** **Status:** assigned **Milestone:** 5.23.12 **Created:** Mon Apr 10, 2023 03:06 AM UTC by PhanTranQuocDat **Last Updated:** Fri Apr 28, 2023 08:18 AM UTC **Owner:** PhanTranQuocDat Valgrind detects memleaks: /var/lib/lxc/SC-2/rootfs/var/log/opensaf/immd.valgrind ==417== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost in loss record 108 of 111 ==417==at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x52ACC2B: sysf_alloc_pkt (sysf_mem.c:429) ==417==by 0x529BA1F: ncs_enc_init_space_pp (hj_ubaid.c:144) ==417==by 0x52C9996: mdtm_fill_data (mds_dt_common.c:1454) ==417==by 0x52CACCD: mdtm_process_recv_message_common (mds_dt_common.c:544) ==417==by 0x52CB071: mdtm_process_recv_data (mds_dt_common.c:1126) ==417==by 0x52D5B8E: mdtm_process_recv_events (mds_dt_tipc.c:1144) ==417==by 0x55106DA: start_thread (pthread_create.c:463) ==417==by 0x584961E: clone (clone.S:95) /var/lib/lxc/SC-1/rootfs/var/log/opensaf/immd.valgrind ==417== 7 bytes in 1 blocks are definitely lost in loss record 6 of 117 ==417==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238) ==417==by 0x1172C4: mbcsv_dec_async_update (immd_mbcsv.c:1128) ==417==by 0x1172C4: immd_mbcsv_decode_proc (immd_mbcsv.c:1402) ==417==by 0x1172C4: immd_mbcsv_callback (immd_mbcsv.c:411) ==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409) ==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460) ==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166) ==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271) ==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) -- ==417== 8 bytes in 1 blocks are definitely lost in loss record 15 of 117 ==417==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238) ==417==by 0x117288: mbcsv_dec_async_update (immd_mbcsv.c:1119) ==417==by 0x117288: immd_mbcsv_decode_proc (immd_mbcsv.c:1402) ==417==by 0x117288: immd_mbcsv_callback (immd_mbcsv.c:411) ==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409) ==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460) ==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166) ==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271) ==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) -- ==417== 16 bytes in 1 blocks are definitely lost in loss record 20 of 117 ==417==at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==417==by 0x4E430B1: immsv_evt_dec_inline_string (immsv_evt.c:238) ==417==by 0x11724C: mbcsv_dec_async_update (immd_mbcsv.c:1110) ==417==by 0x11724C: immd_mbcsv_decode_proc (immd_mbcsv.c:1402) ==417==by 0x11724C: immd_mbcsv_callback (immd_mbcsv.c:411) ==417==by 0x52B1335: ncs_mbscv_rcv_decode (mbcsv_act.c:409) ==417==by 0x52B14D0: ncs_mbcsv_rcv_async_update (mbcsv_act.c:460) ==417==by 0x52B829F: mbcsv_process_events (mbcsv_pr_evts.c:166) ==417==by 0x52B83BA: mbcsv_hdl_dispatch_all (mbcsv_pr_evts.c:271) ==417==by 0x52B2A19: mbcsv_process_dispatch_request (mbcsv_api.c:426) --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3336 amf: node did not reboot in split-brain prevention
- **Milestone**: 5.23.07 --> 5.23.12 --- **[tickets:#3336] amf: node did not reboot in split-brain prevention** **Status:** assigned **Milestone:** 5.23.12 **Created:** Wed Apr 26, 2023 08:48 AM UTC by Thang Duc Nguyen **Last Updated:** Wed Apr 26, 2023 08:48 AM UTC **Owner:** Thang Duc Nguyen With split-brain prevention with arbitration enable, relaxation mode is enable. The arbitration is down, then one SC is down. But the remain SC is still alive. It should be rebooted in this case. ~~~ 2023-04-04T07:52:12.137+02:00 SC-2.1 osafamfd[5337]: NO Node 'SC-1' is down. Start failover delay timer ... 2023-04-04T07:52:19.286+02:00 SC-2.1 osafamfd[5337]: NO Relaxed node promotion is enabled, peer SC is connected ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3040 Amfnd: reboot if mismatch msg id b/w amfd and amfnd
- **Milestone**: 5.22.06 --> 5.23.03 --- ** [tickets:#3040] Amfnd: reboot if mismatch msg id b/w amfd and amfnd** **Status:** fixed **Milestone:** 5.23.03 **Created:** Thu May 16, 2019 07:33 AM UTC by Thang Duc Nguyen **Last Updated:** Sun Jan 29, 2023 09:10 AM UTC **Owner:** Thang Duc Nguyen During SC failover, message received on ACTIVE AMFD can not be checked point to AMFD on STANDBY SC. But the AMFND still process the message ack for that message then it remove from queue. STANDBY SC takes ACTIVE and mismatch message id b/w AMFD and AMFND on new ACTIVE. As consequence, clm track start can not invoked to update cluster member nodes if these node was rebooted. Need to reboot recovery if received message id of amfd mismatch with sent message id of amfnd. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2798 mds: mdstest 5 1, 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed
commit db861cc07e61330bf0c1686869b18e33d302f255 Author: hieu.h.hoang Date: Wed Nov 30 08:27:05 2022 +0700 mds: Fix failed test cases in mdstest [#2798] A number of test cases are failed because it retrieved the event without polling the select object. Solution is to poll the select object. --- ** [tickets:#2798] mds: mdstest 5 1,5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed** **Status:** fixed **Milestone:** 5.23.03 **Created:** Wed Mar 07, 2018 04:19 AM UTC by Hoa Le **Last Updated:** Mon Mar 27, 2023 11:31 PM UTC **Owner:** Hieu Hong Hoang **Attachments:** - [mdstest_5_1.tar.gz](https://sourceforge.net/p/opensaf/tickets/2798/attachment/mdstest_5_1.tar.gz) (8.4 MB; application/gzip) Opensaf commit 5629f554686a498f328e0c79fc946379cbcf6967 mdstest 5 1 ~~~ LOG_NO("\nAction: Retrieve only ONE event\n"); if (mds_service_subscribe(gl_tet_adest.mds_pwe1_hdl, 500, NCSMDS_SCOPE_INTRACHASSIS, 2, svcids) != NCSCC_RC_SUCCESS) { LOG_NO("\nFail\n"); FAIL = 1; } else { LOG_NO("\nAction: Retrieve only ONE event\n"); if (mds_service_retrieve(gl_tet_adest.mds_pwe1_hdl, 500, SA_DISPATCH_ONE) != NCSCC_RC_SUCCESS) { LOG_NO("Fail, retrieve ONE\n"); FAIL = 1; } else LOG_NO("\nSuccess\n"); ~~~ After the subscription request being successful, mdstest would expectedly receive two 2 MDS_UP events of services 600 and 700. These info will be retrieved in the next step of the test case (mds_service_retrieve). The problem here is, these MDS_UP events are processed in a separate (parallel) thread (mds core thread) from the test case's main thread. In a bad scenario, if the mds core thread cannot be processed before the RETRIEVE operations in the main thread, the RETRIEVE request with "SA_DISPATCH_ONE" flag will return "error", and the test case will fail. <143>1 2018-03-07T01:10:29.936907+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="155"] << mds_mcm_svc_subscribe*** // MDS SUBSCRIBE request*** ... <142>1 2018-03-07T01:10:29.937631+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="162"] MDS_SND_RCV: info->info.retrieve_msg.i_dispatchFlags == SA_DISPATCH_ONE*** // MDS RETRIEVE request with SA DISPATCH ONE flag came before MDS UP events being processed*** <139>1 2018-03-07T01:10:29.937729+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="163"] MDS_SND_RCV: msgelem == NULL <142>1 2018-03-07T01:10:29.937953+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="164"] MDTM: Processing pollin events <142>1 2018-03-07T01:10:29.938333+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="165"] MDTM: Received SVC event <143>1 2018-03-07T01:10:29.93838+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="166"] >> mds_mcm_svc_up <143>1 2018-03-07T01:10:29.938418+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="167"] MCM:API: LOCAL SVC INFO : svc_id = INTERNAL(500) | PWE id = 1 | VDEST id = 65535 | <143>1 2018-03-07T01:10:29.938439+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="168"] MCM:API: REMOTE SVC INFO : svc_id = EXTERNAL(600) | PWE id = 1 | VDEST id = 65535 | POLICY = 1 | SCOPE = 3 | ROLE = 1 | MY_PCON = 0 | 2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Action: Retrieve only ONE event 2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Request to ncsmds_api: MDS RETRIEVE has FAILED 2018-03-07 01:10:29.942 SC-1 mdstest: NO Fail, retrieve ONE The same issue was observed in mdstest 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2798 mds: mdstest 5 1, 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed
- **Milestone**: future --> 5.23.03 --- ** [tickets:#2798] mds: mdstest 5 1,5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed** **Status:** fixed **Milestone:** 5.23.03 **Created:** Wed Mar 07, 2018 04:19 AM UTC by Hoa Le **Last Updated:** Mon Dec 19, 2022 01:47 AM UTC **Owner:** Hieu Hong Hoang **Attachments:** - [mdstest_5_1.tar.gz](https://sourceforge.net/p/opensaf/tickets/2798/attachment/mdstest_5_1.tar.gz) (8.4 MB; application/gzip) Opensaf commit 5629f554686a498f328e0c79fc946379cbcf6967 mdstest 5 1 ~~~ LOG_NO("\nAction: Retrieve only ONE event\n"); if (mds_service_subscribe(gl_tet_adest.mds_pwe1_hdl, 500, NCSMDS_SCOPE_INTRACHASSIS, 2, svcids) != NCSCC_RC_SUCCESS) { LOG_NO("\nFail\n"); FAIL = 1; } else { LOG_NO("\nAction: Retrieve only ONE event\n"); if (mds_service_retrieve(gl_tet_adest.mds_pwe1_hdl, 500, SA_DISPATCH_ONE) != NCSCC_RC_SUCCESS) { LOG_NO("Fail, retrieve ONE\n"); FAIL = 1; } else LOG_NO("\nSuccess\n"); ~~~ After the subscription request being successful, mdstest would expectedly receive two 2 MDS_UP events of services 600 and 700. These info will be retrieved in the next step of the test case (mds_service_retrieve). The problem here is, these MDS_UP events are processed in a separate (parallel) thread (mds core thread) from the test case's main thread. In a bad scenario, if the mds core thread cannot be processed before the RETRIEVE operations in the main thread, the RETRIEVE request with "SA_DISPATCH_ONE" flag will return "error", and the test case will fail. <143>1 2018-03-07T01:10:29.936907+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="155"] << mds_mcm_svc_subscribe*** // MDS SUBSCRIBE request*** ... <142>1 2018-03-07T01:10:29.937631+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="162"] MDS_SND_RCV: info->info.retrieve_msg.i_dispatchFlags == SA_DISPATCH_ONE*** // MDS RETRIEVE request with SA DISPATCH ONE flag came before MDS UP events being processed*** <139>1 2018-03-07T01:10:29.937729+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="163"] MDS_SND_RCV: msgelem == NULL <142>1 2018-03-07T01:10:29.937953+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="164"] MDTM: Processing pollin events <142>1 2018-03-07T01:10:29.938333+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="165"] MDTM: Received SVC event <143>1 2018-03-07T01:10:29.93838+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="166"] >> mds_mcm_svc_up <143>1 2018-03-07T01:10:29.938418+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="167"] MCM:API: LOCAL SVC INFO : svc_id = INTERNAL(500) | PWE id = 1 | VDEST id = 65535 | <143>1 2018-03-07T01:10:29.938439+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="168"] MCM:API: REMOTE SVC INFO : svc_id = EXTERNAL(600) | PWE id = 1 | VDEST id = 65535 | POLICY = 1 | SCOPE = 3 | ROLE = 1 | MY_PCON = 0 | 2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Action: Retrieve only ONE event 2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Request to ncsmds_api: MDS RETRIEVE has FAILED 2018-03-07 01:10:29.942 SC-1 mdstest: NO Fail, retrieve ONE The same issue was observed in mdstest 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3331 Valgrind reports errors
- **status**: assigned --> fixed --- ** [tickets:#3331] Valgrind reports errors ** **Status:** fixed **Milestone:** 5.23.03 **Created:** Wed Mar 01, 2023 02:20 AM UTC by PhanTranQuocDat **Last Updated:** Wed Mar 22, 2023 08:35 AM UTC **Owner:** PhanTranQuocDat Valgrind detected errors: ==484== 542 (536 direct, 6 indirect) bytes in 1 blocks are definitely lost in loss record 312 of 368 ==484==at 0x4C3217F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==484==by 0x1636BC: csiattr_create(std::__cxx11::basic_string, std::allocator > const&, SaImmAttrValuesT_2 const**) (csiattr.cc:78) ==484==by 0x164443: csiattr_create_apply (csiattr.cc:519) ==484==by 0x164443: csiattr_ccb_apply_cb(CcbUtilOperationData*) (csiattr.cc:713) ==484==by 0x172155: ccb_apply_cb(unsigned long long, unsigned long long) (imm.cc:1265) ==484==by 0x54B0C94: imma_process_callback_info(imma_cb*, imma_client_node*, imma_callback_info*, unsigned long long) ==407== Invalid read of size 1 ==407==at 0x5732C3A: mds_svc_op_uninstall (mds_svc_op.c:375) ==407==by 0x57320C7: ncsmds_api (mds_papi.c:147) ==407==by 0x54A31D2: imma_mds_unregister(imma_cb*) (imma_mds.cc:171) ==407==by 0x54A25D4: imma_destroy (imma_init.cc:219) ==407==by 0x54A25D4: imma_shutdown(ncsmds_svc_id) (imma_init.cc:337) ==407==by 0x54AF316: saImmOmFinalize (imma_om_api.cc:940) ==407==by 0x5061961: immutil_saImmOmFinalize (immutil.c:1572) ==407==by 0x141267: hydra_config_get (main.cc:775) ==407==by 0x141267: avnd_cb_create (main.cc:349) ==461== Mismatched free() / delete / delete [] ==461==at 0x4C3323B: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==461==by 0x1308D4: comp_init(avnd_comp_tag*, SaImmAttrValuesT_2 const**) (compdb.cc:1422) ==461==by 0x131066: avnd_comp_config_reinit(avnd_comp_tag*) (compdb.cc:1759) ==461==by 0x123FD7: avnd_comp_clc_uninst_inst_hdler(avnd_cb_tag*, avnd_comp_tag*) (clc.cc:1584) ==461==by 0x124390: avnd_comp_clc_fsm_run(avnd_cb_tag*, avnd_comp_tag*, avnd_comp_clc_pres_fsm_ev) (clc.cc:887) ==461==by 0x153FE6: avnd_su_pres_uninst_suinst_hdler(avnd_cb_tag*, avnd_su_tag*, avnd_comp_tag*) (susm.cc:2145) ==461==by 0x1567C0: avnd_su_pres_fsm_run(avnd_cb_tag*, avnd_su_tag*, avnd_comp_tag*, avnd_su_pres_fsm_ev) (susm.cc:1604) ==461==by 0x15C3AA: avnd_evt_ir_evh(avnd_cb_tag*, avnd_evt_tag*) (susm.cc:4236) ==461==by 0x141D25: avnd_evt_process (main.cc:692) ==461==by 0x141D25: avnd_main_process() (main.cc:644) ==461==by 0x1170AD: main (main.cc:225) ==407== Invalid read of size 8 ==407==at 0x119942: avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*, avnd_evt_tag*) (cbq.cc:678) ==407==by 0x141D15: avnd_evt_process (main.cc:692) ==407==by 0x141D15: avnd_main_process() (main.cc:644) ==407==by 0x1170AD: main (main.cc:225) ==407== Address 0x8bb2ad0 is 64 bytes inside a block of size 112 free'd ==407==at 0x4C3323B: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==407==by 0x11962B: avnd_comp_cbq_rec_pop_and_del(avnd_cb_tag*, avnd_comp_tag*, unsigned int, bool) (cbq.cc:973) ==407==by 0x119941: avnd_evt_tmr_cbk_resp_evh(avnd_cb_tag*, avnd_evt_tag*) (cbq.cc:678) ==407==by 0x141D15: avnd_evt_process (main.cc:692) ==428== 8,072 (56 direct, 8,016 indirect) bytes in 1 blocks are definitely lost in loss record 285 of 289 ==428==at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==428==by 0x5914C2B: sysf_alloc_pkt (sysf_mem.c:429) ==428==by 0x5903A1F: ncs_enc_init_space_pp (hj_ubaid.c:144) ==428==by 0x5931996: mdtm_fill_data (mds_dt_common.c:1453) ==428==by 0x5932CC2: mdtm_process_recv_message_common (mds_dt_common.c:544) ==428==by 0x5933061: mdtm_process_recv_data (mds_dt_common.c:1125) ==428==by 0x59351D6: mds_mdtm_process_recvdata (mds_dt_trans.c:1217) ==428==by 0x5936426: mdtm_process_poll_recv_data_tcp (mds_dt_trans.c:903) ==428==by 0x593683E: mdtm_process_recv_events_tcp (mds_dt_trans.c:995) ==428==by 0x61196DA: start_thread (pthread_create.c:463) --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3332 rde: incorrect use of pointer
--- ** [tickets:#3332] rde: incorrect use of pointer** **Status:** unassigned **Milestone:** 5.23.03 **Created:** Mon Mar 27, 2023 06:47 AM UTC by Gary Lee **Last Updated:** Mon Mar 27, 2023 06:47 AM UTC **Owner:** nobody In rda_papi.cc, there is a use after free. if (m_NCS_TASK_START(rda_callback_cb->task_handle) != NCSCC_RC_SUCCESS) { m_NCS_MEM_FREE(rda_callback_cb, 0, 0, 0); m_NCS_TASK_RELEASE(rda_callback_cb->task_handle); rc = PCSRDA_RC_TASK_SPAWN_FAILED; break; } --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3102 mds: waste 1.5s in waiting Adest already down to send response message type
- **Milestone**: 5.20.02 --> 5.22.11 --- ** [tickets:#3102] mds: waste 1.5s in waiting Adest already down to send response message type** **Status:** fixed **Milestone:** 5.22.11 **Created:** Thu Oct 17, 2019 09:23 AM UTC by Thuan Tran **Last Updated:** Thu Jul 28, 2022 07:20 AM UTC **Owner:** Thuan Tran **Attachments:** - [mds.log](https://sourceforge.net/p/opensaf/tickets/3102/attachment/mds.log) (16.9 kB; application/octet-stream) On Active SC, do following commands: ~~~ pkill -STOP osafntfd ntfsend & pkill -9 ntfsend pkill -CONT osafntfd ~~~ Check mds.log will see osafntfd stuck in 1.5s to waiting for agent already down to send response message type. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active
- **Milestone**: 5.22.11 --> 5.23.03 --- ** [tickets:#3288] fmd: failed during setting role from standby to active** **Status:** review **Milestone:** 5.23.03 **Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong **Last Updated:** Thu Nov 17, 2022 04:11 PM UTC **Owner:** Huu The Truong After the standby SC down then another SC is promoted to became the new standby SC. 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f: 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY At that time, the new standby SC received peer info response from old standby SC is needed promote to became the active SC. 2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role to ACTIVE 2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE ... 2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request 2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop supervision timer While an another active SC is alive lead to the standby SC reboots itselft because cluster has only one active SC. 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active FAILED, Standby OUT OF SYNC 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, SupervisionTime = 60 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock
- **Milestone**: 5.22.11 --> 5.23.03 --- ** [tickets:#3293] log: Replace ScopeLock by standard lock** **Status:** review **Milestone:** 5.23.03 **Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang **Last Updated:** Wed Jun 01, 2022 12:59 AM UTC **Owner:** Hieu Hong Hoang We created a class ScopeLock to support recursive mutex. It's used a lot in module log. However, the C++ std have a std:unique_lock which supports std::recursive_mutex. We should use the standard lock instead of creating a new class. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3306 ckpt: checkpoint node director responding to async call.
- **Milestone**: 5.22.11 --> 5.23.03 --- ** [tickets:#3306] ckpt: checkpoint node director responding to async call.** **Status:** accepted **Milestone:** 5.23.03 **Created:** Thu Feb 17, 2022 10:46 AM UTC by Mohan Kanakam **Last Updated:** Thu Oct 06, 2022 02:30 PM UTC **Owner:** Mohan Kanakam During section create, one ckptnd sends async request(normal mds send) to another ckptnd. But, another ckptnd is responding to the request in assumption that it received the sync request and it has to respond to the sender ckptnd. In few cases, it is needed to respond when a sync req comes to ckptnd, but in few cases, it receives async req and it needn't respond async request. We are getting the following messages in mds log when creating the section: sc1-VirtualBox osafckptnd 27692 mds.log [meta sequenceId="2"] MDS_SND_RCV: Invalid Sync CTXT Len --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3312 fmd: sc failed to failover in roamming mode
- **Milestone**: 5.22.11 --> 5.23.03 --- ** [tickets:#3312] fmd: sc failed to failover in roamming mode** **Status:** assigned **Milestone:** 5.23.03 **Created:** Tue Mar 29, 2022 03:44 AM UTC by Huu The Truong **Last Updated:** Thu Nov 17, 2022 04:08 PM UTC **Owner:** Huu The Truong Shutdown SC-6 (role is standby): 2022-03-07 12:14:52.551 INFO: * Stop standby SC (SC-6) SC-10 changed role to standby: 2022-03-07 12:14:54.919 SC-10 osafrded[384]: NO RDE role set to STANDBY However, a service of the old standby is still alive, lead to SC-10 received peer info from the old standby (SC-6). It mistakes this service as active SC is downing. SC-10 changed role to active then rebooted. 2022-03-07 12:14:55.522 SC-10 osaffmd[394]: NO Controller Failover: Setting role to ACTIVE 2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO RDE role set to ACTIVE 2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO Running '/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s) 2022-03-07 12:14:55.654 SC-10 opensaf_sc_active: 49cbd770-9e07-11ec-b3b4-525400fd3480 expected on SC-1 2022-03-07 12:14:55.656 SC-10 osafntfd[439]: NO ACTIVE request 2022-03-07 12:14:55.656 SC-10 osaffmd[394]: NO Controller promoted. Stop supervision timer 2022-03-07 12:14:55.657 SC-10 osafclmd[450]: NO ACTIVE request 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: NO FAILOVER StandBy --> Active 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: ER FAILOVER StandBy --> Active FAILED, Standby OUT OF SYNC 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 20a0f, SupervisionTime = 60 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3322 log: log agent in main process is disabled after child process exits
- **Milestone**: 5.22.11 --> 5.23.03 --- ** [tickets:#3322] log: log agent in main process is disabled after child process exits** **Status:** review **Milestone:** 5.23.03 **Created:** Wed Oct 05, 2022 09:36 AM UTC by Hieu Hong Hoang **Last Updated:** Thu Oct 06, 2022 10:14 AM UTC **Owner:** Hieu Hong Hoang **Attachments:** - [loga](https://sourceforge.net/p/opensaf/tickets/3322/attachment/loga) (28.8 kB; application/octet-stream) - [osaflogd](https://sourceforge.net/p/opensaf/tickets/3322/attachment/osaflogd) (527.4 kB; application/octet-stream) While using log agent, if the new process was created by forking current process, both processes have a same mds address. If the child process exits, the destructor of log agent is invoked. It will unregister with mds so all other subscribing services will detect this log agent is down. However the main process is still using that mds address, all requests from main process become invalid because logd thinks this log agent has already been down. Log analysis: * Main process initializes the mds address: ~~~ <143>1 2022-10-05T10:41:05.537677+02:00 SC-2 logtest 763 loga [meta sequenceId="89"] 763:log/agent/lga_mds.cc:1287 >> lga_mds_init <143>1 2022-10-05T10:41:05.537782+02:00 SC-2 logtest 763 loga [meta sequenceId="90"] 763:log/agent/lga_mds.cc:1334 << lga_mds_init ~~~ * Duplicate the main process by using https://man7.org/linux/man-pages/man2/fork.2.html . Then the duplicated process exits and the destructor of log agent is invoked: ~~~ <143>1 2022-10-05T10:41:05.541101+02:00 SC-2 logtest 763 loga [meta sequenceId="156"] 772:log/agent/lga_agent.cc:167 >> ~LogAgent <143>1 2022-10-05T10:41:05.54126+02:00 SC-2 logtest 763 loga [meta sequenceId="157"] 772:log/agent/lga_state.cc:160 >> stop_recovery2_thread <143>1 2022-10-05T10:41:05.541297+02:00 SC-2 logtest 763 loga [meta sequenceId="158"] 772:log/agent/lga_state.cc:166 TR stop_recovery2_thread RecoveryState::kNormal no thread to stop <143>1 2022-10-05T10:41:05.541315+02:00 SC-2 logtest 763 loga [meta sequenceId="159"] 772:log/agent/lga_state.cc:183 << stop_recovery2_thread <143>1 2022-10-05T10:41:05.541322+02:00 SC-2 logtest 763 loga [meta sequenceId="160"] 772:log/agent/lga_util.cc:125 >> lga_shutdown <143>1 2022-10-05T10:41:05.541329+02:00 SC-2 logtest 763 loga [meta sequenceId="161"] 772:log/agent/lga_mds.cc:1351 >> lga_mds_deinit <143>1 2022-10-05T10:41:05.541573+02:00 SC-2 logtest 763 loga [meta sequenceId="162"] 772:log/agent/lga_mds.cc:1362 << lga_mds_deinit ~~~ * Logd detects this log agent is down and delete all clients of this log agent: ~~~ <143>1 2022-10-05T10:41:05.541593+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3386"] 452:log/logd/lgs_mds.cc:1230 T8 MDS DOWN dest: 2020f3aafb5a7, node ID: 2020f, svc_id: 21 <143>1 2022-10-05T10:41:05.541636+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3387"] 447:log/logd/lgs_evt.cc:415 >> proc_lga_updn_mds_msg <143>1 2022-10-05T10:41:05.541648+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3388"] 447:log/logd/lgs_evt.cc:436 TR proc_lga_updn_mds_msg: LGSV_LGS_EVT_LGA_DOWN mds_dest = 2020f3aafb5a7 <143>1 2022-10-05T10:41:05.541656+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3389"] 447:log/logd/lgs_evt.cc:338 >> lgs_client_delete_by_mds_dest: mds_dest 2020f3aafb5a7 <143>1 2022-10-05T10:41:05.541663+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3390"] 447:log/logd/lgs_evt.cc:191 >> lgs_client_delete: client_id 9 <143>1 2022-10-05T10:41:05.541678+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3391"] 447:log/logd/lgs_evt.cc:213 T4 client_id: 9, REMOVE stream id: 2 <143>1 2022-10-05T10:41:05.541686+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3392"] 447:log/logd/lgs_stream.cc:856 >> log_stream_close: safLgStrCfg=saLogSystem,safApp=safLogService <143>1 2022-10-05T10:41:05.541713+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3393"] 447:log/logd/lgs_stream.cc:922 << log_stream_close: rc=0, numOpeners=7 <143>1 2022-10-05T10:41:05.541737+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3394"] 447:log/logd/lgs_evt.cc:239 << lgs_client_delete <143>1 2022-10-05T10:41:05.541744+02:00 SC-1 osaflogd 447 osaflogd [meta sequenceId="3395"] 447:log/logd/lgs_evt.cc:348 << lgs_client_delete_by_mds_dest ~~~ * The main process send a writting request to logd: ~~~ <143>1 2022-10-05T10:41:07.541457+02:00 SC-2 logtest 763 loga [meta sequenceId="182"] 763:log/agent/lga_mds.cc:1439 >> lga_mds_msg_async_send <143>1 2022-10-05T10:41:07.541488+02:00 SC-2 logtest 763 loga [meta sequenceId="183"] 763:log/agent/lga_mds.cc:789 >> lga_mds_enc <143>1 2022-10-05T10:41:07.541516+02:00 SC-2 logtest 763 loga [meta sequenceId="184"] 763:log/agent/lga_mds.cc:820 T2 msgtype: 0 <143>1 2022-10-05T10:41:07.541524+02:00 SC-2 logtest 763 loga [meta sequenceId="185"] 763:log/agent/lga_mds.cc:834 T2 api_info.type: 4 <143>1 2022-10-05T10:41:07.541533+02:00 SC-2 logtest 763 loga [meta sequenceId="186"] 763:log/age
[tickets] [opensaf:tickets] #3323 imm: PL sync failed after reconnected with SC
- **Milestone**: 5.22.11 --> 5.23.03 --- ** [tickets:#3323] imm: PL sync failed after reconnected with SC** **Status:** review **Milestone:** 5.23.03 **Created:** Wed Oct 05, 2022 09:37 AM UTC by Son Tran Ngoc **Last Updated:** Thu Nov 17, 2022 02:42 PM UTC **Owner:** Son Tran Ngoc Active SC1 and PL4 lost connect suddenly ( may be due to the environment reason),they re-established contact but PL4 sync failed due to PL4 did not update active SC1 information and discard message from IMMD SC1 PL4 sync failed log 2022-09-22 04:07:05.230 DEBUG: Syncing node PL-4 (timeout=120) 2022-09-22 04:08:06.325 WARNING: waiting more than 60 sec for node PL-4 to sync PL4 discard message from SC1 log 2022-09-22 04:07:08.406 PL-4 osafimmnd[354]: WA DISCARD message from IMMD 2010f as ACT:0 SBY:2020f 2022-09-22 04:07:09.013 PL-4 osafimmnd[354]: message repeated 243 times: [ WA DISCARD message from IMMD 2010f as ACT:0 SBY:2020f] step to reproduce. 1.start SCs, PLs. 2.Block traffic SC1 and PL4 ( make sure block traffic after IMM State : IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT ). 3.Unblock traffic SC1 and PL4. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3294 mds: refactor huge api functions
- **Milestone**: future --> 5.22.11 --- ** [tickets:#3294] mds: refactor huge api functions** **Status:** fixed **Milestone:** 5.22.11 **Created:** Tue Oct 26, 2021 06:02 AM UTC by Hieu Hong Hoang **Last Updated:** Mon Aug 08, 2022 03:15 AM UTC **Owner:** Hieu Hong Hoang Some functions have 1.5K+ line of code. It's hard to maintain those functions. We should refactor it into smaller sub-functions. For example: ~~~ line 1863: uint32_t mds_mcm_svc_up(PW_ENV_ID pwe_id, MDS_SVC_ID svc_id, V_DEST_RL role, line 1864: NCSMDS_SCOPE_TYPE scope, MDS_VDEST_ID vdest_id, line 1865: NCS_VDEST_TYPE vdest_policy, MDS_DEST adest, line 1866: bool my_pcon, MDS_SVC_HDL local_svc_hdl, line 1867: MDS_SUBTN_REF_VAL subtn_ref_val, line 1868: MDS_SVC_PVT_SUB_PART_VER svc_sub_part_ver, line 1869: MDS_SVC_ARCHWORD_TYPE archword_type) line 1870: { line 3494: } ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3311 imm: PBE logging inconsistent
- **Component**: fm --> imm --- ** [tickets:#3311] imm: PBE logging inconsistent** **Status:** fixed **Milestone:** 5.22.06 **Created:** Tue Mar 15, 2022 01:32 AM UTC by Thang Duc Nguyen **Last Updated:** Wed Jun 01, 2022 01:01 AM UTC **Owner:** Vu Minh Hoang There are some logging in syslog for PBE but they are inconsistent. It's not easy for troubleshooting. E.g, - LOG_WA("**Persistent back-end** process has apparently died."); - LOG_NO( "**Persistent Back-End** capability configured, Pbe file:%s (suffix may get added)", immnd_cb->mPbeFile); - LOG_NO("**Persistent Back End** OI attached, pid: %u", pbe_pid); - LOG_ER("IMM RELOAD with NO **persistent back end** => ensure cluster restart by IMMD exit at both SCs, exiting"); .etc, Should make they consistent. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3310 log: log agent is blocked for 10s
- **Component**: unknown --> log --- ** [tickets:#3310] log: log agent is blocked for 10s** **Status:** fixed **Milestone:** 5.22.06 **Created:** Tue Mar 08, 2022 03:32 AM UTC by Hieu Hong Hoang **Last Updated:** Wed Jun 01, 2022 01:02 AM UTC **Owner:** Hieu Hong Hoang In [ticket 3291](https://sourceforge.net/p/opensaf/tickets/3291/), a new message was introduced which is sent from the log director to the log agent. When a log agent that contains #3291 run with a log director that doesn't contains #3291, that log agent must wait for 10s. We should avoid this. ~~~ <143>1 2022-02-19T14:06:22.358273+01:00 SC-2 osafamfd 27233 osafamfd [meta sequenceId="749743"] 27233:log/agent/lga_agent.cc:409 >> saLogInitialize ... <143>1 2022-02-19T14:06:32.359774+01:00 SC-2 osafamfd 27233 osafamfd [meta sequenceId="749784"] 27233:log/agent/lga_agent.cc:348 TR Waiting for initial clm status timeout <143>1 2022-02-19T14:06:32.359838+01:00 SC-2 osafamfd 27233 osafamfd [meta sequenceId="749785"] 27233:log/agent/lga_agent.cc:361 << WaitLogServerUp ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3310 log: log agent is blocked for 10s
- **Component**: log --> unknown --- ** [tickets:#3310] log: log agent is blocked for 10s** **Status:** fixed **Milestone:** 5.22.06 **Created:** Tue Mar 08, 2022 03:32 AM UTC by Hieu Hong Hoang **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** Hieu Hong Hoang In [ticket 3291](https://sourceforge.net/p/opensaf/tickets/3291/), a new message was introduced which is sent from the log director to the log agent. When a log agent that contains #3291 run with a log director that doesn't contains #3291, that log agent must wait for 10s. We should avoid this. ~~~ <143>1 2022-02-19T14:06:22.358273+01:00 SC-2 osafamfd 27233 osafamfd [meta sequenceId="749743"] 27233:log/agent/lga_agent.cc:409 >> saLogInitialize ... <143>1 2022-02-19T14:06:32.359774+01:00 SC-2 osafamfd 27233 osafamfd [meta sequenceId="749784"] 27233:log/agent/lga_agent.cc:348 TR Waiting for initial clm status timeout <143>1 2022-02-19T14:06:32.359838+01:00 SC-2 osafamfd 27233 osafamfd [meta sequenceId="749785"] 27233:log/agent/lga_agent.cc:361 << WaitLogServerUp ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3311 imm: PBE logging inconsistent
- **Component**: unknown --> fm --- ** [tickets:#3311] imm: PBE logging inconsistent** **Status:** fixed **Milestone:** 5.22.06 **Created:** Tue Mar 15, 2022 01:32 AM UTC by Thang Duc Nguyen **Last Updated:** Wed Jun 01, 2022 01:01 AM UTC **Owner:** Vu Minh Hoang There are some logging in syslog for PBE but they are inconsistent. It's not easy for troubleshooting. E.g, - LOG_WA("**Persistent back-end** process has apparently died."); - LOG_NO( "**Persistent Back-End** capability configured, Pbe file:%s (suffix may get added)", immnd_cb->mPbeFile); - LOG_NO("**Persistent Back End** OI attached, pid: %u", pbe_pid); - LOG_ER("IMM RELOAD with NO **persistent back end** => ensure cluster restart by IMMD exit at both SCs, exiting"); .etc, Should make they consistent. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3311 imm: PBE logging inconsistent
- **Component**: imm --> unknown --- ** [tickets:#3311] imm: PBE logging inconsistent** **Status:** fixed **Milestone:** 5.22.06 **Created:** Tue Mar 15, 2022 01:32 AM UTC by Thang Duc Nguyen **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** Vu Minh Hoang There are some logging in syslog for PBE but they are inconsistent. It's not easy for troubleshooting. E.g, - LOG_WA("**Persistent back-end** process has apparently died."); - LOG_NO( "**Persistent Back-End** capability configured, Pbe file:%s (suffix may get added)", immnd_cb->mPbeFile); - LOG_NO("**Persistent Back End** OI attached, pid: %u", pbe_pid); - LOG_ER("IMM RELOAD with NO **persistent back end** => ensure cluster restart by IMMD exit at both SCs, exiting"); .etc, Should make they consistent. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active
- **Milestone**: 5.22.06 --> 5.22.11 --- ** [tickets:#3288] fmd: failed during setting role from standby to active** **Status:** review **Milestone:** 5.22.11 **Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** Huu The Truong After the standby SC down then another SC is promoted to became the new standby SC. 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f: 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY At that time, the new standby SC received peer info response from old standby SC is needed promote to became the active SC. 2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role to ACTIVE 2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE ... 2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request 2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop supervision timer While an another active SC is alive lead to the standby SC reboots itselft because cluster has only one active SC. 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active FAILED, Standby OUT OF SYNC 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, SupervisionTime = 60 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock
- **Milestone**: 5.22.06 --> 5.22.11 --- ** [tickets:#3293] log: Replace ScopeLock by standard lock** **Status:** review **Milestone:** 5.22.11 **Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** Hieu Hong Hoang We created a class ScopeLock to support recursive mutex. It's used a lot in module log. However, the C++ std have a std:unique_lock which supports std::recursive_mutex. We should use the standard lock instead of creating a new class. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3299 immnd: unexpected reboot node after merge network back
- **Milestone**: 5.22.06 --> 5.22.11 --- ** [tickets:#3299] immnd: unexpected reboot node after merge network back** **Status:** assigned **Milestone:** 5.22.11 **Created:** Wed Dec 01, 2021 03:07 AM UTC by Huu The Truong **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** Huu The Truong Split network into two partitions. On each partition, create a dummy object. During merge network back has happened twice the reboot node. The first reboot node: 2021-09-07 12:20:06.926 SC-7 osafimmnd[376]: NO Used to be on another partition. Rebooting... 2021-09-07 12:20:06.935 SC-7 osafamfnd[431]: NO AVD NEW_ACTIVE, adest:1 2021-09-07 12:20:06.935 SC-7 osafimmnd[376]: Quick local node rebooting, Reason: Used to be on another partition. Rebooting... 2021-09-07 12:20:06.952 SC-7 opensaf_reboot: Do quick local node reboot At the second, this reboot is unexpected: 2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: NO Used to be on another partition. Rebooting... 2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: Quick local node rebooting, Reason: Used to be on another partition. Rebooting... 2021-09-07 12:20:11.989 SC-7 osafclmna[333]: NO safNode=SC-7,safCluster=myClmCluster Joined cluster, nodeid=2070f 2021-09-07 12:20:11.992 SC-7 opensaf_reboot: Do quick local node reboot 2021-09-07 12:20:12.022 SC-7 opensafd[305]: ER Service RDE has unexpectedly crashed. Unable to continue, exiting --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3306 ckpt: checkpoint node director responding to async call.
- **Milestone**: 5.22.06 --> 5.22.11 --- ** [tickets:#3306] ckpt: checkpoint node director responding to async call.** **Status:** accepted **Milestone:** 5.22.11 **Created:** Thu Feb 17, 2022 10:46 AM UTC by Mohan Kanakam **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** Mohan Kanakam During section create, one ckptnd sends async request(normal mds send) to another ckptnd. But, another ckptnd is responding to the request in assumption that it received the sync request and it has to respond to the sender ckptnd. In few cases, it is needed to respond when a sync req comes to ckptnd, but in few cases, it receives async req and it needn't respond async request. We are getting the following messages in mds log when creating the section: sc1-VirtualBox osafckptnd 27692 mds.log [meta sequenceId="2"] MDS_SND_RCV: Invalid Sync CTXT Len --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3312 fmd: sc failed to failover in roamming mode
- **Milestone**: 5.22.06 --> 5.22.11 --- ** [tickets:#3312] fmd: sc failed to failover in roamming mode** **Status:** assigned **Milestone:** 5.22.11 **Created:** Tue Mar 29, 2022 03:44 AM UTC by Huu The Truong **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** Huu The Truong Shutdown SC-6 (role is standby): 2022-03-07 12:14:52.551 INFO: * Stop standby SC (SC-6) SC-10 changed role to standby: 2022-03-07 12:14:54.919 SC-10 osafrded[384]: NO RDE role set to STANDBY However, a service of the old standby is still alive, lead to SC-10 received peer info from the old standby (SC-6). It mistakes this service as active SC is downing. SC-10 changed role to active then rebooted. 2022-03-07 12:14:55.522 SC-10 osaffmd[394]: NO Controller Failover: Setting role to ACTIVE 2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO RDE role set to ACTIVE 2022-03-07 12:14:55.522 SC-10 osafrded[384]: NO Running '/usr/local/lib/opensaf/opensaf_sc_active' with 0 argument(s) 2022-03-07 12:14:55.654 SC-10 opensaf_sc_active: 49cbd770-9e07-11ec-b3b4-525400fd3480 expected on SC-1 2022-03-07 12:14:55.656 SC-10 osafntfd[439]: NO ACTIVE request 2022-03-07 12:14:55.656 SC-10 osaffmd[394]: NO Controller promoted. Stop supervision timer 2022-03-07 12:14:55.657 SC-10 osafclmd[450]: NO ACTIVE request 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: NO FAILOVER StandBy --> Active 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: ER FAILOVER StandBy --> Active FAILED, Standby OUT OF SYNC 2022-03-07 12:14:55.657 SC-10 osafamfd[461]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 20a0f, SupervisionTime = 60 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3316 base: increase buffer for list of members in a group
- **Milestone**: 5.22.06 --> 5.22.11 --- ** [tickets:#3316] base: increase buffer for list of members in a group** **Status:** assigned **Milestone:** 5.22.11 **Created:** Tue May 24, 2022 03:36 AM UTC by PhanTranQuocDat **Last Updated:** Wed Jun 01, 2022 12:57 AM UTC **Owner:** PhanTranQuocDat When access imm, a user needs to be authenticated to be superuser or a member of authorized group. In case authorized group has too many users, the default system buffer is insufficient to contain leading to error: Numerical result out of range. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3300 dtm: osaftransportd replied Stream not found when delete stream
- **Type**: enhancement --> defect --- ** [tickets:#3300] dtm: osaftransportd replied Stream not found when delete stream** **Status:** wontfix **Milestone:** 5.22.01 **Created:** Thu Dec 09, 2021 06:35 AM UTC by Thien Minh Huynh **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Thien Minh Huynh if delete stream right after enable trace, the error 'stream not found' will be raised. The error sometimes cause confusion in the case of the wrong stream name. ~~~ root@SC-1:~# osaflog --delete osafimmnd ERROR: osaftransportd replied 'Stream not found' ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3292 log: Introduce an initial clm node status
- **Type**: enhancement --> defect --- ** [tickets:#3292] log: Introduce an initial clm node status** **Status:** duplicate **Milestone:** 5.22.01 **Created:** Mon Oct 18, 2021 09:02 AM UTC by Hieu Hong Hoang **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Hieu Hong Hoang Currently, loga don't know when logd is ready for a request. In this ticket, we introduce a new message from logd to loga. When logd detect a log agent is up, logd will send the initial clm node status to loga. Loga should wait for that message before sending any request to logd. This callback could solve the issue with priority of messages in [1396](https://sourceforge.net/p/opensaf/tickets/1396/). In 1396: the agent down message received before the initial client message but it's processed after the initial client message due to the priority of messages. If the loga wait for the initial clm node status message before sending the initial client message, all messages will be processed in right order. Following is the order of processing messages: 1. logd: agent up message 2. loga: initial clm node status message 3. logd: initial client message 4. logd: final client message 5. logd: agent down message --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #209 plmd crashed while deleting plm entities at various points.
- **Milestone**: 5.22.01 --> future --- ** [tickets:#209] plmd crashed while deleting plm entities at various points.** **Status:** review **Milestone:** future **Created:** Wed May 15, 2013 07:02 AM UTC by Mathi Naickan **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** MeenakshiTK When the command , "immcfg -d safHE=7220_slot_1,safDomain=domain_1" is ran plm crashed with segmentation fault. the above object has three childs dpb_1,dpb_2 and PL-13 plmd crashed with the following backtrace : Program terminated with signal 11, Segmentation fault. #0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at plms_utils.c:1360 1360 if (0 == strcmp(tail->plm_entity->dn_name_str, (gdb) bt fH[[K #0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at plms_utils.c:1360 #1 0x08088c8d in plms_chld_get (ent=0x80fe450, chld_list=0xbbc5d9f8) at plms_utils.c:842 #2 0x0805ca90 in plms_delete_objects (obj_type=6, obj_name=0x810a2a8) at plms_imm.c:697 #3 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=68719608079, ccb_id=2) at plms_imm.c:1425 #4 0x032fc46f in imma_process_callback_info (cb=) at imma_proc.c:2005 #5 0x032fb393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592 #6 0x032ebcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548 #7 0x08051bff in main (argc=2, argv=0xbbc5e414) at plms_main.c:484 #8 0x033aee0c in libc_start_main () from /lib/libc.so.6 #9 0x0804c401 in _start () While deleting the entity, which doesn't have any child, it crashed with the following backtrace #0 0x0805cb14 in plms_delete_objects (obj_type=7, obj_name=0x8109588) at plms_imm.c:707 #1 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=824633852431, ccb_id=6) at plms_imm.c:1425 #2 0x0189c46f in imma_process_callback_info (cb=) at imma_proc.c:2005 #3 0x0189b393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592 #4 0x0188bcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548 #5 0x08051bff in main (argc=2, argv=0xbe1d6db4) at plms_main.c:484 #6 0x0194ee0c in libc_start_main () from /lib/libc.so.6 #7 0x0804c401 in _start () Also, check the following issue : Crash in plmc_err callback when ee_id is passed as empty string. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2082 CKPT : Track cbk not invoked for section creation after cpnd restart
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2082] CKPT : Track cbk not invoked for section creation after cpnd restart** **Status:** review **Milestone:** future **Created:** Thu Sep 29, 2016 11:06 AM UTC by Srikanth R **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Mohan Kanakam Changeset: 7997 5.1.FC Track Callback is not invoked after cpnd restart. Below are the apis called from the applications , spawned on two nodes .i.e payloads. On first node : -> Initialize with cpsv -> Create a ckpt with ACTIVE REPLICA flag. On second node. -> Initialize with cpsv. On First node, -> Open the checkpoint in writing mode -> Open the checkpoint in reading mode. -> Kill cpnd process -> Register for Track callback. On Second node, -> Open the ckpt in read mode. -> Kill cpnd proces -> Register for Track callback. After ensuring that both agents registered for track callback, create a section from the application on first node. For section creation, callback should be invoked for applications on two nodes. Currently callback is not invoked for the application on second node. With out cpnd restart, callback is invoked for the two applications --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2454 amfnd: Clean up variable of active amfd status
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2454] amfnd: Clean up variable of active amfd status** **Status:** review **Milestone:** future **Created:** Thu May 04, 2017 12:34 PM UTC by Minh Hon Chau **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Minh Hon Chau In amfnd, we have variable cb->is_avd_down and set of macros: m_AVND_CB_IS_AVD_UP, m_AVND_CB_AVD_UP_SET, m_AVND_CB_AVD_UP_RESET which is using flag AVND_CB_FLAG_AVD_UP, they are all indicating active amfd down/up. Amfnd should only use variable @is_avd_down or the macros, not both of them. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2500 build: Schema files (.xsd) are missing from distribution tarballs
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2500] build: Schema files (.xsd) are missing from distribution tarballs** **Status:** unassigned **Milestone:** future **Created:** Fri Jun 16, 2017 10:45 AM UTC by Anders Widell **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Anders Widell The XML schema (xsd) files are missing in the release tarballs, at least for IMM and probably also for SMF. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2849 smf: Incorrect logging may flood syslog
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2849] smf: Incorrect logging may flood syslog** **Status:** review **Milestone:** future **Created:** Tue May 08, 2018 11:29 AM UTC by elunlen **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Krishna Pawar In function getNodeDestination() in SmfUtils there is a: LOG_NO("%s: className '%s'", __FUNCTION__, className); That should be changed to a TRACE or removed. It is printed in a loop that may go on for 10 seconds with a delay of 2 seconds meaning that this printout may happen 5 times. The problem however, is that this function is called in aloop by waitForNodeDestination() . The logging is not done at very high speed though. One log every 2 seconds --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2861 osaf: Faulty constructs in opensaf detected when compiling with gcc 8.1
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2861] osaf: Faulty constructs in opensaf detected when compiling with gcc 8.1** **Status:** unassigned **Milestone:** future **Created:** Mon May 21, 2018 02:25 PM UTC by Hans Nordebäck **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** nobody Compiling with gcc 8.1 with -Wno-format-truncation -Wno-stringop-overflow -Wno-format-overflow makes the compilation succeed, but these -Wno- should not be used, the faulty constructs should instead be corrected. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2914 clm : add missing testcases in Clm apitest
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2914] clm : add missing testcases in Clm apitest** **Status:** review **Milestone:** future **Created:** Mon Aug 20, 2018 01:34 PM UTC by Richa **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Mohan Kanakam Adding missing test cases in clm apitest. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2930 ckpt: non collocated checkpoint is not deleted from /dev/shm after switch over.
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2930] ckpt: non collocated checkpoint is not deleted from /dev/shm after switch over.** **Status:** review **Milestone:** future **Created:** Thu Sep 20, 2018 07:52 AM UTC by Mohan Kanakam **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Mohan Kanakam Steps to reproduce: 1) Create a non collocated checkpoint on SC-1 and PL-4. 2) Create a one section on SC-1. 3)During switch over operation, close the application on PL-4. 4)After switch over, close the application of SC-1. 4) Observe that checkpoint is not deleted from the /dev/shm on SC-1 and SC-2 . SC-1: root@mohan-VirtualBox:/home/mohan/opensaf-code/src/ckpt/ckptnd# ls /dev/shm/ opensaf_CPND_CHECKPOINT_INFO_131343 opensaf_NCS_GLND_LCK_CKPT_INFO opensaf_NCS_MQND_QUEUE_CKPT_INFO pulse-shm-1049372244 pulse-shm-2170855640 pulse-shm-493188609 opensaf_NCS_GLND_EVT_CKPT_INFO opensaf_NCS_GLND_RES_CKPT_INFO opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131343_2 pulse-shm-2086668063 pulse-shm-3681026513 SC-2: root@mohan-VirtualBox:~# ls /dev/shm opensaf_CPND_CHECKPOINT_INFO_131599 opensaf_NCS_GLND_EVT_CKPT_INFO opensaf_NCS_GLND_LCK_CKPT_INFO opensaf_NCS_GLND_RES_CKPT_INFO opensaf_NCS_MQND_QUEUE_CKPT_INFO opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131599_2 pulse-shm-2892283080 pulse-shm-2910971180 pulse-shm-3340597930 pulse-shm-528662130 pulse-shm-551961907 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2932 ckpt: converting the checkpoint service from c to c++
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2932] ckpt: converting the checkpoint service from c to c++ ** **Status:** review **Milestone:** future **Created:** Mon Oct 01, 2018 01:25 PM UTC by Mohan Kanakam **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Mohan Kanakam Converting the checkpoint service from c to c++. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2966 lck: add missing test case of lck apitest
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2966] lck: add missing test case of lck apitest** **Status:** review **Milestone:** future **Created:** Mon Nov 19, 2018 05:30 AM UTC by Mohan Kanakam **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Mohan Kanakam --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2967 msg: add missing test case of msg apitest
- **Milestone**: 5.22.01 --> future --- ** [tickets:#2967] msg: add missing test case of msg apitest** **Status:** review **Milestone:** future **Created:** Tue Nov 20, 2018 05:33 AM UTC by Mohan Kanakam **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Mohan Kanakam --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3254 Enhancement of NTF notification
- **Milestone**: 5.22.01 --> future --- ** [tickets:#3254] Enhancement of NTF notification** **Status:** assigned **Milestone:** future **Created:** Thu Mar 18, 2021 03:54 AM UTC by Thanh Nguyen **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Thanh Nguyen When IMM changes an attribute with NOTIFY flag, NTF sends the changed attribute values and the old attribute values fetched from IMM. The fetching of old attribute values from IMM might not be successful under a certain condition in which the old values are overwritten before NTF attempts to fetch the old values. To avoid this situation, IMM will send spontaneously the old attribute values to NTF. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3280 dtm: loss of TCP connection requires node reboot
- **Milestone**: 5.22.01 --> future --- ** [tickets:#3280] dtm: loss of TCP connection requires node reboot** **Status:** unassigned **Milestone:** future **Created:** Fri Aug 27, 2021 11:33 AM UTC by Mohan Kanakam **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Mohan Kanakam Some time we see loss of TCP connection among payloads or among controller and payloads in the cluster. Example: If we have 2 controllers and 10 payloads(starting from PL-3 to PL-10), we see TCP connection loss at PL-4 among PL-5. The connection of PL-4 with other payloads remains established. We also see connection loss at PL-7 with SC-2, the connection of PL-7 with other nodes remains established. This result in PL-7 reboot when controller failover happens i.e. SC-1 fails and SC-2 takes Act role. PL-7 thinks that there was a single controller in the cluster and it reboots. This could be reproduced by adding iptables rule to drop the packets. So, the expected behavior is dtmd on PL-4/PL-5 can retry the connection for few times before declaring the node is down. The only drawback with this approach is that it will delay the application failover time or even controller failover time. Any suggestion on it ?? --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3293 log: Replace ScopeLock by standard lock
- **Milestone**: 5.22.01 --> 5.22.04 --- ** [tickets:#3293] log: Replace ScopeLock by standard lock** **Status:** review **Milestone:** 5.22.04 **Created:** Fri Oct 22, 2021 12:24 AM UTC by Hieu Hong Hoang **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Hieu Hong Hoang We created a class ScopeLock to support recursive mutex. It's used a lot in module log. However, the C++ std have a std:unique_lock which supports std::recursive_mutex. We should use the standard lock instead of creating a new class. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3294 mds: refactor huge api functions
- **Milestone**: 5.22.01 --> future --- ** [tickets:#3294] mds: refactor huge api functions** **Status:** assigned **Milestone:** future **Created:** Tue Oct 26, 2021 06:02 AM UTC by Hieu Hong Hoang **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Hieu Hong Hoang Some functions have 1.5K+ line of code. It's hard to maintain those functions. We should refactor it into smaller sub-functions. For example: ~~~ line 1863: uint32_t mds_mcm_svc_up(PW_ENV_ID pwe_id, MDS_SVC_ID svc_id, V_DEST_RL role, line 1864: NCSMDS_SCOPE_TYPE scope, MDS_VDEST_ID vdest_id, line 1865: NCS_VDEST_TYPE vdest_policy, MDS_DEST adest, line 1866: bool my_pcon, MDS_SVC_HDL local_svc_hdl, line 1867: MDS_SUBTN_REF_VAL subtn_ref_val, line 1868: MDS_SVC_PVT_SUB_PART_VER svc_sub_part_ver, line 1869: MDS_SVC_ARCHWORD_TYPE archword_type) line 1870: { line 3494: } ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3295 dtm: osaflog tool does not work with short argument
- **Milestone**: 5.22.01 --> 5.22.04 --- ** [tickets:#3295] dtm: osaflog tool does not work with short argument** **Status:** assigned **Milestone:** 5.22.04 **Created:** Thu Nov 18, 2021 03:05 AM UTC by Thien Minh Huynh **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Vu Minh Hoang step reproduce: osaflog -p mds.log expected: message of mds stream will be printed out like long argument --print. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3299 immnd: unexpected reboot node after merge network back
- **Milestone**: 5.22.01 --> 5.22.04 --- ** [tickets:#3299] immnd: unexpected reboot node after merge network back** **Status:** assigned **Milestone:** 5.22.04 **Created:** Wed Dec 01, 2021 03:07 AM UTC by Huu The Truong **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Huu The Truong Split network into two partitions. On each partition, create a dummy object. During merge network back has happened twice the reboot node. The first reboot node: 2021-09-07 12:20:06.926 SC-7 osafimmnd[376]: NO Used to be on another partition. Rebooting... 2021-09-07 12:20:06.935 SC-7 osafamfnd[431]: NO AVD NEW_ACTIVE, adest:1 2021-09-07 12:20:06.935 SC-7 osafimmnd[376]: Quick local node rebooting, Reason: Used to be on another partition. Rebooting... 2021-09-07 12:20:06.952 SC-7 opensaf_reboot: Do quick local node reboot At the second, this reboot is unexpected: 2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: NO Used to be on another partition. Rebooting... 2021-09-07 12:20:11.963 SC-7 osafimmnd[376]: Quick local node rebooting, Reason: Used to be on another partition. Rebooting... 2021-09-07 12:20:11.989 SC-7 osafclmna[333]: NO safNode=SC-7,safCluster=myClmCluster Joined cluster, nodeid=2070f 2021-09-07 12:20:11.992 SC-7 opensaf_reboot: Do quick local node reboot 2021-09-07 12:20:12.022 SC-7 opensafd[305]: ER Service RDE has unexpectedly crashed. Unable to continue, exiting --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3304 mds: Packet loss without log
- **Milestone**: 5.22.01 --> 5.22.04 --- ** [tickets:#3304] mds: Packet loss without log** **Status:** review **Milestone:** 5.22.04 **Created:** Tue Jan 04, 2022 03:47 AM UTC by Hieu Hong Hoang **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Hieu Hong Hoang In mds module, there is a packet loss feature. This feature provide a packet loss callback which is called when packet loss is detected. Because it is turn off by default, there are no callback or log when packet loss occurs. In this ticket, the log of packet loss will be printed out by default . These logs are helpful in many cases. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3288 fmd: failed during setting role from standby to active
- **Milestone**: 5.22.01 --> 5.22.04 --- ** [tickets:#3288] fmd: failed during setting role from standby to active** **Status:** review **Milestone:** 5.22.04 **Created:** Tue Oct 05, 2021 03:11 AM UTC by Huu The Truong **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Huu The Truong After the standby SC down then another SC is promoted to became the new standby SC. 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Node Down event for node id 2040f: 2021-09-28 07:00:35.950 SC-2 osaffmd[392]: NO Current role: STANDBY At that time, the new standby SC received peer info response from old standby SC is needed promote to became the active SC. 2021-09-28 07:00:35.972 SC-2 osaffmd[392]: NO Controller Failover: Setting role to ACTIVE 2021-09-28 07:00:35.972 SC-2 osafrded[382]: NO RDE role set to ACTIVE ... 2021-09-28 07:00:36.113 SC-2 osafclmd[448]: NO ACTIVE request 2021-09-28 07:00:36.114 SC-2 osaffmd[392]: NO Controller promoted. Stop supervision timer While an another active SC is alive lead to the standby SC reboots itselft because cluster has only one active SC. 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: ER FAILOVER StandBy --> Active FAILED, Standby OUT OF SYNC 2021-09-28 07:00:36.117 SC-2 osafamfd[459]: Rebooting OpenSAF NodeId = 0 EE Name = No EE Mapped, Reason: FAILOVER failed, OwnNodeId = 2020f, SupervisionTime = 60 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3285 amf: amfd takes time to update to update runtime of node in large cluster size
- **Component**: unknown --> amf --- ** [tickets:#3285] amf: amfd takes time to update to update runtime of node in large cluster size** **Status:** fixed **Milestone:** 5.22.01 **Created:** Thu Sep 23, 2021 08:55 AM UTC by Thang Duc Nguyen **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Thang Duc Nguyen In large cluster size(lager than 36 nodes), and many components reside on each node. The runtime of nodes (AdminState and OperationalState) are take time to update in IMM and it causes the application get wrong state of node in IMM instead AMF already update its data base. Suggestion: use sync request to updates these attributes for node as high priority. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3285 amf: amfd takes time to update to update runtime of node in large cluster size
- **Component**: amf --> unknown --- ** [tickets:#3285] amf: amfd takes time to update to update runtime of node in large cluster size** **Status:** fixed **Milestone:** 5.22.01 **Created:** Thu Sep 23, 2021 08:55 AM UTC by Thang Duc Nguyen **Last Updated:** Sun Jan 23, 2022 09:58 PM UTC **Owner:** Thang Duc Nguyen In large cluster size(lager than 36 nodes), and many components reside on each node. The runtime of nodes (AdminState and OperationalState) are take time to update in IMM and it causes the application get wrong state of node in IMM instead AMF already update its data base. Suggestion: use sync request to updates these attributes for node as high priority. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3268 imm: New admin op to force IMM Agent using new timeout value
- **Component**: unknown --> imm --- ** [tickets:#3268] imm: New admin op to force IMM Agent using new timeout value** **Status:** fixed **Milestone:** 5.21.09 **Created:** Mon Jun 21, 2021 01:02 AM UTC by Minh Hon Chau **Last Updated:** Tue Sep 14, 2021 08:06 AM UTC **Owner:** Thien Minh Huynh This ticket is created in conjunction with #3260, in which the single step upgrade fails due to a lot of component timeout. The IMMA_SYNC_TIMEOUT technically environment variable can be exported with appropriate value in each components on every nodes; however, if the cluster has several nodes, each of which has hundreds of components, that make IMMA_SYNC_TIMEOUT update difficult in live site. This ticket introduces a configurable attribute to distribute the new timeout to all IMM clients and force them to use the new timeout. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3268 imm: New admin op to force IMM Agent using new timeout value
- **Component**: imm --> unknown --- ** [tickets:#3268] imm: New admin op to force IMM Agent using new timeout value** **Status:** fixed **Milestone:** 5.21.09 **Created:** Mon Jun 21, 2021 01:02 AM UTC by Minh Hon Chau **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Thien Minh Huynh This ticket is created in conjunction with #3260, in which the single step upgrade fails due to a lot of component timeout. The IMMA_SYNC_TIMEOUT technically environment variable can be exported with appropriate value in each components on every nodes; however, if the cluster has several nodes, each of which has hundreds of components, that make IMMA_SYNC_TIMEOUT update difficult in live site. This ticket introduces a configurable attribute to distribute the new timeout to all IMM clients and force them to use the new timeout. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2849 smf: Incorrect logging may flood syslog
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2849] smf: Incorrect logging may flood syslog** **Status:** review **Milestone:** 5.21.12 **Created:** Tue May 08, 2018 11:29 AM UTC by elunlen **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Krishna Pawar In function getNodeDestination() in SmfUtils there is a: LOG_NO("%s: className '%s'", __FUNCTION__, className); That should be changed to a TRACE or removed. It is printed in a loop that may go on for 10 seconds with a delay of 2 seconds meaning that this printout may happen 5 times. The problem however, is that this function is called in aloop by waitForNodeDestination() . The logging is not done at very high speed though. One log every 2 seconds --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2855 mds: Improve tipc receive logic
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2855] mds: Improve tipc receive logic** **Status:** accepted **Milestone:** future **Created:** Wed May 16, 2018 12:29 PM UTC by Hans Nordebäck **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Minh Hon Chau The tipc receive buffer is 2MB, should not be extended as that will affect e.g. available kernel memory. The receive thread normally runs as a realtime thread, RR and with the gl_mds_library_mutex taken in contention with other MDS, time sharing threads. In several cases TIPC_OVERLOAD has happened as consuming messages are not fast enough and TIPC drops these message at receive buffer full. This ticket will change the receive thread from a realtime thread to the standard round-robin time-sharing policy and a new realtime thread will be created to only receive messages and adding these to a larger buffer using a lock-free algorithm. The old receive thread will run as the standard time sharing thread and will consume messages from this shared buffer. The recvmsg in recvfrom_connectionless will be changed to read from the shared buffer. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2856 imm: create test cases to verify the ticket 2422
- **status**: assigned --> unassigned - **Milestone**: 5.21.09 --> future --- ** [tickets:#2856] imm: create test cases to verify the ticket 2422** **Status:** unassigned **Milestone:** future **Created:** Thu May 17, 2018 06:29 AM UTC by Vu Minh Nguyen **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Vu Minh Nguyen Create test cases for the ticket [#2422]. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2861 osaf: Faulty constructs in opensaf detected when compiling with gcc 8.1
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2861] osaf: Faulty constructs in opensaf detected when compiling with gcc 8.1** **Status:** unassigned **Milestone:** 5.21.12 **Created:** Mon May 21, 2018 02:25 PM UTC by Hans Nordebäck **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** nobody Compiling with gcc 8.1 with -Wno-format-truncation -Wno-stringop-overflow -Wno-format-overflow makes the compilation succeed, but these -Wno- should not be used, the faulty constructs should instead be corrected. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #209 plmd crashed while deleting plm entities at various points.
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#209] plmd crashed while deleting plm entities at various points.** **Status:** review **Milestone:** 5.21.12 **Created:** Wed May 15, 2013 07:02 AM UTC by Mathi Naickan **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** MeenakshiTK When the command , "immcfg -d safHE=7220_slot_1,safDomain=domain_1" is ran plm crashed with segmentation fault. the above object has three childs dpb_1,dpb_2 and PL-13 plmd crashed with the following backtrace : Program terminated with signal 11, Segmentation fault. #0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at plms_utils.c:1360 1360 if (0 == strcmp(tail->plm_entity->dn_name_str, (gdb) bt fH[[K #0 0x08089af6 in plms_ent_to_ent_list_add (ent=0x80fe450, list=0xbbc5d9f8) at plms_utils.c:1360 #1 0x08088c8d in plms_chld_get (ent=0x80fe450, chld_list=0xbbc5d9f8) at plms_utils.c:842 #2 0x0805ca90 in plms_delete_objects (obj_type=6, obj_name=0x810a2a8) at plms_imm.c:697 #3 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=68719608079, ccb_id=2) at plms_imm.c:1425 #4 0x032fc46f in imma_process_callback_info (cb=) at imma_proc.c:2005 #5 0x032fb393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592 #6 0x032ebcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548 #7 0x08051bff in main (argc=2, argv=0xbbc5e414) at plms_main.c:484 #8 0x033aee0c in libc_start_main () from /lib/libc.so.6 #9 0x0804c401 in _start () While deleting the entity, which doesn't have any child, it crashed with the following backtrace #0 0x0805cb14 in plms_delete_objects (obj_type=7, obj_name=0x8109588) at plms_imm.c:707 #1 0x0805fa05 in plms_imm_ccb_apply_cbk (imm_oi_hdl=824633852431, ccb_id=6) at plms_imm.c:1425 #2 0x0189c46f in imma_process_callback_info (cb=) at imma_proc.c:2005 #3 0x0189b393 in imma_hdl_callbk_dispatch_one (cb=) at imma_proc.c:1592 #4 0x0188bcfd in saImmOiDispatch (immOiHandle=) at imma_oi_api.c:548 #5 0x08051bff in main (argc=2, argv=0xbe1d6db4) at plms_main.c:484 #6 0x0194ee0c in libc_start_main () from /lib/libc.so.6 #7 0x0804c401 in _start () Also, check the following issue : Crash in plmc_err callback when ee_id is passed as empty string. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2082 CKPT : Track cbk not invoked for section creation after cpnd restart
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2082] CKPT : Track cbk not invoked for section creation after cpnd restart** **Status:** review **Milestone:** 5.21.12 **Created:** Thu Sep 29, 2016 11:06 AM UTC by Srikanth R **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Mohan Kanakam Changeset: 7997 5.1.FC Track Callback is not invoked after cpnd restart. Below are the apis called from the applications , spawned on two nodes .i.e payloads. On first node : -> Initialize with cpsv -> Create a ckpt with ACTIVE REPLICA flag. On second node. -> Initialize with cpsv. On First node, -> Open the checkpoint in writing mode -> Open the checkpoint in reading mode. -> Kill cpnd process -> Register for Track callback. On Second node, -> Open the ckpt in read mode. -> Kill cpnd proces -> Register for Track callback. After ensuring that both agents registered for track callback, create a section from the application on first node. For section creation, callback should be invoked for applications on two nodes. Currently callback is not invoked for the application on second node. With out cpnd restart, callback is invoked for the two applications --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2451 clm: Make the cluster reset admin op safe
- **status**: review --> unassigned - **Milestone**: 5.21.09 --> future --- ** [tickets:#2451] clm: Make the cluster reset admin op safe** **Status:** unassigned **Milestone:** future **Created:** Wed May 03, 2017 10:51 AM UTC by Anders Widell **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Hans Nordebäck The cluster reset admin operation that was implemented in ticket [#2053] is not safe: if a node reboots very fast it can come up again and join the old cluster before other nodes have rebooted. See mail discussion: https://sourceforge.net/p/opensaf/mailman/message/35398725/ This can be solved by implementing a two-phase cluster reset or by introducing a cluster generation number which is increased at each cluster reset (maybe both ordered an spontaneous cluster resets). A node will not be allowed to join the cluster with a different cluster genration without first rebooting. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2454 amfnd: Clean up variable of active amfd status
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2454] amfnd: Clean up variable of active amfd status** **Status:** review **Milestone:** 5.21.12 **Created:** Thu May 04, 2017 12:34 PM UTC by Minh Hon Chau **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Minh Hon Chau In amfnd, we have variable cb->is_avd_down and set of macros: m_AVND_CB_IS_AVD_UP, m_AVND_CB_AVD_UP_SET, m_AVND_CB_AVD_UP_RESET which is using flag AVND_CB_FLAG_AVD_UP, they are all indicating active amfd down/up. Amfnd should only use variable @is_avd_down or the macros, not both of them. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2500 build: Schema files (.xsd) are missing from distribution tarballs
- **status**: accepted --> unassigned - **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2500] build: Schema files (.xsd) are missing from distribution tarballs** **Status:** unassigned **Milestone:** 5.21.12 **Created:** Fri Jun 16, 2017 10:45 AM UTC by Anders Widell **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Anders Widell The XML schema (xsd) files are missing in the release tarballs, at least for IMM and probably also for SMF. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2541 nid: order of system log print out is not correct
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2541] nid: order of system log print out is not correct** **Status:** review **Milestone:** future **Created:** Wed Aug 02, 2017 07:52 AM UTC by Rafael Odzakow **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Rafael Odzakow using echo -n in opensafd causes delay write to log in a systemd environment causing unconsistent order of the logs. "Starting opensaf" will end up after "Startup finished" in the system log. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2612 amfd: return SA_AIS_ERR_NO_RESOURCES for CSI CCB if SG is unstable
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2612] amfd: return SA_AIS_ERR_NO_RESOURCES for CSI CCB if SG is unstable** **Status:** assigned **Milestone:** future **Created:** Thu Oct 05, 2017 03:23 PM UTC by Alex Jones **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Alex Jones Return SA_AIS_ERR_NO_RESOURCES if CSI ccb apply fails due to SG being unstable. This is similar to ticket #2184, but for CSIs. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2693 clm: Use longer election delay time on isolated nodes
- **status**: assigned --> unassigned - **Milestone**: 5.21.09 --> future --- ** [tickets:#2693] clm: Use longer election delay time on isolated nodes** **Status:** unassigned **Milestone:** future **Created:** Tue Nov 21, 2017 04:52 PM UTC by Anders Widell **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Anders Widell In addition to the CLMNA_ELECTION_DELAY_TIME configuration, allow configuration of a separate (longer) election delay time to be used on isolated nodes, i.e. nodes that cannot see any other node on the network. This will decrease the possibility of split-brain in situations where a node is temporarily disconnected from the rest of the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2768 nid: Use host name when node_name is missing or empty
- **status**: assigned --> unassigned - **Milestone**: 5.21.09 --> future --- ** [tickets:#2768] nid: Use host name when node_name is missing or empty** **Status:** unassigned **Milestone:** future **Created:** Mon Jan 22, 2018 01:00 PM UTC by Anders Widell **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Anders Widell In the case /etc/opensaf/node_name is missing or empty, we can use the hostname of the machine we are running on. This would reduce the amount of needed per-node configuration if hostname and node_name are the same. Also, make /etc/opensaf/node_name empty as default when installing OpenSAF. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2798 mds: mdstest 5 1, 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2798] mds: mdstest 5 1,5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6 failed** **Status:** review **Milestone:** future **Created:** Wed Mar 07, 2018 04:19 AM UTC by Hoa Le **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Hoa Le **Attachments:** - [mdstest_5_1.tar.gz](https://sourceforge.net/p/opensaf/tickets/2798/attachment/mdstest_5_1.tar.gz) (8.4 MB; application/gzip) Opensaf commit 5629f554686a498f328e0c79fc946379cbcf6967 mdstest 5 1 ~~~ LOG_NO("\nAction: Retrieve only ONE event\n"); if (mds_service_subscribe(gl_tet_adest.mds_pwe1_hdl, 500, NCSMDS_SCOPE_INTRACHASSIS, 2, svcids) != NCSCC_RC_SUCCESS) { LOG_NO("\nFail\n"); FAIL = 1; } else { LOG_NO("\nAction: Retrieve only ONE event\n"); if (mds_service_retrieve(gl_tet_adest.mds_pwe1_hdl, 500, SA_DISPATCH_ONE) != NCSCC_RC_SUCCESS) { LOG_NO("Fail, retrieve ONE\n"); FAIL = 1; } else LOG_NO("\nSuccess\n"); ~~~ After the subscription request being successful, mdstest would expectedly receive two 2 MDS_UP events of services 600 and 700. These info will be retrieved in the next step of the test case (mds_service_retrieve). The problem here is, these MDS_UP events are processed in a separate (parallel) thread (mds core thread) from the test case's main thread. In a bad scenario, if the mds core thread cannot be processed before the RETRIEVE operations in the main thread, the RETRIEVE request with "SA_DISPATCH_ONE" flag will return "error", and the test case will fail. <143>1 2018-03-07T01:10:29.936907+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="155"] << mds_mcm_svc_subscribe*** // MDS SUBSCRIBE request*** ... <142>1 2018-03-07T01:10:29.937631+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="162"] MDS_SND_RCV: info->info.retrieve_msg.i_dispatchFlags == SA_DISPATCH_ONE*** // MDS RETRIEVE request with SA DISPATCH ONE flag came before MDS UP events being processed*** <139>1 2018-03-07T01:10:29.937729+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="163"] MDS_SND_RCV: msgelem == NULL <142>1 2018-03-07T01:10:29.937953+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="164"] MDTM: Processing pollin events <142>1 2018-03-07T01:10:29.938333+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="165"] MDTM: Received SVC event <143>1 2018-03-07T01:10:29.93838+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="166"] >> mds_mcm_svc_up <143>1 2018-03-07T01:10:29.938418+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="167"] MCM:API: LOCAL SVC INFO : svc_id = INTERNAL(500) | PWE id = 1 | VDEST id = 65535 | <143>1 2018-03-07T01:10:29.938439+07:00 SC-1 mdstest 473 mds.log [meta sequenceId="168"] MCM:API: REMOTE SVC INFO : svc_id = EXTERNAL(600) | PWE id = 1 | VDEST id = 65535 | POLICY = 1 | SCOPE = 3 | ROLE = 1 | MY_PCON = 0 | 2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Action: Retrieve only ONE event 2018-03-07 01:10:29.941 SC-1 mdstest: NO #012Request to ncsmds_api: MDS RETRIEVE has FAILED 2018-03-07 01:10:29.942 SC-1 mdstest: NO Fail, retrieve ONE The same issue was observed in mdstest 5 9, 4 10, 4 12, 10 1, 10 2, 14 5, 14 6. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3281 mds: Wrong sending NO_ACTIVE after split brain detection
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#3281] mds: Wrong sending NO_ACTIVE after split brain detection** **Status:** assigned **Milestone:** 5.21.12 **Created:** Mon Sep 06, 2021 01:47 AM UTC by Hieu Hong Hoang **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Hieu Hong Hoang **Attachments:** - [dump_mds_5.21.06.patch](https://sourceforge.net/p/opensaf/tickets/3281/attachment/dump_mds_5.21.06.patch) (3.5 kB; application/octet-stream) - [mds.log.1](https://sourceforge.net/p/opensaf/tickets/3281/attachment/mds.log.1) (12.0 MB; application/octet-stream) Configuration: - Cluster with 10 SCs, allow sc absence. - Split cluster to four partitions: [[SC-1, SC-2], [SC-3, SC-4, SC-5, SC-6], [SC-7, SC-8, SC-9], [SC-10]] - Role of SCs after network splits: [[SC-1(ACT), SC-2(STB)], [SC-3(ACT), SC-4, SC-5(STB), SC-6], [SC-7(STB), SC-8(ATC), SC-9], [SC-10(ATC)]] - Network merges Observation: - All active and standby SCs rebooted due to split brain detected except SCs in partition 1. The SCs in partition 1 don't reboot because the active and standby SCs in other partitions rebooted too fast. - Ntf agent in SC1 fails to send notification to ntf server and it will not recover. ~~~ <143>1 2021-09-04T07:11:56.225221+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136348"] 439:amf/amfd/imm.cc:419 >> execute <143>1 2021-09-04T07:11:56.225223+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136349"] 439:amf/amfd/ntf.cc:804 >> exec: Ntf Type:3000, sent status:0 <143>1 2021-09-04T07:11:56.225227+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136350"] 439:amf/amfd/ntf.cc:491 >> avd_try_send_notification: Ntf Type:3000, sent status:0 <143>1 2021-09-04T07:11:56.225231+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136351"] 439:ntf/agent/ntfa_api.c:2016 >> saNtfNotificationSend <143>1 2021-09-04T07:11:56.225235+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136352"] 439:ntf/agent/ntfa_api.c:62 TR NTFS server is unavailable <143>1 2021-09-04T07:11:56.225238+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136353"] 439:ntf/agent/ntfa_api.c:2260 << saNtfNotificationSend <143>1 2021-09-04T07:11:56.225241+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136354"] 439:amf/amfd/ntf.cc:513 TR Notification Send unsuccesful TRY_AGAIN or TIMEOUT rc:6 <143>1 2021-09-04T07:11:56.225243+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136355"] 439:amf/amfd/ntf.cc:532 << avd_try_send_notification <143>1 2021-09-04T07:11:56.225246+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136356"] 439:amf/amfd/ntf.cc:811 TR TRY-AGAIN <143>1 2021-09-04T07:11:56.225249+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136357"] 439:amf/amfd/ntf.cc:822 << exec <143>1 2021-09-04T07:11:56.225252+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="136358"] 439:amf/amfd/imm.cc:427 << execute: 2 ... <143>1 2021-09-04T07:26:00.185418+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191040"] 439:amf/amfd/imm.cc:419 >> execute <143>1 2021-09-04T07:26:00.185465+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191041"] 439:amf/amfd/ntf.cc:804 >> exec: Ntf Type:3000, sent status:0 <143>1 2021-09-04T07:26:00.185475+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191042"] 439:amf/amfd/ntf.cc:491 >> avd_try_send_notification: Ntf Type:3000, sent status:0 <143>1 2021-09-04T07:26:00.185485+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191043"] 439:ntf/agent/ntfa_api.c:2016 >> saNtfNotificationSend <143>1 2021-09-04T07:26:00.185497+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191044"] 439:ntf/agent/ntfa_api.c:59 TR NTFS server is down <143>1 2021-09-04T07:26:00.185505+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191045"] 439:ntf/agent/ntfa_api.c:2260 << saNtfNotificationSend <143>1 2021-09-04T07:26:00.185513+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191046"] 439:amf/amfd/ntf.cc:513 TR Notification Send unsuccesful TRY_AGAIN or TIMEOUT rc:6 <143>1 2021-09-04T07:26:00.18552+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191047"] 439:amf/amfd/ntf.cc:532 << avd_try_send_notification <143>1 2021-09-04T07:26:00.185528+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191048"] 439:amf/amfd/ntf.cc:811 TR TRY-AGAIN <143>1 2021-09-04T07:26:00.185536+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191049"] 439:amf/amfd/ntf.cc:822 << exec <143>1 2021-09-04T07:26:00.185543+02:00 SC-1 osafamfd 439 osafamfd [meta sequenceId="191050"] 439:amf/amfd/imm.cc:427 << execute: 2 ~~~ Reason: - Before the active SC-3 rebooted, SC-1 still had enough time to connect to the ntf server in SC-3. When SC-3 rebooted, the ntf agent in SC-1 received a NO_ACTIVE message of NTFS service. Actually, the ntf server in SC-1 is still in active state. - The following mds log is generated by the opensaf code which applied the patch dump_mds_5.21.06.patch. + The ntf agent in SC-1 detects the ntf server in SC-3 is up. Mds updates the active destination of NTFS service to
[tickets] [opensaf:tickets] #3280 dtm: loss of TCP connection requires node reboot
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#3280] dtm: loss of TCP connection requires node reboot** **Status:** unassigned **Milestone:** 5.21.12 **Created:** Fri Aug 27, 2021 11:33 AM UTC by Mohan Kanakam **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Mohan Kanakam Some time we see loss of TCP connection among payloads or among controller and payloads in the cluster. Example: If we have 2 controllers and 10 payloads(starting from PL-3 to PL-10), we see TCP connection loss at PL-4 among PL-5. The connection of PL-4 with other payloads remains established. We also see connection loss at PL-7 with SC-2, the connection of PL-7 with other nodes remains established. This result in PL-7 reboot when controller failover happens i.e. SC-1 fails and SC-2 takes Act role. PL-7 thinks that there was a single controller in the cluster and it reboots. This could be reproduced by adding iptables rule to drop the packets. So, the expected behavior is dtmd on PL-4/PL-5 can retry the connection for few times before declaring the node is down. The only drawback with this approach is that it will delay the application failover time or even controller failover time. Any suggestion on it ?? --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3271 amf: Issue of headless restoration with Roaming SC
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#3271] amf: Issue of headless restoration with Roaming SC** **Status:** unassigned **Milestone:** 5.21.12 **Created:** Fri Jul 02, 2021 05:07 AM UTC by Minh Hon Chau **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** nobody In robustness test of roaming SC cluster recovery from split brain, the test performs rolling split then rejoin every active SC by 3 seconds with promote active timer = 0. The following log shows the issue starting point. The SC-1 is promoted to active right after the previous active is split. amfnd on SC-1 starts to send headless state information to amfd on SC-1 (this case does not happen without roaming SC, where the active SC after headless does not have amfnd's headless information in the SC). Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SU:safSu=b769074fb6,safSg=2N,safApp=ABC-012 <0, 1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SU:safSu=b769074fb6,safSg=2N,safApp=OpenSAF <0, 1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SISU:safSi=All-NWayActive,safApp=ABC-012,safSu=b769074fb6,safSg=NWayActive,safApp=ABC-012 <1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SU:safSu=b769074fb6,safSg=NWayActive,safApp=ABC-012 <0, 1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SISU:safSi=d4bf28eca3,safApp=ABC-012,safSu=b769074fb6,safSg=NoRed,safApp=ABC-012 <1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SU:safSu=b769074fb6,safSg=NoRed,safApp=ABC-012 <0, 1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SISU:safSi=748f4402ae,safApp=ABC-456,safSu=b769074fb6,safSg=NoRed,safApp=ABC-456 <1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SU:safSu=b769074fb6,safSg=NoRed,safApp=ABC-456 <0, 1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SISU:safSi=d4bf28eca3,safApp=OpenSAF,safSu=b769074fb6,safSg=NoRed,safApp=OpenSAF <1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Synced SU:safSu=b769074fb6,safSg=NoRed,safApp=OpenSAF <0, 1, 3> Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO 10 CSICOMP states sent Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO 33 COMP states sent Jun 30 19:19:44 SC-1 osafamfnd[8802]: NO Sending node up due to NCSMDS_NEW_ACTIVE Jun 30 19:19:45 SC-1 osafamfd[8779]: NO Receive message with event type:12, msg_type:31, from node:21a0f, msg_id:0 Jun 30 19:19:45 SC-1 osafamfd[8779]: NO Receive message with event type:13, msg_type:32, from node:21a0f, msg_id:0 amfd on SC-1 restores the headless information Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Received node_up from 21a0f: msg_id 1 Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Enter restore headless cached RTAs from IMM Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Leave reading headless cached RTAs from IMM: SUCCESS Jun 30 19:19:47 SC-1 osafamfd[8779]: NO Node '47740d42-79f8-a1c9-ea73-8cb599ef2deb' joined the cluster Jun 30 19:19:47 SC-1 osafamfnd[8802]: NO Assigning 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to 'safSu=b769074fb6,safSg=2N,safApp=OpenSAF' The other SCs rejoins and misses out the headless restoration of amfd-SC1, that causes the issue of amfd-SC1 inconsistent with the amfnd(s) on other SCs Jun 30 19:19:48 SC-1 osafamfd[8779]: NO Receive message with event type:12, msg_type:31, from node:21c0f, msg_id:0 Jun 30 19:19:48 SC-1 osafamfd[8779]: NO Receive message with event type:13, msg_type:32, from node:21c0f, msg_id:0 Jun 30 19:19:52 SC-1 osafamfd[8779]: NO Receive message with event type:12, msg_type:31, from node:21b0f, msg_id:0 Jun 30 19:19:52 SC-1 osafamfd[8779]: NO Receive message with event type:13, msg_type:32, from node:21b0f, msg_id:0 Jun 30 19:19:54 SC-1 osafamfd[8779]: NO Received node_up from 21b0f: msg_id 1 Jun 30 19:19:54 SC-1 osafamfd[8779]: NO Received node_up from 21c0f: msg_id 1 Jun 30 19:19:56 SC-1 osafamfd[8779]: NO Received node_up from 21b0f: msg_id 1 Jun 30 19:19:56 SC-1 osafamfd[8779]: NO Received node_up from 21c0f: msg_id 1 Jun 30 19:19:57 SC-1 osafamfd[8779]: NO Cluster startup is done --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3270 build: make rpm got unversioned python
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#3270] build: make rpm got unversioned python** **Status:** review **Milestone:** 5.21.12 **Created:** Mon Jun 28, 2021 07:10 AM UTC by Thien Minh Huynh **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Thien Minh Huynh %{__python} is no longer pointed to python2. ~~~ rpmbuild -bb --clean --rmspec --rmsource \ --define "_topdir `pwd`/rpms" --define "_tmppath `pwd`/rpms/tmp" \ `pwd`/rpms/SPECS/opensaf.spec error: attempt to use unversioned python, define %__python to /usr/bin/python2 or /usr/bin/python3 explicitly error: line 1556: %{python_sitelib}/pyosaf/*.py make[1]: *** [Makefile:26843: rpm] Error 1 make[1]: Leaving directory '/root/osaf-build/opensaf-5.21.06' make: *** [makefile:8: all] Error 2 ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3254 Enhancement of NTF notification
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#3254] Enhancement of NTF notification** **Status:** assigned **Milestone:** 5.21.12 **Created:** Thu Mar 18, 2021 03:54 AM UTC by Thanh Nguyen **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Thanh Nguyen When IMM changes an attribute with NOTIFY flag, NTF sends the changed attribute values and the old attribute values fetched from IMM. The fetching of old attribute values from IMM might not be successful under a certain condition in which the old values are overwritten before NTF attempts to fetch the old values. To avoid this situation, IMM will send spontaneously the old attribute values to NTF. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3106 dtm: flush the logtrace asap when the logtrace owner is terminated
- **Milestone**: 5.21.09 --> future --- ** [tickets:#3106] dtm: flush the logtrace asap when the logtrace owner is terminated** **Status:** review **Milestone:** future **Created:** Thu Oct 24, 2019 03:57 AM UTC by Vu Minh Nguyen **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Vu Minh Nguyen This ticket will add a machanism in logtrace server, so that it can detect the logtrace owner terminated and does the flush right away to avoid losing traces from trace files. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3094 mds: Reassembly timer timeout causes message discarded
- **Milestone**: 5.21.09 --> future --- ** [tickets:#3094] mds: Reassembly timer timeout causes message discarded** **Status:** unassigned **Milestone:** future **Created:** Sat Sep 28, 2019 09:59 PM UTC by Minh Hon Chau **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Minh Hon Chau Run test: export MDS_TIPC_FCTRL_ENABLED=1; ckpttest 20 11 Test sometimes failed because the Reassembly timer timeout and discarded all fragment. Some outlined log: - ckptnd as a mds receiver, receives the first fragment of big message, start Reassembly timer (5 seconds hard coded) <142>1 2019-09-28T19:41:33.372579+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477553"] MDTM: Reassembly started <143>1 2019-09-28T19:41:33.372582+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477554"] MDTM: Recd fragmented message(first frag) with Frag Seqnum=4 SVC Seq num =3, from src Adest = <72075194378064089> <142>1 2019-09-28T19:41:33.372585+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477555"] MDTM: User Recd msg len=65223 <143>1 2019-09-28T19:41:33.372603+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477556"] MCM_DB:RecvMessage:TimerStart:Reassemble:Hdl=0xfee7:SrcSvcId=CPA(18):SrcVdest=65535,DestSvcHdl=562945658454033 <143>1 2019-09-28T19:41:33.372616+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477557"] MDTM: size: 65262 anc is NULL <143>1 2019-09-28T19:41:33.37262+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477558"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32770, fseq:5], rcvwnd[acked:4, rcv:4, nacked:0] <143>1 2019-09-28T19:41:33.372623+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477559"] MDTM: Recd message with Fragment Seqnum=4, frag_num=2, from src_id=<0x01001001:3859124441> <143>1 2019-09-28T19:41:33.372669+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477568"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32771, fseq:6], rcvwnd[acked:4, rcv:5, nacked:0] <143>1 2019-09-28T19:41:33.372673+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477569"] MDTM: Recd message with Fragment Seqnum=4, frag_num=3, from src_id=<0x01001001:3859124441> - The big message causes Tipc buffer overflow <139>1 2019-09-28T19:41:33.384242+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477862"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32801, fseq:36], rcvwnd[acked:31, rcv:33, nacked:0], Error[msg loss] <<..>> <139>1 2019-09-28T19:41:33.386422+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478063"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32800, fseq:35], rcvwnd[acked:46, rcv:48, nacked:0], Error[unexpected retransmission] <<..>> <139>1 2019-09-28T19:41:33.386658+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478091"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32813, fseq:48], rcvwnd[acked:46, rcv:48, nacked:0], Error[unexpected retransmission] <<..>> <139>1 2019-09-28T19:41:33.384873+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477905"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32814, fseq:49], rcvwnd[acked:31, rcv:33, nacked:0], Error[msg loss] <<..>> - The transmission problem is resolved, but the Reassembly timer has expired, any message from sender has passed the tipc flow control with correct sequence number will be dropped at mds_dt_common <142>1 2019-09-28T19:41:38.392219+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478200"] MDTM: Processing Timer mailbox events <143>1 2019-09-28T19:41:38.392328+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478201"] MDTM: Tmr Mailbox Processing:Reassemble Tmr Hdl=0xfee7 <142>1 2019-09-28T19:41:38.392623+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478206"] MSG loss not enbaled mds_mcm_msg_loss <143>1 2019-09-28T19:41:39.380193+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478210"] MDTM: size: 65262 anc is NULL <143>1 2019-09-28T19:41:39.380227+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478211"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32822, fseq:57], rcvwnd[acked:56, rcv:56, nacked:0] <143>1 2019-09-28T19:41:39.380243+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478212"] MDTM: Recd message with Fragment Seqnum=4, frag_num=54, from src_id=<0x01001001:3859124441> <143>1 2019-09-28T19:41:39.380263+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478213"] MDS_DT_COMMON : reassembly queue doesnt exist seq_num=4, Adest = <0x01001001,3859124441 <139>1 2019-09-28T19:41:39.380273+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478214"] MDTM: Some stale message recd, hence dropping Adest = <72075194378064089> <143>1 2019-09-28T19:41:39.380309+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478215"] MDTM: size: 65262 anc is NULL <143>1 2019-09-28T19:41:39.380322+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478216"] FCTRL:
[tickets] [opensaf:tickets] #3000 rde: rdegetrole timeout in rda
- **Milestone**: 5.21.09 --> future --- ** [tickets:#3000] rde: rdegetrole timeout in rda** **Status:** unassigned **Milestone:** future **Created:** Wed Jan 16, 2019 07:58 AM UTC by Canh Truong **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** nobody In the "rdegetrole" command or set role ..., rda send the geting role request and poll for the response with timeout 30s. But in there are some place that loop over 30s in rde when etcd plugin is used. That causes the poll event is timeout. The command "rdegetrole" or set role will get error. while (rc == SA_AIS_ERR_FAILED_OPERATION && retries < kMaxRetry) { ++retries; std::this_thread::sleep_for(kSleepInterval); ... } --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2979 mds: create a UNIX socket for local node communication when using TIPC
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2979] mds: create a UNIX socket for local node communication when using TIPC** **Status:** unassigned **Milestone:** future **Created:** Tue Dec 04, 2018 08:32 PM UTC by Alex Jones **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** nobody We need to create a UNIX socket for local node communication when using TIPC, much like DTM does for TCP. This is useful when using AMF container/contained and TIPC as the transport. Currently, to allow communication between the amf-lib in the container with amfnd on the host, we need to allow the container access to the host's network stack so it can see the TIPC address. If we created a UNIX socket for local node TIPC communication we could pass this socket file to the container instead of exposing the host's entire network stack. MDS can then use the socket file for TIPC, like it does with TCP. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2976 msg: allow msg lib to be used in a container
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2976] msg: allow msg lib to be used in a container** **Status:** accepted **Milestone:** future **Created:** Mon Nov 26, 2018 02:49 PM UTC by Alex Jones **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Alex Jones The implementation of MSG has the node director create the IPC msg queue while the agent is also able to access and use the IPC msg queue. This breaks when trying to run only the agent inside a container. saMsgMessageGet expects access to the IPC message queue locally, so this function always fails. Need to investigate whether we can have the container running the agent share the IPC message queue with the node director, or whether we need to reimplement the code so that only one entity has access to the IPC message queue. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2967 msg: add missing test case of msg apitest
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2967] msg: add missing test case of msg apitest** **Status:** review **Milestone:** 5.21.12 **Created:** Tue Nov 20, 2018 05:33 AM UTC by Mohan Kanakam **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Mohan Kanakam --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2966 lck: add missing test case of lck apitest
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2966] lck: add missing test case of lck apitest** **Status:** review **Milestone:** 5.21.12 **Created:** Mon Nov 19, 2018 05:30 AM UTC by Mohan Kanakam **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Mohan Kanakam --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2948 rde: race between quiesced node and standby node become active node
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2948] rde: race between quiesced node and standby node become active node** **Status:** unassigned **Milestone:** future **Created:** Mon Oct 29, 2018 10:48 AM UTC by Canh Truong **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** nobody Summary issue: Config cluster with 5 SCs. - Starting cluster: SC5 is active, SC4 standby and [SC1-SC3] is quiesced. SC1 and SC2 are promoting those nodes to become ACTIVE (by rde component). SC5 is answering the take-over request from SC2 first and reject this request. Then SC5 is rebooted (manually) before answering the request from SC. - After SC5 is rebooted. SC4 is also promoted it self node to become the ACTVE. SC4 send the take-over request. - Unfortunately the node become ACTIVE is quiesced SC1, not standby SC4. So all synced data from ACTIVE may be lost --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2932 ckpt: converting the checkpoint service from c to c++
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2932] ckpt: converting the checkpoint service from c to c++ ** **Status:** review **Milestone:** 5.21.12 **Created:** Mon Oct 01, 2018 01:25 PM UTC by Mohan Kanakam **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Mohan Kanakam Converting the checkpoint service from c to c++. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2930 ckpt: non collocated checkpoint is not deleted from /dev/shm after switch over.
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2930] ckpt: non collocated checkpoint is not deleted from /dev/shm after switch over.** **Status:** review **Milestone:** 5.21.12 **Created:** Thu Sep 20, 2018 07:52 AM UTC by Mohan Kanakam **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Mohan Kanakam Steps to reproduce: 1) Create a non collocated checkpoint on SC-1 and PL-4. 2) Create a one section on SC-1. 3)During switch over operation, close the application on PL-4. 4)After switch over, close the application of SC-1. 4) Observe that checkpoint is not deleted from the /dev/shm on SC-1 and SC-2 . SC-1: root@mohan-VirtualBox:/home/mohan/opensaf-code/src/ckpt/ckptnd# ls /dev/shm/ opensaf_CPND_CHECKPOINT_INFO_131343 opensaf_NCS_GLND_LCK_CKPT_INFO opensaf_NCS_MQND_QUEUE_CKPT_INFO pulse-shm-1049372244 pulse-shm-2170855640 pulse-shm-493188609 opensaf_NCS_GLND_EVT_CKPT_INFO opensaf_NCS_GLND_RES_CKPT_INFO opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131343_2 pulse-shm-2086668063 pulse-shm-3681026513 SC-2: root@mohan-VirtualBox:~# ls /dev/shm opensaf_CPND_CHECKPOINT_INFO_131599 opensaf_NCS_GLND_EVT_CKPT_INFO opensaf_NCS_GLND_LCK_CKPT_INFO opensaf_NCS_GLND_RES_CKPT_INFO opensaf_NCS_MQND_QUEUE_CKPT_INFO opensaf_safCkpt=DemoCkpt,safApp=safCkptServic_131599_2 pulse-shm-2892283080 pulse-shm-2910971180 pulse-shm-3340597930 pulse-shm-528662130 pulse-shm-551961907 --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2914 clm : add missing testcases in Clm apitest
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#2914] clm : add missing testcases in Clm apitest** **Status:** review **Milestone:** 5.21.12 **Created:** Mon Aug 20, 2018 01:34 PM UTC by Richa **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Mohan Kanakam Adding missing test cases in clm apitest. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2908 dtm: Add support for connection-oriented TIPC
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2908] dtm: Add support for connection-oriented TIPC** **Status:** assigned **Milestone:** future **Created:** Wed Aug 01, 2018 01:20 PM UTC by Anders Widell **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Minh Hon Chau DTM currently uses TCP connections for communication between nodes. This ticket proposes that support shall be added for using connection-oriented TIPC instead of, or possibly also at the same time as TCP. The reason for using both TCP and TIPC at the same time would be that DTM can monitor the connectivity between nodes using both TIPC and TCP. The connectivity should then only be considered to be up if both TCP and TIPC are connected. CLM could use this information to disallow cluster membership of nodes that are not connected using both TCP and TIPC. However, it could still be possible to send messages between the nodes using just one of the two connection types - this could be useful to avoid split-brain problems. Another reason for adding support for TIPC in DTM is that we can avoid the problem that our current TIPC implementation can lose messages, and we would no longer require real-time priority for the MDS thread. In fact, the MDS thread could be completely removed and we could remove the MDS code for handling TIPC (only DTM would need to support TIPC). This is a rather large enhancement if all features mentioned above are implemented. However, as a first step a very small implementation could simply add support in DTM for using TIPC instead of TCP. This ought to be easy to implement. We would then have three possible configurations: * TCP using DTM * TIPC using DTM (new) * TIPC without DTM --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2878 ntf: initialize client in log service loop forever
- **Milestone**: 5.21.09 --> future --- ** [tickets:#2878] ntf: initialize client in log service loop forever ** **Status:** review **Milestone:** future **Created:** Fri Jun 15, 2018 06:33 AM UTC by Canh Truong **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Canh Truong When NTFD is started, the NtfAdmin class object is defined and client in log service is initialized in this object. The initialiation of client may be loop forever if TRY_AGAIN error always return. * do { result = saLogInitialize(&logHandle, &logCallbacks, &logversion); if (SAAISERRTRYAGAIN == result) { if (firsttry) { LOGWA("saLogInitialize returns try again, retries..."); firsttry = 0; } usleep(AISTIMEOUT); logversion = kLogVersion; } } while (SAAISERRTRYAGAIN == result);* Somehow log service has not been started on time or log service is busy, NTFD is kept in the loop. NTFD started for long time and the AMF hasn't received csi callback (in 30 seconds ??). The error is printout: "2018-04-23 02:13:56.326 SC-2 osafamfnd[272]: ER safComp=NTF,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:csiSetcallbackTimeout Recovery is:nodeFailfast " NTFD should be updated at initialization of log client. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3282 amfd: coredump while deleting csi
- **Milestone**: 5.21.09 --> 5.21.12 --- ** [tickets:#3282] amfd: coredump while deleting csi** **Status:** assigned **Milestone:** 5.21.12 **Created:** Thu Sep 09, 2021 03:11 AM UTC by Hieu Hong Hoang **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Hieu Hong Hoang **Attachments:** - [bt.SC-1](https://sourceforge.net/p/opensaf/tickets/3282/attachment/bt.SC-1) (10.7 kB; application/octet-stream) When a service unit assigns to a service instance which doesn't support all component service instances of that service instance, amfd will failed to delete those unsupported component service instances. For example: An application have two SU and one SI configured as below ~~~ SU1 contains COMP_A and COMP_B SU2 contains COMP_A SI1 have CSI_A and CSI_B COMP_A supports CSI_A, COMP_B supports CSI_B. ~~~ After opensaf assign SI1 to SU1 and SU2, amfd will crash if we delete the CSI_B. ~~~ 2021-09-07 11:32:21.213 SC-1 osafimmnd[376]: NO Ccb 23 COMMITTED (immcfg_SC-1_1562) 2021-09-07 11:32:21.213 SC-1 osafamfd[439]: src/amf/amfd/csi.cc:945: ccb_apply_delete_hdlr: Assertion 't_csicomp' failed. 2021-09-07 11:32:21.305 SC-1 osafamfnd[459]: ER AMFD has unexpectedly crashed. Rebooting node ~~~ --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3263 rde: Cluster is unrecoverable after all nodes split-brain in roaming SC
- **status**: assigned --> fixed - **Milestone**: 5.21.06 --> 5.21.09 --- ** [tickets:#3263] rde: Cluster is unrecoverable after all nodes split-brain in roaming SC** **Status:** fixed **Milestone:** 5.21.09 **Created:** Fri May 14, 2021 04:56 AM UTC by Minh Hon Chau **Last Updated:** Tue Sep 14, 2021 06:09 AM UTC **Owner:** Minh Hon Chau In Roaming SC deployment, if split-brain occurs that separates all nodes apart, in which each partition has one SC, we have all SCs becoming active. At rejoin, all SCs detect themself as duplicated active to one of other SCs, they should all reboot, ideally. However, sometimes the last active SC is not detected as duplicated because all the other SCs already reboot. The last SC does not find any others as active duplicated to itself. As of this result, since the last SC is not healthy throughout the split time, it's causing many errors for other nodes to rejoin again after reboot. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3263 rde: Cluster is unrecoverable after all nodes split-brain in roaming SC
- **status**: fixed --> assigned --- ** [tickets:#3263] rde: Cluster is unrecoverable after all nodes split-brain in roaming SC** **Status:** assigned **Milestone:** 5.21.06 **Created:** Fri May 14, 2021 04:56 AM UTC by Minh Hon Chau **Last Updated:** Tue Sep 14, 2021 06:09 AM UTC **Owner:** Minh Hon Chau In Roaming SC deployment, if split-brain occurs that separates all nodes apart, in which each partition has one SC, we have all SCs becoming active. At rejoin, all SCs detect themself as duplicated active to one of other SCs, they should all reboot, ideally. However, sometimes the last active SC is not detected as duplicated because all the other SCs already reboot. The last SC does not find any others as active duplicated to itself. As of this result, since the last SC is not healthy throughout the split time, it's causing many errors for other nodes to rejoin again after reboot. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3263 rde: Cluster is unrecoverable after all nodes split-brain in roaming SC
commit bbe47278c2499bc738bf0c2dc8cc4e9a026d Author: Minh Chau Date: Tue Jul 13 18:00:41 2021 +1000 rde: Add timeout of waiting for peer info [#3263] This ticket revisit the waiting for peer info and fix the problem of disordered peer_up and peer info in the commit d1593b03b3c9bec292b14dde65264c261760bf46 --- ** [tickets:#3263] rde: Cluster is unrecoverable after all nodes split-brain in roaming SC** **Status:** fixed **Milestone:** 5.21.06 **Created:** Fri May 14, 2021 04:56 AM UTC by Minh Hon Chau **Last Updated:** Wed May 26, 2021 11:07 AM UTC **Owner:** Minh Hon Chau In Roaming SC deployment, if split-brain occurs that separates all nodes apart, in which each partition has one SC, we have all SCs becoming active. At rejoin, all SCs detect themself as duplicated active to one of other SCs, they should all reboot, ideally. However, sometimes the last active SC is not detected as duplicated because all the other SCs already reboot. The last SC does not find any others as active duplicated to itself. As of this result, since the last SC is not healthy throughout the split time, it's causing many errors for other nodes to rejoin again after reboot. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3279 ntf: Errors under compilation for 32 bit machine
- **status**: assigned --> fixed --- ** [tickets:#3279] ntf: Errors under compilation for 32 bit machine** **Status:** fixed **Milestone:** 5.21.09 **Created:** Thu Aug 26, 2021 01:15 AM UTC by Thanh Nguyen **Last Updated:** Tue Sep 14, 2021 06:01 AM UTC **Owner:** Thanh Nguyen Ticket 3277 patch has compilation errors when compiled for 32 bit machine. Following is an extract of one compilation. CXX src/ntf/ntfd/bin_osafntfd-NtfSubscription.o In file included from ./src/base/ncs_osprm.h:32:0, from ./src/mds/mds_papi.h:32, from ./src/ntf/common/ntfsv_msg.h:26, from ./src/ntf/ntfd/ntfs_com.h:33, from ./src/ntf/ntfd/NtfNotification.h:29, from ./src/ntf/ntfd/NtfFilter.h:29, from ./src/ntf/ntfd/NtfSubscription.h:25, from src/ntf/ntfd/NtfSubscription.cc:22: src/ntf/ntfd/NtfSubscription.cc: In member function 'void NtfSubscription::sendNotification(NtfSmartPtr&, NtfClient*)': ./src/base/logtrace.h:173:65: error: format '%lu' expects argument of type 'long unsigned int', but argument 6 has type 'MDS_DEST {aka long long unsigned int}' [-Werror=format=] logtrace_trace(FILE, __LINE, CAT_TRACE, (format), ##args) ^ src/ntf/ntfd/NtfSubscription.cc:305:9: note: in expansion of macro 'TRACE' TRACE("Nodeid: %u, MdsDest: %lu", evt->info.mds_info.node_id, ^ ./src/base/logtrace.h:163:61: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'MDS_DEST {aka long long unsigned int}' [-Werror=format=] logtrace_log(FILE, __LINE, LOG_ERR, (format), ##args) ^ src/ntf/ntfd/NtfSubscription.cc:316:9: note: in expansion of macro 'LOG_ER' LOG_ER("Down event missed for app with mdsdest: %lu on node: %u", ^ CXX src/ntf/ntfd/bin_osafntfd-NtfLogger.o CXX src/ntf/ntfd/bin_osafntfd-NtfReader.o cc1plus: all warnings being treated as errors Makefile:21672: recipe for target 'src/ntf/ntfd/bin_osafntfd-NtfSubscription.o' failed --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #3279 ntf: Errors under compilation for 32 bit machine
commit 2249c5f7035ad7ec31b2ecd71b88a4d35745acd3 Author: Thanh Nguyen Date: Tue Aug 31 08:06:52 2021 +1000 ntf: Fix compilation errors for 32 bit machine [#3279] Patch for ticket 3277 failed compilation for 32 bit machine. This patch fixes these compilation errors. --- ** [tickets:#3279] ntf: Errors under compilation for 32 bit machine** **Status:** assigned **Milestone:** 5.21.09 **Created:** Thu Aug 26, 2021 01:15 AM UTC by Thanh Nguyen **Last Updated:** Tue Sep 14, 2021 05:53 AM UTC **Owner:** Thanh Nguyen Ticket 3277 patch has compilation errors when compiled for 32 bit machine. Following is an extract of one compilation. CXX src/ntf/ntfd/bin_osafntfd-NtfSubscription.o In file included from ./src/base/ncs_osprm.h:32:0, from ./src/mds/mds_papi.h:32, from ./src/ntf/common/ntfsv_msg.h:26, from ./src/ntf/ntfd/ntfs_com.h:33, from ./src/ntf/ntfd/NtfNotification.h:29, from ./src/ntf/ntfd/NtfFilter.h:29, from ./src/ntf/ntfd/NtfSubscription.h:25, from src/ntf/ntfd/NtfSubscription.cc:22: src/ntf/ntfd/NtfSubscription.cc: In member function 'void NtfSubscription::sendNotification(NtfSmartPtr&, NtfClient*)': ./src/base/logtrace.h:173:65: error: format '%lu' expects argument of type 'long unsigned int', but argument 6 has type 'MDS_DEST {aka long long unsigned int}' [-Werror=format=] logtrace_trace(FILE, __LINE, CAT_TRACE, (format), ##args) ^ src/ntf/ntfd/NtfSubscription.cc:305:9: note: in expansion of macro 'TRACE' TRACE("Nodeid: %u, MdsDest: %lu", evt->info.mds_info.node_id, ^ ./src/base/logtrace.h:163:61: error: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'MDS_DEST {aka long long unsigned int}' [-Werror=format=] logtrace_log(FILE, __LINE, LOG_ERR, (format), ##args) ^ src/ntf/ntfd/NtfSubscription.cc:316:9: note: in expansion of macro 'LOG_ER' LOG_ER("Down event missed for app with mdsdest: %lu on node: %u", ^ CXX src/ntf/ntfd/bin_osafntfd-NtfLogger.o CXX src/ntf/ntfd/bin_osafntfd-NtfReader.o cc1plus: all warnings being treated as errors Makefile:21672: recipe for target 'src/ntf/ntfd/bin_osafntfd-NtfSubscription.o' failed --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2726 smf: Smfnd does not protect global variables used in more than one thread
- **Milestone**: future --> 5.21.06 --- ** [tickets:#2726] smf: Smfnd does not protect global variables used in more than one thread** **Status:** fixed **Milestone:** 5.21.06 **Created:** Mon Dec 04, 2017 11:29 AM UTC by elunlen **Last Updated:** Mon Apr 26, 2021 03:15 AM UTC **Owner:** Thanh Nguyen Several global variables (cb structure) are handled both in the main thread and in the mds thread but no mutex is used for protection. Make handling of global variables thread safe --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets