[devel] [PATCH 1/1] base: fix creation of msg queues [#3107]

2020-02-13 Thread Alex Jones
Message queues stop working correctly after queue file is removed from /tmp. Message queue API uses "ftok" which relies on the file being permanent. The behaviour is undefined if the file is removed. Many systems clean out /tmp periodically, so this can break if the message queue is long lived. C

[devel] [PATCH 0/1] Review Request for base: fix creation of msg queues [#3107]

2020-02-13 Thread Alex Jones
5f9 Author: Alex Jones Date: Thu, 13 Feb 2020 08:39:46 -0500 base: fix creation of msg queues [#3107] Message queues stop working correctly after queue file is removed from /tmp. Message queue API uses "ftok" which relies on the file being permanent. The behaviour is undefined if

[devel] [PATCH 0/1] Review Request for amfd: fix calculating standby rank for SIrankedSU with non-unique rank [#3149]

2020-02-07 Thread Alex Jones
revision d21dd0c020e33fd8932481976571d3ed22580ef5 Author: Alex Jones Date: Fri, 7 Feb 2020 13:52:12 -0500 amfd: fix calculating standby rank for SIrankedSU with non-unique rank [#3149] Standby rank which is passed to CSI set and protection group callbacks may not be accurate. If SIrankedSUs exis

[devel] [PATCH 1/1] amfd: fix calculating standby rank for SIrankedSU with non-unique rank [#3149]

2020-02-07 Thread Alex Jones
Standby rank which is passed to CSI set and protection group callbacks may not be accurate. If SIrankedSUs exist with non-unique ranks, AVD_SI::get_sisu_rank() is not traversing all the SUs at that rank to determine the standby rank. AVD_SI::get_sisu_rank() needs to traverse all the SUs at the pa

Re: [devel] [PATCH 5/5] build: fix compile errors with gcc 9.x [#3134]

2020-02-04 Thread Alex Jones
i_value))[len] = '\0'; => strncpy with "len + 1" then later overwrite with `\0'. I suggest strncpy with "len" as original code to avoid redundant changes. Best Regards, ThuanTr From: Alex Jones [1] Sent: Monday, February 3, 2020 10:

[devel] [PATCH 3/5] build: fix gcc-9.x compiler problems [#3134]

2020-02-03 Thread Alex Jones
more fixes --- src/ntf/apitest/test_ntf_imcn.cc | 53 +++- src/plm/plmcd/plmc_read_config.c | 2 +- 2 files changed, 40 insertions(+), 15 deletions(-) diff --git a/src/ntf/apitest/test_ntf_imcn.cc b/src/ntf/apitest/test_ntf_imcn.cc index b1a1e87b4..51b9076c6 100644 --

[devel] [PATCH 4/5] build: fix compile errors from gcc-9.x [#3134]

2020-02-03 Thread Alex Jones
more issues --- src/imm/immloadd/imm_pbe_load.cc | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/imm/immloadd/imm_pbe_load.cc b/src/imm/immloadd/imm_pbe_load.cc index 72b926383..5f5aefcec 100644 --- a/src/imm/immloadd/imm_pbe_load.cc +++ b/src/imm/immloadd/imm_pbe_lo

[devel] [PATCH 1/5] build: fix errors from gcc 9.x [#3134]

2020-02-03 Thread Alex Jones
Mostly strncpy and strncat problems. --- src/base/daemon.c | 1 + src/ckpt/ckptd/cpd_imm.c | 4 ++-- src/ckpt/ckptnd/cpnd_res.c| 2 +- src/clm/clmd/clms_imm.cc | 2 +- src/dtm/dtmnd/dtm_intra_svc.cc

[devel] [PATCH 5/5] build: fix compile errors with gcc 9.x [#3134]

2020-02-03 Thread Alex Jones
Rework fixes in NTF and SMF. --- src/ntf/apitest/test_ntf_imcn.cc | 2 +- src/smf/smfd/SmfUtils.cc | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/ntf/apitest/test_ntf_imcn.cc b/src/ntf/apitest/test_ntf_imcn.cc index 51b9076c6..04f155074 100644 --- a/src/ntf/apit

[devel] [PATCH 0/5] Review Request for build: fix errors from gcc 9.x [#3134]

2020-02-03 Thread Alex Jones
ERIES HERE *** revision 1c9c9c9aa23f95939597b0e29055c94c24e2815a Author: Alex Jones Date: Mon, 3 Feb 2020 10:32:17 -0500 build: fix compile errors with gcc 9.x [#3134] Rework fixes in NTF and SMF. revision 560b3243c3bcd821ca67839de8a4ee2825422966 Author: Alex Jones Date: Mon, 3 Feb 2020 10:32:17 -0

[devel] [PATCH 2/5] build: fix errors from gcc 9.x [#3134]

2020-02-03 Thread Alex Jones
More compiler fixes --- src/imm/common/immpbe_dump.cc| 2 +- src/plm/plmcd/plmc_read_config.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/imm/common/immpbe_dump.cc b/src/imm/common/immpbe_dump.cc index 3bde78a3f..175bd0484 100644 --- a/src/imm/common/immpbe_dump.

[devel] [PATCH 0/1] Review Request for mfnd: don't quiesce comp which is in TERMINATION_FAILED state [#3147]

2020-01-30 Thread Alex Jones
EXPLAIN/COMMENT THE PATCH SERIES HERE *** revision fd81f84a655def349896e175c4615023f1f99151 Author: Alex Jones Date: Thu, 30 Jan 2020 10:58:28 -0500 amfnd: don't quiesce comp which is in TERMINATION_FAILED state [#3147] When SU goes into TERMINATION_FAILED because one of it

[devel] [PATCH 1/1] amfnd: don't quiesce comp which is in TERMINATION_FAILED state [#3147]

2020-01-30 Thread Alex Jones
When SU goes into TERMINATION_FAILED because one of its components went to TERMINATION_FAILED, amfnd will still send QUIESCED to those components, even though they are already terminating. This can cause the SG to go into unstable state, and get stuck. IsCompQualifiedAssignment does not check for

[devel] [PATCH 0/1] Review Request for uml: add support for plm to run under uml [#2922]

2018-09-17 Thread Alex Jones
revision 84ddb28a1b5fd0b9b24795196c523b5b050effbe Author: Alex Jones Date: Mon, 17 Sep 2018 15:42:04 -0400 uml: add support for plm to run under uml [#2922] Add support for plm to run under uml. Added Files: src/plm/config/openhpi.conf Complete diffstat: -- src/plm/config/op

[devel] [PATCH 1/1] uml: add support for plm to run under uml [#2922]

2018-09-17 Thread Alex Jones
Add support for plm to run under uml. --- src/plm/config/openhpi.conf| 18 tools/cluster_sim_uml/archive/scripts/40opensaf.rc | 30 +++ tools/cluster_sim_uml/build_uml| 95 -- 3 files changed, 138 insertions(+), 5 deletions(-

Re: [devel] [PATCH 0/2] Review Request for plma: align the function headers [#199]

2018-09-17 Thread Alex Jones
Ack. I will push it. Alex On 09/11/2018 08:55 AM, Meenakshi TK wrote: __ NOTICE: This email was received from an EXTERNAL sender __ Summary: p

Re: [devel] [PATCH 1/1] plm: fix return codes for saPlmReadinessTrackResponse [#200]

2018-09-17 Thread Alex Jones
not be anything other than" + "START/VALIDATE. change_step: %d", trk_info->change_step); One typo above is datebase which should be database. Thanks, Meenakshi High Availability Solutions Pvt. Ltd. [2]www.hasolutions.in ----- Original Me

[devel] [PATCH 0/1] Review Request for plmd: fix adding and removing of invocation id to list [#197]

2018-09-14 Thread Alex Jones
servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision c0e8a1d9b6e1a8e53f8f0ffbff9b86c40ee0d6b6 Author: Alex J

[devel] [PATCH 1/1] plmd: fix adding and removing of invocation id to list [#197]

2018-09-14 Thread Alex Jones
Jan 22 11:09:03 localhost osafplmd[3988]: Invocation id mentioned in the resp, is not found in the grp->inocation_list. inv_id: 9 If multiple entities are part of the same entity group, and START or VALIDATE tracking is requested, if an admin operation is done on these entities, once one response

[devel] [PATCH 1/1] plm: fix return codes for saPlmReadinessTrackResponse [#200]

2018-09-07 Thread Alex Jones
saPlmReadinessTrackResponse sometimes returns SA_AIS_OK, when invalid parameters are passed. SaPlmReadinessTrackResponseT parameter is not checked for range. Also, the msg is sent asynchronously from the agent to plmd, so that errors from plmd cannot be passed back to the agent. Check the SaPlmRe

[devel] [PATCH 0/1] Review Request for plm: fix return codes for saPlmReadinessTrackResponse [#200]

2018-09-07 Thread Alex Jones
servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision c87593a8180c59b4c3e7f0bd0b8789dac72b0415 Author: Alex J

Re: [devel] [PATCH 1/1] amf: add support for container/contained [#70]

2018-09-06 Thread Alex Jones
ility Solutions Pvt. Ltd. ([2]www.hasolutions.in) - OpenSAF Support and Services - Original Message - Subject: Re: [PATCH 1/1] amf: add support for container/contained [#70] From: "Alex Jones" [3] Date: 8/29/18 9:29 pm To: [4]nagen...@hasolu

Re: [devel] [PATCH 1/1] plm: remove unused function plms_hsm_finalize [#210]

2018-09-06 Thread Alex Jones
Ack. I will push it. Alex On 09/06/2018 04:40 AM, Meenakshi TK wrote: __ NOTICE: This email was received from an EXTERNAL sender __ --- src

Re: [devel] [PATCH 0/1] Review Request for plm: correct first arguement of API saPlmEntityGroupAdd() in apitest [#1983]

2018-09-06 Thread Alex Jones
for plm: correct first arguement of API saPlmEntityGroupAdd() in apitest [#1983] From: "Alex Jones" [3] Date: 8/27/18 10:42 pm To: "Meenakshi TK" [4], [5]nagen...@hasolutions.in Cc: [6]opensaf-devel@lists.sourceforge.net Hi, This test is curr

Re: [devel] [PATCH 1/1] ckpt: add the ckpt reference to the CPND node info [#2082]

2018-09-04 Thread Alex Jones
Hi Mohan, I am not able to reproduce the problem as described in the ticket. Can you post your test code? Alex On 09/03/2018 03:32 AM, [1]mo...@hasolutions.in wrote: __ NOTICE: This email was received fro

Re: [devel] [PATCH 1/1] amf: add support for container/contained [#70]

2018-08-29 Thread Alex Jones
__ Hi Alex No, I just ran kill 10 times to escalate restart to failover. Do you have a really small probation time in your demo config? Gary On 28/8/18 4:09 am, Alex Jones wrote: G'day Gary, I can't reproduce this. Do you have a script or

Re: [devel] [PATCH 1/1] amf: add support for container/contained [#70]

2018-08-29 Thread Alex Jones
,safApp=Container saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) Also, have you tried killing the amf_container_demo binary? Thanks Gary On 14/08/18 05:00, Alex Jones wrote: Hi Gary, I just resubmitted a new p

[devel] [PATCH 0/1] Review Request for plmd: fix crash when saPlmReadinessTrack is called in error [#2919]

2018-08-27 Thread Alex Jones
OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision e18dabd0a8385ff61ba1ab0540eba4ee58b5cc4e Author:

[devel] [PATCH 1/1] plmd: fix crash when saPlmReadinessTrack is called in error [#2919]

2018-08-27 Thread Alex Jones
plmd crashes when saPlmReadinessTrack is called with entities pointer set, but smaller than what plmd would return. In this case plmd is returning ERR_NO_SPACE, which is correct, but it is setting numberOfEntities without setting the entities pointer. This causes the edu routines to crash. It is

Re: [devel] [PATCH 1/1] amf: add support for container/contained [#70]

2018-08-27 Thread Alex Jones
saAmfSUReadinessState=IN-SERVICE(2) amf-state si: safSi=SC-2N,safApp=OpenSAF saAmfSIAdminState=UNLOCKED(1) saAmfSIAssignmentState=FULLY_ASSIGNED(2) safSi=Contained_2N_1,safApp=Contained_2N saAmfSIAdminState=UNLOCKED(1) saAmfSIAssignme

Re: [devel] [PATCH 0/1] Review Request for plm: correct first arguement of API saPlmEntityGroupAdd() in apitest [#1983]

2018-08-27 Thread Alex Jones
Hi, This test is currently not enabled in test_saPlmEntityGroupCreate.c. Can you please enable it as part of this ticket? Alex On 08/20/2018 07:37 AM, Meenakshi TK wrote: __ NOTICE: This email was rece

Re: [devel] [PATCH 1/1] ckpt: add new test case of API saCkptInitialize() of apitest [#2913]

2018-08-21 Thread Alex Jones
Hi Mohan, Ack from me. Alex On 08/21/2018 04:16 AM, mohan kanakam wrote: __ NOTICE: This email was received from an EXTERNAL sender __

Re: [devel] [PATCH 1/1] amf: add support for container/contained [#70]

2018-08-15 Thread Alex Jones
saAmfSISUHAState=ACTIVE(1) saAmfSISUHAReadinessState=READY_FOR_ASSIGNMENT(1) Also, have you tried killing the amf_container_demo binary? Thanks Gary On 14/08/18 05:00, Alex Jones wrote: Hi Gary, I just resubmitted a new patch which breaks out the different components, and add

Re: [devel] [PATCH 1/1] amf: add support for container/contained [#70]

2018-08-13 Thread Alex Jones
Hi Gary, I just resubmitted a new patch which breaks out the different components, and addresses the other comments here. But, #2 (rejecting all but NWay-active for container) should already be in there. Is there a specific test you ran that didn't work? Alex On 08/13/20

[devel] [PATCH 3/5] amf: add support for container/contained [#70]

2018-08-13 Thread Alex Jones
Add support for container/contained amf common. --- src/amf/common/amf_amfparam.h | 22 ++ src/amf/common/amf_d2nmsg.h | 11 +++ src/amf/common/amf_defs.h | 2 ++ src/amf/common/amf_util.h | 3 ++- src/amf/common/d2nedu.c | 22 +- s

[devel] [PATCH 4/5] amf: add support for container/contained [#70]

2018-08-13 Thread Alex Jones
Add support for container/contained for amf agent. --- src/amf/agent/amf_agent.cc | 73 +++--- src/amf/agent/ava_cb.h | 1 + src/amf/agent/ava_hdl.cc | 31 src/amf/agent/ava_mds.cc | 34 - src/amf/agent/ava_m

[devel] [PATCH 1/5] amfd: add support for container/contained [#70]

2018-08-13 Thread Alex Jones
This ticket adds support for container/contained in amfd. --- src/amf/amfd/comp.cc | 65 ++-- src/amf/amfd/comp.h | 4 +- src/amf/amfd/comptype.cc | 6 +- src/amf/amfd/csi.cc | 6 ++ src/amf/amfd/csi.h | 3 + src/amf/amfd/ndproc.cc | 14 + src/am

[devel] [PATCH 5/5] amf: add support for container/contained [#70]

2018-08-13 Thread Alex Jones
Add support for container/contained samples. --- samples/amf/Makefile.am | 2 +- samples/amf/container/AppConfig-contained-2N.xml | 327 + samples/amf/container/AppConfig-container.xml| 331 ++ samples/amf/container/Makefile.am| 45 ++

[devel] [PATCH 0/5] Review Request for amf: add support for container/contained [#70]

2018-08-13 Thread Alex Jones
revision 9c9f7e04c39fca9030025b0a8394eabf328a4c70 Author: Alex Jones Date: Mon, 13 Aug 2018 14:48:14 -0400 amf: add support for container/contained [#70] Add support for container/contained samples. revision cf9d7565376059239c0902555c1c4811db6deff2 Author: Alex Jones Date: Mon, 13 Aug 2018 14:48:14 -0400

[devel] [PATCH 2/5] amfnd: add support for container/contained [#70]

2018-08-13 Thread Alex Jones
This ticket adds support for container/contained. --- src/amf/amfnd/amfnd.cc| 5 ++- src/amf/amfnd/avnd_cb.h | 2 + src/amf/amfnd/avnd_comp.h | 64 + src/amf/amfnd/avnd_evt.h | 1 + src/amf/amfnd/avnd_mds.h | 4 +- src/amf/amfnd/avnd_proc.h | 2 + src

Re: [devel] [PATCH 0/1] Review Request for amf: add support for container/contained [#70]

2018-08-06 Thread Alex Jones
ender __ Hi Alex I can reproduce the coredump by doing "immcfg -f AppConfig-2N.xml" (the amf_demo sample). It looks better with the patch. Thanks Gary From: Alex Jones [1] Organization: Ribbon Date: Saturday, 4 August 2018 at 12:59 am To: Gary Lee

Re: [devel] [PATCH 0/1] Review Request for amf: add support for container/contained [#70]

2018-08-03 Thread Alex Jones
;d On 3/8/18, 11:25 am, "Gary Lee" [1] wrote: Hi Alex I haven't had a chance to look at it, but I did run our regression tests with the patch. amfd is segfaulting regularly, with backtraces like the attachment. Thanks Gary From: Alex Jones [2] Organizatio

Re: [devel] [PATCH 0/1] Review Request for amf: add support for container/contained [#70]

2018-08-01 Thread Alex Jones
8 04:22 PM, Alex Jones wrote: Summary: amf: add support for container/contained [#70] Review request for Ticket(s): 70 Peer Reviewer(s): Nagu, Hans, Ravi, Gary Pull request to: Affected branch(es): develop Development branch: ticket-70 Base revision: 7f6f6c0531a0f5e4f2b0dc1abf4bab6962a3d1a9 Personal

[devel] [PATCH 0/1] Review Request for amf: add support for container/contained [#70]

2018-07-31 Thread Alex Jones
revision d33e50eeb51ccf8808c24a445637d6f1472c396e Author: Alex Jones Date: Tue, 31 Jul 2018 16:06:47 -0400 amf: add support for container/contained [#70] This ticket adds support for container/contained for AMF. Added Files: samples/amf/container/amf_container_demo.c samples/amf

[devel] [PATCH 0/1] Review Request for msg: update msg to use CLM B.04.01 [#2841]

2018-05-11 Thread Alex Jones
Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 9298b3b02ea0d1df99c9549402e427b7cefa7e78 Author: Alex Jones Date: Fri, 11 M

[devel] [PATCH 1/1] msg: update msg to use CLM B.04.01 [#2841]

2018-05-11 Thread Alex Jones
Update msgd and msgnd to use CLM B.04.01. --- src/msg/Makefile.am | 2 -- src/msg/common/mqsv_def.h | 5 + src/msg/msgd/mqd_api.c| 15 --- src/msg/msgd/mqd_clm.c| 17 +++-- src/msg/msgd/mqd_clm.h| 10 -- src/msg/msgnd/mqnd_init.c | 18 +++

[devel] [PATCH 1/1] msgd: put node down handling on thread [#2852]

2018-05-11 Thread Alex Jones
If multiple nodes go down simultaneously which are hosting msg queues (e.g. multiple VMs on a host, and the host goes down), msgd can take a long time to process the node downs which blocks the main thread, and therefore the healthcheck doesn't get processed, so msgd dies, which restarts the contro

[devel] [PATCH 0/1] Review Request for msgd: put node down handling on thread [#2852]

2018-05-11 Thread Alex Jones
n Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 9bb598f8390aaf41c1e0dcd458ee0d82fae58999 Author: Alex Jones Date: Fri, 1

[devel] [PATCH 1/1] lck: fix errors when displaying SaLckResource class [#2070]

2018-05-07 Thread Alex Jones
When getting IMM info for a lock resource, SaLckResource, the information is often not correct. Both lckd and lcknd are not updating IMM correctly when SaLckResource information changes at runtime. Write test cases which make sure these attributes are being updated correctly. And fix the issues.

[devel] [PATCH 0/1] Review Request for lck: fix errors when displaying SaLckResource class [#2070]

2018-05-07 Thread Alex Jones
revision 8fe4377c25259e1430717d3b67e2c4cc2fd3c66f Author: Alex Jones Date: Mon, 7 May 2018 10:04:42 -0400 lck: fix errors when displaying SaLckResource class [#2070] When getting IMM info for a lock resource, SaLckResource, the information is often not correct. Both lckd and lcknd are not updating IMM corr

[devel] [PATCH 0/1] Review Request for plm: don't instantiate child EEs twice when unlocking parent EE [#2846]

2018-05-03 Thread Alex Jones
vicesy OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 4efaccbcde991cd3ff848e43af6c6d007912af

[devel] [PATCH 1/1] plm: don't instantiate child EEs twice when unlocking parent EE [#2846]

2018-05-03 Thread Alex Jones
Child EEs (VMs) can fail to boot up when unlocking the parent EE. The current code resets the VM when unlocking the parent EE. This is done in plms_move_chld_ent_to_insvc(). Later in the unlock function, the child EEs are reset again. libvirt does not like these resets being done in less than 1 se

[devel] [PATCH 1/1] clmd: clear admin_op and stat_change for COMPLETED plm readiness cb [#2847]

2018-05-03 Thread Alex Jones
Sometimes CLM will reboot a node which was locked with PLM admin command. admin_op and stat_change are not being cleared in COMPLETED step in PLM readiness callback. Clear admin_op and stat_change. --- src/clm/clmd/clms.h | 2 +- src/clm/clmd/clms_plm.cc | 7 +++ src/clm/clmd/clms_u

[devel] [PATCH 0/1] Review Request for clmd: clear admin_op and stat_change for COMPLETED plm readiness cb [#2847]

2018-05-03 Thread Alex Jones
y OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision f566e34de691ace5bc7d2832bc1f06b481075db3 Au

Re: [devel] [PATCH 1/1] nid: restart opensafd on failure when systemd enabled [#2839]

2018-05-02 Thread Alex Jones
the core dump at e.g. which address you receive the signal. Perhaps you have found a "window" where immnd is not monitored? /Regards HansN On 04/25/2018 03:23 PM, Alex Jones wrote: Hi Hans, I understand. But, what if it doesn't fail in the nid phase?

[devel] [PATCH 1/1] fmd: fix regression interacting with PLM [#2844]

2018-04-30 Thread Alex Jones
fmd does not pass the EE to opensaf_reboot when attempting to reset the peer. The legacy code passed 0 to fm_mds_async_send. The new code passes NCSMDS_SCOPE_NONE, but doesn't update how bcast_scope is used. Change fm_mds_async_send to check bcast_scope. If it is not NCSMDS_SCOPE_NONE, then use i

[devel] [PATCH 0/1] Review Request for fmd: fix regression interacting with PLM [#2844]

2018-04-30 Thread Alex Jones
servicesy Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 9ab40a006c71a27c140cea5a32ab71b33facdb25 Author: Alex Jones D

[devel] [PATCH 0/4] Review Request for lck: resurrect apitests [#2437]

2018-04-27 Thread Alex Jones
revision 494407c7d28526ac0d616f9be8c2484981bbbeda Author: Alex Jones Date: Fri, 27 Apr 2018 14:37:12 -0400 lck: resurrect apitest [#2437] Resurrect apitest revision 106200a751299a2adf20574809845098e055b874 Author: Alex Jones Date: Fri, 27 Apr 2018 14:29:53 -0400 lck: resurrect apit

[devel] [PATCH 2/4] lck: resurrect apitests [#2437]

2018-04-27 Thread Alex Jones
Resurrect apitests --- src/lck/apitest/test_ErrUnavailable.cc | 2 +- src/lck/apitest/test_saLckLimitGet.cc | 2 +- src/lck/apitest/test_saLckResourceClass.cc | 10 +++--- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/lck/apitest/test_ErrUnavailable.cc b/src/lc

[devel] [PATCH 4/4] lck: resurrect apitest [#2437]

2018-04-27 Thread Alex Jones
Resurrect apitest --- src/lck/apitest/test_ErrUnavailable.cc | 2 +- src/lck/apitest/test_saLckLimitGet.cc | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/lck/apitest/test_ErrUnavailable.cc b/src/lck/apitest/test_ErrUnavailable.cc index db1e0b72f..715efe47c 100644 ---

[devel] [PATCH 3/4] lck: resurrect apitest [#2437]

2018-04-27 Thread Alex Jones
Resurrect apitest --- src/lck/Makefile.am | 5 + src/lck/apitest/test_saLckLimitGet.cc | 7 +-- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/src/lck/Makefile.am b/src/lck/Makefile.am index 5b3102722..db3e043e1 100644 --- a/src/lck/Makefile.am +++ b/src/

[devel] [PATCH 0/1] Review Request for msgd: handle abrupt restart of remote node [#2840]

2018-04-25 Thread Alex Jones
revision e04d343ab46a7409772001c61624eb39c2eb50aa Author: Alex Jones Date: Wed, 25 Apr 2018 10:27:13 -0400 msgd: handle abrupt restart of remote node [#2840] Sometimes when a remote node restarts abruptly, queues which were created on that node, are unable to be opened again when that node comes up. There is a race

[devel] [PATCH 1/1] msgd: handle abrupt restart of remote node [#2840]

2018-04-25 Thread Alex Jones
Sometimes when a remote node restarts abruptly, queues which were created on that node, are unable to be opened again when that node comes up. There is a race condition when the remote node goes down between msgd getting the CLM and MDS events indicating node down, and immd removing the implemente

Re: [devel] [PATCH 1/1] nid: restart opensafd on failure when systemd enabled [#2839]

2018-04-25 Thread Alex Jones
only happen if REBOOT_ON_FAIL_TIMEOUT is set, (i.e. not 0). I checked the latest version, the reboot works fine if e.g. immnd fails in the nid phase and REBOOT_ON_FAIL_TIMEOUT is set. /Thanks HansN From: Alex Jones [[1]mailto:ajo...@rbbn.com] Sent: den 25 april 2018 15:05 To: Ha

Re: [devel] [PATCH 1/1] nid: restart opensafd on failure when systemd enabled [#2839]

2018-04-25 Thread Alex Jones
___ Hi Alex, please see comment below. /Thanks HansN On 04/23/2018 03:56 PM, Alex Jones wrote: Hi Hans, I just did some tests. Maybe there is a bug in nid, but when I do not have "Restart=on-failure", the node does not reboot when I run the command

Re: [devel] [PATCH 1/1] amfd: if rootCauseEntity is PLM entity don't engage lock/lock-in [#2835]

2018-04-23 Thread Alex Jones
, please see below for some comments/questions. /Regards HansN On 04/18/2018 03:41 PM, Alex Jones wrote: When using PLM an AMF node mapped to a CLM node mapped to a PLM EE, can get stuck in locked state when rebooting, or going through a PLM EE lock/unlock. When amfd receives a START

Re: [devel] [PATCH 1/1] nid: restart opensafd on failure when systemd enabled [#2839]

2018-04-23 Thread Alex Jones
andle the reboot request if Restart=on-failure is set? /BR HansN ______ Från: Alex Jones [1] Skickat: den 19 april 2018 17:27:27 Till: Hans Nordebäck; Anders Widell Kopia: [2]opensaf-devel@lists.sourceforge.net; Alex

[devel] [PATCH 0/1] Review Request for nid: restart opensafd on failure when systemd enabled [#2839]

2018-04-19 Thread Alex Jones
OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision c67596599b7728ea45e2d449d5ba3c3103bf8452 Author: Alex J

[devel] [PATCH 1/1] nid: restart opensafd on failure when systemd enabled [#2839]

2018-04-19 Thread Alex Jones
Under certain circumstances opensafd fails to start (immnd or dtmd crashes, etc). Apr 19 15:07:31 ams-idsp-46-novnfm osafdtmd[3315]: src/dtm/dtmnd/dtm_intra_svc.cc:1778: dtm_process_internode_service_up_msg: Assertion '0' failed. We can tell systemd to restart opensafd if it fails to start. ---

[devel] [PATCH 0/1] Review Request for amfd: if rootCauseEntity is PLM entity don't engage lock/lock-in [#2835]

2018-04-18 Thread Alex Jones
the patch from ticket 2834. revision 9e09af922cf88a56ee4984abe46b01f363117e30 Author: Alex Jones Date: Wed, 18 Apr 2018 09:08:41 -0400 amfd: if rootCauseEntity is PLM entity don't engage lock/lock-in [#2835] When using PLM an AMF node mapped to a CLM node mapped to a PLM EE, can get stuck

[devel] [PATCH 1/1] amfd: if rootCauseEntity is PLM entity don't engage lock/lock-in [#2835]

2018-04-18 Thread Alex Jones
When using PLM an AMF node mapped to a CLM node mapped to a PLM EE, can get stuck in locked state when rebooting, or going through a PLM EE lock/unlock. When amfd receives a START step from CLM tracking it attempts to gracefully shutdown the AMF node using AMF admin operations lock/lock-in. When P

[devel] [PATCH 1/1] plmd: use virDomainDestroy and virDomainCreate to reset VM [#2836]

2018-04-12 Thread Alex Jones
Abrupt restart or unlock-in of child EE does not always work. virDomainReset() does not always work. Use virDomainDestroy() and virDomainCreate() instead. --- src/plm/plmd/plms_virt.cc | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/plm/plmd/plms_virt.cc

[devel] [PATCH 0/1] Review Request for plmd: use virDomainDestroy and virDomainCreate to reset VM [#2836]

2018-04-12 Thread Alex Jones
OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 56a0e35daf04083c5fb76270dbf0163b03500d58 Author:

[devel] [PATCH 0/1] Review Request for clmd: pass rootCauseEntity from PLM tracking to CLM tracking clients [#2834]

2018-04-12 Thread Alex Jones
a37 Author: Alex Jones Date: Thu, 12 Apr 2018 10:53:19 -0400 clmd: pass rootCauseEntity from PLM tracking to CLM tracking clients [#2834] CLM tracking clients have no context for the tracking callback. PLM rootCauseEntity is not passed by CLM to its own tracking clients. When CLM tracking

[devel] [PATCH 1/1] clmd: pass rootCauseEntity from PLM tracking to CLM tracking clients [#2834]

2018-04-12 Thread Alex Jones
CLM tracking clients have no context for the tracking callback. PLM rootCauseEntity is not passed by CLM to its own tracking clients. When CLM tracking is invoked because of PLM tracking, pass on the rootCauseEntity. --- src/clm/clmd/clms_evt.cc | 4 +-- src/clm/clmd/clms_imm.cc | 80

Re: [devel] [PATCH 1/1] msg: updated the assert condition , to avoid core [#2802]

2018-04-05 Thread Alex Jones
Ack. Alex On 04/03/2018 06:46 AM, srinivas wrote: __ NOTICE: This email was received from an EXTERNAL sender __ --- src/msg/apitest/test_Me

[devel] [PATCH 0/1] Review Request for plmd: handle admin-operation-pending for EE unlock [#2819]

2018-03-26 Thread Alex Jones
servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision ae59ca0e4d33b97d3fbc28d531452e391afe488a Author: Alex J

[devel] [PATCH 1/1] plmd: handle admin-operation-pending for EE unlock [#2819]

2018-03-26 Thread Alex Jones
If EE unlock fails, it is never retried when management is regained. The EE just sits in LOCKED admin state. If EE unlock fails, the code continues as if it did succeed, setting readiness state to in-service, etc. If EE unlock fails, just return ERR_DEPLOYMENT immediately, and don't set anything

Re: [devel] [PATCH 1/1] msg: updated the assert condition , to avoid core [#2802]

2018-03-26 Thread Alex Jones
Hi Srinivas, Two comments: 1. Put the new include file before the above "msg/..." files, so it is in alphabetical order 2. change the test, so there is only one aisrc_validate call in it. Otherwise, 2 PASSED show up for the test. Alex On 03/26/2018 07:23 AM,

[devel] [PATCH 0/1] Review Request for plmd: connect to hypervisor after middleware switchover [#2817]

2018-03-22 Thread Alex Jones
not handle admin-operation-pending for child EEs while the parent EE was not available. revision 28094fa2491d458478491d6343f0be4fb5ecdbd7 Author: Alex Jones Date: Thu, 22 Mar 2018 20:46:14 -0400 plmd: connect to hypervisor after middleware switchover [#2817] Any PLM admin operation whic

[devel] [PATCH 1/1] plmd: connect to hypervisor after middleware switchover [#2817]

2018-03-22 Thread Alex Jones
Any PLM admin operation which requires hypervisor assistance (e.g. unlock-in, abrupt restart) will fail after middleware switchover. When plmcds are reconnecting to the new active plmd, the plmd does not attempt to connect to the hypervisor if the EE is a virtual machine monitor. Connect to the h

[devel] [PATCH 1/1] plmd: connect to hypervisor after middleware switchover [#2817]

2018-03-21 Thread Alex Jones
After a middleware switchover, EE admin commands that need hypervisor support do not work (e.g. unlock-in, abrupt restart). After the switchover, the plmcds on the different nodes reconnect to the new plmd. But, the new plmd does not make any contact with the hypervisors. So, the commands fail. W

[devel] [PATCH 0/1] Review Request for plmd: connect to hypervisor after middleware switchover [#2817]

2018-03-21 Thread Alex Jones
revision 6042af1f311dc6b6ec270bd0aaa8e570e6477842 Author: Alex Jones Date: Wed, 21 Mar 2018 11:49:55 -0400 plmd: connect to hypervisor after middleware switchover [#2817] After a middleware switchover, EE admin commands that need hypervisor support do not work (e.g. unlock-in, abrupt restart). After the switc

[devel] [PATCH 0/1] Review Request for msgnd: prevent race condition during q transfer [#2816]

2018-03-20 Thread Alex Jones
servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 36e5a1d4fb123862cc442301140f70e8ce10a7c4 Author: Alex Jones Date:

[devel] [PATCH 1/1] msgnd: prevent race condition during q transfer [#2816]

2018-03-20 Thread Alex Jones
During q transfer when new node is opening the q, msgnd fails to create the runtime IMM object for the queue, and the open fails. When the transfer is done, the old side and owner of the runtime object doesn't delete the IMM object until after the q transfer response is sent. This is a race condit

[devel] [PATCH 1/1] plmd: enable dynamic tracing [#2796]

2018-03-07 Thread Alex Jones
Dynamic tracing does not work with plmd. plmd overrides the USR2 signal with its own dump routine. Remove the signal hander code for USR2 in plmd. --- src/plm/plmd/plms_main.c | 20 1 file changed, 20 deletions(-) diff --git a/src/plm/plmd/plms_main.c b/src/plm/plmd/plms_ma

[devel] [PATCH 0/1] Review Request for plmd: enable dynamic tracing [#2796]

2018-03-07 Thread Alex Jones
libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision c75a7990a32d4d0d05bad0ba69e920dd42d780e8 Author: Alex Jones Date: Wed, 7 Mar 2018 15:3

[devel] [PATCH 0/1] Review Request for msgd: during cold sync don't add tracking entries which already exist [#2793]

2018-03-06 Thread Alex Jones
vicesy OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 916b838764c03891c5e35b18626d89aadbb5ca

[devel] [PATCH 1/1] msgd: during cold sync don't add tracking entries which already exist [#2793]

2018-03-06 Thread Alex Jones
Opening of an existing msg q using saMsgQueueOpen (for q failover) may take a long time. When cold sync is done, sometimes two MDS cold sync requests are sent by the standby, so the standby can receive 2 cold syncs. The standby code to process the cold sync response blindly adds the tracking entri

Re: [devel] [PATCH 1/1] cpnd: Correct duration of cpnd_tmr_start in cpnd_proc_update_remote [#2787]

2018-03-06 Thread Alex Jones
signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot__

Re: [devel] [PATCH 0/1] Review Request for cpnd: Correct duration of cpnd_tmr_start in cpnd_proc_update_remote [#2787]

2018-02-23 Thread Alex Jones
signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot__

Re: [devel] [PATCH 0/1] Review Request for msg: implement metadata size and limit fetch operations [#2626]

2018-01-31 Thread Alex Jones
signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot__

[devel] [PATCH 0/1] Review Request for plm: handle race condition for EE instantiation [#2514]

2018-01-02 Thread Alex Jones
revision 14dfc8f3e86559585b072a9c18025cb562caaeff Author: Alex Jones Date: Tue, 2 Jan 2018 10:45:31 -0500 plm: handle race condition for EE instantiation [#2514] Child EE which is a controller can get shutdown because its parent EE (host) has not connected to PLM, yet. If the controller is a VM, and the

[devel] [PATCH 1/1] plm: handle race condition for EE instantiation [#2514]

2018-01-02 Thread Alex Jones
Child EE which is a controller can get shutdown because its parent EE (host) has not connected to PLM, yet. If the controller is a VM, and the host is a payload, there is a race condition when instantiating the EEs. If the host doesn't connect to PLM first, then when the controller EE (child of ho

[devel] [PATCH 0/1] Review Request for plm: don't set readiness state to in-service if EE is terminating [#2734]

2017-12-13 Thread Alex Jones
vicesy OpenSAF servicesn Core libraries n Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 1a3ad81467d91b4f98b76657821e256645a3e5

[devel] [PATCH 1/1] plm: don't set readiness state to in-service if EE is terminating [#2734]

2017-12-13 Thread Alex Jones
If an EE goes down during a controller switchover the TERMINATED message sent by plmc to plmd may not be received because of the switch over. In this case the EE will be stuck in terminating presence state. If any parent of the EE is in OOS, then we can definitely set the presence state to UNINST

Re: [devel] [PATCH 0/1] Review Request for plm: handle plmc clients which abruptly terminated [#2529]

2017-12-12 Thread Alex Jones
signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot__

Re: [devel] [PATCH 1/1] clmd: add dynamically created EEs to PLM entity group on standby [#2730]

2017-12-11 Thread Alex Jones
signature.asc Description: OpenPGP digital signature -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot__

[devel] [PATCH 1/1] plm: handle plmc clients which abruptly terminated [#2529]

2017-12-07 Thread Alex Jones
In virtual environments nodes can reboot very quickly (less than 1 minute). If the reboot is abrupt, plmd may not be aware that the EE went down until after it has already come back up because plmd relies on the TCP connection to plmcd on the node. In this case, plmd will set the readiness state to

[devel] [PATCH 0/1] Review Request for plm: handle plmc clients which abruptly terminated [#2529]

2017-12-07 Thread Alex Jones
revision caa9f9f93e507748ec6fb43c97d83967f4c6045b Author: Alex Jones Date: Thu, 7 Dec 2017 11:31:46 -0500 plm: handle plmc clients which abruptly terminated [#2529] In virtual environments nodes can reboot very quickly (less than 1 minute). If the reboot is abrupt, plmd may not be aware that the EE went

  1   2   3   4   >