[ovs-dev] [PATCH ovn] Fix connection string in case of changes in the ovndb-servers.ocf RA

2021-03-25 Thread Michele Baldessari
Currently if you update the master_ip attribute, pacemaker will restart
the resource but the ovn master role will still listen to the previous
old ip address. The reason for this is the following piece of code:

  conn=`ovn-nbctl get NB_global . connections`
  if [ "$conn" == "[]" ]; then
ovn-nbctl -- --id=@conn_uuid create Connection \
  target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${LISTEN_ON_IP}" \
  inactivity_probe=$INACTIVE_PROBE -- set NB_Global . connections=@conn_uuid
  fi

Once the connection is set the first time 'get NB_global . connections'
will always return the UUID of the connection so we're never changing
the connection data.

Let's set the connection info whenever the connection is not "[]"

Tested as follows:
1) Started with the following listening ip addresses (resource is configured 
with master_ip=172.17.1.44):
tcp   LISTEN 0  10172.17.1.44:6641   0.0.0.0:*
users:(("ovsdb-server",pid=150360,fd=15)) ino:18609895 sk:4d <->
tcp   LISTEN 0  10172.17.1.44:6642   0.0.0.0:*
users:(("ovsdb-server",pid=150379,fd=15)) ino:18609038 sk:4e <->

2) Updated the resource master_ip with:
pcs resource update ovndb_servers master_ip=4.5.6.7

3) Observed the new listening ip addresses on the master node:
[root@controller-3 stdouts]# ss -natulpe |grep 664[12]
tcp   LISTEN 0  104.5.6.7:6641   0.0.0.0:*
users:(("ovsdb-server",pid=516770,fd=15)) ino:11641860 sk:4d <->
tcp   LISTEN 0  104.5.6.7:6642   0.0.0.0:*
users:(("ovsdb-server",pid=516789,fd=15)) ino:11642885 sk:4e <->

Previously we'd observe ovsdb-server listening to the old IP even after
the resource change.

Signed-off-by: Michele Baldessari 
---
 utilities/ovndb-servers.ocf | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/utilities/ovndb-servers.ocf b/utilities/ovndb-servers.ocf
index 7351c7d64a94..896d590ecc78 100755
--- a/utilities/ovndb-servers.ocf
+++ b/utilities/ovndb-servers.ocf
@@ -259,6 +259,9 @@ ovsdb_server_notify() {
 ovn-nbctl -- --id=@conn_uuid create Connection \
 target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${LISTEN_ON_IP}" \
 inactivity_probe=$INACTIVE_PROBE -- set NB_Global . connections=@conn_uuid
+   else
+CONN_UID=$(sed -e 's/^\[//' -e 's/\]$//' <<< ${conn})
+ovn-nbctl set connection "${CONN_UID}" 
target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${LISTEN_ON_IP}"
 fi
 
 conn=`ovn-sbctl get SB_global . connections`
@@ -267,6 +270,9 @@ inactivity_probe=$INACTIVE_PROBE -- set NB_Global . 
connections=@conn_uuid
 ovn-sbctl -- --id=@conn_uuid create Connection \
 target="p${SB_MASTER_PROTO}\:${SB_MASTER_PORT}\:${LISTEN_ON_IP}" \
 inactivity_probe=$INACTIVE_PROBE -- set SB_Global . connections=@conn_uuid
+else
+CONN_UID=$(sed -e 's/^\[//' -e 's/\]$//' <<< ${conn})
+ovn-sbctl set connection "${CONN_UID}" 
target="p${SB_MASTER_PROTO}\:${SB_MASTER_PORT}\:${LISTEN_ON_IP}"
 fi
 
 else
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] Make the notify() calls work with IPv6 in the OCF resource-agent

2020-06-03 Thread Michele Baldessari
When the VIP is an IPv6 address we get the following error in the
resource agent:
ovndb_servers_notify_0:355:stderr [ + ovn-sbctl -- --id=@conn_uuid create 
Connection 'target=ptcp\:6642\:[fd00:fd00:fd00:2000::a2]' 
inactivity_probe=18 -- set SB_Global . connections=@conn_uuid ]
ovndb_servers_notify_0:355:stderr [ ovn-sbctl: 
ptcp\:6642\:[fd00:fd00:fd00:2000::a2]: unexpected "[" parsing string ]

This is because MASTER_IP is an IPv6 address and is being passed to
ovn-[ns]bctl without being escaped and the command errors out with
unexpected parsing string errors. The rest of the create Connection
command was already escaping the columns, we are just missing the ip
address bits in case of IPv6.

Let's make sure we escape the '[]:' characters and avoid this problem.
Tested this on an OpenStack environment on both IPv6 and IPv4.

Signed-off-by: Michele Baldessari 
---
 utilities/ovndb-servers.ocf | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/utilities/ovndb-servers.ocf b/utilities/ovndb-servers.ocf
index 56c2bc3227c3..354818a78d4d 100755
--- a/utilities/ovndb-servers.ocf
+++ b/utilities/ovndb-servers.ocf
@@ -249,7 +249,9 @@ ovsdb_server_notify() {
 if [ "x${LISTEN_ON_MASTER_IP_ONLY}" = xno ]; then
LISTEN_ON_IP="0.0.0.0"
 else
-   LISTEN_ON_IP=${MASTER_IP}
+   # ovn-[sn]bctl want ':[]' characters to be escaped. We do so in 
order
+   # to make this work when MASTER_IP is an IPv6 address
+   LISTEN_ON_IP=$(sed -e 's/\(\[\|\]\|:\)/\\\1/g' <<< ${MASTER_IP})
 fi
 conn=`ovn-nbctl get NB_global . connections`
 if [ "$conn" == "[]" ]
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Make the notify() calls work with IPv6

2020-06-03 Thread Michele Baldessari
On Sun, May 31, 2020 at 12:19:07PM +0100, Gabriele Cerami wrote:
> AFAIK OVN patches should have PATCH ovn prefix :)

Ah ok, I guess last time I tweaked the OCF resource agent it was still
in the ovs repo.

> On 31 May, Michele Baldessari wrote:
> > ---
> >  utilities/ovndb-servers.ocf | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/utilities/ovndb-servers.ocf b/utilities/ovndb-servers.ocf
> > index 56c2bc3227c3..2181bc5e273a 100755
> > --- a/utilities/ovndb-servers.ocf
> > +++ b/utilities/ovndb-servers.ocf
> > @@ -251,11 +251,14 @@ ovsdb_server_notify() {
> >  else
> > LISTEN_ON_IP=${MASTER_IP}
> 
> May I suggest a less intrusive one liner ? LISTEN_ON_IP=$(sed -e 
> 's/\(\[\|\]\|:\)/\\\1/g' <<< ${MASTER_IP})

Sure can do, although I need to retest everything then.

thanks

> >  fi
> > +# ovn-nbctl wants ':[]' characters to be escaped. We do so in order
> > +# to make this work when MASTER_IP is an IPv6 address
> > +LISTEN_ON_IP_ESCAPED=$(echo ${LISTEN_ON_IP} | sed -e 
> > 's/\(\[\|\]\|:\)/\\\1/g')
> 

-- 
Michele Baldessari
C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Make the notify() calls work with IPv6

2020-05-31 Thread Michele Baldessari
When the VIP is an IPv6 address we get the following error in the
resource agent:
ovndb_servers_notify_0:355:stderr [ + ovn-sbctl -- --id=@conn_uuid create 
Connection 'target=ptcp\:6642\:[fd00:fd00:fd00:2000::a2]' 
inactivity_probe=18 -- set SB_Global . connections=@conn_uuid ]
ovndb_servers_notify_0:355:stderr [ ovn-sbctl: 
ptcp\:6642\:[fd00:fd00:fd00:2000::a2]: unexpected "[" parsing string ]

This is because MASTER_IP is an IPv6 address and is being passed to
ovn-[ns]bctl without being escaped and the command errors out with
unexpected parsing string errors. The rest of the create Connection
command was already escaping the columns, we are just missing the ip
address bits in case of IPv6.

Let's make sure we escape the '[]:' characters and avoid this problem.
Tested this on an OpenStack environment on both IPv6 and IPv4.

Signed-off-by: Michele Baldessari 
---
 utilities/ovndb-servers.ocf | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/utilities/ovndb-servers.ocf b/utilities/ovndb-servers.ocf
index 56c2bc3227c3..2181bc5e273a 100755
--- a/utilities/ovndb-servers.ocf
+++ b/utilities/ovndb-servers.ocf
@@ -251,11 +251,14 @@ ovsdb_server_notify() {
 else
LISTEN_ON_IP=${MASTER_IP}
 fi
+# ovn-nbctl wants ':[]' characters to be escaped. We do so in order
+# to make this work when MASTER_IP is an IPv6 address
+LISTEN_ON_IP_ESCAPED=$(echo ${LISTEN_ON_IP} | sed -e 
's/\(\[\|\]\|:\)/\\\1/g')
 conn=`ovn-nbctl get NB_global . connections`
 if [ "$conn" == "[]" ]
 then
 ovn-nbctl -- --id=@conn_uuid create Connection \
-target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${LISTEN_ON_IP}" \
+target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${LISTEN_ON_IP_ESCAPED}" \
 inactivity_probe=$INACTIVE_PROBE -- set NB_Global . connections=@conn_uuid
 fi
 
@@ -263,7 +266,7 @@ inactivity_probe=$INACTIVE_PROBE -- set NB_Global . 
connections=@conn_uuid
 if [ "$conn" == "[]" ]
 then
 ovn-sbctl -- --id=@conn_uuid create Connection \
-target="p${SB_MASTER_PROTO}\:${SB_MASTER_PORT}\:${LISTEN_ON_IP}" \
+target="p${SB_MASTER_PROTO}\:${SB_MASTER_PORT}\:${LISTEN_ON_IP_ESCAPED}" \
 inactivity_probe=$INACTIVE_PROBE -- set SB_Global . connections=@conn_uuid
 fi
 
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Make pidfile_is_running more robust against empty pidfiles

2019-08-14 Thread Michele Baldessari
On Wed, Aug 14, 2019 at 02:28:13PM +0300, Ilya Maximets wrote:
> On 14.08.2019 11:39, Michele Baldessari wrote:
> > In some of our destructive testing of ovn-dbs inside containers managed
> > by pacemaker we reached a situation where /var/run/openvswitch had
> > empty .pid files. The current code does not deal well with them
> > and pidfile_is_running() returns true in such a case and this confuses
> > the OCF resource agent.
> > 
> > - Before this change:
> > Inside a container run:
> >   killall ovsdb-server;
> >   echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > 
> > /var/run/openvswitch/ovnsb_db.pid
> > 
> > We will observe that the cluster is unable to ever recover because
> > it believes the ovn processes to be running when they really aren't and
> > eventually just fails:
> >  podman container set: ovn-dbs-bundle 
> > [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
> >ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
> >ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Stopped controller-1
> >ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2
> > 
> > - After this change the cluster is able to recover from this state and
> > correctly start the resource:
> >  podman container set: ovn-dbs-bundle 
> > [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
> >ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
> >ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Slave controller-1
> >ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2
> > 
> > Signed-off-by: Michele Baldessari 
> > ---
> >  ovn/utilities/ovn-ctl | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/ovn/utilities/ovn-ctl b/ovn/utilities/ovn-ctl
> > index 7e5cd469c83c..65f03e28ddba 100755
> > --- a/ovn/utilities/ovn-ctl
> > +++ b/ovn/utilities/ovn-ctl
> > @@ -35,7 +35,7 @@ 
> > ovn_northd_db_conf_file="$etcdir/ovn-northd-db-params.conf"
> >  
> >  pidfile_is_running () {
> >  pidfile=$1
> > -test -e "$pidfile" && pid=`cat "$pidfile"` && pid_exists "$pid"
> > +test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && 
> > pid_exists "$pid"
> 
> Hi. Thanks for the fix!
> 
> Maybe it's better to add additional check for an empty argument to
> 'pid_exists' function instead? This will cover more cases like invocations
> from the utilities/ovs-lib.in.
> 
> I think, you may also add following tag to commit-message in this case:
> Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.")
> 
> This patch also will be needed in ovn-org/ovn repository too.
> (Use 'PATCH ovn' subject prefix while sending patches targeted for ovn repo.)
> 
> Best regards, Ilya Maximets.

Thanks for the feedback Ilya, I have amended things (hopefully correctly) in
http://patchwork.ozlabs.org/patch/1147111/ (I could not figure out how
to update an existing patch in patchwork, I hope this is okay)

Kind regards,
Michele
-- 
Michele Baldessari
C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2, ovn] Make pid_exists() more robust against empty pid argument

2019-08-14 Thread Michele Baldessari
In some of our destructive testing of ovn-dbs inside containers managed
by pacemaker we reached a situation where /var/run/openvswitch had
empty .pid files. The current code does not deal well with them
and pidfile_is_running() returns true in such a case and this confuses
the OCF resource agent.

- Before this change:
Inside a container run:
  killall ovsdb-server;
  echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > 
/var/run/openvswitch/ovnsb_db.pid

We will observe that the cluster is unable to ever recover because
it believes the ovn processes to be running when they really aren't and
eventually just fails:
 podman container set: ovn-dbs-bundle 
[192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
   ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Stopped controller-1
   ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2

Let's make sure pid_exists() returns false when the pid is an empty
string.

- After this change the cluster is able to recover from this state and
correctly start the resource:
 podman container set: ovn-dbs-bundle 
[192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
   ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Slave controller-1
   ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2

Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.")

Signed-off-by: Michele Baldessari 
---
v1 -> v2

- Implemented Ilya's suggestion and moved the check from
  pidfile_is_running() to pid_exists() and re-run my tests
---
 utilities/ovs-lib.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index fa840ec637f5..dc485413ef0c 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -127,7 +127,7 @@ fi
 pid_exists () {
 # This is better than "kill -0" because it doesn't require permission to
 # send a signal (so daemon_status in particular works as non-root).
-test -d /proc/"$1"
+test -n "$1" && test -d /proc/"$1"
 }
 
 pid_comm_check () {
-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] Make pid_exists() more robust against empty pid argument

2019-08-14 Thread Michele Baldessari
In some of our destructive testing of ovn-dbs inside containers managed
by pacemaker we reached a situation where /var/run/openvswitch had
empty .pid files. The current code does not deal well with them
and pidfile_is_running() returns true in such a case and this confuses
the OCF resource agent.

- Before this change:
Inside a container run:
  killall ovsdb-server;
  echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > 
/var/run/openvswitch/ovnsb_db.pid

We will observe that the cluster is unable to ever recover because
it believes the ovn processes to be running when they really aren't and
eventually just fails:
 podman container set: ovn-dbs-bundle 
[192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
   ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Stopped controller-1
   ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2

Let's make sure pid_exists() returns false when the pid is an empty
string.

- After this change the cluster is able to recover from this state and
correctly start the resource:
 podman container set: ovn-dbs-bundle 
[192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
   ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Slave controller-1
   ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2

Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.")

Signed-off-by: Michele Baldessari 
---
v1 -> v2

- Implemented Ilya's suggestion and moved the check from
  pidfile_is_running() to pid_exists() and re-run my tests
---
 utilities/ovs-lib.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index fa840ec637f5..dc485413ef0c 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -127,7 +127,7 @@ fi
 pid_exists () {
 # This is better than "kill -0" because it doesn't require permission to
 # send a signal (so daemon_status in particular works as non-root).
-test -d /proc/"$1"
+test -n "$1" && test -d /proc/"$1"
 }
 
 pid_comm_check () {
-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Make pidfile_is_running more robust against empty pidfiles

2019-08-14 Thread Michele Baldessari
In some of our destructive testing of ovn-dbs inside containers managed
by pacemaker we reached a situation where /var/run/openvswitch had
empty .pid files. The current code does not deal well with them
and pidfile_is_running() returns true in such a case and this confuses
the OCF resource agent.

- Before this change:
Inside a container run:
  killall ovsdb-server;
  echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > 
/var/run/openvswitch/ovnsb_db.pid

We will observe that the cluster is unable to ever recover because
it believes the ovn processes to be running when they really aren't and
eventually just fails:
 podman container set: ovn-dbs-bundle 
[192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
   ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Stopped controller-1
   ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2

- After this change the cluster is able to recover from this state and
correctly start the resource:
 podman container set: ovn-dbs-bundle 
[192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers):   Master controller-0
   ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers):   Slave controller-1
   ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers):   Slave controller-2

Signed-off-by: Michele Baldessari 
---
 ovn/utilities/ovn-ctl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovn/utilities/ovn-ctl b/ovn/utilities/ovn-ctl
index 7e5cd469c83c..65f03e28ddba 100755
--- a/ovn/utilities/ovn-ctl
+++ b/ovn/utilities/ovn-ctl
@@ -35,7 +35,7 @@ ovn_northd_db_conf_file="$etcdir/ovn-northd-db-params.conf"
 
 pidfile_is_running () {
 pidfile=$1
-test -e "$pidfile" && pid=`cat "$pidfile"` && pid_exists "$pid"
+test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && 
pid_exists "$pid"
 } >/dev/null 2>&1
 
 stop_nb_ovsdb() {
-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] OVN resource agent - make promotion synchronous

2019-07-17 Thread Michele Baldessari
On Wed, Jul 17, 2019 at 06:23:56AM -0500, Terry Wilson wrote:
> Is this just waiting on a couple of line length issues to be fixed? Or do
> those really matter?

Hi Terry,

I kind of ignored it because the ovn/utilities/ovndb-servers.ocf file
has already a bunch of lines > 79 chars. I can still fix it up if that
is preferred?

cheers,
Michele

> 
> On Tue, Jul 9, 2019 at 3:10 AM 0-day Robot  wrote:
> 
> > Bleep bloop.  Greetings Michele Baldessari, I am a robot and I have tried
> > out your patch.
> > Thanks for your contribution.
> >
> > I encountered some error that I wasn't expecting.  See the details below.
> >
> >
> > checkpatch:
> > WARNING: Line is 89 characters long (recommended limit is 79)
> > #50 FILE: ovn/utilities/ovndb-servers.ocf:545:
> > ocf_log debug "ovndb_servers: Waiting for promotion $host_name as
> > master to complete"
> >
> > WARNING: Line is 82 characters long (recommended limit is 79)
> > #58 FILE: ovn/utilities/ovndb-servers.ocf:553:
> > ocf_log debug "ovndb_servers: Promotion of $host_name as the master
> > completed"
> >
> > Lines checked: 64, Warnings: 2, Errors: 0
> >
> >
> > Please check this out.  If you feel there has been an error, please email
> > acon...@bytheb.org
> >
> > Thanks,
> > 0-day Robot
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >

-- 
Michele Baldessari
C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] OVN resource agent - make promotion synchronous

2019-07-09 Thread Michele Baldessari
Currently inside the ovsdb_server_promote() function we call 'promote_ovnnb'
and 'promote_ovnsb' and then just record the new master state in the
CIB.

This creates a race because those two promote commands are asynchronous
so when we exit the ovsdb_server_promote() function the underlying DBs
are not guaranteed to be in master state. That means that clients might
connect to an instance that is in read-only mode.

We add a simple sleep loop where we wait for the underlying DB state to
confirm the master state. We do not need to add a timeout loop because
in case of an issue the resource timeout set within pacemaker will kick
in and the resource agent script will be killed by pacemaker.

Tested this within an openstack environment using ovn with roughly ~20
reboots and was unable to trigger the issue (before the patch we would
trigger the issue after a couple of reboots tops).

Signed-off-by: Michele Baldessari 
---
 ovn/utilities/ovndb-servers.ocf | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/ovn/utilities/ovndb-servers.ocf b/ovn/utilities/ovndb-servers.ocf
index 10313304cb7c..cd47426689ef 100755
--- a/ovn/utilities/ovndb-servers.ocf
+++ b/ovn/utilities/ovndb-servers.ocf
@@ -516,6 +516,8 @@ ovsdb_server_stop() {
 }
 
 ovsdb_server_promote() {
+local state
+
 ovsdb_server_check_status ignore_northd
 rc=$?
 case $rc in
@@ -540,7 +542,15 @@ ovsdb_server_promote() {
 ${OVN_CTL} --ovn-manage-ovsdb=no start_northd
 fi
 
-ocf_log debug "ovndb_servers: Promoting $host_name as the master"
+ocf_log debug "ovndb_servers: Waiting for promotion $host_name as master 
to complete"
+ovsdb_server_check_status
+state=$?
+while [ "$state" != "$OCF_RUNNING_MASTER" ]; do
+  sleep 1
+  ovsdb_server_check_status
+  state=$?
+done
+ocf_log debug "ovndb_servers: Promotion of $host_name as the master 
completed"
 # Record ourselves so that the agent has a better chance of doing
 # the right thing at startup
 ${CRM_ATTR_REPL_INFO} -v "$host_name"
-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev