Superficially looking at the code, it seems that the raft code should allow me to specify a dns name instead of an IP address, i.e. 'tcp:ovsdb-0.ovsdb.svc:6644', which would be ideal. I thought I'd tried that and it didn't work, but it's possible that one of:
* I did try this, but ovn-ctl barfed without passing it to ovsdb-server. * I did try this, but I have an old version of ovsdb-server which requires an IP address. * I didn't actually try this, and I need to put down the crack pipe. Using a DNS name would be the optimal behaviour in a K8S cluster, so I'm suddenly feeling better about this again. Matt On Thu, 9 Jul 2020 at 15:53, Matthew Booth <mbo...@redhat.com> wrote: > > On Thu, 9 Jul 2020 at 11:53, Matthew Booth <mbo...@redhat.com> wrote: > > > > I'm running a 3-node ovsdb raft cluster in kubernetes without using > > host networking, NET_ADMIN, or any special networking privileges. I'm > > using a StatefulSet, so I have persistent storage and a persistent > > network name. However, I don't have a persistent IP. I have studied 2 > > existing implementation of OVN including [1], but as they are both > > focussed on providing SDN service to the cluster itself (which I'm > > not: I'm just a regular tenant of the cluster), they both legitimately > > use host networking and therefore don't suffer this issue. > > > > [1] > > https://github.com/ovn-org/ovn-kubernetes/blob/master/dist/templates/ovnkube-db-raft.yaml.j2 > > > > I finally managed to test what happens when a pod's IP changes, and > > the answer is: it breaks. Specifically, the logs are full of: > > > > 2020-07-09T10:09:16Z|06012|socket_util|ERR|Dropped 59 log messages in > > last 59 seconds (most recently, 1 seconds ago) due to excessive rate > > 2020-07-09T10:09:16Z|06013|socket_util|ERR|6644:10.131.0.4: bind: > > Cannot assign requested address > > 2020-07-09T10:09:16Z|06014|raft|WARN|Dropped 59 log messages in last > > 59 seconds (most recently, 1 seconds ago) due to excessive rate > > 2020-07-09T10:09:16Z|06015|raft|WARN|ptcp:6644:10.131.0.4: listen > > failed (Cannot assign requested address) > > > > The reason it can't bind to 10.131.0.4 is that it's no longer a local > > IP address. > > > > Note that this is binding the raft cluster port, not the client port. > > I have clients connecting to a service IP, which is static. I can't > > specifically test that it still works after the pod IPs change, but as > > it worked before there's no reason to suspect it won't. > > > > My first thought was to use service IPs for the raft cluster, too, but > > if it wants to bind to its local cluster IP that's never going to > > work, because the service IP is never a local IP address (traffic is > > forwarded by an external service). > > > > ovsdb-server is invoked in its container by ovn-ctl: > > > > exec /usr/share/openvswitch/scripts/ovn-ctl \ > > --no-monitor \ > > --db-nb-create-insecure-remote=yes \ > > --db-nb-cluster-remote-addr="$(bracketify ${initialiser_ip})" \ > > --db-nb-cluster-local-addr="$(bracketify ${LOCAL_IP})" \ > > --db-nb-cluster-local-proto=tcp \ > > --db-nb-cluster-remote-proto=tcp \ > > --ovn-nb-log="-vconsole:${OVN_LOG_LEVEL} -vfile:off" \ > > run_nb_ovsdb > > > > initialiser_ip is the pod IP address of the pod which comes up first. > > This is a bootstrapping thing, and afaik isn't relevant once the > > cluster is initialised. It certainly doesn't appear in the command > > line below. LOCAL_IP is the current ip address of this pod. > > Surprisingly (to me), this doesn't appear in the ovsdb-server > > invocation either. The actual invocation is: > > > > ovsdb-server -vconsole:info -vfile:off > > --log-file=/var/log/openvswitch/ovsdb-server-sb.log > > --remote=punix:/pod-run/ovnsb_db.sock --pidfile=/pod-run/ovnsb_db.pid > > --unixctl=ovnsb_db.ctl > > --remote=db:OVN_Southbound,SB_Global,connections > > --private-key=db:OVN_Southbound,SSL,private_key > > --certificate=db:OVN_Southbound,SSL,certificate > > --ca-cert=db:OVN_Southbound,SSL,ca_cert > > --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols > > --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers > > --remote=ptcp:6642:0.0.0.0 /var/lib/openvswitch/ovnsb_db.db > > > > So it's getting its former IP address from somewhere. As the only > > local state is the database itself, I assume it's reading it from the > > DB's cluster table. Here's what it currently thinks about cluster > > state: > > > > # ovs-appctl -t /pod-run/ovnsb_db.ctl cluster/status OVN_Southbound > > 83c7 > > Name: OVN_Southbound > > Cluster ID: 1524 (1524187a-8a7b-41d5-89cf-ad2d00141258) > > Server ID: 83c7 (83c771fd-d866-4324-bdd6-707c1bf72010) > > Address: tcp:10.131.0.4:6644 > > Status: cluster member > > Role: candidate > > Term: 41039 > > Leader: unknown > > Vote: self > > > > Log: [5526, 5526] > > Entries not yet committed: 0 > > Entries not yet applied: 0 > > Connections: (->7f46) (->66fc) > > Servers: > > 83c7 (83c7 at tcp:10.131.0.4:6644) (self) (voted for 83c7) > > 7f46 (7f46 at tcp:10.129.2.9:6644) > > 66fc (66fc at tcp:10.128.2.13:6644) > > > > This highlights the next problem, which is that both the other IPs > > have changed, too. I know the new IP addresses of the other 2 cluster > > nodes, although I don't know which one is 7f46 (but presumably it > > knows). Even if I did know, presumably I can't modify the db while > > it's not a member of the cluster anyway. The only way I can currently > > think of to recover this situation is: > > > > * Scale back the cluster to just node-0 > > * node-0 converts itself to a standalone db > > * node-0 converts itself to a cluster db with a new local IP > > * Scale the cluster back up to 3 nodes, initialised from node-0 > > > > I haven't tested this so there may be problems with it, but in any > > case it's not a realistic solution. > > > > A much nicer solution would be to use a service IP for the raft > > cluster, but from the above error message I'm not expecting that to > > work because it won't be able to bind it. I'm going to test this > > today, and I'll update if I find to the contrary. > > Just to confirm I tested this and, as expected, ovsdb-server fails to > start with: > > 2020-07-09T14:49:30Z|00013|socket_util|ERR|6643:172.30.84.58: bind: > Cannot assign requested address > 2020-07-09T14:49:30Z|00014|raft|WARN|ptcp:6643:172.30.84.58: listen > failed (Cannot assign requested address) > > In this case 172.30.84.58 is the stable service IP associated with > this node, but it is not assigned directly to the node. > > Matt > -- > Matthew Booth > Red Hat OpenStack Engineer, Compute DFG > > Phone: +442070094448 (UK) -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss