On Mon, Oct 09, 2017 at 01:13:56PM -0400, Russell Bryant wrote: > On Mon, Sep 25, 2017 at 2:29 PM, Ben Pfaff <b...@ovn.org> wrote: > > On Mon, Sep 25, 2017 at 11:09:49AM -0700, Han Zhou wrote: > >> On Mon, Sep 25, 2017 at 2:36 AM, Miguel Angel Ajo Pelayo < > >> majop...@redhat.com> wrote: > >> > > >> > I believe Lucas Alvares could give you valuable feedback on this as > >> > he was planning to use this as a mechanism for synchronization on > >> > the networking-ovn side (if I didn't get it wrong). > >> > > >> > I believe he's back by October. > >> > > >> > Best regards. > >> > Miguel Ángel. > >> > > >> > On Fri, Sep 22, 2017 at 6:58 PM, Ben Pfaff <b...@ovn.org> wrote: > >> > > >> > > We've had a couple of brief discussions during the OVN meeting about > >> > > locks in OVSDB. As I understand it, a few services use OVSDB locks to > >> > > avoid duplicating work. The question is whether and how to extend > >> > > OVSDB > >> > > locks to a distributed context. > >> > > > >> > > First, I think it's worth reviewing how OVSDB locks work, filling in > >> > > some of the implications that aren't covered by RFC 7047. OVSDB locks > >> > > are server-level (not database-level) objects that can be owned by at > >> > > most one client at a time. Clients can obtain them either through a > >> > > "lock" operation, in which case they get queued to obtain the lock when > >> > > it's no longer owned by anyone else, or through a "steal" operation > >> > > that > >> > > always succeeds immediately, kicking out whoever (if anyone) previously > >> > > owned the lock. A client loses a lock whenever it releases it with an > >> > > "unlock" operation or whenever its connection to the server drops. The > >> > > server notifies a client whenever it acquires a lock or whenever it is > >> > > stolen by another client. > >> > > > >> > > This scheme works perfectly for one particular scenario: where the > >> > > resource protected by the lock is an OVSDB database (or part of one) on > >> > > the same server as the lock. This is because OVSDB transactions > >> > > include > >> > > an "assert" operation that names a lock and aborts the transaction if > >> > > the client does not hold the lock. Since the server is both the lock > >> > > manager and the implementer of the transaction, it can always make the > >> > > correct decision. This scenario could be extended to distributed locks > >> > > with the same guarantee. > >> > > > >> > > Another scenario that could work acceptably with distributed OVSDB > >> > > locks > >> > > is one where the lock guards against duplicated work. For example, > >> > > suppose a couple of ovn-northd instances both try to grab a lock, with > >> > > only the winner actually running, to avoid having both of them spend a > >> > > lot of CPU time recomputing the southbound flow table. A distributed > >> > > version of OVSDB locks would probably work fine in practice for this, > >> > > although occasionally due to network propagation delays, "steal" > >> > > operations, or different ideas between client and server of when a > >> > > session has dropped, both ovn-northd might think they have the lock. > >> > > (If, however, they combined this with "assert" when they actually > >> > > committed their changes to the southbound database, then they would > >> > > never actually interfere with each other in database commits.) > >> > > > >> > > A scenario that would not work acceptably with distributed OVSDB locks, > >> > > without a change to the model, is where the lock ensures correctness, > >> > > that is, if two clients both think they have the lock then bad things > >> > > happen. I believe that this requires clients to understand a concept > >> > > of > >> > > leases, which OVSDB doesn't currently have. The "steal" operation is > >> > > also problematic in this model since it would require canceling a > >> > > lease. (This scenario also does not work acceptably with single-server > >> > > OVSDB locks.) > >> > > > >> > > I'd appreciate anyone's thoughts on the topic. > >> > > > >> > > This webpage is good reading: > >> > > > >> https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html > >> > > > >> > > Thanks, > >> > > > >> > > Ben. > >> > >> Hi Ben, > >> > >> If I understand correctly, you are saying that the clustering wouldn't > >> introduce any new restriction to the locking mechanism, comparing with the > >> current single node implementation. Both new and old approach support > >> avoiding redundant work, but not for correctness (unless "assert" or some > >> other "fence" is used). Is this correct? > > > > It's accurate that clustering would not technically introduce new > > restrictions. It will increase race windows, especially over Unix > > sockets, so anyone who is currently (incorrectly) relying on OVSDB > > locking for correctness will probably start seeing failures that they > > did not see before. I'd be pleased to hear that no one is doing this. > > You discussed the ovn-northd use case in your original post (thanks!). > > The existing Neutron integration use case should be fine. In that > case, it's not committing any transactions. The lock is only used to > ensure that only one server is processing logical switch port "up" > state. If more than one thinks it has a lock, the worst that can > happen is we send the same port event through OpenStack more than > once. That's mostly harmless, aside from a log message. > > Miguel mentioned that it might be used for an additional use case that > Lucas is working on, but OVSDB locks are not used there.
OK, thanks. My current patch series do not implement distributed locks, but now I can start designing the feature. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev