On 17/06/14 00:28, Joshua Harlow wrote: > So this is a reader/write lock then? > > I have seen https://github.com/python-zk/kazoo/pull/141 come up in the > kazoo (zookeeper python library) but there was a lack of a maintainer for > that 'recipe', perhaps if we really find this needed we can help get that > pull request 'sponsored' so that it can be used for this purpose? > > > As far as resiliency, the thing I was thinking about was how correct do u > want this lock to be? > > If u say go with memcached and a locking mechanism using it this will not > be correct but it might work good enough under normal usage. So that¹s why > I was wondering about what level of correctness do you want and what do > you want to happen if a server that is maintaining the lock record dies. > In memcaches case this will literally be 1 server, even if sharding is > being used, since a key hashes to one server. So if that one server goes > down (or a network split happens) then it is possible for two entities to > believe they own the same lock (and if the network split recovers this > gets even weirder); so that¹s what I was wondering about when mentioning > resiliency and how much incorrectness you are willing to tolerate.
>From my POV, the most important things are: * 2 nodes must never believe they hold the same lock * A node must eventually get the lock I was expecting to implement locking on all three backends as long as they support it. I haven't looked closely at memcached, but if it can detect a split it should be able to have a fencing race with the possible lock holder before continuing. This is obviously undesirable, as you will probably be fencing an otherwise correctly functioning node, but it will be correct. Matt > > -----Original Message----- > From: Matthew Booth <mbo...@redhat.com> > Organization: Red Hat > Date: Friday, June 13, 2014 at 1:40 AM > To: Joshua Harlow <harlo...@yahoo-inc.com>, "OpenStack Development Mailing > List (not for usage questions)" <openstack-dev@lists.openstack.org> > Subject: Re: [openstack-dev] [nova] Distributed locking > >> On 12/06/14 21:38, Joshua Harlow wrote: >>> So just a few thoughts before going to far down this path, >>> >>> Can we make sure we really really understand the use-case where we think >>> this is needed. I think it's fine that this use-case exists, but I just >>> want to make it very clear to others why its needed and why distributing >>> locking is the only *correct* way. >> >> An example use of this would be side-loading an image from another >> node's image cache rather than fetching it from glance, which would have >> very significant performance benefits in the VMware driver, and possibly >> other places. The copier must take a read lock on the image to prevent >> the owner from ageing it during the copy. Holding a read lock would also >> assure the copier that the image it is copying is complete. >> >>> This helps set a good precedent for others that may follow down this >>> path >>> that they also clearly explain the situation, how distributed locking >>> fixes it and all the corner cases that now pop-up with distributed >>> locking. >>> >>> Some of the questions that I can think of at the current moment: >>> >>> * What happens when a node goes down that owns the lock, how does the >>> software react to this? >> >> This can be well defined according to the behaviour of the backend. For >> example, it is well defined in zookeeper when a node's session expires. >> If the lock holder is no longer a valid node, it would be fenced before >> deleting its lock, allowing other nodes to continue. >> >> Without fencing it would not be possible to safely continue in this case. >> >>> * What resources are being locked; what is the lock target, what is its >>> lifetime? >> >> These are not questions for a locking implementation. A lock would be >> held on a name, and it would be up to the api user to ensure that the >> protected resource is only used while correctly locked, and that the >> lock is not held longer than necessary. >> >>> * What resiliency do you want this lock to provide (this becomes a >>> critical question when considering memcached, since memcached is not >>> really the best choice for a resilient distributing locking backend)? >> >> What does resiliency mean in this context? We really just need the lock >> to be correct >> >>> * What do entities that try to acquire a lock do when they can't acquire >>> it? >> >> Typically block, but if a use case emerged for trylock() it would be >> simple to implement. For example, in the image side-loading case we may >> decide that if it isn't possible to immediately acquire the lock it >> isn't worth waiting, and we just fetch it from glance anyway. >> >>> A useful thing I wrote up a while ago, might still be useful: >>> >>> https://wiki.openstack.org/wiki/StructuredWorkflowLocks >>> >>> Feel free to move that wiki if u find it useful (its sorta a high-level >>> doc on the different strategies and such). >> >> Nice list of implementation pros/cons. >> >> Matt >> >>> >>> -Josh >>> >>> -----Original Message----- >>> From: Matthew Booth <mbo...@redhat.com> >>> Organization: Red Hat >>> Reply-To: "OpenStack Development Mailing List (not for usage questions)" >>> <openstack-dev@lists.openstack.org> >>> Date: Thursday, June 12, 2014 at 7:30 AM >>> To: "OpenStack Development Mailing List (not for usage questions)" >>> <openstack-dev@lists.openstack.org> >>> Subject: [openstack-dev] [nova] Distributed locking >>> >>>> We have a need for a distributed lock in the VMware driver, which I >>>> suspect isn't unique. Specifically it is possible for a VMware >>>> datastore >>>> to be accessed via multiple nova nodes if it is shared between >>>> clusters[1]. Unfortunately the vSphere API doesn't provide us with the >>>> primitives to implement robust locking using the storage layer itself, >>>> so we're looking elsewhere. >>>> >>>> The closest we seem to have in Nova currently are service groups, which >>>> currently have 3 implementations: DB, Zookeeper and Memcached. The >>>> service group api currently provides simple membership, but for locking >>>> we'd be looking for something more. >>>> >>>> I think the api we'd be looking for would be something along the lines >>>> of: >>>> >>>> Foo.lock(name, fence_info) >>>> Foo.unlock(name) >>>> >>>> Bar.fence(fence_info) >>>> >>>> Note that fencing would be required in this case. We believe we can >>>> fence by terminating the other Nova's vSphere session, but other >>>> options >>>> might include killing a Nova process, or STONITH. These would be >>>> implemented as fencing drivers. >>>> >>>> Although I haven't worked through the detail, I believe lock and unlock >>>> would be implementable in all 3 of the current service group drivers. >>>> Fencing would be implemented separately. >>>> >>>> My questions: >>>> >>>> * Does this already exist, or does anybody have patches pending to do >>>> something like this? >>>> * Are there other users for this? >>>> * Would service groups be an appropriate place, or a new distributed >>>> locking class? >>>> * How about if we just used zookeeper directly in the driver? >>>> >>>> Matt >>>> >>>> [1] Cluster ~= hypervisor >>>> -- >>>> Matthew Booth >>>> Red Hat Engineering, Virtualisation Team >>>> >>>> Phone: +442070094448 (UK) >>>> GPG ID: D33C3490 >>>> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 >>>> >>>> _______________________________________________ >>>> OpenStack-dev mailing list >>>> OpenStack-dev@lists.openstack.org >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> >> -- >> Matthew Booth >> Red Hat Engineering, Virtualisation Team >> >> Phone: +442070094448 (UK) >> GPG ID: D33C3490 >> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 > -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev