Re: [openstack-dev] [nova] Distributed locking
Could u expand on this and how it would work. I'm pretty skeptical of new ad-hoc locking implementations so just want to ensure it's flushed out in detail. What would the two local locks be, where would they be, what would the 'conducting' being doing to coordinate? -Original Message- From: John Garbutt Reply-To: "OpenStack Development Mailing List (not for usage questions)" Date: Wednesday, June 25, 2014 at 1:08 AM To: "OpenStack Development Mailing List (not for usage questions)" Subject: Re: [openstack-dev] [nova] Distributed locking >So just to keep the ML up with some of the discussion we had in IRC >the other day... > >Most resources in Nova are owned by a particular nova-compute. So the >locks on the resources are effectively held by the nova-compute that >owns the resource. > >We already effectively have a cross nova-compute "lock" holding in the >capacity reservations during migrate/resize. > >But to cut a long story short, if the image cache is actually just a >copy from one of the nova-compute nodes that already have that image >into the local (shared) folder for another nova-compute, then we can >get away without a global lock, and just have two local locks on >either end and some "conducting" to co-ordinate things. > >Its not perfect, but its an option. > >Thanks, >John > > >On 17 June 2014 18:18, Clint Byrum wrote: >> Excerpts from Matthew Booth's message of 2014-06-17 01:36:11 -0700: >>> On 17/06/14 00:28, Joshua Harlow wrote: >>> > So this is a reader/write lock then? >>> > >>> > I have seen https://github.com/python-zk/kazoo/pull/141 come up in >>>the >>> > kazoo (zookeeper python library) but there was a lack of a >>>maintainer for >>> > that 'recipe', perhaps if we really find this needed we can help get >>>that >>> > pull request 'sponsored' so that it can be used for this purpose? >>> > >>> > >>> > As far as resiliency, the thing I was thinking about was how correct >>>do u >>> > want this lock to be? >>> > >>> > If u say go with memcached and a locking mechanism using it this >>>will not >>> > be correct but it might work good enough under normal usage. So >>>that¹s why >>> > I was wondering about what level of correctness do you want and what >>>do >>> > you want to happen if a server that is maintaining the lock record >>>dies. >>> > In memcaches case this will literally be 1 server, even if sharding >>>is >>> > being used, since a key hashes to one server. So if that one server >>>goes >>> > down (or a network split happens) then it is possible for two >>>entities to >>> > believe they own the same lock (and if the network split recovers >>>this >>> > gets even weirder); so that¹s what I was wondering about when >>>mentioning >>> > resiliency and how much incorrectness you are willing to tolerate. >>> >>> From my POV, the most important things are: >>> >>> * 2 nodes must never believe they hold the same lock >>> * A node must eventually get the lock >>> >> >> If these are musts, then memcache is a no-go for locking. memcached is >> likely to delete anything it is storing in its RAM, at any time. Also >> if you have several memcache servers, a momentary network blip could >> lead to acquiring the lock erroneously. >> >> The only thing it is useful for is coalescing, where a broken lock just >> means wasted resources, erroneous errors, etc. If consistency is needed, >> then you need a consistent backend. >> >> ___ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >___ >OpenStack-dev mailing list >OpenStack-dev@lists.openstack.org >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
So just to keep the ML up with some of the discussion we had in IRC the other day... Most resources in Nova are owned by a particular nova-compute. So the locks on the resources are effectively held by the nova-compute that owns the resource. We already effectively have a cross nova-compute "lock" holding in the capacity reservations during migrate/resize. But to cut a long story short, if the image cache is actually just a copy from one of the nova-compute nodes that already have that image into the local (shared) folder for another nova-compute, then we can get away without a global lock, and just have two local locks on either end and some "conducting" to co-ordinate things. Its not perfect, but its an option. Thanks, John On 17 June 2014 18:18, Clint Byrum wrote: > Excerpts from Matthew Booth's message of 2014-06-17 01:36:11 -0700: >> On 17/06/14 00:28, Joshua Harlow wrote: >> > So this is a reader/write lock then? >> > >> > I have seen https://github.com/python-zk/kazoo/pull/141 come up in the >> > kazoo (zookeeper python library) but there was a lack of a maintainer for >> > that 'recipe', perhaps if we really find this needed we can help get that >> > pull request 'sponsored' so that it can be used for this purpose? >> > >> > >> > As far as resiliency, the thing I was thinking about was how correct do u >> > want this lock to be? >> > >> > If u say go with memcached and a locking mechanism using it this will not >> > be correct but it might work good enough under normal usage. So that¹s why >> > I was wondering about what level of correctness do you want and what do >> > you want to happen if a server that is maintaining the lock record dies. >> > In memcaches case this will literally be 1 server, even if sharding is >> > being used, since a key hashes to one server. So if that one server goes >> > down (or a network split happens) then it is possible for two entities to >> > believe they own the same lock (and if the network split recovers this >> > gets even weirder); so that¹s what I was wondering about when mentioning >> > resiliency and how much incorrectness you are willing to tolerate. >> >> From my POV, the most important things are: >> >> * 2 nodes must never believe they hold the same lock >> * A node must eventually get the lock >> > > If these are musts, then memcache is a no-go for locking. memcached is > likely to delete anything it is storing in its RAM, at any time. Also > if you have several memcache servers, a momentary network blip could > lead to acquiring the lock erroneously. > > The only thing it is useful for is coalescing, where a broken lock just > means wasted resources, erroneous errors, etc. If consistency is needed, > then you need a consistent backend. > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
Excerpts from Matthew Booth's message of 2014-06-17 01:36:11 -0700: > On 17/06/14 00:28, Joshua Harlow wrote: > > So this is a reader/write lock then? > > > > I have seen https://github.com/python-zk/kazoo/pull/141 come up in the > > kazoo (zookeeper python library) but there was a lack of a maintainer for > > that 'recipe', perhaps if we really find this needed we can help get that > > pull request 'sponsored' so that it can be used for this purpose? > > > > > > As far as resiliency, the thing I was thinking about was how correct do u > > want this lock to be? > > > > If u say go with memcached and a locking mechanism using it this will not > > be correct but it might work good enough under normal usage. So that¹s why > > I was wondering about what level of correctness do you want and what do > > you want to happen if a server that is maintaining the lock record dies. > > In memcaches case this will literally be 1 server, even if sharding is > > being used, since a key hashes to one server. So if that one server goes > > down (or a network split happens) then it is possible for two entities to > > believe they own the same lock (and if the network split recovers this > > gets even weirder); so that¹s what I was wondering about when mentioning > > resiliency and how much incorrectness you are willing to tolerate. > > From my POV, the most important things are: > > * 2 nodes must never believe they hold the same lock > * A node must eventually get the lock > If these are musts, then memcache is a no-go for locking. memcached is likely to delete anything it is storing in its RAM, at any time. Also if you have several memcache servers, a momentary network blip could lead to acquiring the lock erroneously. The only thing it is useful for is coalescing, where a broken lock just means wasted resources, erroneous errors, etc. If consistency is needed, then you need a consistent backend. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On Tue, Jun 17, 2014 at 4:36 AM, Matthew Booth wrote: > On 17/06/14 00:28, Joshua Harlow wrote: >> So this is a reader/write lock then? >> >> I have seen https://github.com/python-zk/kazoo/pull/141 come up in the >> kazoo (zookeeper python library) but there was a lack of a maintainer for >> that 'recipe', perhaps if we really find this needed we can help get that >> pull request 'sponsored' so that it can be used for this purpose? >> >> >> As far as resiliency, the thing I was thinking about was how correct do u >> want this lock to be? >> >> If u say go with memcached and a locking mechanism using it this will not >> be correct but it might work good enough under normal usage. So that¹s why >> I was wondering about what level of correctness do you want and what do >> you want to happen if a server that is maintaining the lock record dies. >> In memcaches case this will literally be 1 server, even if sharding is >> being used, since a key hashes to one server. So if that one server goes >> down (or a network split happens) then it is possible for two entities to >> believe they own the same lock (and if the network split recovers this >> gets even weirder); so that¹s what I was wondering about when mentioning >> resiliency and how much incorrectness you are willing to tolerate. > > From my POV, the most important things are: > > * 2 nodes must never believe they hold the same lock > * A node must eventually get the lock > > I was expecting to implement locking on all three backends as long as > they support it. I haven't looked closely at memcached, but if it can > detect a split it should be able to have a fencing race with the > possible lock holder before continuing. This is obviously undesirable, > as you will probably be fencing an otherwise correctly functioning node, > but it will be correct. There's a team working on a pluggable library for distributed coordination: http://git.openstack.org/cgit/stackforge/tooz Doug > > Matt > >> >> -Original Message----- >> From: Matthew Booth >> Organization: Red Hat >> Date: Friday, June 13, 2014 at 1:40 AM >> To: Joshua Harlow , "OpenStack Development Mailing >> List (not for usage questions)" >> Subject: Re: [openstack-dev] [nova] Distributed locking >> >>> On 12/06/14 21:38, Joshua Harlow wrote: >>>> So just a few thoughts before going to far down this path, >>>> >>>> Can we make sure we really really understand the use-case where we think >>>> this is needed. I think it's fine that this use-case exists, but I just >>>> want to make it very clear to others why its needed and why distributing >>>> locking is the only *correct* way. >>> >>> An example use of this would be side-loading an image from another >>> node's image cache rather than fetching it from glance, which would have >>> very significant performance benefits in the VMware driver, and possibly >>> other places. The copier must take a read lock on the image to prevent >>> the owner from ageing it during the copy. Holding a read lock would also >>> assure the copier that the image it is copying is complete. >>> >>>> This helps set a good precedent for others that may follow down this >>>> path >>>> that they also clearly explain the situation, how distributed locking >>>> fixes it and all the corner cases that now pop-up with distributed >>>> locking. >>>> >>>> Some of the questions that I can think of at the current moment: >>>> >>>> * What happens when a node goes down that owns the lock, how does the >>>> software react to this? >>> >>> This can be well defined according to the behaviour of the backend. For >>> example, it is well defined in zookeeper when a node's session expires. >>> If the lock holder is no longer a valid node, it would be fenced before >>> deleting its lock, allowing other nodes to continue. >>> >>> Without fencing it would not be possible to safely continue in this case. >>> >>>> * What resources are being locked; what is the lock target, what is its >>>> lifetime? >>> >>> These are not questions for a locking implementation. A lock would be >>> held on a name, and it would be up to the api user to ensure that the >>> protected resource is only used while correctly locked, and that the >>> lock is not held longer than necessary. >>> >>>> * What resiliency do
Re: [openstack-dev] [nova] Distributed locking
On 17/06/14 00:28, Joshua Harlow wrote: > So this is a reader/write lock then? > > I have seen https://github.com/python-zk/kazoo/pull/141 come up in the > kazoo (zookeeper python library) but there was a lack of a maintainer for > that 'recipe', perhaps if we really find this needed we can help get that > pull request 'sponsored' so that it can be used for this purpose? > > > As far as resiliency, the thing I was thinking about was how correct do u > want this lock to be? > > If u say go with memcached and a locking mechanism using it this will not > be correct but it might work good enough under normal usage. So that¹s why > I was wondering about what level of correctness do you want and what do > you want to happen if a server that is maintaining the lock record dies. > In memcaches case this will literally be 1 server, even if sharding is > being used, since a key hashes to one server. So if that one server goes > down (or a network split happens) then it is possible for two entities to > believe they own the same lock (and if the network split recovers this > gets even weirder); so that¹s what I was wondering about when mentioning > resiliency and how much incorrectness you are willing to tolerate. >From my POV, the most important things are: * 2 nodes must never believe they hold the same lock * A node must eventually get the lock I was expecting to implement locking on all three backends as long as they support it. I haven't looked closely at memcached, but if it can detect a split it should be able to have a fencing race with the possible lock holder before continuing. This is obviously undesirable, as you will probably be fencing an otherwise correctly functioning node, but it will be correct. Matt > > -Original Message- > From: Matthew Booth > Organization: Red Hat > Date: Friday, June 13, 2014 at 1:40 AM > To: Joshua Harlow , "OpenStack Development Mailing > List (not for usage questions)" > Subject: Re: [openstack-dev] [nova] Distributed locking > >> On 12/06/14 21:38, Joshua Harlow wrote: >>> So just a few thoughts before going to far down this path, >>> >>> Can we make sure we really really understand the use-case where we think >>> this is needed. I think it's fine that this use-case exists, but I just >>> want to make it very clear to others why its needed and why distributing >>> locking is the only *correct* way. >> >> An example use of this would be side-loading an image from another >> node's image cache rather than fetching it from glance, which would have >> very significant performance benefits in the VMware driver, and possibly >> other places. The copier must take a read lock on the image to prevent >> the owner from ageing it during the copy. Holding a read lock would also >> assure the copier that the image it is copying is complete. >> >>> This helps set a good precedent for others that may follow down this >>> path >>> that they also clearly explain the situation, how distributed locking >>> fixes it and all the corner cases that now pop-up with distributed >>> locking. >>> >>> Some of the questions that I can think of at the current moment: >>> >>> * What happens when a node goes down that owns the lock, how does the >>> software react to this? >> >> This can be well defined according to the behaviour of the backend. For >> example, it is well defined in zookeeper when a node's session expires. >> If the lock holder is no longer a valid node, it would be fenced before >> deleting its lock, allowing other nodes to continue. >> >> Without fencing it would not be possible to safely continue in this case. >> >>> * What resources are being locked; what is the lock target, what is its >>> lifetime? >> >> These are not questions for a locking implementation. A lock would be >> held on a name, and it would be up to the api user to ensure that the >> protected resource is only used while correctly locked, and that the >> lock is not held longer than necessary. >> >>> * What resiliency do you want this lock to provide (this becomes a >>> critical question when considering memcached, since memcached is not >>> really the best choice for a resilient distributing locking backend)? >> >> What does resiliency mean in this context? We really just need the lock >> to be correct >> >>> * What do entities that try to acquire a lock do when they can't acquire >>> it? >> >> Typically block, but if a use case emerged for trylock() it would be >> simpl
Re: [openstack-dev] [nova] Distributed locking
So this is a reader/write lock then? I have seen https://github.com/python-zk/kazoo/pull/141 come up in the kazoo (zookeeper python library) but there was a lack of a maintainer for that 'recipe', perhaps if we really find this needed we can help get that pull request 'sponsored' so that it can be used for this purpose? As far as resiliency, the thing I was thinking about was how correct do u want this lock to be? If u say go with memcached and a locking mechanism using it this will not be correct but it might work good enough under normal usage. So that¹s why I was wondering about what level of correctness do you want and what do you want to happen if a server that is maintaining the lock record dies. In memcaches case this will literally be 1 server, even if sharding is being used, since a key hashes to one server. So if that one server goes down (or a network split happens) then it is possible for two entities to believe they own the same lock (and if the network split recovers this gets even weirder); so that¹s what I was wondering about when mentioning resiliency and how much incorrectness you are willing to tolerate. -Original Message- From: Matthew Booth Organization: Red Hat Date: Friday, June 13, 2014 at 1:40 AM To: Joshua Harlow , "OpenStack Development Mailing List (not for usage questions)" Subject: Re: [openstack-dev] [nova] Distributed locking >On 12/06/14 21:38, Joshua Harlow wrote: >> So just a few thoughts before going to far down this path, >> >> Can we make sure we really really understand the use-case where we think >> this is needed. I think it's fine that this use-case exists, but I just >> want to make it very clear to others why its needed and why distributing >> locking is the only *correct* way. > >An example use of this would be side-loading an image from another >node's image cache rather than fetching it from glance, which would have >very significant performance benefits in the VMware driver, and possibly >other places. The copier must take a read lock on the image to prevent >the owner from ageing it during the copy. Holding a read lock would also >assure the copier that the image it is copying is complete. > >> This helps set a good precedent for others that may follow down this >>path >> that they also clearly explain the situation, how distributed locking >> fixes it and all the corner cases that now pop-up with distributed >>locking. >> >> Some of the questions that I can think of at the current moment: >> >> * What happens when a node goes down that owns the lock, how does the >> software react to this? > >This can be well defined according to the behaviour of the backend. For >example, it is well defined in zookeeper when a node's session expires. >If the lock holder is no longer a valid node, it would be fenced before >deleting its lock, allowing other nodes to continue. > >Without fencing it would not be possible to safely continue in this case. > >> * What resources are being locked; what is the lock target, what is its >> lifetime? > >These are not questions for a locking implementation. A lock would be >held on a name, and it would be up to the api user to ensure that the >protected resource is only used while correctly locked, and that the >lock is not held longer than necessary. > >> * What resiliency do you want this lock to provide (this becomes a >> critical question when considering memcached, since memcached is not >> really the best choice for a resilient distributing locking backend)? > >What does resiliency mean in this context? We really just need the lock >to be correct > >> * What do entities that try to acquire a lock do when they can't acquire >> it? > >Typically block, but if a use case emerged for trylock() it would be >simple to implement. For example, in the image side-loading case we may >decide that if it isn't possible to immediately acquire the lock it >isn't worth waiting, and we just fetch it from glance anyway. > >> A useful thing I wrote up a while ago, might still be useful: >> >> https://wiki.openstack.org/wiki/StructuredWorkflowLocks >> >> Feel free to move that wiki if u find it useful (its sorta a high-level >> doc on the different strategies and such). > >Nice list of implementation pros/cons. > >Matt > >> >> -Josh >> >> -Original Message- >> From: Matthew Booth >> Organization: Red Hat >> Reply-To: "OpenStack Development Mailing List (not for usage questions)" >> >> Date: Thursday, June 12, 2014 at 7:30 AM >> To: "OpenStack Development Mailing List (not for usage questions)" &g
Re: [openstack-dev] [nova] Distributed locking
On 06/13/2014 05:01 AM, Julien Danjou wrote: On Thu, Jun 12 2014, Jay Pipes wrote: This is news to me. When was this decided and where can I read about it? Originally https://wiki.openstack.org/wiki/Oslo/blueprints/service-sync has been proposed, presented and accepted back at the Icehouse summit in HKG. That's what led to tooz creation and development since then. Thanks, Julien, that's a helpful link. Appreciated! Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On Fri, 13 Jun 2014 09:40:30 AM Matthew Booth wrote: > On 12/06/14 21:38, Joshua Harlow wrote: > > So just a few thoughts before going to far down this path, > > > > Can we make sure we really really understand the use-case where we think > > this is needed. I think it's fine that this use-case exists, but I just > > want to make it very clear to others why its needed and why distributing > > locking is the only *correct* way. > > An example use of this would be side-loading an image from another > node's image cache rather than fetching it from glance, which would have > very significant performance benefits in the VMware driver, and possibly > other places. The copier must take a read lock on the image to prevent > the owner from ageing it during the copy. Holding a read lock would also > assure the copier that the image it is copying is complete. For this particular example, taking a lock every time seems expensive. An alternative would be to just try to read from another node, and if the result wasn't complete+valid for whatever reason then fallback to reading from glance. > > * What happens when a node goes down that owns the lock, how does the > > software react to this? > > This can be well defined according to the behaviour of the backend. For > example, it is well defined in zookeeper when a node's session expires. > If the lock holder is no longer a valid node, it would be fenced before > deleting its lock, allowing other nodes to continue. > > Without fencing it would not be possible to safely continue in this case. So I'm sorry for explaining myself poorly in my earlier post. I think you've just described waiting for the lock to expire before another node can take it, which is just a regular lock behaviour. What additional steps do you want Fence() to perform at this point? (I can see if the resource provider had some form of fencing, then it could do all sorts of additional things - but I gather your original use case is exactly where that *isn't* an option) "If the lock was allowed to go stale and not released cleanly, then we should forcibly reboot the stale instance before allowing the lock to be held again" shouldn't be too hard to add. - Is this just rebooting the instance sufficient for similar situations or would we need configurable "actions"? - Which bot do we trust to issue the reboot command? >From the locking service pov, I can think of several ways to implement this, so we probably want to export a high-level operation and allow the details to vary to suit the underlying locking implementation. -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
Excerpts from Matthew Booth's message of 2014-06-13 01:40:30 -0700: > On 12/06/14 21:38, Joshua Harlow wrote: > > So just a few thoughts before going to far down this path, > > > > Can we make sure we really really understand the use-case where we think > > this is needed. I think it's fine that this use-case exists, but I just > > want to make it very clear to others why its needed and why distributing > > locking is the only *correct* way. > > An example use of this would be side-loading an image from another > node's image cache rather than fetching it from glance, which would have > very significant performance benefits in the VMware driver, and possibly > other places. The copier must take a read lock on the image to prevent > the owner from ageing it during the copy. Holding a read lock would also > assure the copier that the image it is copying is complete. Really? Usually in the unix-inspired world we just open a file and it stays around until we close it. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
Are the details of that implementation described on wiki or elsewhere? (Partially for my own curiosity). I think I understand how it works but write ups usually clear that right up. Sent from my really tiny device... > On Jun 14, 2014, at 12:15 AM, "Robert Collins" > wrote: > >> On 13 June 2014 02:30, Matthew Booth wrote: >> We have a need for a distributed lock in the VMware driver, which I >> suspect isn't unique. Specifically it is possible for a VMware datastore >> to be accessed via multiple nova nodes if it is shared between >> clusters[1]. Unfortunately the vSphere API doesn't provide us with the >> primitives to implement robust locking using the storage layer itself, >> so we're looking elsewhere. > > Perhaps I'm missing something, but I didn't see anything in your > description about actually needing a *distributed* lock, just needing > a local that can be held by remote systems. As Devananda says, a > centralised lock that can be held by agents has been implemented in > Ironic - such a thing is very simple and quite easy to reason about... > but its not suitable for all problems. HA and consistency requirements > for such a thing are delivered through e.g. galera in the DB layer. > > -Rob > > > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On 13 June 2014 02:30, Matthew Booth wrote: > We have a need for a distributed lock in the VMware driver, which I > suspect isn't unique. Specifically it is possible for a VMware datastore > to be accessed via multiple nova nodes if it is shared between > clusters[1]. Unfortunately the vSphere API doesn't provide us with the > primitives to implement robust locking using the storage layer itself, > so we're looking elsewhere. Perhaps I'm missing something, but I didn't see anything in your description about actually needing a *distributed* lock, just needing a local that can be held by remote systems. As Devananda says, a centralised lock that can be held by agents has been implemented in Ironic - such a thing is very simple and quite easy to reason about... but its not suitable for all problems. HA and consistency requirements for such a thing are delivered through e.g. galera in the DB layer. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On Thu, Jun 12 2014, Jay Pipes wrote: > This is news to me. When was this decided and where can I read about > it? Originally https://wiki.openstack.org/wiki/Oslo/blueprints/service-sync has been proposed, presented and accepted back at the Icehouse summit in HKG. That's what led to tooz creation and development since then. -- Julien Danjou // Free Software hacker // http://julien.danjou.info signature.asc Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On 12/06/14 21:38, Joshua Harlow wrote: > So just a few thoughts before going to far down this path, > > Can we make sure we really really understand the use-case where we think > this is needed. I think it's fine that this use-case exists, but I just > want to make it very clear to others why its needed and why distributing > locking is the only *correct* way. An example use of this would be side-loading an image from another node's image cache rather than fetching it from glance, which would have very significant performance benefits in the VMware driver, and possibly other places. The copier must take a read lock on the image to prevent the owner from ageing it during the copy. Holding a read lock would also assure the copier that the image it is copying is complete. > This helps set a good precedent for others that may follow down this path > that they also clearly explain the situation, how distributed locking > fixes it and all the corner cases that now pop-up with distributed locking. > > Some of the questions that I can think of at the current moment: > > * What happens when a node goes down that owns the lock, how does the > software react to this? This can be well defined according to the behaviour of the backend. For example, it is well defined in zookeeper when a node's session expires. If the lock holder is no longer a valid node, it would be fenced before deleting its lock, allowing other nodes to continue. Without fencing it would not be possible to safely continue in this case. > * What resources are being locked; what is the lock target, what is its > lifetime? These are not questions for a locking implementation. A lock would be held on a name, and it would be up to the api user to ensure that the protected resource is only used while correctly locked, and that the lock is not held longer than necessary. > * What resiliency do you want this lock to provide (this becomes a > critical question when considering memcached, since memcached is not > really the best choice for a resilient distributing locking backend)? What does resiliency mean in this context? We really just need the lock to be correct > * What do entities that try to acquire a lock do when they can't acquire > it? Typically block, but if a use case emerged for trylock() it would be simple to implement. For example, in the image side-loading case we may decide that if it isn't possible to immediately acquire the lock it isn't worth waiting, and we just fetch it from glance anyway. > A useful thing I wrote up a while ago, might still be useful: > > https://wiki.openstack.org/wiki/StructuredWorkflowLocks > > Feel free to move that wiki if u find it useful (its sorta a high-level > doc on the different strategies and such). Nice list of implementation pros/cons. Matt > > -Josh > > -Original Message- > From: Matthew Booth > Organization: Red Hat > Reply-To: "OpenStack Development Mailing List (not for usage questions)" > > Date: Thursday, June 12, 2014 at 7:30 AM > To: "OpenStack Development Mailing List (not for usage questions)" > > Subject: [openstack-dev] [nova] Distributed locking > >> We have a need for a distributed lock in the VMware driver, which I >> suspect isn't unique. Specifically it is possible for a VMware datastore >> to be accessed via multiple nova nodes if it is shared between >> clusters[1]. Unfortunately the vSphere API doesn't provide us with the >> primitives to implement robust locking using the storage layer itself, >> so we're looking elsewhere. >> >> The closest we seem to have in Nova currently are service groups, which >> currently have 3 implementations: DB, Zookeeper and Memcached. The >> service group api currently provides simple membership, but for locking >> we'd be looking for something more. >> >> I think the api we'd be looking for would be something along the lines of: >> >> Foo.lock(name, fence_info) >> Foo.unlock(name) >> >> Bar.fence(fence_info) >> >> Note that fencing would be required in this case. We believe we can >> fence by terminating the other Nova's vSphere session, but other options >> might include killing a Nova process, or STONITH. These would be >> implemented as fencing drivers. >> >> Although I haven't worked through the detail, I believe lock and unlock >> would be implementable in all 3 of the current service group drivers. >> Fencing would be implemented separately. >> >> My questions: >> >> * Does this already exist, or does anybody have patches pending to do >> something like this? >> * Are there other users for this? >> * Would service groups be an appropriate place, or a new distributed >> locking class? >> * How about if we just used zookeeper directly in the driver? >> >> Matt >> >> [1] Cluster ~= hypervisor >> -- >> Matthew Booth >> Red Hat Engineering, Virtualisation Team >> >> Phone: +442070094448 (UK) >> GPG ID: D33C3490 >> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 >> >> ___
Re: [openstack-dev] [nova] Distributed locking
On 13/06/14 05:27, Angus Lees wrote: > On Thu, 12 Jun 2014 05:06:38 PM Julien Danjou wrote: >> On Thu, Jun 12 2014, Matthew Booth wrote: >>> This looks interesting. It doesn't have hooks for fencing, though. >>> >>> What's the status of tooz? Would you be interested in adding fencing >>> hooks? >> >> It's maintained and developer, we have plan to use it in Ceilometer and >> others projects. Joshua also wants to use it for Taskflow. >> >> We are blocked for now by https://review.openstack.org/#/c/93443/ and by >> the lack of resource to complete that request obviously, so help >> appreciated. :) >> >> As for fencing hooks, it sounds like a good idea. > > As far as I understand these things, in distributed-locking-speak "fencing" > just means "breaking someone else's lock". No. It means forcibly preventing a delinquent node from further use of the shared resource. An example of a fencing solution is a remote power switch. No trust is required. Matt -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On Thu, 12 Jun 2014 05:06:38 PM Julien Danjou wrote: > On Thu, Jun 12 2014, Matthew Booth wrote: > > This looks interesting. It doesn't have hooks for fencing, though. > > > > What's the status of tooz? Would you be interested in adding fencing > > hooks? > > It's maintained and developer, we have plan to use it in Ceilometer and > others projects. Joshua also wants to use it for Taskflow. > > We are blocked for now by https://review.openstack.org/#/c/93443/ and by > the lack of resource to complete that request obviously, so help > appreciated. :) > > As for fencing hooks, it sounds like a good idea. As far as I understand these things, in distributed-locking-speak "fencing" just means "breaking someone else's lock". I think your options here are (and apologies if I'm repeating things that are obvious): 1. Have a "force unlock" protocol (numerous alternatives exist). Assume the lock holder implements it properly and stops accessing the shared resource when asked. 2. Kill the lock holder using some method unrelated to the locking service and wait for the locking protocol to realise ex-holder is dead through usual liveness tests. Assume not being able to hold the lock implies no longer able to access the shared resource. The "liveness test" usually involves the holder pinging the lock service periodically, and everyone has to wait for some agreed timeout before assuming a client is dead. (1) involves a lot of trust - and seems particularly bad if the reason you are breaking the lock is because the holder is misbehaving. Assuming (2) is the only reasonable choice, I don't think the lock service needs explicit support for fencing, since the exact method for killing the holder is unrelated, and relatively uninteresting (probably always going to be an instance delete in OS). Perhaps more interesting is exactly what conditions you require before attempting to kill the lock holder - you wouldn't want just any job deciding it was warranted, or else a misbehaving client would cause mayhem. Again, I suggest your options here are: 1. Require human judgement. ie: provide monitoring for whatever is misbehaving and make it obvious that one mitigation is to nuke the apparent holder. 2. Require the lock breaker to be able to reach a majority of nodes as some proof of "I'm working, my opinion must be right". In a paxos system, reaching a majority of nodes basically becomes "hold a lock", we end back up with "my liveness test is better than yours somehow", and I'm not sure how to resolve that without human judgement (but I'm not familiar with existing approaches). Again, I don't think this needs additional support from the lock service, beyond a liveness test (which zookeeper, for example, has). tl;dr: I'm interested in what sort of automated fencing behaviour you'd like. -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
Ironic has a simple lock mechanism for nodes to ensure that, if the hash ring rebalances while an operation is in progress, the second conductor doesn't trample on the work that the first conductor is doing until it's finished (and releases the lock). Right now, it's got a simple DB backing. We've discussed making it more pluggable. I'd be all for there being a common openstack way to do this, preferably in oslo. /me adds tooz to my list of things to read Cheers, -Deva On Thu, Jun 12, 2014 at 9:46 AM, Jay Pipes wrote: > On 06/12/2014 10:35 AM, Julien Danjou wrote: >> >> On Thu, Jun 12 2014, Matthew Booth wrote: >> >>> We have a need for a distributed lock in the VMware driver, which I >>> suspect isn't unique. Specifically it is possible for a VMware datastore >>> to be accessed via multiple nova nodes if it is shared between >>> clusters[1]. Unfortunately the vSphere API doesn't provide us with the >>> primitives to implement robust locking using the storage layer itself, >>> so we're looking elsewhere. >> >> >> The tooz library has been created for this purpose: >> >>https://pypi.python.org/pypi/tooz >> >>https://git.openstack.org/cgit/stackforge/tooz/ >> >>> Although I haven't worked through the detail, I believe lock and unlock >>> would be implementable in all 3 of the current service group drivers. >>> Fencing would be implemented separately. >> >> >> The plan is to leverage tooz to replace the Nova service group drivers, >> as this is also usable in a lot of others OpenStack services. > > > This is news to me. When was this decided and where can I read about it? > > Thanks, > -jay > > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
So just a few thoughts before going to far down this path, Can we make sure we really really understand the use-case where we think this is needed. I think it's fine that this use-case exists, but I just want to make it very clear to others why its needed and why distributing locking is the only *correct* way. This helps set a good precedent for others that may follow down this path that they also clearly explain the situation, how distributed locking fixes it and all the corner cases that now pop-up with distributed locking. Some of the questions that I can think of at the current moment: * What happens when a node goes down that owns the lock, how does the software react to this? * What resources are being locked; what is the lock target, what is its lifetime? * What resiliency do you want this lock to provide (this becomes a critical question when considering memcached, since memcached is not really the best choice for a resilient distributing locking backend)? * What do entities that try to acquire a lock do when they can't acquire it? A useful thing I wrote up a while ago, might still be useful: https://wiki.openstack.org/wiki/StructuredWorkflowLocks Feel free to move that wiki if u find it useful (its sorta a high-level doc on the different strategies and such). -Josh -Original Message- From: Matthew Booth Organization: Red Hat Reply-To: "OpenStack Development Mailing List (not for usage questions)" Date: Thursday, June 12, 2014 at 7:30 AM To: "OpenStack Development Mailing List (not for usage questions)" Subject: [openstack-dev] [nova] Distributed locking >We have a need for a distributed lock in the VMware driver, which I >suspect isn't unique. Specifically it is possible for a VMware datastore >to be accessed via multiple nova nodes if it is shared between >clusters[1]. Unfortunately the vSphere API doesn't provide us with the >primitives to implement robust locking using the storage layer itself, >so we're looking elsewhere. > >The closest we seem to have in Nova currently are service groups, which >currently have 3 implementations: DB, Zookeeper and Memcached. The >service group api currently provides simple membership, but for locking >we'd be looking for something more. > >I think the api we'd be looking for would be something along the lines of: > >Foo.lock(name, fence_info) >Foo.unlock(name) > >Bar.fence(fence_info) > >Note that fencing would be required in this case. We believe we can >fence by terminating the other Nova's vSphere session, but other options >might include killing a Nova process, or STONITH. These would be >implemented as fencing drivers. > >Although I haven't worked through the detail, I believe lock and unlock >would be implementable in all 3 of the current service group drivers. >Fencing would be implemented separately. > >My questions: > >* Does this already exist, or does anybody have patches pending to do >something like this? >* Are there other users for this? >* Would service groups be an appropriate place, or a new distributed >locking class? >* How about if we just used zookeeper directly in the driver? > >Matt > >[1] Cluster ~= hypervisor >-- >Matthew Booth >Red Hat Engineering, Virtualisation Team > >Phone: +442070094448 (UK) >GPG ID: D33C3490 >GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 > >___ >OpenStack-dev mailing list >OpenStack-dev@lists.openstack.org >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On 06/12/2014 10:35 AM, Julien Danjou wrote: On Thu, Jun 12 2014, Matthew Booth wrote: We have a need for a distributed lock in the VMware driver, which I suspect isn't unique. Specifically it is possible for a VMware datastore to be accessed via multiple nova nodes if it is shared between clusters[1]. Unfortunately the vSphere API doesn't provide us with the primitives to implement robust locking using the storage layer itself, so we're looking elsewhere. The tooz library has been created for this purpose: https://pypi.python.org/pypi/tooz https://git.openstack.org/cgit/stackforge/tooz/ Although I haven't worked through the detail, I believe lock and unlock would be implementable in all 3 of the current service group drivers. Fencing would be implemented separately. The plan is to leverage tooz to replace the Nova service group drivers, as this is also usable in a lot of others OpenStack services. This is news to me. When was this decided and where can I read about it? Thanks, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On Thu, Jun 12 2014, Matthew Booth wrote: > This looks interesting. It doesn't have hooks for fencing, though. > > What's the status of tooz? Would you be interested in adding fencing > hooks? It's maintained and developer, we have plan to use it in Ceilometer and others projects. Joshua also wants to use it for Taskflow. We are blocked for now by https://review.openstack.org/#/c/93443/ and by the lack of resource to complete that request obviously, so help appreciated. :) As for fencing hooks, it sounds like a good idea. -- Julien Danjou /* Free Software hacker http://julien.danjou.info */ signature.asc Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/06/14 15:35, Julien Danjou wrote: > On Thu, Jun 12 2014, Matthew Booth wrote: > >> We have a need for a distributed lock in the VMware driver, which >> I suspect isn't unique. Specifically it is possible for a VMware >> datastore to be accessed via multiple nova nodes if it is shared >> between clusters[1]. Unfortunately the vSphere API doesn't >> provide us with the primitives to implement robust locking using >> the storage layer itself, so we're looking elsewhere. > > The tooz library has been created for this purpose: > > https://pypi.python.org/pypi/tooz > > https://git.openstack.org/cgit/stackforge/tooz/ > >> Although I haven't worked through the detail, I believe lock and >> unlock would be implementable in all 3 of the current service >> group drivers. Fencing would be implemented separately. > > The plan is to leverage tooz to replace the Nova service group > drivers, as this is also usable in a lot of others OpenStack > services. This looks interesting. It doesn't have hooks for fencing, though. What's the status of tooz? Would you be interested in adding fencing hooks? Matt - -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 -BEGIN PGP SIGNATURE- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlOZvc8ACgkQNEHqGdM8NJCgHQCcCTGaZ9520HCa60MJ0xhkD81O pi4AnA2x9nwGD5F5xD8SHYEYNOpRri/2 =WIsg -END PGP SIGNATURE- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Distributed locking
On Thu, Jun 12 2014, Matthew Booth wrote: > We have a need for a distributed lock in the VMware driver, which I > suspect isn't unique. Specifically it is possible for a VMware datastore > to be accessed via multiple nova nodes if it is shared between > clusters[1]. Unfortunately the vSphere API doesn't provide us with the > primitives to implement robust locking using the storage layer itself, > so we're looking elsewhere. The tooz library has been created for this purpose: https://pypi.python.org/pypi/tooz https://git.openstack.org/cgit/stackforge/tooz/ > Although I haven't worked through the detail, I believe lock and unlock > would be implementable in all 3 of the current service group drivers. > Fencing would be implemented separately. The plan is to leverage tooz to replace the Nova service group drivers, as this is also usable in a lot of others OpenStack services. -- Julien Danjou ;; Free Software hacker ;; http://julien.danjou.info signature.asc Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev