Excerpts from Duncan Thomas's message of 2015-08-04 00:32:40 -0700: > On 3 August 2015 at 20:53, Clint Byrum <cl...@fewbar.com> wrote: > > > Excerpts from Devananda van der Veen's message of 2015-08-03 08:53:21 > > -0700: > > Also on a side note, I think Cinder's need for this is really subtle, > > and one could just accept that sometimes it's going to break when it does > > two things to one resource from two hosts. The error rate there might > > even be lower than the false-error rate that would be caused by a twitchy > > DLM with timeouts a little low. So there's a core cinder discussion that > > keeps losing to the shiny DLM discussion, and I'd like to see it played > > out fully: Could Cinder just not do anything, and let the few drivers > > that react _really_ badly, implement their own concurrency controls? > > > > > So the problem here is data corruption. Lots of our races can cause data > corruption. Not 'my instance didn't come up', not 'my network is screwed > and I need to tear everything down and do it again', but 'My 1tb of > customer database is now missing the second half'. This means that we > *really* need some confidence and understanding in whatever we do. The idea > of locks timing out and being stolen without fencing is frankly scary and > begging for data corruption unless we're very careful. I'd rather use a > persistent lock (e.g. a db record change) and manual recovery than a lock > timeout that might cause corruption. >
Thanks Duncan. Can you be more specific about a known data-corrupting race that a) isn't handled simply by serialization in the database, and b) isn't specific to a single driver? __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev