Hi, I would like to shed some additional light on this for those who were not there. So, SELECT ... FOR UPDATE does lock on a single node, as it is pointed out earlier in this thread, a simple solution is to write only one node at a time. Haproxy can be set up with both backends, see this blog post for example. http://www.mysqlperformanceblog.com/2012/06/20/percona-xtradb-cluster-reference-architecture-with-haproxy/
In a nutshell, and with a bit of an oversimplification, galera replicates in write sets. A write set is practically a row based binary log event + some metadata which is good for 2 things: you can take a look at 2 write sets and tell if they are conflicting or not, and you can take a look at a writeset and a database, and tell if the write set is applicable to the database. At the time of commit, the transaction is transferred to all the other cluster nodes in parallel. On the remote node, the new transaction is compared to each other transaction waiting in the queue to be applied, and it's checked if it's applicable to the database. If the transaction if not conflicting, and it's applicable, it's queued, and the node signals back that the commit can proceed. There is a nice drawing about this here: http://www.percona.com/doc/percona-xtradb-cluster/5.6/features/multimaster-replication.html So, because of this, the locks of SELECT FOR UPDATE won't replicate. Between nodes, galera uses optimistic locking. This means that we assume that during the certification process (described above), there will be no conflicts. If there are conflicts, the transaction is rolled back on the originating node, and this is when you receive the error message in question. A failed transaction is something which can happen any time with any database engine with any "interesting feature", and when a transaction failed, the application should now what to do with it. In case of galera, a conflict in a single node case was a wait on row locks, in case of galera replication, this will be a rollback. A rollback is a much more expensive operation (data has to be copied back from undo), so if there are lots of failures like this, performance will suffer. So, this is not a deadlock in the classical sense. Yet, InnoDB can roll back a transaction any time because of a deadlock (any database engine can do that, including PostgreSQL), and the application should be able to handle this. As it was noted earlier, writing to a single node only at a time is a good solution for avoiding this. With multiple nodes written, storage engine level writes will still happen on every node, because every node has the whole data set. Writing on multiple nodes can be beneficial because parsing SQL is much more expensive than just applying a row based binary log event, so you can see some performance improvement if all nodes are written. I would discourage using any type of multi-master replication without understanding how conflict resolution works in case of the chosen solution. In case of galera, if row locks were replicated over the network, it would act the same way as a single server, but it would be really slow. If SELECT FOR UPDATE is only used to achieve consistent reads (read your own writes), that can be achieved with wsrep_causal_reads. I am happy to help to avoid SELECT FOR UPDATE if somebody can tell me the use cases. On Tue, May 20, 2014 at 10:53 AM, Julien Danjou <jul...@danjou.info> wrote: > On Mon, May 19 2014, Jay Pipes wrote: > >> I think at that point I mentioned that there were a number of places that >> were using the SELECT ... FOR UPDATE construct in Nova (in SQLAlchemy, it's >> the with_lockmode('update') modification of the query object). Peter >> promptly said that was a problem. MySQL Galera does not support SELECT ... >> FOR UPDATE, since it has no concept of cross-node locking of records and >> results are non-deterministic. > > So you send a command that's not supported and the whole software > deadlocks? Is there a bug number about that or something? I cannot > understand how this can be possible and considered as something normal > (that's the feeling I have reading your mail, I may be wrong). > >> We have a number of options: >> >> 1) Stop using MySQL Galera for databases of projects that contain >> with_lockmode('update') >> >> 2) Put a big old warning in the docs somewhere about the problem of >> potential deadlocks or odd behaviour with Galera in these projects >> >> 3) For Nova and Neutron, remove the use of with_lockmode('update') and >> instead use a coarse-grained file lock or a distributed lock manager for >> those areas where we need deterministic reads or quiescence. >> >> 4) For the Nova db quota driver, refactor the driver to either use a >> non-locking method for reservation and quota queries or move the driver out >> into its own projects (or use something like Climate and make sure that >> Climate uses a non-blocking algorithm for those queries...) >> >> Thoughts? > > 5) Stop leveling down our development, and rely and leverage a powerful > RDBMS that provides interesting feature, such as PostgreSQL. > > Sorry, had to say it, but it's pissing me off to see the low quality of > the work that is done around SQL in OpenStack. > > -- > Julien Danjou > /* Free Software hacker > http://julien.danjou.info */ -- Peter Boros, Consultant, Percona Telephone: +1 888 401 3401 ext 546 Emergency: +1 888 401 3401 ext 911 Skype: percona.pboros _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev