On 02/04/2015 07:59 PM, Angus Lees wrote:
On Thu Feb 05 2015 at 9:02:49 AM Robert Collins <robe...@robertcollins.net <mailto:robe...@robertcollins.net>> wrote: On 5 February 2015 at 10:24, Joshua Harlow <harlo...@outlook.com <mailto:harlo...@outlook.com>> wrote: > How interesting, > > Why are people using galera if it behaves like this? :-/ Because its actually fairly normal. In fact its an instance of point 7 on https://wiki.openstack.org/__wiki/BasicDesignTenets <https://wiki.openstack.org/wiki/BasicDesignTenets> - one of our oldest wiki pages :). In more detail, consider what happens in full isolation when you have the A and B example given, but B starts its transaction before A. B BEGIN A BEGIN A INSERT foo A COMMIT B SELECT foo -> NULL Note that this still makes sense from each of A and B's individual view of the world. If I understood correctly, the big change with Galera that Matthew is highlighting is that read-after-write may not be consistent from the pov of a single thread.
No, this is not correct. There is nothing different about Galera here versus any asynchronously replicated database. A single thread, issuing statements in two entirely *separate sessions*, load-balanced across an entire set of database cluster nodes, may indeed see older data if the second session gets balanced to a slave node.
Nothing has changed about this with Galera. The exact same patterns that you would use to ensure that you are able to read the data that you previously wrote can be used with Galera. Just have the thread start a transactional session and ensure all queries are executed in the context of that session. Done. Nothing about Galera changes anything here.
Not have read-after-write is *really* hard to code to (see for example x86 SMP cache coherency, C++ threading semantics, etc which all provide read-after-write for this reason). This is particularly true when the affected operations are hidden behind an ORM - it isn't clear what might involve a database call and sequencers (or logical clocks, etc) aren't made explicit in the API. I strongly suggest just enabling wsrep_casual_reads on all galera sessions, unless you can guarantee that the high-level task is purely read-only, and then moving on to something else ;) If we choose performance over correctness here then we're just signing up for lots of debugging of hard to reproduce race conditions, and the fixes are going to look like what wsrep_casual_reads does anyway. (Mind you, exposing sequencers at every API interaction would be awesome, and I look forward to a future framework and toolchain that makes that easy to do correctly)
IMHO, you all are reading WAY too much into this. The behaviour that Matthew is describing is the kind of thing that has been around for decades now with asynchronous slave replication. Applications have traditionally handled it by sending reads that can tolerate slave lag to a slave machine, and reads that cannot to the same machine that was written to.
Galera doesn't change anything here. I'm really not sure what the fuss is about, frankly.
I don't recommend mucking with wsrep_causal_reads if we don't have to. And, IMO, we don't have to much with it at all.
Best, -jay __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev