Re: [openstack-dev] [tc] Active or passive role with our database layer

Mike Bayer Tue, 23 May 2017 10:52:10 -0700


On 05/23/2017 01:10 PM, Octave J. Orgeron wrote:

Comments below..

On 5/21/2017 1:38 PM, Monty Taylor wrote:

For example: An HA strategy using slave promotion and a VIP thatpoints at the current write master paired with an applicationincorrectly configured to do such a thing can lead to writes to thewrong host after a failover event and an application that seems to berunning fine until the data turns up weird after a while.
This is definitely a more complicated area that becomes more and morespecific to the clustering technology being used. Galera vs. MySQLCluster is a good example. Galera has an active/passive architecturewhere the above issues become a concern for sure.

This is not my understanding; Galera is multi-master and if you lose anode, you don't lose any committed transactions; the writesets arevalidated as acceptable by, and pushed out to all nodes before yourcommit succeeds. There's an option to make it wait until all thosewritesets are fully written to disk as well, but even with that optionflipped off, if you COMMIT to one node then that node explodes, you losenothing. your writesets have been verified as will be accepted by allthe other nodes.

active/active is the second bullet point on the main homepage:http://galeracluster.com/products/

In the "active" approach, we still document expectations, but we alsovalidate them. If they are not what we expect but can be changed atruntime, we change them overriding conflicting environmental config,and if we can't, we hard-stop indicating an unsuitable environment.Rather than providing helper tools, we perform the steps neededourselves, in the order they need to be performed, ensuring that theyare done in the manner in which they need to be done.
This might be a trickier situation, especially if the database(s) are ina separate or dedicated environment that the OpenStack service processesdon't have access to. Of course for SQL commands, this isn't a problem.But changing the configuration files and restarting the database may bea harder thing to expect.

nevertheless the HA setup within tripleo does do this, currently usingPacemaker and resource agents. This is within the scope of at leastparts of Openstack.

In either approach the OpenStack service has to be able to talk toboth old and new versions of the schema. And in either approach weneed to make sure to limit the schema change operations to the setthat can be accomplished in an online fashion. We also have to becareful to not start writing values to new columns until all of thenodes have been updated, because the replication stream can'treplicate the new column value to nodes that don't have the new column.
This is another area where something like MySQL Cluster (NDB) wouldoperate differently because it's an active/active architecture. Solimiting the number of online changes while a table is locked across thecluster would be very important. There is also the timeouts for theapplications to consider, something that could be abstracted again withoslo.db.

So the DDL we do on Galera, to confirm but also clarify Monty's point,is under the realm of "total order isolation", which means it's going tohold up the whole cluster while DDL is applied to all nodes. Montysays this disqualifies it as an "online upgrade", which is because ifyou emitted DDL that had to run default values into a million rows thenyour whole cluster would temporarily have to wait for that to happen; wehandle that by making sure we don't do migrations with that kind of datarequirement and while yes, the DB has to wait for a schema change toapply, they are at least very short (in theory). For practicalpurposes, it is *mostly* an "online" style of migration because all theservices that talk to the database can keep on talking to the databasewithout being stopped, upgraded to new software version, and restarted,which IMO is what's really hard about "online" upgrades. It does meanthat services will just have a little more latency while operationsproceed. Maybe we need a new term called "quasi-online" or somethinglike that.

Facebook has released a Python version of their "online" schemamigration tool for MySQL which does the full blown "create a new, blanktable" approach, e.g. which contains the newer version of the schema, sothat nothing at all stops or slows down at all. And then to managebetween the two tables while everything is running it also makes a"change capture" table to keep track of what's going on, and then towire it all together it uses...triggers!https://github.com/facebookincubator/OnlineSchemaChange/wiki/How-OSC-works.Crazy Facebook kids. How we know that "make two more tables and wireit all together with new triggers" in fact is more performant than just,"add a column to the table", I'm not sure how/when they make thatdetermination. I don't see an Openstack cluster as quite the samething as hosting a site like Facebook so I lean towards the more liberalinterpretation of "online upgrades".

* Versions
It's worth noting that behavior for schema updates and other thingschange over time with backend database version. We set minimumversions of other things, like libvirt and OVS - so we might also wantto set minimum versions for what we can support in the database. Thatway we can know for a given release of OpenStack what DDL operationsare safe to use for a rolling upgrade and what are not. That meansdetecting such a version and potentially refusing to perform anupgrade if the version isn't acceptable. That reduces the operator'sability to choose what version of the database software to run, butincreases our ability to be able to provide tooling and operationsthat we can be confident will work.
Validating the MySQL database version is a good idea. The features dochange over time. A good example is how in 5.7, you'll get warningsabout duplicate indexes being dropped in a future release which willdefinitely affect multiple services today.
== Summary ==
These are just a couple of examples - but I hope they're at leastmildly useful to explain some of the sorts of issues at hand - and whyI think we need to clarify what our intent is separate from the issueof what databases we "support".
Some operations have one and only one "right" way to be done. Forthose operations if we take an 'active' approach, we can implementthem once and not make all of our deployers and distributors eachimplement and run them. However, there is a cost to that. Automaticand prescriptive behavior has a higher dev cost that is proportionalto the number of supported architectures. This then implies a need tolimit deployer architecture choices.
On the other hand, taking an 'external' approach allows us to federatethe work of supporting the different architectures to the deployers.This means more work on the deployer's part, but also potentially agreater amount of freedom on their part to deploy supporting servicesthe way they want. It means that some of the things that have beenrequested of us - such as easier operation and an increase in thenumber of things that can be upgraded with no-downtime - might becomeprohibitively costly for us to implement.
I honestly think that both are acceptable choices we can make and thatfor any given topic there are middle grounds to be found at any givenmoment in time.
BUT - without a decision as to what our long-term philosophical intentin this space is that is clear and understandable to everyone, wecannot have successful discussions about the impact of implementationchoices, since we will not have a shared understanding of the problemspace or the solutions we're talking about.
For my part - I hear complaints that OpenStack is 'difficult' tooperate and requests for us to make it easier. This is why I have beenadvocating some actions that are clearly rooted in an 'active' worldview.
Finally, this is focused on the database layer but similar questionsarise in other places. What is our philosophy on prescriptive/activechoices on our part coupled with automated action and ease ofoperation vs. expanded choices for the deployer at the expense ofconfiguration and operational complexity. For now let's see if we cananswer it for databases, and see where that gets us.
Thanks for reading.

Monty
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tc] Active or passive role with our database layer

Reply via email to