Re: [openstack-dev] [oslo][oslo.db] MySQL Cluster support

Octave J. Orgeron Thu, 02 Feb 2017 13:55:14 -0800

Hi Doug,

Comments below..


Thanks,
Octave

On 2/2/2017 12:52 PM, Mike Bayer wrote:



On 02/02/2017 02:16 PM, Octave J. Orgeron wrote:

Hi Doug,

Comments below..

Thanks,
Octave

On 2/2/2017 11:27 AM, Doug Hellmann wrote:

It sounds like part of the plan is to use the configuration setting
to control how the migration scripts create tables. How will that
work? Does each migration need custom logic, or can we build helpers
into oslo.db somehow? Or will the option be passed to the database
to change its behavior transparently?


These are good questions. For each service, when the db sync or db
manage operation is done it will call into SQL Alchemy or Alembic
depending on the methods used by the given service. For example, most
use SQL Alchemy, but there are services like Ironic and Neutron that use
Alembic. It is within these scripts under the <service>/db/* hierarchy
that the logic exist today to configure the database schema for any
given service. Both approaches will look at the schema version in the
database to determine where to start the create, upgrade, heal, etc.
operations. What my patches do is that in the scripts where a table
needs to be modified, there will be custom IF/THEN logic to check the
cfg.CONF.database.mysql_storage_engine setting to make the required
modifications. There are also use cases where the api.py or model(s).py
under the <service>/db/ hierarchy needs to look at this setting as well
for API and CLI operations where mysql_engine is auto-inserted into DB
operations. In those use cases, I replace the hard coded "InnoDB" with
the mysql_storage_engine variable.

can you please clarify "replace the hard coded "InnoDB" " ? Are youproposing to send reviews for patches against all occurrences of"InnoDB" in files likehttps://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py? The "InnoDB" keyword is hardcoded in hundreds of migration filesacross all openstack projects that use MySQL. Are all of these goingto be patched with some kind of conditional?

Yes, that is the plan to patch each of the scripts that has these andany other issues that need to be addressed.

It would be interesting if we could develop some helpers to automate
this, but it would probably have to be at the SQL Alchemy or Alembic
levels.
not really, you can build a hook that intercepts operations likeCreateTable, or that intercepts SQL as it is emitted over aconnection, in order to modify these values on the fly. But that is aspecific kind of approach with it's own set of surprises.Alternatively you can make an alternate SQLAlchemy dialect that nolonger recognizes "mysql_*" as the prefix for these arguments. There'sways to do this part.
But more critically I noticed you referred to altering the names ofcolumns to suit NDB. How will this be accomplished? Changing acolumn name in an openstack application is no longer trivial, becauseonline upgrades must be supported for applications like Nova andNeutron. A column name can't just change to a new name, both columnshave to exist and logic must be added to keep these columns synchronized.

Putting the hooks into SQL Alchemy dialect would only solve things likethe mysql_engine=, savepoints, and nested operations. It won't solvefor the row length issues or be able to determine which ones to targetsince we don't have some method of specifying the potential lengths ofcontents. We also have to consider that Alembic doesn't have the samecapabilities as SQL Alchemy, so if we invest in making enhancementsthere we still have Neutron and Ironic that wouldn't be able to benefit.I think being consistent is important as well here.

The patches don't change the names of columns, they only change the sizeor type. There is only a single occurrence that I've seen where a columnname causes problems because it's using a reserved name in the SQL. Ihave a patch for that issue, which I believe is in Heat if I remembercorrectly.

Unfortunately, throughout all of the OpenStack services today we
are hard coding things like mysql_engine, using InnoDB specific features
(savepoints, nested operations, etc.), and not following the strict SQL
orders for modifying table elements (foreign keys, constraints, and
indexes).
Savepoints aren't InnoDB specific, they are a standard SQL feature andalso their use is not widespread right now. I'm not sure what youmean by "the strict SQL orders", we use ALTER TABLE as is standard inMySQL for this and it's behind an abstraction layer that supportsother databases such as Postgresql.

Savepoints are not implemented yet in MySQL Cluster, but it's on theroadmap. As for the SQL ordering, what I'm talking about is the way someservices will drop or modify foreign keys, constraints, or indexes inthe wrong operation order. These have to be unfurled in the correctorder and put back in the right order. InnoDB does not enforce this,but NDB does. So this isn't about doing an ALTER TABLE on a column. Someservices like Keystone don't suffer from these problems as they are morecareful about operation orders. Other services like Glance ignore theoperation ordering and that's where more work is required to fix things.


  * Many of the SQL Alchemy and Alembic scripts only import the minimal
    set of python modules. If we imported others, we would also have to
    initialize those name spaces which means a lot more code :(


I'm not sure what this means, can you clarify ?

When the scripts run, they don't import the whole namespace for a givenservice. Most of them will just import the modules for the SQL Alchemyor Alembic namespaces with the cfg.CONF.database.* from oslo.*. If wewanted to inject the setting elsewhere, like within the servicenamespace, we have to import those modules and initialize the fullnamespace. This means considerably more lines of code and context issuesto resolve. Instead, by using oslo.db under the cfg.CONF.database, whichalready gets imported for SQL Alchemy and Alembic operations, we canquickly check for mysql_storage_engine.


Does that make it clear?



   * Reduces the amount of overhead required to make these changes.

What sort of "overhead", do you mean code complexity, performance ?

The overhead would be importing another module from the service andinitializing the namespace for each SQL Alchemy or Alembic script.Again, I don't want to re-architect the way those processes work todayas that would be a major undertaking. By using the same methods they usetoday to get to the cfg.CONF.database namespace, I cut down on thecoding overhead and the performance overhead of multiple scriptsimporting modules and initializing namespaces. It would increase theamount of reads against the *.conf files and all of the modules required.


Keep in mind that we do not encourage code outside of libraries to
rely on configuration settings defined within libraries, because
that limits our ability to change the names and locations of the
configuration variables.  If migration scripts need to access the
configuration setting we will need to add some sort of public API
to oslo.db to query the value. The function can simply return the
configured value.


Configuration parameters within any given service will make use of a
large namespace that pulls in things from oslo and the .conf files for a
given service. So even when an API, CLI, or DB related call is made,
these namespaces are key for things to work. In the case of the SQL
Alchemy and Alembic scripts, they also make use of this namespace with
oslo, oslo.db, etc. to figure out how to connect to the database and
other database settings. I don't think we need a public API for these
kinds of calls as the community already makes use of the libraries to
build the namespace. My oslo.db setting and patches for each service
just make use of the cfg.CONF.database namespace to determine the
correct behavior to execute.


What other behaviors are likely to be changed by the new option?
Will application runtime behavior need to know about the storage
engine?


The changes will be transparent to the application runtime behavior. The
APIs and CLI tools call into the <service>/db/api.py as the entry point
for database calls. Behind this you usually have a models.py that is
aware of the database schema to understand the layout of things. So the
underlining structure is abstracted away from the run-time. These entry
points sometimes do require minor modifications to handle any hard coded
issues or intercept functions like savepoints and nested operations.
Again I use the cfg.CONF.database namespace to check for the appropriate
behavior and implement IF/THEN logic to do the right thing.

Some of my design objectives for all of these patches are:

  * Zero impact on OpenStack functionality and usability (API, CLI, user
    experience)
  * No loss in database structure. Consistent foreign keys, constraints,
    indexes, etc.
  * Minimal impact on column size and/or types to fit within NDB table
    row limits. Many columns are over-sized today.
  * Validate functionality of APIs, service processes, and CLI. Tempest
    is our friend :)
  * Zero impact for users not using MySQL Cluster (NDB).

Doug
Thanks,
Octave

On 2/2/2017 6:46 AM, Doug Hellmann wrote:
Excerpts from Octave J. Orgeron's message of 2017-02-01 20:33:38-0700:
Hi Folks,
I'm working on adding support for MySQL Cluster to the coreOpenStack
services. This will enable the community to benefit from an
active/active, auto-sharding, and scale-out MySQL database. Myapproachis to have a single configuration setting in each core OpenStackservicein the oslo.db configuration section called mysql_storage_enginethatwill enable the logic in the SQL Alchemy or Alembic upgradescripts to
handle the differences between InnoDB and NDB storage engines
respectively. When enabled, this logic will make the required table
schema changes around:

   * Row character length limits 65k -> 14k
* Proper SQL ordering of foreign key, constraints, and indexoperations
   * Interception of savepoint and nested operations

By default this functionality will not be enabled and will have no
impact on the default InnoDB functionality. These changes have been
tested on Kilo and Mitaka in previous releases of our OpenStack
distributions with Tempest. I'm working on updating these patchesfor
upstream consumption. We are also working on a 3rd party CI for
regression testing against MySQL Cluster for the community.

The first change set is for oslo.db and can be reviewed at:

https://review.openstack.org/427970

Thanks,
Octave
Is it possible to detect the storage engine at runtime, instead of
having the operator configure it?

Doug
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--

Oracle <http://www.oracle.com/>
Octave J. Orgeron | Sr. Principal Architect and Software Engineer
Oracle Linux OpenStack
Mobile: +1-720-616-1550 <tel:+17206161550>
500 Eldorado Blvd. | Broomfield, CO 80021
Certified Oracle Enterprise Architect: Systems Infrastructure
<http://www.oracle.com/us/solutions/enterprise-architecture/index.html>
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to
developing practices and products that help protect the environment

__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--

Oracle <http://www.oracle.com/>
Octave J. Orgeron | Sr. Principal Architect and Software Engineer
Oracle Linux OpenStack
Mobile: +1-720-616-1550 <tel:+17206161550>
500 Eldorado Blvd. | Broomfield, CO 80021

Certified Oracle Enterprise Architect: Systems Infrastructure<http://www.oracle.com/us/solutions/enterprise-architecture/index.html>Green Oracle <http://www.oracle.com/commitment> Oracle is committed todeveloping practices and products that help protect the environment

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][oslo.db] MySQL Cluster support

Reply via email to