[openstack-dev] [keystone] [tripleo] multi-region Keystone via stretched Galera cluster

2018-06-04 Thread Michael Bayer
Hey list -

as mentioned in the May 11 Keystone meeting, I am working on one of
the canonical approaches to producing a multi-region Keystone service,
which is by having overcloud-specific keystone services interact with
a Galera database that is running masters across multiple overclouds.
  The work here is to be integrated into tripleo and at [1] we discuss
the production of a multi-region Keystone service, deployable across
multiple tripleo overclouds.

As the spec is being discussed I continue to work on the proof of
concept [2] which in its master branch is being developed to deploy
the second galera DB as well as haproxy/vip/keystone completely from
tripleo heat templates; the changes being patched here are to be
proposed as changes to tripleo itself once this version of the POC is
working.

The "standard_tripleo_version" branch is the previous iteration, which
provides a fully working proof of concept that adds a second Galera
instance to a pair of already deployed overclouds.

Comments on the review welcome.

[1] https://review.openstack.org/#/c/566448/

[2] https://github.com/zzzeek/stretch_cluster

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] innodb OPTIMIZE TABLE ?

2018-04-09 Thread Michael Bayer
On Mon, Apr 9, 2018 at 5:53 AM, Gorka Eguileor  wrote:
> On 06/04, Michael Bayer wrote:
>> On Wed, Apr 4, 2018 at 5:00 AM, Gorka Eguileor  wrote:
>> > On 03/04, Jay Pipes wrote:
>> >> On 04/03/2018 11:07 AM, Michael Bayer wrote:
>> >> > The MySQL / MariaDB variants we use nowadays default to
>> >> > innodb_file_per_table=ON and we also set this flag to ON in installer
>> >> > tools like TripleO. The reason we like file per table is so that
>> >> > we don't grow an enormous ibdata file that can't be shrunk without
>> >> > rebuilding the database.  Instead, we have lots of little .ibd
>> >> > datafiles for each table throughout each openstack database.
>> >> >
>> >> > But now we have the issue that these files also can benefit from
>> >> > periodic optimization which can shrink them and also have a beneficial
>> >> > effect on performance.   The OPTIMIZE TABLE statement achieves this,
>> >> > but as would be expected it itself can lock tables for potentially a
>> >> > long time.   Googling around reveals a lot of controversy, as various
>> >> > users and publications suggest that OPTIMIZE is never needed and would
>> >> > have only a negligible effect on performance.   However here we seek
>> >> > to use OPTIMIZE so that we can reclaim disk space on tables that have
>> >> > lots of DELETE activity, such as keystone "token" and ceilometer
>> >> > "sample".
>> >> >
>> >> > Questions for the group:
>> >> >
>> >> > 1. is OPTIMIZE table worthwhile to be run for tables where the
>> >> > datafile has grown much larger than the number of rows we have in the
>> >> > table?
>> >>
>> >> Possibly, though it's questionable to use MySQL/InnoDB for storing 
>> >> transient
>> >> data that is deleted often like ceilometer samples and keystone tokens. A
>> >> much better solution is to use RDBMS partitioning so you can simply ALTER
>> >> TABLE .. DROP PARTITION those partitions that are no longer relevant (and
>> >> don't even bother DELETEing individual rows) or, in the case of Ceilometer
>> >> samples, don't use a traditional RDBMS for timeseries data at all...
>> >>
>> >> But since that is unfortunately already the case, yes it is probably a 
>> >> good
>> >> idea to OPTIMIZE TABLE on those tables.
>> >>
>> >> > 2. from people's production experience how safe is it to run OPTIMIZE,
>> >> > e.g. how long is it locking tables, etc.
>> >>
>> >> Is it safe? Yes.
>> >>
>> >> Does it lock the entire table for the duration of the operation? No. It 
>> >> uses
>> >> online DDL operations:
>> >>
>> >> https://dev.mysql.com/doc/refman/5.7/en/innodb-file-defragmenting.html
>> >>
>> >> Note that OPTIMIZE TABLE is mapped to ALTER TABLE tbl_name FORCE for 
>> >> InnoDB
>> >> tables.
>> >>
>> >> > 3. is there a heuristic we can use to measure when we might run this
>> >> > -.e.g my plan is we measure the size in bytes of each row in a table
>> >> > and then compare that in some ratio to the size of the corresponding
>> >> > .ibd file, if the .ibd file is N times larger than the logical data
>> >> > size we run OPTIMIZE ?
>> >>
>> >> I don't believe so, no. Most things I see recommended is to simply run
>> >> OPTIMIZE TABLE in a cron job on each table periodically.
>> >>
>> >> > 4. I'd like to propose this job of scanning table datafile sizes in
>> >> > ratio to logical data sizes, then running OPTIMIZE, be a utility
>> >> > script that is delivered via oslo.db, and would run for all innodb
>> >> > tables within a target MySQL/ MariaDB server generically.  That is, I
>> >> > really *dont* want this to be a script that Keystone, Nova, Ceilometer
>> >> > etc. are all maintaining delivering themselves.   this should be done
>> >> > as a generic pass on a whole database (noting, again, we are only
>> >> > running it for very specific InnoDB tables that we observe have a poor
>> >> > logical/physical size ratio).
>> >>
>> >> I don't believe this should be in oslo.db. This is strictly the pur

Re: [openstack-dev] [oslo.db] innodb OPTIMIZE TABLE ?

2018-04-06 Thread Michael Bayer
On Wed, Apr 4, 2018 at 5:00 AM, Gorka Eguileor  wrote:
> On 03/04, Jay Pipes wrote:
>> On 04/03/2018 11:07 AM, Michael Bayer wrote:
>> > The MySQL / MariaDB variants we use nowadays default to
>> > innodb_file_per_table=ON and we also set this flag to ON in installer
>> > tools like TripleO. The reason we like file per table is so that
>> > we don't grow an enormous ibdata file that can't be shrunk without
>> > rebuilding the database.  Instead, we have lots of little .ibd
>> > datafiles for each table throughout each openstack database.
>> >
>> > But now we have the issue that these files also can benefit from
>> > periodic optimization which can shrink them and also have a beneficial
>> > effect on performance.   The OPTIMIZE TABLE statement achieves this,
>> > but as would be expected it itself can lock tables for potentially a
>> > long time.   Googling around reveals a lot of controversy, as various
>> > users and publications suggest that OPTIMIZE is never needed and would
>> > have only a negligible effect on performance.   However here we seek
>> > to use OPTIMIZE so that we can reclaim disk space on tables that have
>> > lots of DELETE activity, such as keystone "token" and ceilometer
>> > "sample".
>> >
>> > Questions for the group:
>> >
>> > 1. is OPTIMIZE table worthwhile to be run for tables where the
>> > datafile has grown much larger than the number of rows we have in the
>> > table?
>>
>> Possibly, though it's questionable to use MySQL/InnoDB for storing transient
>> data that is deleted often like ceilometer samples and keystone tokens. A
>> much better solution is to use RDBMS partitioning so you can simply ALTER
>> TABLE .. DROP PARTITION those partitions that are no longer relevant (and
>> don't even bother DELETEing individual rows) or, in the case of Ceilometer
>> samples, don't use a traditional RDBMS for timeseries data at all...
>>
>> But since that is unfortunately already the case, yes it is probably a good
>> idea to OPTIMIZE TABLE on those tables.
>>
>> > 2. from people's production experience how safe is it to run OPTIMIZE,
>> > e.g. how long is it locking tables, etc.
>>
>> Is it safe? Yes.
>>
>> Does it lock the entire table for the duration of the operation? No. It uses
>> online DDL operations:
>>
>> https://dev.mysql.com/doc/refman/5.7/en/innodb-file-defragmenting.html
>>
>> Note that OPTIMIZE TABLE is mapped to ALTER TABLE tbl_name FORCE for InnoDB
>> tables.
>>
>> > 3. is there a heuristic we can use to measure when we might run this
>> > -.e.g my plan is we measure the size in bytes of each row in a table
>> > and then compare that in some ratio to the size of the corresponding
>> > .ibd file, if the .ibd file is N times larger than the logical data
>> > size we run OPTIMIZE ?
>>
>> I don't believe so, no. Most things I see recommended is to simply run
>> OPTIMIZE TABLE in a cron job on each table periodically.
>>
>> > 4. I'd like to propose this job of scanning table datafile sizes in
>> > ratio to logical data sizes, then running OPTIMIZE, be a utility
>> > script that is delivered via oslo.db, and would run for all innodb
>> > tables within a target MySQL/ MariaDB server generically.  That is, I
>> > really *dont* want this to be a script that Keystone, Nova, Ceilometer
>> > etc. are all maintaining delivering themselves.   this should be done
>> > as a generic pass on a whole database (noting, again, we are only
>> > running it for very specific InnoDB tables that we observe have a poor
>> > logical/physical size ratio).
>>
>> I don't believe this should be in oslo.db. This is strictly the purview of
>> deployment tools and should stay there, IMHO.
>>
>
> Hi,
>
> As far as I know most projects do "soft deletes" where we just flag the
> rows as deleted and don't remove them from the DB, so it's only when we
> use a management tool and run the "purge" command that we actually
> remove these rows.
>
> Since running the optimize without purging would be meaningless, I'm
> wondering if we should trigger the OPTIMIZE also within the purging
> code.  This way we could avoid innefective runs of the optimize command
> when no purge has happened and even when we do the optimization we could
> skip the ratio calculation altogether for tables where no rows have been
> deleted (the ratio hasn't ch

Re: [openstack-dev] [oslo.db] innodb OPTIMIZE TABLE ?

2018-04-03 Thread Michael Bayer
On Tue, Apr 3, 2018 at 11:41 AM, Jay Pipes  wrote:
> On 04/03/2018 11:07 AM, Michael Bayer wrote:
>>
>
> Yes.
>
>> b. oslo.db script to run generically, yes or no?
>
>
> No. Just have Triple-O install galera_innoptimizer and run it in a cron job.

OK, here are the issues I have with galera_innoptimizer:

1. only runs on Galera.This should work on a non-galera db as well

2. hardcoded to MySQLdb / mysqlclient.   We don't install that driver anymore.

3. is just running OPTIMIZE on every table across the board, and at
best you can give it a list of tables.  I was hoping to not add more
hardcoded cross-dependencies to tripleo, as this means individual
projects would need to affect how the script is run which means we
have to again start shipping individual per-app crons that require
eternal babysitting.

What failures do you foresee if I tried to make it compare the logical
data size to the physical file size?  since I'm going here for file
size optimization only.   or just too complicated / brittle ?

>
> Best,
> -jay
>
>> thanks for your thoughts!
>>
>>
>>
>> [1] https://github.com/deimosfr/galera_innoptimizer
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [oslo.db] innodb OPTIMIZE TABLE ?

2018-04-03 Thread Michael Bayer
The MySQL / MariaDB variants we use nowadays default to
innodb_file_per_table=ON and we also set this flag to ON in installer
tools like TripleO. The reason we like file per table is so that
we don't grow an enormous ibdata file that can't be shrunk without
rebuilding the database.  Instead, we have lots of little .ibd
datafiles for each table throughout each openstack database.

But now we have the issue that these files also can benefit from
periodic optimization which can shrink them and also have a beneficial
effect on performance.   The OPTIMIZE TABLE statement achieves this,
but as would be expected it itself can lock tables for potentially a
long time.   Googling around reveals a lot of controversy, as various
users and publications suggest that OPTIMIZE is never needed and would
have only a negligible effect on performance.   However here we seek
to use OPTIMIZE so that we can reclaim disk space on tables that have
lots of DELETE activity, such as keystone "token" and ceilometer
"sample".

Questions for the group:

1. is OPTIMIZE table worthwhile to be run for tables where the
datafile has grown much larger than the number of rows we have in the
table?

2. from people's production experience how safe is it to run OPTIMIZE,
e.g. how long is it locking tables, etc.

3. is there a heuristic we can use to measure when we might run this
-.e.g my plan is we measure the size in bytes of each row in a table
and then compare that in some ratio to the size of the corresponding
.ibd file, if the .ibd file is N times larger than the logical data
size we run OPTIMIZE ?

4. I'd like to propose this job of scanning table datafile sizes in
ratio to logical data sizes, then running OPTIMIZE, be a utility
script that is delivered via oslo.db, and would run for all innodb
tables within a target MySQL/ MariaDB server generically.  That is, I
really *dont* want this to be a script that Keystone, Nova, Ceilometer
etc. are all maintaining delivering themselves.   this should be done
as a generic pass on a whole database (noting, again, we are only
running it for very specific InnoDB tables that we observe have a poor
logical/physical size ratio).

5. for Galera this gets more tricky, as we might want to run OPTIMIZE
on individual nodes directly.  The script at [1] illustrates how to
run this on individual nodes one at a time.

More succinctly, the Q is:

a. OPTIMIZE, yes or no?
b. oslo.db script to run generically, yes or no?

thanks for your thoughts!



[1] https://github.com/deimosfr/galera_innoptimizer

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] upcoming warnings in MySQL 5.6, 5.7 for BLOB columns

2018-03-14 Thread Michael Bayer
Forgot the links:

[1] https://bugs.mysql.com/bug.php?id=79317
[2] https://github.com/PyMySQL/PyMySQL/issues/644

On Wed, Mar 14, 2018 at 11:53 AM, Michael Bayer  wrote:
> hey all -
>
> Just looking to see if we think this will impact openstack.  MySQL 5.6
> and 5.7, but not yet MariaDB, now emits an erroneous warning when you
> try to send a binary value to the database, because it sees the client
> connection is supposed to use the utf8 or utf8mb4 charsets, assumes
> all data must be in that charset, then warns because the binary data
> does not necessarily conform to utf8 (which it has no need to, it's
> binary).
>
> Sounds weird, right, to make it easier the demo looks just like this:
>
> import pymysql
> import uuid
>
> conn = pymysql.connect(
> user="scott", passwd="tiger", host="mysql56",
> db="test", charset="utf8mb4")
> cursor = conn.cursor()
> cursor.execute("""
> CREATE TABLE IF NOT EXISTS `profiles` (
>   `id` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
>   `city` blob NOT NULL
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
> """)
> cursor.execute(
> "INSERT INTO profiles (id, city) VALUES (%(id)s, %(city)s)",
> {
> 'id': uuid.uuid4().hex,
> 'city': pymysql.Binary(
> 
> b'z\xf9\x87jS?\xd4i\xa5\xa3\r\xa7\x1e\xed\x16\xe0\xb5\x05R\xa4\xec\x16\x8f\x06\xb5\xea+\xaf<\x00\\\x94I9A\xe0\x82\xa7\x13\x0c\x8c'
> )
> }
> )
>
> when using PyMySQL 0.8.0 (not 0.7.1) you then get a warning
>
> Warning: (1300, "Invalid utf8mb4 character string: 'F9876A'").
>
>
> So, Oracle upstream clearly is never going to fix this if you look at
> the typically dismal discussion at [1].   I poked the PyMySQL project
> at [2] to see what we can do.   Long term is that SQLAlchemy will add
> the special "_binary" prefix to binary-bound parameter tokens to avoid
> the warning, however right now PyMySQL supports a flag "binary_prefix"
> that will do it for us on the driver side.
>
> For Openstack, i need to know if we are in fact passing binary data to
> databases in some project or another.   What we can do is add the
> supply of this flag to oslo.db so that it is present automatically for
> the PyMySQL driver, as well as checking the PyMySQL version for
> compatibility.
>
> If folks are seeing this warning already or are using BLOB / binary
> columns in their project please ping me and we will get this added to
> oslo.db.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [oslo.db] upcoming warnings in MySQL 5.6, 5.7 for BLOB columns

2018-03-14 Thread Michael Bayer
hey all -

Just looking to see if we think this will impact openstack.  MySQL 5.6
and 5.7, but not yet MariaDB, now emits an erroneous warning when you
try to send a binary value to the database, because it sees the client
connection is supposed to use the utf8 or utf8mb4 charsets, assumes
all data must be in that charset, then warns because the binary data
does not necessarily conform to utf8 (which it has no need to, it's
binary).

Sounds weird, right, to make it easier the demo looks just like this:

import pymysql
import uuid

conn = pymysql.connect(
user="scott", passwd="tiger", host="mysql56",
db="test", charset="utf8mb4")
cursor = conn.cursor()
cursor.execute("""
CREATE TABLE IF NOT EXISTS `profiles` (
  `id` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
  `city` blob NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
""")
cursor.execute(
"INSERT INTO profiles (id, city) VALUES (%(id)s, %(city)s)",
{
'id': uuid.uuid4().hex,
'city': pymysql.Binary(

b'z\xf9\x87jS?\xd4i\xa5\xa3\r\xa7\x1e\xed\x16\xe0\xb5\x05R\xa4\xec\x16\x8f\x06\xb5\xea+\xaf<\x00\\\x94I9A\xe0\x82\xa7\x13\x0c\x8c'
)
}
)

when using PyMySQL 0.8.0 (not 0.7.1) you then get a warning

Warning: (1300, "Invalid utf8mb4 character string: 'F9876A'").


So, Oracle upstream clearly is never going to fix this if you look at
the typically dismal discussion at [1].   I poked the PyMySQL project
at [2] to see what we can do.   Long term is that SQLAlchemy will add
the special "_binary" prefix to binary-bound parameter tokens to avoid
the warning, however right now PyMySQL supports a flag "binary_prefix"
that will do it for us on the driver side.

For Openstack, i need to know if we are in fact passing binary data to
databases in some project or another.   What we can do is add the
supply of this flag to oslo.db so that it is present automatically for
the PyMySQL driver, as well as checking the PyMySQL version for
compatibility.

If folks are seeing this warning already or are using BLOB / binary
columns in their project please ping me and we will get this added to
oslo.db.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [all] please DO NOT IMPORT from oslo_db.tests.* ! projects doing this need to revert ASAP

2018-02-19 Thread Michael Bayer
Summarizing all the reviews:

Doug's proposed check in oslo_db.tests: https://review.openstack.org/#/c/545859/
Mark oslo_db internal fixtures private:   https://review.openstack.org/545862
Cinder: https://review.openstack.org/545860
Neutron: https://review.openstack.org/545868
Ironic: https://review.openstack.org/545874
Glance: https://review.openstack.org/545878
Glare: https://review.openstack.org/545883



On Mon, Feb 19, 2018 at 11:32 AM, Doug Hellmann  wrote:
> Excerpts from Michael Bayer's message of 2018-02-19 10:55:52 -0500:
>> wow that's heavy-handed.  should that be in an oslo utility package of
>> some kind ?
>
> I thought about that, but figured we should wait and see whether we
> actually want to take the approach before polishing it. If we do we can
> add a function in oslo.utils.
>
>>
>> On Mon, Feb 19, 2018 at 10:52 AM, Doug Hellmann  
>> wrote:
>> > Excerpts from Doug Hellmann's message of 2018-02-19 10:15:34 -0500:
>> >> Excerpts from Michael Bayer's message of 2018-02-19 10:00:59 -0500:
>> >> > Hi list -
>> >> >
>> >> > Apparently Cinder was misled by my deprecations within the
>> >> > oslo_db.sqlalchemy.test_base package of DbFixture and DbTestCase, and
>> >> > in https://review.openstack.org/#/c/522290/ the assumption was made
>> >> > that these should be imported from oslo_db.tests.sqlalchemy.This
>> >> > is an immense mistake on my part that I did not expect people to go
>> >> > looking for the same names elsewhere in private packages and now we
>> >> > have a serious downstream issue as these modules are not packaged, as
>> >> > well as the possibility that the oslo_db.tests. package is now locked
>> >> > in time and I have to add deprecations there also.
>> >> >
>> >> > If anyone knows of projects (or feels like helping me search) that are
>> >> > importing *anything* from oslo_db.tests these must be reverted ASAP.
>> >> >
>> >>
>> >> If we have modules or classes we don't expect people to be importing
>> >> directly, we need to prefix the names with _ to comply with the naming
>> >> conventions we have previously told everyone to look for to recognize
>> >> private code.
>> >>
>> >> I think it's safe to treat "tests" as an exception (after resolving
>> >> this case) but we should probably document that.
>> >>
>> >> Doug
>> >
>> > Once we resolve the current set of imports, we can land a patch like
>> > https://review.openstack.org/545859 to prevent this from happening in
>> > the future.
>> >
>> > Doug
>> >
>> > __
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [all] please DO NOT IMPORT from oslo_db.tests.* ! projects doing this need to revert ASAP

2018-02-19 Thread Michael Bayer
On Mon, Feb 19, 2018 at 10:57 AM, Andrey Kurilin  wrote:
> As for downstream you can do whatever you want, but it looks like this issue
> should be solved in upstream. I mean if "tests" directory is located at the
> top level of the repo, no one will use it.

again, the search at
http://codesearch.openstack.org/?q=oslo_db.tests&i=nope&files=&repos=
shows four downstream projects using it.   I am now submitting gerrits
for all four and also getting internal downstream patches to fix
internally.  this is as bad as it gets.




> Also, setuptools supports `exclude` option which should solve the issue as
> well.
>
>
>
> 2018-02-19 17:41 GMT+02:00 Michael Bayer :
>>
>> On Mon, Feb 19, 2018 at 10:39 AM, Andrey Kurilin 
>> wrote:
>> > Can someone explain me the reason for including "tests" module into
>> > packages?
>>
>> the "tests" module should not be inside packages.   Downstream we have
>> CI running Cinder's test suite against packaged dependencies, which
>> fails because we don't package oslo_db.tests.
>>
>>
>> >
>> >
>> > 2018-02-19 17:00 GMT+02:00 Michael Bayer :
>> >>
>> >> Hi list -
>> >>
>> >> Apparently Cinder was misled by my deprecations within the
>> >> oslo_db.sqlalchemy.test_base package of DbFixture and DbTestCase, and
>> >> in https://review.openstack.org/#/c/522290/ the assumption was made
>> >> that these should be imported from oslo_db.tests.sqlalchemy.This
>> >> is an immense mistake on my part that I did not expect people to go
>> >> looking for the same names elsewhere in private packages and now we
>> >> have a serious downstream issue as these modules are not packaged, as
>> >> well as the possibility that the oslo_db.tests. package is now locked
>> >> in time and I have to add deprecations there also.
>> >>
>> >> If anyone knows of projects (or feels like helping me search) that are
>> >> importing *anything* from oslo_db.tests these must be reverted ASAP.
>> >>
>> >>
>> >> __
>> >> OpenStack Development Mailing List (not for usage questions)
>> >> Unsubscribe:
>> >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> > --
>> > Best regards,
>> > Andrey Kurilin.
>> >
>> >
>> > __
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe:
>> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
> Best regards,
> Andrey Kurilin.
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [all] please DO NOT IMPORT from oslo_db.tests.* ! projects doing this need to revert ASAP

2018-02-19 Thread Michael Bayer
wow that's heavy-handed.  should that be in an oslo utility package of
some kind ?

On Mon, Feb 19, 2018 at 10:52 AM, Doug Hellmann  wrote:
> Excerpts from Doug Hellmann's message of 2018-02-19 10:15:34 -0500:
>> Excerpts from Michael Bayer's message of 2018-02-19 10:00:59 -0500:
>> > Hi list -
>> >
>> > Apparently Cinder was misled by my deprecations within the
>> > oslo_db.sqlalchemy.test_base package of DbFixture and DbTestCase, and
>> > in https://review.openstack.org/#/c/522290/ the assumption was made
>> > that these should be imported from oslo_db.tests.sqlalchemy.This
>> > is an immense mistake on my part that I did not expect people to go
>> > looking for the same names elsewhere in private packages and now we
>> > have a serious downstream issue as these modules are not packaged, as
>> > well as the possibility that the oslo_db.tests. package is now locked
>> > in time and I have to add deprecations there also.
>> >
>> > If anyone knows of projects (or feels like helping me search) that are
>> > importing *anything* from oslo_db.tests these must be reverted ASAP.
>> >
>>
>> If we have modules or classes we don't expect people to be importing
>> directly, we need to prefix the names with _ to comply with the naming
>> conventions we have previously told everyone to look for to recognize
>> private code.
>>
>> I think it's safe to treat "tests" as an exception (after resolving
>> this case) but we should probably document that.
>>
>> Doug
>
> Once we resolve the current set of imports, we can land a patch like
> https://review.openstack.org/545859 to prevent this from happening in
> the future.
>
> Doug
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [all] please DO NOT IMPORT from oslo_db.tests.* ! projects doing this need to revert ASAP

2018-02-19 Thread Michael Bayer
On Mon, Feb 19, 2018 at 10:39 AM, Andrey Kurilin  wrote:
> Can someone explain me the reason for including "tests" module into
> packages?

the "tests" module should not be inside packages.   Downstream we have
CI running Cinder's test suite against packaged dependencies, which
fails because we don't package oslo_db.tests.


>
>
> 2018-02-19 17:00 GMT+02:00 Michael Bayer :
>>
>> Hi list -
>>
>> Apparently Cinder was misled by my deprecations within the
>> oslo_db.sqlalchemy.test_base package of DbFixture and DbTestCase, and
>> in https://review.openstack.org/#/c/522290/ the assumption was made
>> that these should be imported from oslo_db.tests.sqlalchemy.This
>> is an immense mistake on my part that I did not expect people to go
>> looking for the same names elsewhere in private packages and now we
>> have a serious downstream issue as these modules are not packaged, as
>> well as the possibility that the oslo_db.tests. package is now locked
>> in time and I have to add deprecations there also.
>>
>> If anyone knows of projects (or feels like helping me search) that are
>> importing *anything* from oslo_db.tests these must be reverted ASAP.
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
> Best regards,
> Andrey Kurilin.
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [all] please DO NOT IMPORT from oslo_db.tests.* ! projects doing this need to revert ASAP

2018-02-19 Thread Michael Bayer
On Mon, Feb 19, 2018 at 10:15 AM, Doug Hellmann  wrote:
> Excerpts from Michael Bayer's message of 2018-02-19 10:00:59 -0500:
>> Hi list -
>>
>> Apparently Cinder was misled by my deprecations within the
>> oslo_db.sqlalchemy.test_base package of DbFixture and DbTestCase, and
>> in https://review.openstack.org/#/c/522290/ the assumption was made
>> that these should be imported from oslo_db.tests.sqlalchemy.This
>> is an immense mistake on my part that I did not expect people to go
>> looking for the same names elsewhere in private packages and now we
>> have a serious downstream issue as these modules are not packaged, as
>> well as the possibility that the oslo_db.tests. package is now locked
>> in time and I have to add deprecations there also.
>>
>> If anyone knows of projects (or feels like helping me search) that are
>> importing *anything* from oslo_db.tests these must be reverted ASAP.
>>
>
> If we have modules or classes we don't expect people to be importing
> directly, we need to prefix the names with _ to comply with the naming
> conventions we have previously told everyone to look for to recognize
> private code.

doing that now

>
> I think it's safe to treat "tests" as an exception (after resolving
> this case) but we should probably document that.

the example of three projects that did this without any of us knowing
should illustrate that we really can't make that assumption.


>
> Doug
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [oslo.db] [all] please DO NOT IMPORT from oslo_db.tests.* ! projects doing this need to revert ASAP

2018-02-19 Thread Michael Bayer
Hi list -

Apparently Cinder was misled by my deprecations within the
oslo_db.sqlalchemy.test_base package of DbFixture and DbTestCase, and
in https://review.openstack.org/#/c/522290/ the assumption was made
that these should be imported from oslo_db.tests.sqlalchemy.This
is an immense mistake on my part that I did not expect people to go
looking for the same names elsewhere in private packages and now we
have a serious downstream issue as these modules are not packaged, as
well as the possibility that the oslo_db.tests. package is now locked
in time and I have to add deprecations there also.

If anyone knows of projects (or feels like helping me search) that are
importing *anything* from oslo_db.tests these must be reverted ASAP.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron][l2gw] Preventing DB out-of-sync

2017-12-08 Thread Michael Bayer
On Wed, Dec 6, 2017 at 3:46 AM, Peng Liu  wrote:
> Hi,
>
> During working on this patch[0], I encounter some DB out-of-sync problem. I
> think maybe the design can be improved. Here is my thought, all comments are
> welcome.


see also https://review.openstack.org/#/c/490834/ which I think is
dealing with a similar (if not the same) issue.

>
> In plugin code, I found all the resource operations follow the pattern in
> [1]. It is a very misleading design compare to [2].
>
> "For every action that can be taken on a resource, the mechanism driver
> exposes two methods - ACTION_RESOURCE_precommit, which is called within the
> database transaction context, and ACTION_RESOURCE_postcommit, called after
> the database transaction is complete."
>
> In result, if we focus on the out-of-sync between plugin DB and
> driver/backend DB:
>
> 1) In RPC driver, only methods Action_Resource and are implemented. Which
> means the action is token before it was written in plugin DB. In case of
> action partial succeed (especially for update case) or plugin DB operation
> failure, it will cause DB out-of-sync.
> 2) In ODL driver, only methods Action_Resource_postcommit are implemented,
> which means there is no validation in ODL level before the record is written
> in the plugin DB. In case of, ODL side failure, there is no rollback for
> plugin DB.
>
> So, to fix this issue is costly. Both plugin and driver sides need to be
> altered.
>
> The other side of this issue is a period db monitor mechanism between plugin
> and drivers, and it is another story.
>
>
> [0] https://review.openstack.org/#/c/516256
> [1]
> ...
> def Action_Resource
> self.validate_Resource_for_Action
> self.driver.Action_Resource
> with context.session.begin(subtransactions=True):
> super.Action_Resource
> self.driver.Action_Resource_precommit
> try:
> self.driver.Action_Resource_postcommit
> ...
> [2] https://wiki.openstack.org/wiki/Neutron/ML2
>
> --
> Peng Liu
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron][oslo.db] db retry madness

2017-09-08 Thread Michael Bayer
r functions that are
>> being passed not just simple structures but ORM-mapped objects?   Was this
>> use case considered?
>
> No, it doesn't support ORM objects. Given that the target use case of the
> retry decorator was for functions not in an ongoing session, we didn't have
> a pattern to support in the main tree where an ORM object would be passed as
> an argument to the function that needed to be protected.

right, makes sense, and also that is still how it's going to be here
because you have to be at the top of the transaction (unless using
SAVEPOINT).


>
>>4c. Does Neutron's model base have any support for deepcopy()?  I notice
>> that the base classes in model_base do **not** seem to have a __deepcopy__
>> present.  This would mean that this deepcopy() will definitely break any ORM
>> mapped objects because you need to use an approach like what Nova uses for
>> copy():
>> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L46
>> . This would be why the developer's objects are all getting kicked
> out of the session.   (this is the "is this a bug?" part)
>
> That's definitely why the objects are being kicked out of the session. I'm
> not sure if it makes sense to attempt to support copying an ORM object when
> the decorator is meant for functions starting a completely new session. If
> there is a use case I'm missing that we do want to support, it probably
> makes more sense to use a different decorator than to adjust the existing
> one.

you can put a __copy__() and __deepcopy__() on the base that raises an
exception with an informative message, perhaps.(maybe this should
be an oslo.db thing).


>
>>4d. How frequent is the problem of mutable lists and dicts, and should this
>> complex and actually pretty expensive behavior be optional, only for those
>> functions that are known to receieve mutable structures and change them in
>> place?
>
> It's difficult to tell. The issue was that the retry logic was added much
> later to the API level so we needed a generically safe solution to protect
> every possible API operation. It's copying things passed into the API
> request so they are usually dictionaries with fewer than 10 keys and values
> with small strings or lists of only a few entries, making the copy
> operations cheap.
>
> If we did want to get rid of the copy operation, it would probably be
> quicker to add a new decorator that doesn't copy and then slowly opt into
> for each function that we validate as 'side-effect free' on the inputs.

+1


>
>
> I hope that makes sense. Let me know if you have any other questions.

That covers it and also I should have put together the "retry is
outside the transaction == there are no ORM objects at that level"
thing.

>
> Cheers,
> Kevin Benton
>
> On Thu, Sep 7, 2017 at 8:11 AM, Michael Bayer  wrote:
>>
>> I'm trying to get a handle on neutron's retry decorators.I'm
>> assisting a developer in the networking_odl project who wishes to make
>> use of neutron's retry_if_session_inactive decorator.
>>
>> The developer has run into several problems, including that
>> networking_odl methods here seem to accept a "session" directly,
>> rather than the "context" as is standard throughout neutron, and
>> retry_if_session_inactive assumes the context calling pattern.
>> Additionally, the retry_db_errors() decorator which is also used by
>> retry_if_session_inactive is doing a deepcopy operation on arguments,
>> which when applied to ORM-mapped entities, makes a copy of them which
>> are then detached from the Session and they fail to function beyond
>> that.
>>
>> For context, the developer's patch is at:
>> https://review.openstack.org/#/c/500584/3
>>
>> and the neutron features I'm looking at can be seen when they were
>> reviewed at: https://review.openstack.org/#/c/356530/22
>>
>>
>> So I have a bunch of questions, many are asking the same things: "is
>> networking_odl's pattern legacy or not?" and "is this retry decorator
>> buggy?"   There's overlap in these questions.  But "yes" / "no" to
>> each would still help me figure out that this is a giraffe and not a
>> duck :) ("does it have a long neck?" "does it quack?" etc)
>>
>> 1. since the retry_if_session_inactive decorator assumes functions
>> that receive a "context", what does it mean that networking_odl uses
>> functions that receive a "session" and not a "context" ?   Should
&g

[openstack-dev] [neutron][oslo.db] db retry madness

2017-09-07 Thread Michael Bayer
I'm trying to get a handle on neutron's retry decorators.I'm
assisting a developer in the networking_odl project who wishes to make
use of neutron's retry_if_session_inactive decorator.

The developer has run into several problems, including that
networking_odl methods here seem to accept a "session" directly,
rather than the "context" as is standard throughout neutron, and
retry_if_session_inactive assumes the context calling pattern.
Additionally, the retry_db_errors() decorator which is also used by
retry_if_session_inactive is doing a deepcopy operation on arguments,
which when applied to ORM-mapped entities, makes a copy of them which
are then detached from the Session and they fail to function beyond
that.

For context, the developer's patch is at:
https://review.openstack.org/#/c/500584/3

and the neutron features I'm looking at can be seen when they were
reviewed at: https://review.openstack.org/#/c/356530/22


So I have a bunch of questions, many are asking the same things: "is
networking_odl's pattern legacy or not?" and "is this retry decorator
buggy?"   There's overlap in these questions.  But "yes" / "no" to
each would still help me figure out that this is a giraffe and not a
duck :) ("does it have a long neck?" "does it quack?" etc)

1. since the retry_if_session_inactive decorator assumes functions
that receive a "context", what does it mean that networking_odl uses
functions that receive a "session" and not a "context" ?   Should
networking_odl be modified such that it accepts the same "context"
that Neutron passes around?

2. Alternatively, should the retry_if_session_inactive decorator be
modified (or a new decorator added) that provides the identical
behavior for functions that receive a "session" argument rather than a
"context" argument ?  (or should this be done anyway even if
networking_odl's pattern is legacy?)

3. I notice that the session.is_active flag is considered to be
meaningful.   Normally, this flag is of no use, as the enginefacade
pattern only provides for a Session that has had the begin() method
called.  However, it looks like in neutron_lib/context.py, the Context
class is providing for a default session
"db_api.get_writer_session()", which I would assume is how we get a
Session with is_active is false:

3a.  Is this a temporary measure to work around legacy patterns?
3b.  If 3a., is the pattern in use by networking_odl, receiving a
"session" and not "context", the specific legacy pattern in question?
3c.  if 3a and 3b, are we expecting all the various plugins to
start using "context" at some point ?  (and when they come to me with
breakage, I can push them in that direction?)

4. What is the bug we are fixing by using deepcopy() with the
arguments passed to the decorated function:

4a.  Is it strictly that simple mutable lists and dicts of simple
types (ints, strings, etc.) are preserved across retries, assuming the
receiving functions might be modifying them?

4b. Does this deepcopy() approach have any support for functions
that are being passed not just simple structures but ORM-mapped
objects?   Was this use case considered?

4c. Does Neutron's model base have any support for deepcopy()?  I
notice that the base classes in model_base do **not** seem to have a
__deepcopy__ present.  This would mean that this deepcopy() will
definitely break any ORM mapped objects because you need to use an
approach like what Nova uses for copy():
https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L46
.  This would be why the developer's objects are all getting kicked
out of the session.   (this is the "is this a bug?" part)

4d. How frequent is the problem of mutable lists and dicts, and
should this complex and actually pretty expensive behavior be
optional, only for those functions that are known to receieve mutable
structures and change them in place?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][oslo.db][keystone] A POC of Keystone over CockroachDB

2017-09-06 Thread Michael Bayer
On Mon, Sep 4, 2017 at 12:06 PM, Ronan-Alexandre Cherrueau
 wrote:
> Hi folks,
>
> Recently in the Inria's Discovery initiative[1], we got in touch with
> CockroachLabs guys with an idea: make Keystone supports CockorachDB. So
> we give it a try and you can find a very first result on our GitHub[2].
> The GitHub project consists of a Vagrant file that spawns a VM with a
> CockroachDB database and then installs Keystone with Devstack using
> CockroachDB as backend.
>
> CockroachDB claims to support the PostgreSQL protocol. It also provides
> support for SQLAlchemy that mostly inherits from the PostgreSQL one. So,
> making Keystone working with CockroachDB should be easy peasy right?
> Almost! The rest of this email describes what we have done, to make it
> works.
>
>
> sqlalchemy-migrate doesn't work with CockroachDB
> 
>
> Keystone uses a database migration tool named sqlalchemy-migrate[3].
> This tool is called during the deployment of Keystone to migrate the
> database from its initial versions to its actual version.
> Unfortunately, migration failed with the following (partial)
> stacktrace:
>
> ,
> | DEBUG migrate.versioning.repository [-] Config:
> OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'),
> ('repository_id', 'keystone'), ('version_table', 'migrate_version'),
> ('required_dbs', '[]'), ('use_timestamp_numbering', 'False')]))])
> __init__ 
> /usr/local/lib/python2.7/dist-packages/migrate/versioning/repository.py:83
> | INFO migrate.versioning.api [-] 66 -> 67...
> | CRITICAL keystone [-] KeyError: 'cockroachdb'
> | ...
> | TRACE keystone visitorcallable = get_engine_visitor(engine, 
> visitor_name)
> | TRACE keystone   File
> "/usr/local/lib/python2.7/dist-packages/migrate/changeset/databases/visitor.py",
> line 47, in get_engine_visitor
> | TRACE keystone return get_dialect_visitor(engine.dialect, name)
> | TRACE keystone   File
> "/usr/local/lib/python2.7/dist-packages/migrate/changeset/databases/visitor.py",
> line 62, in get_dialect_visitor
> | TRACE keystone migrate_dialect_cls = DIALECTS[sa_dialect_name]
> | TRACE keystone KeyError: 'cockroachdb'
> `
>
> As we understand it, sqlalchemy-migrate contains dedicated SQL backend
> codes and there is no such code for CockroachDB. As a workaround, we
> have performed a second OS deployment with PostgreSQL as backend and
> made a dump of the database after migrations. We then bypassed
> migrations in our first deployment by setting up CockroachDB with the
> dumped script. Note that we had to edit the pgdump script a little bit
> because some features are missing in CockroachDB.
>
> The workaround we have to go with represents an obstacle to test
> CockroachDB with other OpenStack core services. We would be grateful
> if some of you would help us with adding the support of CockroachDB in
> sqlalchemy-migrate.

Please don't pursue sqlalchemy-migrate, it is an ancient and
unmaintained project.   Keystone should be migrated to Alembic
migrations and this would be a much more useful way to spend the
efforts.  Alembic is fully extensible as well and I am highly
motivated to make it work as needed.


>
>
> oslo.db contains backend specific code for error handling
> =
>
> The `oslo.db' library contains one file with backend-specific codes
> (`oslo_db/sqlalchemy/exc_filters.py'[4]). This file aims at
> abstracting database exceptions that differ but target the same class
> of error (because each backend produces a specific exception with a
> specific error message). Intercepted exceptions are wrapped into a
> common OS exception to abstract backend errors in the rest of
> `oslo.db'. CockroachDB doesn't produce same errors than PostgreSQL, so
> we have to update this class. Note that our POC is not exhaustive
> since we only added error messages we saw during Tempest/Rally tests.
>
> You can look at the differences between OpenStack/oslo.db/stable/pike
> and our fork on GitHub[5]. We only add two lines!
>
>
> CockroachDB manages isolation without lock
> ==
>
> CockroachDB lets you write transactions that respect ACID properties.
> Regarding the "I" (Isolation), CockroachDB offers snapshot and
> serializable isolation but doesn't rely on locks to do that. So every
> time there is concurrent editing transactions that end in a conflict,
> then you have to retry the transactions. Fortunately, `oslo.db'
> already offers a decorator[6] to do that automatically. But, based on
> tests we run with Tempest/Rally, we figured out that some Keystone's
> SQL requests needed the decorator.
>
> You can look at the differences between OpenStack/keystone/stable/pike
> and our fork on GitHub[7].
>
>
> What's next?
> 
>
> You can drop yourself on the VM as a stack user and run Rally tests
> (see README). Tempest is disabled because we have an issue with its
> installation[8]. Note th

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-27 Thread Michael Bayer
proposed:

https://review.openstack.org/#/c/487902/

On Thu, Jul 27, 2017 at 9:46 AM, Michael Bayer  wrote:
> On Wed, Jul 26, 2017 at 8:06 PM, Jay Pipes  wrote:
>> Isn't that exactly what I'm proposing below? :)
>
> yes, I'm agreeing with you!

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][oslo.db] nominating Jay Pipes for oslo-db-core

2017-07-27 Thread Michael Bayer
Yeah if Jay has the time that's a big +1 from me!

On Thu, Jul 27, 2017 at 10:04 AM, Doug Hellmann  wrote:
> I have noticed that Jay has been very deeply involved in several
> recent design discussions about oslo.db, and he obviously has a
> great deal of experience in the area, so even though he hasn't been
> actively reviewing patches recently I think he would be a good
> addition to the team. I have asked him, and he is interested and will
> try to become more active as well.
>
> Please indicate your opinion with +1/-1.
>
> Doug
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [oslo.db] [relational database users] heads up for a MariaDB issue that will affect most projects

2017-07-27 Thread Michael Bayer
On Thu, Jul 27, 2017 at 3:23 AM, ChangBo Guo  wrote:
> Thanks for the follow up, maybe we need document the issue and work around
> in some place, in alembic?

So yes, there's a whole bunch implied by this:

1. SQLAlchemy knows how to reflect CHECK constraints.  Since MariaDB
supports these as of 10.2, SQLAlchemy itself would need to learn to
read these for MariaDB; on all MySQL/MariaDB versions right now there
is no logic to do this since CHECK constraints are not persisted.

The usual approach taken in Openstack migration scripts when they have
to navigate around constraints is to use inspector / reflection to
read things about constraints and indexes into the Python application,
make decisions, then emit statements as a result.This approach
would not be possible until SQLAlchemy implements this, or we add
helpers into oslo.db that do something similar.

2. Microsoft SQL Server has this same limitation, e.g. that you can't
drop a column that has a CHECK on it.   Over there, Alembic includes a
feature that generates SQL to detect the name of this constraint and
then drop the constraint of that name.  What is very useful about how
it is done in this case is that it is all within a server-side SQL
script that does the whole locate / drop without any round trips.   If
something similar could be implemented for MySQL/MariaDB, preferably
something that runs without error on all MySQL variants, it would be
just a simple bit of boilerplate that has to run before you drop a
column, and this is the kind of thing we could possibly bundle as a
"mysql_drop_check" flag for the drop_column() operation.

3. if the name of this CHECK constraint is known up front, it's much
easier to just emit "DROP CONSTRAINT xyz".   But it's not clear that
all openstack projects are sending out names for CHECK constraints in
particular the one that sneaks out when the Boolean datatype is used.
 Applying predictable names to the CHECK constraints retroactively of
course means you need to do something like #1 or #2 to find them.

4. *maybe* mariadb's CHECK constraint here already has a predictable
name.   I'd have to get a 10.2 instance running and see what it comes
up with.

>
> 2017-07-24 23:21 GMT+08:00 Michael Bayer :
>>
>> hey good news, the owner of the issue upstream found that the SQL
>> standard agrees with my proposed behavior.   So while this is current
>> MariaDB 10.2 / 10.3 behavior, hopefully it will be resolved in an
>> upcoming release within those series.   not sure of the timing though
>> so we may not be able to duck it.
>>
>> On Mon, Jul 24, 2017 at 11:16 AM, Michael Bayer  wrote:
>> > On Mon, Jul 24, 2017 at 10:37 AM, Doug Hellmann 
>> > wrote:
>> >> Excerpts from Michael Bayer's message of 2017-07-23 16:39:20 -0400:
>> >>> Hey list -
>> >>>
>> >>> It appears that MariaDB as of version 10.2 has made an enhancement
>> >>> that overall is great and fairly historic in the MySQL community,
>> >>> they've made CHECK constraints finally work.   For all of MySQL's
>> >>> existence, you could emit a CREATE TABLE statement that included CHECK
>> >>> constraint, but the CHECK phrase would be silently ignored; there are
>> >>> no actual CHECK constraints in MySQL.
>> >>>
>> >>> Mariadb 10.2 has now made CHECK do something!  However!  the bad news!
>> >>>  They have decided that the CHECK constraint against a single column
>> >>> should not be implicitly dropped if you drop the column [1].   In case
>> >>> you were under the impression your SQLAlchemy / oslo.db project
>> >>> doesn't use CHECK constraints, if you are using the SQLAlchemy Boolean
>> >>> type, or the "ENUM" type without using MySQL's native ENUM feature
>> >>> (less likely), there's a simple CHECK constraint in there.
>> >>>
>> >>> So far the Zun project has reported the first bug on Alembic [2] that
>> >>> they can't emit a DROP COLUMN for a boolean column.In [1] I've
>> >>> made my complete argument for why this decision on the MariaDB side is
>> >>> misguided.   However, be on the lookout for boolean columns that can't
>> >>> be DROPPED on some environments using newer MariaDB.  Workarounds for
>> >>> now include:
>> >>>
>> >>> 1. when using Boolean(), set create_constraint=False
>> >>>
>> >>> 2. when using Boolean(), make sure it has a "name" to give the
>> >>> constraint, so that later you can DROP CONSTRAINT easily
>> >>>

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-27 Thread Michael Bayer
On Wed, Jul 26, 2017 at 8:06 PM, Jay Pipes  wrote:
> Isn't that exactly what I'm proposing below? :)

yes, I'm agreeing with you!

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Michael Bayer
On Jul 26, 2017 7:45 PM, "Jay Pipes"  wrote:

On 07/26/2017 07:06 PM, Octave J. Orgeron wrote:

> Hi Michael,
>
> On 7/26/2017 4:28 PM, Michael Bayer wrote:
>
>>
>> it at all.
>> thinking out loud
>>
>> oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
>> oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
>> oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)
>>
>>
>> so if you don't have mysql_small_rowsize,  nothing happens.
>>
>>
> I think the mysql_small_rowsize is a bit misleading since in one case we
> are changing the size and the others the type. Perhaps:
>
> mysql_alt_size=64
> mysql_alt_type=sa.TINYTEXT
> mysql_alt_type=sa.TEXT
>
> alt standing for alternate. What do you think?
>

-1

I think it should be specific to NDB, since that's what the override is
for. I'd support something like:

 oslo_db.sqlalchemy.types.String(255, mysql_ndb_size=64)


Ok, I give up on that fight, fine.  mysql_ndb_xyz but at least build it
into a nicely named type.   I know i come off as crazy changing my mind and
temporarily forgetting key details but this is often how I internally come
up with things...




Octave, I understand due to the table row size limitations the desire to
reduce some column sizes for NDB. What I'm not entirely clear on is the
reason to change the column *type* specifically for NDB. There are
definitely cases where different databases have column types -- say,
PostgreSQL's INET column type -- that don't exist in other RDBMS. For those
cases, the standard approach in SQLAlchemy is to create a sqlalchemy
ColumnType concrete class that essentially translates the CREATE TABLE
statement (and type compilation/coercing) to specify the supported column
type in the RDBMS if it's supported otherwise defaults the column type to
something coerceable.

An example of this can be seen here for how this is done for IPv4 data in
the apiary project:

https://github.com/gmr/apiary/blob/master/apiary/types.py#L49

I'd certainly be open to doing things like this for NDB, but I'd first need
to understand why you chose to convert the column types for the columns
that you did. Any information you can provide about that would be great.

Best,
-jay


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Michael Bayer
On Jul 26, 2017 6:28 PM, "Michael Bayer"  wrote:

On Wed, Jul 26, 2017 at 6:19 PM, Michael Bayer  wrote:
> On Wed, Jul 26, 2017 at 5:30 PM, Michael Bayer  wrote:
>>
>> There is a bigger problem with this entire series of changes, whether
>> or not the "ndb" keyword is present.  Which is that projects need to
>> add new columns, tables, and make datatype changes all the time, and
>> they will not have any idea about the requirements for ndb or even
>> that it exists, nor will anyone have access to this platform for
>> development nor should they be expected to worry about it.   If they
>> not only have to fill in dozens of special "ndb" or generic-but-needed
>> by ndb flags, and then if they even have to worry about the sum of all
>> the sizes in a row, that means the ndb implementation will be
>> continuously broken across many projects in every release unless ndb
>> developers are checking every database change in every project at all
>> times.   Is that level of effort part of the plan?
>
> OK, I apologize, you answered that here:
>
> https://review.openstack.org/#/c/427970/26
>
>
>> Now considering that my company is heavily invested in using MySQL
Cluster (NDB) and that we use the kola framework, we have to keep an eye on
each of the services to make sure that it works. This is why you see lots
of other patches that I'm working on to fix services like Cinder, Neutron,
Nova, etc. So has time goes by, we will continue to make patches to enable
these services to work with NDB.
>
> If we were to approach this as real migrations that change the lengths
> of datatypes, that means the tables must be created at InnoDB first
> and migrated to NDB within the migrations.  Because NDB will disallow
> the initial creation, right?
>
> if the proposal is to modify the actual sizes in the original
> migration files, that's not something that can be done, unfortunately,
> it would be hugely risky because those migrations represent a snapshot
> of the actual schema.
>
> If we *do* need to keep doing something like the "AutoStringXYZ"
> approach I really want to change those names and not have any "ndb."
> in it at all.

thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.



Also, these flags in theory would *only* be in an old migration file.
Going forward, we'd hope that Oracle will be running its own CI and
ensuring new migrations don't go over the limits , right ?  New columns
would ideally not have conditional datatype rules at all.












>
> But all the options here seem kind of icky.
>
>
>
>
>
>
>>
>>
>>
>>
>>
>>
>>
>>>
>>> I don't see a way of automating that and making it maintainable without
a
>>> lot more overhead in code and people. If we really want to remove the
>>> complexity here, why don't we just change the sizes and types on these
>>> handful of table columns so that they fit within both InnoDB and NDB?
That
>>> way we don't need these functions and the tables are exactly the same?
That
>>> would only leave us with the createtable, savepoint/rollback, etc.
stuff to
>>> address which is already taken care of in the ndb module in oslo.db?
Then we
>>> just fix the foreign key stuff as I've been doing, since it has zero
impact
>>> on InnoDB deployments and if anything ensures things are consistent.
That
>>> would then leave us to really focus on fixing migrations to use oslo.db
and
>>> pass the correct flags, which is a more lengthy process than the rest of
>>> this.
>>>
>>> I don't see the point in trying to make this stuff anymore complicated.
>>>
>>> Octave
>>>
>>>
>>> On 7/25/2017 12:20 PM, Michael Bayer wrote:
>>>>
>>>> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer 
wrote:
>>>>>>
>>>>>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
>>>>>> most
>>>>>> dbs, TINYTEXT for ndb
>>>>>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for
most
>>>>>> dbs, TEXT for ndb
>>>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most
dbs,
>>>>>> VARCHAR(64) on ndb
>>>>>>
>>>>>> This way, we can override the String with TINYTEXT or TEXT or ch

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Michael Bayer
On Wed, Jul 26, 2017 at 6:19 PM, Michael Bayer  wrote:
> On Wed, Jul 26, 2017 at 5:30 PM, Michael Bayer  wrote:
>>
>> There is a bigger problem with this entire series of changes, whether
>> or not the "ndb" keyword is present.  Which is that projects need to
>> add new columns, tables, and make datatype changes all the time, and
>> they will not have any idea about the requirements for ndb or even
>> that it exists, nor will anyone have access to this platform for
>> development nor should they be expected to worry about it.   If they
>> not only have to fill in dozens of special "ndb" or generic-but-needed
>> by ndb flags, and then if they even have to worry about the sum of all
>> the sizes in a row, that means the ndb implementation will be
>> continuously broken across many projects in every release unless ndb
>> developers are checking every database change in every project at all
>> times.   Is that level of effort part of the plan?
>
> OK, I apologize, you answered that here:
>
> https://review.openstack.org/#/c/427970/26
>
>
>> Now considering that my company is heavily invested in using MySQL Cluster 
>> (NDB) and that we use the kola framework, we have to keep an eye on each of 
>> the services to make sure that it works. This is why you see lots of other 
>> patches that I'm working on to fix services like Cinder, Neutron, Nova, etc. 
>> So has time goes by, we will continue to make patches to enable these 
>> services to work with NDB.
>
> If we were to approach this as real migrations that change the lengths
> of datatypes, that means the tables must be created at InnoDB first
> and migrated to NDB within the migrations.  Because NDB will disallow
> the initial creation, right?
>
> if the proposal is to modify the actual sizes in the original
> migration files, that's not something that can be done, unfortunately,
> it would be hugely risky because those migrations represent a snapshot
> of the actual schema.
>
> If we *do* need to keep doing something like the "AutoStringXYZ"
> approach I really want to change those names and not have any "ndb."
> in it at all.

thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.










>
> But all the options here seem kind of icky.
>
>
>
>
>
>
>>
>>
>>
>>
>>
>>
>>
>>>
>>> I don't see a way of automating that and making it maintainable without a
>>> lot more overhead in code and people. If we really want to remove the
>>> complexity here, why don't we just change the sizes and types on these
>>> handful of table columns so that they fit within both InnoDB and NDB? That
>>> way we don't need these functions and the tables are exactly the same? That
>>> would only leave us with the createtable, savepoint/rollback, etc. stuff to
>>> address which is already taken care of in the ndb module in oslo.db? Then we
>>> just fix the foreign key stuff as I've been doing, since it has zero impact
>>> on InnoDB deployments and if anything ensures things are consistent. That
>>> would then leave us to really focus on fixing migrations to use oslo.db and
>>> pass the correct flags, which is a more lengthy process than the rest of
>>> this.
>>>
>>> I don't see the point in trying to make this stuff anymore complicated.
>>>
>>> Octave
>>>
>>>
>>> On 7/25/2017 12:20 PM, Michael Bayer wrote:
>>>>
>>>> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>>>>>>
>>>>>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
>>>>>> most
>>>>>> dbs, TINYTEXT for ndb
>>>>>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>>>>>> dbs, TEXT for ndb
>>>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>>>>>> VARCHAR(64) on ndb
>>>>>>
>>>>>> This way, we can override the String with TINYTEXT or TEXT or change the
>>>>>> size for ndb.
>>>>>>>
>>>>>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>>>>>> TINYTEXT() on ndb
>>>>>>> oslo_db.sqlalchemy.String(255, ndb_s

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Michael Bayer
On Wed, Jul 26, 2017 at 5:30 PM, Michael Bayer  wrote:
>
> There is a bigger problem with this entire series of changes, whether
> or not the "ndb" keyword is present.  Which is that projects need to
> add new columns, tables, and make datatype changes all the time, and
> they will not have any idea about the requirements for ndb or even
> that it exists, nor will anyone have access to this platform for
> development nor should they be expected to worry about it.   If they
> not only have to fill in dozens of special "ndb" or generic-but-needed
> by ndb flags, and then if they even have to worry about the sum of all
> the sizes in a row, that means the ndb implementation will be
> continuously broken across many projects in every release unless ndb
> developers are checking every database change in every project at all
> times.   Is that level of effort part of the plan?

OK, I apologize, you answered that here:

https://review.openstack.org/#/c/427970/26


> Now considering that my company is heavily invested in using MySQL Cluster 
> (NDB) and that we use the kola framework, we have to keep an eye on each of 
> the services to make sure that it works. This is why you see lots of other 
> patches that I'm working on to fix services like Cinder, Neutron, Nova, etc. 
> So has time goes by, we will continue to make patches to enable these 
> services to work with NDB.

If we were to approach this as real migrations that change the lengths
of datatypes, that means the tables must be created at InnoDB first
and migrated to NDB within the migrations.  Because NDB will disallow
the initial creation, right?

if the proposal is to modify the actual sizes in the original
migration files, that's not something that can be done, unfortunately,
it would be hugely risky because those migrations represent a snapshot
of the actual schema.

If we *do* need to keep doing something like the "AutoStringXYZ"
approach I really want to change those names and not have any "ndb."
in it at all.

But all the options here seem kind of icky.






>
>
>
>
>
>
>
>>
>> I don't see a way of automating that and making it maintainable without a
>> lot more overhead in code and people. If we really want to remove the
>> complexity here, why don't we just change the sizes and types on these
>> handful of table columns so that they fit within both InnoDB and NDB? That
>> way we don't need these functions and the tables are exactly the same? That
>> would only leave us with the createtable, savepoint/rollback, etc. stuff to
>> address which is already taken care of in the ndb module in oslo.db? Then we
>> just fix the foreign key stuff as I've been doing, since it has zero impact
>> on InnoDB deployments and if anything ensures things are consistent. That
>> would then leave us to really focus on fixing migrations to use oslo.db and
>> pass the correct flags, which is a more lengthy process than the rest of
>> this.
>>
>> I don't see the point in trying to make this stuff anymore complicated.
>>
>> Octave
>>
>>
>> On 7/25/2017 12:20 PM, Michael Bayer wrote:
>>>
>>> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>>>>>
>>>>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
>>>>> most
>>>>> dbs, TINYTEXT for ndb
>>>>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>>>>> dbs, TEXT for ndb
>>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>>>>> VARCHAR(64) on ndb
>>>>>
>>>>> This way, we can override the String with TINYTEXT or TEXT or change the
>>>>> size for ndb.
>>>>>>
>>>>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>>>>> TINYTEXT() on ndb
>>>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>>>>>> most dbs, VARCHAR(64) on ndb
>>>>>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>>>>>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>>>>>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs,
>>>>>> TINYTEXT()
>>>>>> on ndb
>>>>>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>>>>>> dbs, VARCHAR(55) on ndb
>>>>>>
>>>>>> don't worry about implementation, can the above declaration ->
>>>>>> datatype mappin

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Michael Bayer
I realize now that we are in fact going for a total "row size", when I
was under the impression that ndb had a simple limit of 64 characters
for a VARCHAR.

As I was going on the completely wrong assumptions, I'd like to
rethink the approach of datatypes.

I do think a real migration that simply reduces the sizes of selected
columns is the best approach in this case, and that the types like
AutoStringXYZ should go away completely.

To that end I've proposed reverting the one ndb patchset that has
merged which is the one in Cinder:

https://review.openstack.org/#/c/487603/

However, if Cinder declines to revert this, the "AutoXYZ" types in
oslo.db (which have also been released) will have to go through a
deprecation cycle.

Additionally, my concern that projects will not have any way to guard
against ever going over a 14K row size remains, and I still think that
checks need to be put in place in oslo.db that would sum the total row
size of any given table and raise an error if the limit is surpassed.




On Wed, Jul 26, 2017 at 5:40 PM, Octave J. Orgeron
 wrote:
> Hi Michael,
>
> Comments below..
>
> On 7/26/2017 1:08 PM, Michael Bayer wrote:
>
>
>
> On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" 
> wrote:
>
> Hi Michael,
>
> I understand that you want to abstract this completely away inside of
> oslo.db. However, the reality is that making column changes based purely on
> the size and type of that column, without understanding what that column is
> being used for is extremely dangerous. You could end up clobbering a column
> that needs a specific length for a value,
>
>
>
> Nowhere in my example is the current length truncated.   Also, if two
> distinct lengths truly must be maintained we add a field "minimum_length".
>
>
> prevent
>
>  an index from working, etc.
>
>
> That's what the indexable flag would achieve.
>
> It
>
>  wouldn't make sense to just do global changes on a column based on the
> size.
>
>
> This seems to be what your patches are doing, however.
>
>
> This is incorrect. I only change columns that meet my criteria for being
> changed. I'm not globally changing columns across every table and service.
> So to be clear and make sure we are on the same page..
>
> Are you proposing that we continue to select specific columns and adjust
> their size by using the below, instead of the ndb.Auto* functions?
>
> oslo_db.sqlalchemy.String(, indexable=, ndb_size=,
> ndb_type=)
>
> i.e.
>
> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
> dbs, TINYTEXT for ndb
> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
> dbs, TEXT for ndb
> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
> VARCHAR(64) on ndb
>
> So if I need to change a column that today says:
>
> sa.String(4096)
>
> I would modify it to:
>
> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT)
>
> OR
>
> Are you proposing that we change very single column across every single
> database blindly using some logic in oslo.db, where even if a column doesn't
> need to be changed, it gets changed based on the database engine type and
> the size of the column?
>
> So even if we have a table that doesn't need to be changed or touched, we
> would end up with:
>
> mysql_enable_ndb = True
>
> sa.String(255) -> TINYTEXT
>
> If that is the type of behavior you are aiming for, I think don't that makes
> sense.
>
>
>
>
>
> There are far more tables that fit in both InnoDB and NDB already than those
> that don't. As I've stated many times before, the columns that I make
> changes to are evaluated to understand:
>
> 1. What populates it?
> 2. Who consumes it?
> 3. What are the possible values and required lengths?
> 4. What is the impact of changing the size or type?
> 5. Evaluated against the other columns in the table, which one makes the
> most sense to adjust?
>
> I don't see a way of automating that and making it maintainable without a
> lot more overhead in code and people.
>
>
> My proposal is intended to *reduce* the great verbosity in the current
> patches I see and remove the burden of every project having to be aware of
> "ndb" every time a column is added.
>
>
> I agree with using as few arguments to the oslo.db.sqlalchemy.String
> function. But at the same time, if a column needs to be adjusted, someone
> has to put the right arguments there. As far as the burden goes, Oracle is
> already taking the ownership of making MySQL Cluster work across services,
> which means maintaining patches and creating new ones as 

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Michael Bayer
On Tue, Jul 25, 2017 at 3:27 PM, Octave J. Orgeron
 wrote:
> 5. Evaluated against the other columns in the table, which one makes the
> most sense to adjust?


well, the above point is one I've been trying to get a straight answer
on for a long time.

"evaluated against other columns" suggests we cannot change the size
of a datatype in isolation, instead we are trying to create a total
length of the row.Otherwise, the size of the "other" columns
should not matter.

Then, in fact yes I do see you aren't changing every size, in 216_havana I see:

Column('display_name', String(length=255))  -> no change
Column('display_description', String(length=255)), -> becomes TINYTEXT
Column('os_type', String(length=255)) -> becomes VARCHAR(64)

The "display_name" column will render VARCHAR(255).  Which means, ndb
can have a VARCHAR(255).  But in the case of os_type, you shrink it to
be VARCHAR(64) for ndb. Why?   What happens if it stays
VARCHAR(255) ?

There is a bigger problem with this entire series of changes, whether
or not the "ndb" keyword is present.  Which is that projects need to
add new columns, tables, and make datatype changes all the time, and
they will not have any idea about the requirements for ndb or even
that it exists, nor will anyone have access to this platform for
development nor should they be expected to worry about it.   If they
not only have to fill in dozens of special "ndb" or generic-but-needed
by ndb flags, and then if they even have to worry about the sum of all
the sizes in a row, that means the ndb implementation will be
continuously broken across many projects in every release unless ndb
developers are checking every database change in every project at all
times.   Is that level of effort part of the plan?







>
> I don't see a way of automating that and making it maintainable without a
> lot more overhead in code and people. If we really want to remove the
> complexity here, why don't we just change the sizes and types on these
> handful of table columns so that they fit within both InnoDB and NDB? That
> way we don't need these functions and the tables are exactly the same? That
> would only leave us with the createtable, savepoint/rollback, etc. stuff to
> address which is already taken care of in the ndb module in oslo.db? Then we
> just fix the foreign key stuff as I've been doing, since it has zero impact
> on InnoDB deployments and if anything ensures things are consistent. That
> would then leave us to really focus on fixing migrations to use oslo.db and
> pass the correct flags, which is a more lengthy process than the rest of
> this.
>
> I don't see the point in trying to make this stuff anymore complicated.
>
> Octave
>
>
> On 7/25/2017 12:20 PM, Michael Bayer wrote:
>>
>> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>>>>
>>>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
>>>> most
>>>> dbs, TINYTEXT for ndb
>>>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>>>> dbs, TEXT for ndb
>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>>>> VARCHAR(64) on ndb
>>>>
>>>> This way, we can override the String with TINYTEXT or TEXT or change the
>>>> size for ndb.
>>>>>
>>>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>>>> TINYTEXT() on ndb
>>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>>>>> most dbs, VARCHAR(64) on ndb
>>>>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>>>>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>>>>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs,
>>>>> TINYTEXT()
>>>>> on ndb
>>>>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>>>>> dbs, VARCHAR(55) on ndb
>>>>>
>>>>> don't worry about implementation, can the above declaration ->
>>>>> datatype mapping work ?
>>>>>
>>>>>
>>>> In my patch for Neutron, you'll see a lot of the AutoStringText() calls
>>>> to
>>>> replace exceptionally long String columns (4096, 8192, and larger).
>>>
>>> MySQL supports large VARCHAR now, OK.   yeah this could be
>>> String(8192, ndb_type=TEXT) as well.
>>
>> OK, no, sorry each time I think of this I keep seeing the verbosity of
>> imports etc. in the code, because i

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Michael Bayer
On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" 
wrote:

Hi Michael,

I understand that you want to abstract this completely away inside of
oslo.db. However, the reality is that making column changes based purely on
the size and type of that column, without understanding what that column is
being used for is extremely dangerous. You could end up clobbering a column
that needs a specific length for a value,



Nowhere in my example is the current length truncated.   Also, if two
distinct lengths truly must be maintained we add a field "minimum_length".


prevent

 an index from working, etc.


That's what the indexable flag would achieve.

It

 wouldn't make sense to just do global changes on a column based on the
size.


This seems to be what your patches are doing, however.


There are far more tables that fit in both InnoDB and NDB already than
those that don't. As I've stated many times before, the columns that I make
changes to are evaluated to understand:

1. What populates it?
2. Who consumes it?
3. What are the possible values and required lengths?
4. What is the impact of changing the size or type?
5. Evaluated against the other columns in the table, which one makes the
most sense to adjust?

I don't see a way of automating that and making it maintainable without a
lot more overhead in code and people.


My proposal is intended to *reduce* the great verbosity in the current
patches I see and remove the burden of every project having to be aware of
"ndb" every time a column is added.


If

 we really want to remove the complexity here, why don't we just change the
sizes and types on these handful of table columns so that they fit within
both InnoDB and NDB?



Because that requires new migrations which are a great risk and
inconvenience to projects.

That

 way we don't need these functions and the tables are exactly the same?
That would only leave us with the createtable, savepoint/rollback, etc.
stuff to address which is already taken care of in the ndb module in
oslo.db? Then we just fix the foreign key stuff as I've been doing, since
it has zero impact on InnoDB deployments and if anything ensures things are
consistent. That would then leave us to really focus on fixing migrations
to use oslo.db and pass the correct flags, which is a more lengthy process
than the rest of this.

I don't see the point in trying to make this stuff anymore complicated.


The proposal is to make it simpler than it is right now.

Run though every column change youve proposed and show me which ones don't
fit into my proposed ruleset.   I will add additional declarative flags to
ensure those use cases are covered.





Octave


On 7/25/2017 12:20 PM, Michael Bayer wrote:

> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>
>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
>>> dbs, TINYTEXT for ndb
>>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>>> dbs, TEXT for ndb
>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>>> VARCHAR(64) on ndb
>>>
>>> This way, we can override the String with TINYTEXT or TEXT or change the
>>> size for ndb.
>>>
>>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>>> TINYTEXT() on ndb
>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>>>> most dbs, VARCHAR(64) on ndb
>>>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>>>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>>>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT()
>>>> on ndb
>>>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>>>> dbs, VARCHAR(55) on ndb
>>>>
>>>> don't worry about implementation, can the above declaration ->
>>>> datatype mapping work ?
>>>>
>>>>
>>>> In my patch for Neutron, you'll see a lot of the AutoStringText() calls
>>> to
>>> replace exceptionally long String columns (4096, 8192, and larger).
>>>
>> MySQL supports large VARCHAR now, OK.   yeah this could be
>> String(8192, ndb_type=TEXT) as well.
>>
> OK, no, sorry each time I think of this I keep seeing the verbosity of
> imports etc. in the code, because if we had:
>
> String(80, ndb_type=TEXT)
>
> then we have to import both String and TEXT, and then what if there's
> ndb.TEXT, the code is still making an ndb-specific decision, etc.
>
> I still see that this can be mostly automated from a simple ruleset
> based on the size:
>
> length <= 64 :VARCHAR(length) on all backen

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-25 Thread Michael Bayer
On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
>> dbs, TINYTEXT for ndb
>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>> dbs, TEXT for ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>> VARCHAR(64) on ndb
>>
>> This way, we can override the String with TINYTEXT or TEXT or change the
>> size for ndb.
>
>>>
>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>> TINYTEXT() on ndb
>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>>> most dbs, VARCHAR(64) on ndb
>>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT()
>>> on ndb
>>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>>> dbs, VARCHAR(55) on ndb
>>>
>>> don't worry about implementation, can the above declaration ->
>>> datatype mapping work ?
>>>
>>>
>> In my patch for Neutron, you'll see a lot of the AutoStringText() calls to
>> replace exceptionally long String columns (4096, 8192, and larger).
>
> MySQL supports large VARCHAR now, OK.   yeah this could be
> String(8192, ndb_type=TEXT) as well.

OK, no, sorry each time I think of this I keep seeing the verbosity of
imports etc. in the code, because if we had:

String(80, ndb_type=TEXT)

then we have to import both String and TEXT, and then what if there's
ndb.TEXT, the code is still making an ndb-specific decision, etc.

I still see that this can be mostly automated from a simple ruleset
based on the size:

length <= 64 :VARCHAR(length) on all backends
length > 64, length <= 255:   VARCHAR(length) for most backends,
TINYTEXT for ndb
length > 4096:  VARCHAR(length) for most backends, TEXT for ndb

the one case that seems outside of this is:

String(255)  where they have an index or key on the VARCHAR, and in
fact they only need < 64 characters to be indexed.  In that case you
don't want to use TINYTEXT, right?   So one exception:

oslo_db.sqlalchemy.types.String(255, indexable=True)

e.g. a declarative hint to the oslo_db backend to not use a LOB type.

then we just need oslo_db.sqlalchemy.types.String, and virtually
nothing except the import has to change, and a few keywords.

What we're trying to do in oslo_db is as much as possible state the
intent of a structure or datatype declaratively, and leave as much of
the implementation up to oslo_db itself.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-24 Thread Michael Bayer
On Mon, Jul 24, 2017 at 5:10 PM, Octave J. Orgeron
 wrote:
> I don't think it makes sense to make these global. We don't need to change
> all occurrences of String(255) to TinyText for example. We make that
> determination through understanding the table structure and usage. But I do
> like the idea of changing the second option to ndb_size=, I think that makes
> things very clear. If you want to collapse the use cases.. what about?:
>
> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
> dbs, TINYTEXT for ndb
> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
> dbs, TEXT for ndb
> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
> VARCHAR(64) on ndb
>
> This way, we can override the String with TINYTEXT or TEXT or change the
> size for ndb.

OK.   See, originally when I was pushing for an ndb "dialect", that
hook lets us say String(255).with_variant(TEXT, "ndb") which is what I
was going for originally.  However, since we went with a special flag
and not a dialect, using ndb_type / ndb_size is *probably* fine.


>
>>
>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>> TINYTEXT() on ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>> most dbs, VARCHAR(64) on ndb
>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT()
>> on ndb
>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>> dbs, VARCHAR(55) on ndb
>>
>> don't worry about implementation, can the above declaration ->
>> datatype mapping work ?
>>
>> Also where are we using AutoStringText(), it sounds like this is just
>> what SQLAlchemy calls the Text() datatype?   (e.g. an unlengthed
>> string type, comes out as CLOB etc).
>>
> In my patch for Neutron, you'll see a lot of the AutoStringText() calls to
> replace exceptionally long String columns (4096, 8192, and larger).

MySQL supports large VARCHAR now, OK.   yeah this could be
String(8192, ndb_type=TEXT) as well.


>
>
>
>
>>
>>
>>> In many cases, the use of these could be removed by simply changing the
>>> columns to more appropriate types and sizes. There is a tremendous amount
>>> of
>>> wasted space in many of the databases. I'm more than willing to help out
>>> with this if teams decide they would rather do that instead as the
>>> long-term
>>> solution. Until then, these functions enable the use of both with minimal
>>> impact.
>>>
>>> Another thing to keep in mind is that the only services that I've had to
>>> adjust column sizes for are:
>>>
>>> Cinder
>>> Neutron
>>> Nova
>>> Magnum
>>>
>>> The other services that I'm working on like Keystone, Barbican, Murano,
>>> Glance, etc. only need changes to:
>>>
>>> 1. Ensure that foreign keys are dropped and created in the correct order
>>> when changing things like indexes, constraints, etc. Many services do
>>> these
>>> proper steps already, there are just cases where this has been missed
>>> because InnoDB is very forgiving on this. But other databases are not.
>>> 2. Fixing the database migration and sync operations to use oslo.db, pass
>>> the right parameters, etc. Something that should have been done in the
>>> first
>>> place, but hasn't. So this more of a house cleaning step to insure that
>>> services are using oslo.db correctly.
>>>
>>> The only other oddball use case is deal with disabling nested
>>> transactions,
>>> where Neutron is the only one that does this.
>>>
>>> On the flip side, here is a short list of services that I haven't had to
>>> make ANY changes for other than having oslo.db 4.24 or above:
>>>
>>> aodh
>>> gnocchi
>>> heat
>>> ironic
>>> manila
>>>
 3. it's not clear (I don't even know right now by looking at these
 reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
 For example in

 https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
 I see a list of String(255)'s changed to one type or the other without
 any clear notion why one would use one or the other.  Having names
 that define simply the declared nature of the type would be most
 appropriate.
>>>
>>>
>>> One has to look at what the column is being used for and decide what
>>> appropriate remediation steps are. This takes time and one must research
>>> what kind of data goes in the column, what puts it there, what consumes
>>> it,
>>> and what remediation would have the least amount of impact.
>>>
 I can add these names up to oslo.db and then we would just need to
 spread these out through all the open ndb reviews and then also patch
 up Cinder which seems to be the only ndb implementation that's been
 merged so far.

 Keep in mind this is really me trying to correct my own mistake, as I
 helped design and approved of the original approach here where
 

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-24 Thread Michael Bayer
On Mon, Jul 24, 2017 at 3:37 PM, Octave J. Orgeron
 wrote:
> For these, here is a brief synopsis:
>
> AutoStringTinyText, will convert a column to the TinyText type. This is used
> for cases where a 255 varchar string needs to be converted to a text blob to
> make the row fit within the NDB limits. If you are using ndb, it'll convert
> it to TinyText, otherwise it leaves it alone. The reason that TinyText type
> was chosen is because it'll hold the same 255 varchars and saves on space.
>
> AutoStringText, does the same as the above, but converts the type to Text
> and is meant for use cases where you need more than 255 varchar worth of
> space. Good examples of these uses are where outputs of hypervisor and OVS
> commands are dumped into the database.
>
> AutoStringSize, you pass two parameters, one being the non-NDB size and the
> second being the NDB size. The point here is where you need to reduce the
> size of the column to fit within the NDB limits, but you want to preserve
> the String varchar type because it might be used in a key, index, etc. I
> only use these in cases where the impacts are very low.. for example where a
> column is used for keeping track of status (up, down, active, inactive,
> etc.) that don't require 255 varchars.

Can the "auto" that is supplied by AutoStringTinyText and
AutoStringSize be merged?


oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
TINYTEXT() on ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
most dbs, VARCHAR(64) on ndb
oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT() on ndb
oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
dbs, VARCHAR(55) on ndb

don't worry about implementation, can the above declaration ->
datatype mapping work ?

Also where are we using AutoStringText(), it sounds like this is just
what SQLAlchemy calls the Text() datatype?   (e.g. an unlengthed
string type, comes out as CLOB etc).




>
> In many cases, the use of these could be removed by simply changing the
> columns to more appropriate types and sizes. There is a tremendous amount of
> wasted space in many of the databases. I'm more than willing to help out
> with this if teams decide they would rather do that instead as the long-term
> solution. Until then, these functions enable the use of both with minimal
> impact.
>
> Another thing to keep in mind is that the only services that I've had to
> adjust column sizes for are:
>
> Cinder
> Neutron
> Nova
> Magnum
>
> The other services that I'm working on like Keystone, Barbican, Murano,
> Glance, etc. only need changes to:
>
> 1. Ensure that foreign keys are dropped and created in the correct order
> when changing things like indexes, constraints, etc. Many services do these
> proper steps already, there are just cases where this has been missed
> because InnoDB is very forgiving on this. But other databases are not.
> 2. Fixing the database migration and sync operations to use oslo.db, pass
> the right parameters, etc. Something that should have been done in the first
> place, but hasn't. So this more of a house cleaning step to insure that
> services are using oslo.db correctly.
>
> The only other oddball use case is deal with disabling nested transactions,
> where Neutron is the only one that does this.
>
> On the flip side, here is a short list of services that I haven't had to
> make ANY changes for other than having oslo.db 4.24 or above:
>
> aodh
> gnocchi
> heat
> ironic
> manila
>
>>
>> 3. it's not clear (I don't even know right now by looking at these
>> reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
>> For example in
>> https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
>> I see a list of String(255)'s changed to one type or the other without
>> any clear notion why one would use one or the other.  Having names
>> that define simply the declared nature of the type would be most
>> appropriate.
>
>
> One has to look at what the column is being used for and decide what
> appropriate remediation steps are. This takes time and one must research
> what kind of data goes in the column, what puts it there, what consumes it,
> and what remediation would have the least amount of impact.
>
>>
>> I can add these names up to oslo.db and then we would just need to
>> spread these out through all the open ndb reviews and then also patch
>> up Cinder which seems to be the only ndb implementation that's been
>> merged so far.
>>
>> Keep in mind this is really me trying to correct my own mistake, as I
>> helped design and approved of the original approach here where
>> projects would be consuming against the "ndb." namespace.  However,
>> after seeing it in reviews how prevalent the use of this extremely
>> backend-specific name is, I think the use of the name should be much

Re: [openstack-dev] [all] [oslo.db] [relational database users] heads up for a MariaDB issue that will affect most projects

2017-07-24 Thread Michael Bayer
hey good news, the owner of the issue upstream found that the SQL
standard agrees with my proposed behavior.   So while this is current
MariaDB 10.2 / 10.3 behavior, hopefully it will be resolved in an
upcoming release within those series.   not sure of the timing though
so we may not be able to duck it.

On Mon, Jul 24, 2017 at 11:16 AM, Michael Bayer  wrote:
> On Mon, Jul 24, 2017 at 10:37 AM, Doug Hellmann  wrote:
>> Excerpts from Michael Bayer's message of 2017-07-23 16:39:20 -0400:
>>> Hey list -
>>>
>>> It appears that MariaDB as of version 10.2 has made an enhancement
>>> that overall is great and fairly historic in the MySQL community,
>>> they've made CHECK constraints finally work.   For all of MySQL's
>>> existence, you could emit a CREATE TABLE statement that included CHECK
>>> constraint, but the CHECK phrase would be silently ignored; there are
>>> no actual CHECK constraints in MySQL.
>>>
>>> Mariadb 10.2 has now made CHECK do something!  However!  the bad news!
>>>  They have decided that the CHECK constraint against a single column
>>> should not be implicitly dropped if you drop the column [1].   In case
>>> you were under the impression your SQLAlchemy / oslo.db project
>>> doesn't use CHECK constraints, if you are using the SQLAlchemy Boolean
>>> type, or the "ENUM" type without using MySQL's native ENUM feature
>>> (less likely), there's a simple CHECK constraint in there.
>>>
>>> So far the Zun project has reported the first bug on Alembic [2] that
>>> they can't emit a DROP COLUMN for a boolean column.In [1] I've
>>> made my complete argument for why this decision on the MariaDB side is
>>> misguided.   However, be on the lookout for boolean columns that can't
>>> be DROPPED on some environments using newer MariaDB.  Workarounds for
>>> now include:
>>>
>>> 1. when using Boolean(), set create_constraint=False
>>>
>>> 2. when using Boolean(), make sure it has a "name" to give the
>>> constraint, so that later you can DROP CONSTRAINT easily
>>>
>>> 3. if not doing #1 and #2, in order to drop the column you need to use
>>> the inspector (e.g. from sqlalchemy import inspect; inspector =
>>> inspect(engine)) and locate all the CHECK constraints involving the
>>> target column, and then drop them by name.
>>
>> Item 3 sounds like the description of a helper function we could add to
>> oslo.db for use in migration scripts.
>
> OK let me give a little bit more context, that if MariaDB holds steady
> here, I will have to implement #3 within Alembic itself (though yes,
> for SQLAlchemy-migrate, still needed :) ). MS SQL Server has the
> same limitation for CHECK constraints and Alembic provides for a
> SQL-only procedure that can run as a static SQL element on that
> backend; hopefully the same is possible for MySQL.
>
>
>
>>
>> Doug
>>
>>>
>>> [1] https://jira.mariadb.org/browse/MDEV-4
>>>
>>> [2] 
>>> https://bitbucket.org/zzzeek/alembic/issues/440/cannot-drop-boolean-column-in-mysql
>>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [oslo.db] [relational database users] heads up for a MariaDB issue that will affect most projects

2017-07-24 Thread Michael Bayer
On Mon, Jul 24, 2017 at 10:37 AM, Doug Hellmann  wrote:
> Excerpts from Michael Bayer's message of 2017-07-23 16:39:20 -0400:
>> Hey list -
>>
>> It appears that MariaDB as of version 10.2 has made an enhancement
>> that overall is great and fairly historic in the MySQL community,
>> they've made CHECK constraints finally work.   For all of MySQL's
>> existence, you could emit a CREATE TABLE statement that included CHECK
>> constraint, but the CHECK phrase would be silently ignored; there are
>> no actual CHECK constraints in MySQL.
>>
>> Mariadb 10.2 has now made CHECK do something!  However!  the bad news!
>>  They have decided that the CHECK constraint against a single column
>> should not be implicitly dropped if you drop the column [1].   In case
>> you were under the impression your SQLAlchemy / oslo.db project
>> doesn't use CHECK constraints, if you are using the SQLAlchemy Boolean
>> type, or the "ENUM" type without using MySQL's native ENUM feature
>> (less likely), there's a simple CHECK constraint in there.
>>
>> So far the Zun project has reported the first bug on Alembic [2] that
>> they can't emit a DROP COLUMN for a boolean column.In [1] I've
>> made my complete argument for why this decision on the MariaDB side is
>> misguided.   However, be on the lookout for boolean columns that can't
>> be DROPPED on some environments using newer MariaDB.  Workarounds for
>> now include:
>>
>> 1. when using Boolean(), set create_constraint=False
>>
>> 2. when using Boolean(), make sure it has a "name" to give the
>> constraint, so that later you can DROP CONSTRAINT easily
>>
>> 3. if not doing #1 and #2, in order to drop the column you need to use
>> the inspector (e.g. from sqlalchemy import inspect; inspector =
>> inspect(engine)) and locate all the CHECK constraints involving the
>> target column, and then drop them by name.
>
> Item 3 sounds like the description of a helper function we could add to
> oslo.db for use in migration scripts.

OK let me give a little bit more context, that if MariaDB holds steady
here, I will have to implement #3 within Alembic itself (though yes,
for SQLAlchemy-migrate, still needed :) ). MS SQL Server has the
same limitation for CHECK constraints and Alembic provides for a
SQL-only procedure that can run as a static SQL element on that
backend; hopefully the same is possible for MySQL.



>
> Doug
>
>>
>> [1] https://jira.mariadb.org/browse/MDEV-4
>>
>> [2] 
>> https://bitbucket.org/zzzeek/alembic/issues/440/cannot-drop-boolean-column-in-mysql
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-24 Thread Michael Bayer
On Mon, Jul 24, 2017 at 10:01 AM, Jay Pipes  wrote:

> I would much prefer to *add* a brand new schema migration that handles
> conversion of the entire InnoDB schema at a certain point to an
> NDB-compatible one *after* that point. That way, we isolate the NDB changes
> to one specific schema migration -- and can point users to that one specific
> migration in case bugs arise. This is the reason that every release we add a
> number of "placeholder" schema migration numbered files to handle situations
> such as these.
>
> I understand that Oracle wants to support older versions of OpenStack in
> their distribution and that's totally cool with me. But, the proper way IMHO
> to do this kind of thing is to take one of the placeholder migrations and
> use that as the NDB-conversion migration. I would posit that since Oracle
> will need to keep some not-insignificant amount of Python code in their
> distribution fork of Nova in order to bring in the oslo.db and Nova NDB
> support, that it will actually be *easier* for them to maintain a *separate*
> placeholder schema migration for all NDB conversion work instead of changing
> an existing schema migration with a new patch.

OK, if it is feasible for the MySQL engine to build out the whole
schema as InnoDB and then do a migrate that changes the storage engine
of all tables to NDB and then also changes all the datatypes, that can
work.   If you want to go that way, then fine.

However, I may be missing something but I'm not seeing the practical
difference.   This new "ndb" migration still goes into the source
tree, still gets invoked for all users, and if the "if ndb_enabled()"
flag is somehow broken, it breaks just as well if it's in a brand new
migration vs. if it's in an old migration.

Suppose "if ndb_enabled(engine)" is somehow broken.  Either it crashes
the migrations, or it runs inappropriately.

If the conditional is in a brand new migration file that's pushed out
in Queens, *everybody* runs it when they upgrade, as well as when they
do fresh installation, and they get the breakage.

if the conditional is in havana 216, *everybody* gets it when they do
a fresh installation, and they get the breakage.   Upgraders do not.

How is "new migration" better than "make old migration compatible" ?

Again, fine by me if the other approach works, I'm just trying to see
where I'm being dense here.

Keep in mind that existing migrations *do* break and have to be fixed
- because while the migration files don't change, the databases they
talk to do.  The other thread I introduced about Mariadb 10.2 now
refusing to DROP columns that have a CHECK constraint is an example,
and will likely mean lots of old migration files across openstack
projects will need adjustments.








>
> All the best,
> -jay
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-23 Thread Michael Bayer
On Sun, Jul 23, 2017 at 6:10 PM, Jay Pipes  wrote:
> Glad you brought this up, Mike. I was going to start a thread about this.
> Comments inline.
>
> On 07/23/2017 05:02 PM, Michael Bayer wrote:
> Well, besides that point (which I agree with), that is attempting to change
> an existing database schema migration, which is a no-no in my book ;)


OK this point has come up before and I always think there's a bit of
an over-broad kind of purism being applied (kind of like when someone
says, "never use global variables ever!" and I say, "OK so sys.modules
is out right ?" :)  ).

I agree with "never change a migration" to the extent that you should
never change the *SQL steps emitted for a database migration*.  That
is, if your database migration emitted "ALTER TABLE foobar foobar
foobar" on a certain target databse, that should never change.   No
problem there.

However, what we're doing here is adding new datatype logic for the
NDB backend which are necessarily different; not to mention that NDB
requires more manipulation of constraints to make certain changes
happen.  To make all that work, the *Python code that emits the SQL
for the migration* needs to have changes made, mostly (I would say
completely) in the form of new conditionals for NDB-specific concepts.
   In the case of the datatypes, the migrations will need to refer to
a SQLAlchemy type object that's been injected with the ndb-specific
logic when the NDB backend is present; I've made sure that when the
NDB backend is *not* present, the datatypes behave exactly the same as
before.

So basically, *SQL steps do not change*, but *Python code that emits
the SQL steps* necessarily has to change to accomodate for when the
"ndb" flag is present - this because these migrations have to run on
brand new ndb installations in order to create the database.   If Nova
and others did the initial "create database" without using the
migration files and instead used a create_all(), things would be
different, but that's not how things work (and also it is fine that
the migrations are used to build up the DB).

There is also the option to override the compilation for the base
SQLAlchemy String type does so that no change at all would be needed
to consuming projects in this area, but it seems like there is a need
to specify ndb-specific length arguments in some cases so keeping the
oslo_db-level API seems like it would be best.  (Note that the ndb
module in oslo_db *does* instrument the CreateTable construct globally
however, though it is very careful not to be involved unless the ndb
flag is present).




>
>> I can add these names up to oslo.db and then we would just need to
>> spread these out through all the open ndb reviews and then also patch
>> up Cinder which seems to be the only ndb implementation that's been
>> merged so far.
>
>
> +1
>
>> Keep in mind this is really me trying to correct my own mistake, as I
>> helped design and approved of the original approach here where
>> projects would be consuming against the "ndb." namespace.  However,
>> after seeing it in reviews how prevalent the use of this extremely
>> backend-specific name is, I think the use of the name should be much
>> less frequent throughout projects and only surrounding logic that is
>> purely to do with the ndb backend and no others.   At the datatype
>> level, the chance of future naming conflicts is very high and we
>> should fix this mistake (my mistake) before it gets committed
>> throughout many downstream projects.
>
>
> I had a private conversation with Octave on Friday. I had mentioned that I
> was upset I didn't know about the series of patches to oslo.db that added
> that module. I would certainly have argued against that approach. Please
> consider hitting me with a cluestick next time something of this nature pops
> up. :)
>
> Also, as I told Octave, I have no problem whatsoever with NDB Cluster. I
> actually think it's a pretty brilliant piece of engineering -- and have for
> over a decade since I worked at MySQL.
>
> My complaint regarding the code patch proposed to Nova was around the
> hard-coding of the ndb namespace into the model definitions.
>
> Best,
> -jay
>
>>
>> [1] https://review.openstack.org/#/c/427970/
>>
>> [2] https://review.openstack.org/#/c/446643/
>>
>> [3] https://review.openstack.org/#/c/446136/
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> _

[openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-23 Thread Michael Bayer
I've been working with Octave Oregon in assisting with new rules and
datatypes that would allow projects to support the NDB storage engine
with MySQL.

To that end, we've made changes to oslo.db in [1] to support this, and
there are now a bunch of proposals such as [2] [3] to implement new
ndb-specific structures in projects.

The reviews for all downstream projects except Cinder are still under
review. While we have a chance to avoid a future naming problem, I am
making the following proposal:

Rather than having all the projects make use of
oslo_db.sqlalchemy.ndb.AutoStringTinyText / AutoStringSize, we add new
generic types to oslo.db :

oslo_db.sqlalchemy.types.SmallString
oslo_db.sqlalchemy.types.String

(or similar )

Internally, the ndb module would be mapping its implementation for
AutoStringTinyText and AutoStringSize to these types.   Functionality
would be identical, just the naming convention exported to downstream
consuming projects would no longer refer to "ndb." for
datatypes.

Reasons for doing so include:

1. openstack projects should be relying upon oslo.db to make the best
decisions for any given database backend, hardcoding as few
database-specific details as possible.   While it's unavoidable that
migration files will have some "if ndb:" kinds of blocks, for the
datatypes themselves, the "ndb." namespace defeats extensibility.  if
IBM wanted Openstack to run on DB2 (again?) and wanted to add a
"db2.String" implementation to oslo.db for example, the naming and
datatypes would need to be opened up as above in any case;  might as
well make the change now before the patch sets are merged.

2. The names "AutoStringTinyText" and "AutoStringSize" themselves are
confusing and inconsistent w/ each other (e.g. what is "auto"?  one is
"auto" if its String or TinyText and the other is "auto" if its
String, and..."size"?)

3. it's not clear (I don't even know right now by looking at these
reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
For example in 
https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
I see a list of String(255)'s changed to one type or the other without
any clear notion why one would use one or the other.  Having names
that define simply the declared nature of the type would be most
appropriate.

I can add these names up to oslo.db and then we would just need to
spread these out through all the open ndb reviews and then also patch
up Cinder which seems to be the only ndb implementation that's been
merged so far.

Keep in mind this is really me trying to correct my own mistake, as I
helped design and approved of the original approach here where
projects would be consuming against the "ndb." namespace.  However,
after seeing it in reviews how prevalent the use of this extremely
backend-specific name is, I think the use of the name should be much
less frequent throughout projects and only surrounding logic that is
purely to do with the ndb backend and no others.   At the datatype
level, the chance of future naming conflicts is very high and we
should fix this mistake (my mistake) before it gets committed
throughout many downstream projects.


[1] https://review.openstack.org/#/c/427970/

[2] https://review.openstack.org/#/c/446643/

[3] https://review.openstack.org/#/c/446136/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all] [oslo.db] [relational database users] heads up for a MariaDB issue that will affect most projects

2017-07-23 Thread Michael Bayer
Hey list -

It appears that MariaDB as of version 10.2 has made an enhancement
that overall is great and fairly historic in the MySQL community,
they've made CHECK constraints finally work.   For all of MySQL's
existence, you could emit a CREATE TABLE statement that included CHECK
constraint, but the CHECK phrase would be silently ignored; there are
no actual CHECK constraints in MySQL.

Mariadb 10.2 has now made CHECK do something!  However!  the bad news!
 They have decided that the CHECK constraint against a single column
should not be implicitly dropped if you drop the column [1].   In case
you were under the impression your SQLAlchemy / oslo.db project
doesn't use CHECK constraints, if you are using the SQLAlchemy Boolean
type, or the "ENUM" type without using MySQL's native ENUM feature
(less likely), there's a simple CHECK constraint in there.

So far the Zun project has reported the first bug on Alembic [2] that
they can't emit a DROP COLUMN for a boolean column.In [1] I've
made my complete argument for why this decision on the MariaDB side is
misguided.   However, be on the lookout for boolean columns that can't
be DROPPED on some environments using newer MariaDB.  Workarounds for
now include:

1. when using Boolean(), set create_constraint=False

2. when using Boolean(), make sure it has a "name" to give the
constraint, so that later you can DROP CONSTRAINT easily

3. if not doing #1 and #2, in order to drop the column you need to use
the inspector (e.g. from sqlalchemy import inspect; inspector =
inspect(engine)) and locate all the CHECK constraints involving the
target column, and then drop them by name.

[1] https://jira.mariadb.org/browse/MDEV-4

[2] 
https://bitbucket.org/zzzeek/alembic/issues/440/cannot-drop-boolean-column-in-mysql

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.config] how to deprecate a name but still have it as conf.

2017-07-18 Thread Michael Bayer
On Tue, Jul 18, 2017 at 1:02 PM, Doug Hellmann  wrote:

> Option renaming was originally meant as an operatior-facing feature
> to handle renames for values coming from the config file, but not
> as they are used in code.  mtreinish added
> https://review.openstack.org/#/c/357987/ to address this for Tempest,
> so it's possible there's a bug in the logic in oslo.config somewhere
> (or that oslo.db's case is a new one).

OK, patch set 5 at
https://review.openstack.org/#/c/334182/5/oslo_db/options.py shows
what I'm trying to do to make this work, however the test case added
in test_options still fails.   If this is supposed to "just work" then
I hope someone can confirm that.

Alternatively, a simple flag in DeprecatedOpt  "alias_on_conf=True"
would be super easy here so that specific names in our DeprecatedOpt
could be mirrored because we know projects are consuming them on conf.


>
> That said, the options defined by a library are *NOT* part of its
> API, and should never be used by code outside of the library. The
> whole point of isolating options like that is to give operators a
> way to change the way an app uses a library (drivers, credentials,
> etc.) without the app having to know the details.  Ideally the nova
> tests that access oslo.db configuration options directly would
> instead use an API in oslo.db to do the same thing (that API may
> need to be written, if it doesn't already exist).

OK, that is I suppose an option, but clearly a long and arduous one at
this point (add new API to oslo.db, find all projects looking at
conf., submit gerrits, somehow make sure projects never talk to
conf. directly?   how would we ensure that?  shouldn't
oslo.config allow the library that defines the options to plug in its
own "private" namespace so that consuming projects don't make this
mistake?)



>
> At that point, only oslo.db code would refer to the option, and you
> could use the deprecated_name and deprecated_group settings to
> describe the move and change the references to oslo.db within the
> library using a single patch to oslo.db.
>
> Doug
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [oslo.config] how to deprecate a name but still have it as conf.

2017-07-18 Thread Michael Bayer
In oslo.db, I'd like to rename the option "idle_timeout" to
"connection_recycle_time".

Following the pattern of using DeprecatedOpt, we get this:

cfg.IntOpt('connection_recycle_time',
   default=3600,
   deprecated_opts=[cfg.DeprecatedOpt('idle_timeout',
  group="DATABASE"),
cfg.DeprecatedOpt('idle_timeout',
  group="database"),
cfg.DeprecatedOpt('sql_idle_timeout',
  group='DEFAULT'),
cfg.DeprecatedOpt('sql_idle_timeout',
  group='DATABASE'),
cfg.DeprecatedOpt('idle_timeout',
  group='sql')],


However, Nova is accessing "conf.idle_timeout" directly in
nova/db/sqlalcemy/api.py -> _get_db_conf.  Tempest run fails.

Per the oslo.config documentation, the "deprecated_name" flag would
create an alias on the conf. namespace.  However, this has no effect,
even if I remove the other deprecated parameters completely:

cfg.IntOpt('connection_recycle_time',
   default=3600,
   deprecated_name='idle_timeout',

a simple unit test fails to see a value for
conf.connection_recycle_time, including if I add
"deprecated_group='DATABASE'" which is the group that's in this
specific test (however this would not be a solution anyway because
projects use different group names).

From this, it would appear that oslo.config has made it impossible to
deprecate the name of an option because DeprecatedOpt() has no means
of providing the value as an alias on the conf. object.   There's not
even a way I could have projects like nova make a forwards-compatible
change here.

Is this a bug in oslo.config or in oslo.db's usage of oslo.confg?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Michael Bayer
On Thursday, September 1, 2016, Jeremy Stanley  wrote:

>
> I don't read that at all as suggesting "the problem is solved, go
> away" but rather "help us make it better for everyone, don't just
> take one project off in a new direction and leave the others
> behind."


I can clarify.  I don't work directly on glance or keystone, I do oslo.db,
sqlalchemy, and alembic development.   If it's decided that the approach is
"no special technique, just query more columns and tables in your data
access layer and straddle across API versions", that does not indicate any
new patterns or tools in Oslo or further up, hence "solved" in that the
techniques are already available.  If OTOH we are getting into triggers or
this idea I have to do Python level translation events at the write side,
that indicates the need for new library features and patterns.

I've been tasked with being ready to assist Nova and Neutron with online
migrations for over a year.   Other than helping Neutron get
expand/contract going, I've not been involved at all, and not with anything
related to data migrations.   There hasn't been any need.



> --
> Jeremy Stanley
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed Database

2016-04-24 Thread Michael Bayer
On Sunday, April 24, 2016, Ed Leafe  wrote:

> On Apr 23, 2016, at 11:33 PM, Mike Bayer >
> wrote:
> >
> > Facebook and LinkedIn have built distributed database systems based on
> MySQL at profoundly massive scales. Openstack's problem I'm going to guess
> isn't as hard as that.
>
> That's always the problem with citing implementations, isn't it? Netflix
> and Apple have built amazing systems with Cassandra, but that doesn't mean
> it's right for any other implementation. What I find more interesting is
> what it took to make any particular technology work with the load placed
> upon it.
>
>

I'm only seeking to counter what appears to be the premise of your blog
post, which is that distributed and SQL are mutually exclusive.   When you
say, "why don't we just query the database?"  You can make distributed SQL
look like that to the application as well, but it might not bring any
advantages.



> -- Ed
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][oslo] Dealing with database connection sharing issues

2015-02-22 Thread Michael Bayer




> On Feb 22, 2015, at 10:20 AM, Yuriy Taraday  wrote:
> 
> 
> 
>> On Sun Feb 22 2015 at 6:27:16 AM Michael Bayer  wrote:
>> 
>> 
>> 
>> > On Feb 21, 2015, at 9:49 PM, Joshua Harlow  wrote:
>> >
>> > Some comments/questions inline...
>> >
>> > Mike Bayer wrote:
>> >>
>> >> Yuriy Taraday  wrote:
>> >>
>> >>>> On Fri Feb 20 2015 at 9:14:30 PM Joshua Harlow  
>> >>>> wrote:
>> >>>> This feels like something we could do in the service manager base class,
>> >>>> maybe by adding a "post fork" hook or something.
>> >>> +1 to that.
>> >>>
>> >>> I think it'd be nice to have the service __init__() maybe be something 
>> >>> like:
>> >>>
>> >>>   def __init__(self, threads=1000, prefork_callbacks=None,
>> >>>postfork_callbacks=None):
>> >>>  self.postfork_callbacks = postfork_callbacks or []
>> >>>  self.prefork_callbacks = prefork_callbacks or []
>> >>>  # always ensure we are closing any left-open fds last...
>> >>>  self.prefork_callbacks.append(self._close_descriptors)
>> >>>  ...
>> >>>
>> >>> (you must've meant postfork_callbacks.append)
>> >>>
>> >>> Note that multiprocessing module already have 
>> >>> `multiprocessing.util.register_after_fork` method that allows to 
>> >>> register callback that will be called every time a Process object is 
>> >>> run. If we remove explicit use of `os.fork` in oslo.service (replace it 
>> >>> with Process class) we'll be able to specify any after-fork callbacks in 
>> >>> libraries that they need.
>> >>> For example, EngineFacade could register `pool.dispose()` callback there 
>> >>> (it should have some proper finalization logic though).
>> >>
>> >> +1 to use Process and the callback system for required initialization 
>> >> steps
>> >> and so forth, however I don’t know that an oslo lib should silently 
>> >> register
>> >> global events on the assumption of how its constructs are to be used.
>> >>
>> >> I think whatever Oslo library is responsible for initiating the 
>> >> Process/fork
>> >> should be where it ensures that resources from other Oslo libraries are 
>> >> set
>> >> up correctly. So oslo.service might register its own event handler with
>> >
>> > Sounds like some kind of new entrypoint + discovery service that 
>> > oslo.service (eck can we name it something else, something that makes it 
>> > useable for others on pypi...) would need to plug-in to. It would seems 
>> > like this is a general python problem (who is to say that only oslo 
>> > libraries use resources that need to be fixed/closed after forking); are 
>> > there any recommendations that the python community has in general for 
>> > this (aka, a common entrypoint *all* libraries export that allows them to 
>> > do things when a fork is about to occur)?
>> >
>> >> oslo.db such that it gets notified of new database engines so that it can
>> >> associate a disposal with it; it would do something similar for
>> >> oslo.messaging and other systems that use file handles.   The end
>> >> result might be that it uses register_after_fork(), but the point is that
>> >> oslo.db.sqlalchemy.create_engine doesn’t do this; it lets oslo.service
>> >> apply a hook so that oslo.service can do it on behalf of oslo.db.
>> >
>> > Sounds sort of like global state/a 'open resource' pool that each library 
>> > needs to maintain internally to it that tracks how applications/other 
>> > libraries are using it; that feels sorta odd IMHO.
>> >
>> > Wouldn't that mean libraries that provide back resource objects, or 
>> > resource containing objects..., for others to use would now need to 
>> > capture who is using what (weakref pools?) to retain what all the 
>> > resources are being used and by whom (so that they can fix/close them on 
>> > fork); not every library has a pool (like sqlalchemy afaik does) to track 
>> > these kind(s) of things (for better or worse...). And what if those 
>> > libraries use other libraries that use resources (who owns what?); seems 
>> > like this just gets very messy/

Re: [openstack-dev] [all][oslo] Dealing with database connection sharing issues

2015-02-21 Thread Michael Bayer



> On Feb 21, 2015, at 9:49 PM, Joshua Harlow  wrote:
> 
> Some comments/questions inline...
> 
> Mike Bayer wrote:
>> 
>> Yuriy Taraday  wrote:
>> 
 On Fri Feb 20 2015 at 9:14:30 PM Joshua Harlow  
 wrote:
 This feels like something we could do in the service manager base class,
 maybe by adding a "post fork" hook or something.
>>> +1 to that.
>>> 
>>> I think it'd be nice to have the service __init__() maybe be something like:
>>> 
>>>   def __init__(self, threads=1000, prefork_callbacks=None,
>>>postfork_callbacks=None):
>>>  self.postfork_callbacks = postfork_callbacks or []
>>>  self.prefork_callbacks = prefork_callbacks or []
>>>  # always ensure we are closing any left-open fds last...
>>>  self.prefork_callbacks.append(self._close_descriptors)
>>>  ...
>>> 
>>> (you must've meant postfork_callbacks.append)
>>> 
>>> Note that multiprocessing module already have 
>>> `multiprocessing.util.register_after_fork` method that allows to register 
>>> callback that will be called every time a Process object is run. If we 
>>> remove explicit use of `os.fork` in oslo.service (replace it with Process 
>>> class) we'll be able to specify any after-fork callbacks in libraries that 
>>> they need.
>>> For example, EngineFacade could register `pool.dispose()` callback there 
>>> (it should have some proper finalization logic though).
>> 
>> +1 to use Process and the callback system for required initialization steps
>> and so forth, however I don’t know that an oslo lib should silently register
>> global events on the assumption of how its constructs are to be used.
>> 
>> I think whatever Oslo library is responsible for initiating the Process/fork
>> should be where it ensures that resources from other Oslo libraries are set
>> up correctly. So oslo.service might register its own event handler with
> 
> Sounds like some kind of new entrypoint + discovery service that oslo.service 
> (eck can we name it something else, something that makes it useable for 
> others on pypi...) would need to plug-in to. It would seems like this is a 
> general python problem (who is to say that only oslo libraries use resources 
> that need to be fixed/closed after forking); are there any recommendations 
> that the python community has in general for this (aka, a common entrypoint 
> *all* libraries export that allows them to do things when a fork is about to 
> occur)?
> 
>> oslo.db such that it gets notified of new database engines so that it can
>> associate a disposal with it; it would do something similar for
>> oslo.messaging and other systems that use file handles.   The end
>> result might be that it uses register_after_fork(), but the point is that
>> oslo.db.sqlalchemy.create_engine doesn’t do this; it lets oslo.service
>> apply a hook so that oslo.service can do it on behalf of oslo.db.
> 
> Sounds sort of like global state/a 'open resource' pool that each library 
> needs to maintain internally to it that tracks how applications/other 
> libraries are using it; that feels sorta odd IMHO.
> 
> Wouldn't that mean libraries that provide back resource objects, or resource 
> containing objects..., for others to use would now need to capture who is 
> using what (weakref pools?) to retain what all the resources are being used 
> and by whom (so that they can fix/close them on fork); not every library has 
> a pool (like sqlalchemy afaik does) to track these kind(s) of things (for 
> better or worse...). And what if those libraries use other libraries that use 
> resources (who owns what?); seems like this just gets very messy/impractical 
> pretty quickly once you start using any kind of 3rd party library that 
> doesn't follow the same pattern... (which brings me back to the question of 
> isn't there a common python way/entrypoint that deal with forks that works 
> better than ^).
> 
>> 
>> So, instead of oslo.service cutting through and closing out the file
>> descriptors from underneath other oslo libraries that opened them, we set up
>> communication channels between oslo libs that maintain a consistent layer of
>> abstraction, and instead of making all libraries responsible for the side
>> effects that might be introduced from other oslo libraries, we make the
>> side-effect-causing library the point at which those effects are
>> ameliorated as a service to other oslo libraries.   This allows us to keep
>> the knowledge of what it means to use “multiprocessing” in one
>> place, rather than spreading out its effects.
> 
> If only we didn't have all those other libraries[1] that people use to (that 
> afaik highly likely also have resources they open); so even with getting 
> oslo.db and oslo.messaging into this kind of pattern, we are still left with 
> the other 200+ that aren't/haven't been following this pattern ;-)

I'm only trying to solve well known points like this one between two Oslo 
libraries.   Obviously trying to multiply out this pattern times all libra