Re: [openstack-dev] Online Migrations.

Mike Bayer Mon, 15 Jun 2015 12:26:46 -0700


On 6/15/15 2:21 PM, Dan Smith wrote:

Tying this to the releases is less desirable from my perspective. It
means that landing a thing requires more than six months of developer
and reviewer context. We have that right now, and we get along, but it's
much harder to plan, execute, and cleanup those sorts of longer-lived
changes. It also means that CDers have to wait for the contract to be
landed well after they should have been able to clean up their database,
and may imply that people _have_ to do a contract at some point,
depending on how it's exposed.

The goal for this was to separate the three phases. Tying one of them to
the releases kinda hampers the utility of it to some degree, IMHO.
Making it declarative (even when part of what is declared are the
condition(s) upon which a particular contraction can proceed) is much
more desirable to me.

all of these things are true.

but i don't see how this part of things is going to be solved unless youotherwise do something like #1, but maybe not as complicated as that.


Here's the deal.  If I write a program, that says this:


class MyThing(Model):
    __tablename__ = 'thing'
    x = Column()
    y = Column()

then I say:

print session.query(MyThing)


it's going to run "SELECT x, y FROM thing"

if you want MyThing to have "y" there, but the program runs in some kindof mode that doesnt include "y" anymore, you can do something like this:



class MyThing(Model):
    __tablename__ = 'thing'
    x = Column()

    if we_have_column('thing', 'y'):
        y = Column()

note that the above is totally pseudocode. If you want it to be like"y = RemovedColumn()", there is probably a way to make it work that wayalso, e.g. that there's this declared "y = something()" in your model,but the MyThing model does not actually get a "y" in it, and even that"y" is written to some other collection likeMyThing.columns_we_have_removed (again, also pseudocode).

Alternatively, you can have MyThing with .x and .y and then try to messaround with your Query() objects so that they skip "y" when thiscondition occurs, which at the basic level looks like:


session.query(MyThing).options(defer('y')).

With this approach, you'd probably want to use a new API I've added in1.0 that allows for on-query-construction events which can add thesedeferral rules. Hacking this into model_query() is going to be moredifficult / hardcoded and also isn't going to accommodate things likelazy loads, joins, eager loads, etc. In any case, to do this correctlyfor intercepted queries is doable but might be difficult and error pronein some cases, as it has to search for all entities in the query,aliased, joined, subqueried, etc. that might be referring to"thing.y". Also something has to be worked out for the persistenceside; it needs to be excluded from INSERT statements and even UPDATEstatements if some logic is setting a value for it. Or you couldbuild up some SQL execution events using the SQLAlchemy event API tojust scrub these columns out when the SQL is emitted, but then we haveto parse and rewrite SQL.

But either way, you can have all of that. But what is not clear hereis, when is that decision made, that we no longer have "y" ?


Is it made:

1. at runtime? e.g. your nova service is running, it's doing "SELECT x,y FROM thing", then some magic thing happens somewhere and the appsuddenly sees, hey "y" is gone! change all queries to "SELECT x FROMthing". What would this magic thing be? Are you going to run areflection of the table schema on every query (you definitely aren't).So I don't know that this is possible.

2. at application start time? e.g. nova service starts up, somethinghappens before "MyThing" is first declared where MyThing knows that "y"is no longer there for this run (or something that will impact all thequeries and persistence operations, less desirable).

#2 is much more possible. But still, how does it run? How do we knowthat "y" is there on one run, and is not there on another? do we:

2a. When the app starts up, we run reflection queries against the DB(e.g. what autogenerate / OSM does, looking in schema catalogs).This is doable, but can get expensive on startup if we really have lotsof columns/tables to worry about; it also means that either the changesto the queries here happen totally at query time (intricate,difficult-ish), as for the change to happen at model definition time(simple, easy) means the app needs to be connected to the databasebefore it imports the models, and this is the complete opposite of howNova's api.py is constructed right now. Plus the feature needs toaccommodate for Cells, where there's a totally different databasehappening (maybe this has to be query time for that reason alone).

2b. In a config file somewhere? Some kind of directive that says, "heywe have now dropped "thing.y". What would that look like?

2c. Based on some kind of version number in the database? Not too muchdifferent from #2a.


That said, I still think we should get the original thing merged. Even
if we did contractions purely with the manual migrations for the
foreseeable future, that'd be something we could deal with.

--Dan

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Online Migrations.

Reply via email to