Re: Sharing metadata between alembic and SQLAlchemy

Ian McCullough Sat, 08 Aug 2015 12:51:07 -0700

So I've continued thinking about this a bit, mainly because trying to keep 
the database schema and object model in sync is effectively a kind of a 
"double entry" system, and I have been burned *so. many. times.* in the 
past by (N>1)-entry systems where the different "truths" get out of sync, 
leading to subtle, but week-ruining bugs. In the abstract, my goal is to 
have a single source of truth and to have that truth 'flow' through the 
system (in my case, I'm trying to use the alembic version scripts, but I 
really don't care what form the truth takes as long as there's only one of 
them.)


Michael said: 

> However, where it gets thorny is that neither Alembic migrations nor 
> SQLAlchemy metadata are supersets of each other


My approach here has been to "change" this by leveraging the ability to 
specify arbitrary, ancillary data in the `info` property of `SchemaItem` to 
store any/all additional information necessary to re-create the models 
(i.e. making alembic migrations a superset of the SQLAlchemy metadata *that 
I need, for my specific purposes*) Then, once I've captured that metadata, 
I push it up into the database (in my case, I'm using Postgres's COMMENTs 
feature, but in other DBs it could just be an arbitrary table), which can 
then be used by another, build-time tool to generate my models.py file from 
the database. Ideally, there would be a way to cut out the database so that 
you could just run the alembic scripts and get out the appropriate 
metadata, but going through the database is an acceptable detour for me 
(especially now that I've wrapped up the necessary fixtures to spool up an 
ephemeral, local Postgres installation.) Considering that you could 
conceivably even ship pickled python object graphs to this kind of 
"sidecar" storage, I suspect that there probably *is* enough flexibility to 
capture all the SQLAlchemy metadata, if you carried this to its logical 
conclusion.

I realize this approach is probably too "restrictive" to be useful in the 
general case, but I figured I'd share my thoughts and hacks anyway. 
Conceptually, I think the best thing, long term, would be for alembic to be 
able to handle both DB schema migration and object model migration, and to 
serve as a single source of truth for systems willing to operate completely 
within alembic's purview. Based on your comments about mine being an 
unusual workflow, I assume many folks won't want to work this way, but for 
those who consider single sources of truth to be critical, I think it could 
be a win.

If anyone is interested in talking more about this, let me know.

Regards,
Ian



On Monday, August 3, 2015 at 9:36:19 AM UTC-4, Michael Bayer wrote:
>
>
>
> On 8/1/15 6:59 PM, Ian McCullough wrote:
>
> I've been getting up to speed with SQLAlchemy and alembic and having a 
> great time of it. This is some great stuff! 
>
> One thing that's been confounding me is this: My Alembic schema revisions 
> are 'authoritative' for my metadata (i.e. I've started from scratch using 
> alembic to build the schema from nothing), yet it doesn't appear that the 
> metadata that exists in my alembic scripts can be leveraged by my models in 
> my main app. So far, I've been maintaining effectively two versions of the 
> metadata, one in the form of the "flattened projection" of my alembic 
> schema rev scripts, and another in my application models scripts. I 
> understand that there are some facilities to auto-re-generate the metadata 
> from the RDBMS on the application side, but that seems potentially "lossy", 
> or at least subject to the whims of whatever DBAPI provider I'm using.
>
> Is there a way to pull this flattened projection of metadata out of 
> alembic and into my app's models at runtime? (i.e. import alembic, read the 
> version from the live DB, then build the metadata by playing the upgrade 
> scripts forward, not against the database, but against a metadata 
> instance?) It seems like a fool's errand to try to keep my app models in 
> sync with the flattened projection of the schema revisions by hand. My 
> assumption is that I'm missing something super-obvious here.
>
>
> There's a lot to say on this issue.    The idea of the migrations 
> themselves driving the metadata would be nice, and I think that the recent 
> rewrite of django south does something somewhat analogous to this.
>
> Also, the reorganization of Alembic operations into objects that you can 
> hang any number of operations upon, this is due for Alembic 0.8, is also 
> something that we'd leverage to make this kind of thing happen.
>
> However, where it gets thorny is that neither Alembic migrations nor 
> SQLAlchemy metadata are supersets of each other.     That is, there's many 
> things in SQLAlchemy metadata that currently has no formal representation 
> in Alembic operations, the primary example is that of Python-side default 
> operations on columns, which have no relevance to emitting ALTER 
> statements.    On the Alembic side, a set of migrations that takes care to 
> only use the official Alembic op.* operations, and also does not use 
> "execute()" for any of them, is the only way to guarantee that each change 
> is potentially representable in SQLAlchemy metadata.    A migration that 
> emits op.execute("ALTER TABLE foo ADD COLUMN xyz") wouldn't work here, and 
> a migration that has lots of conditionals and runtime logic might also not 
> be useful in this way.
>
> SQLAlchemy Table and Column objects also do not support removal from their 
> parents.   This would be necessary in order to represent "drop" mutations 
> as targeted at a SQLAlchemy metadata structure.  This is something that 
> could be implemented but SQLA has always made a point to not get into this 
> because it's very complicated to handle "cascades" of dependent objects, 
> whether that means raising an error or mimicking other functionality of a 
> real "drop" operation.
>
> Finally, the whole workflow of Alembic up til now has been organized for 
> the opposite workflow; the MetaData is the authoritative model, and 
> migrations are generated using tools like autogenerate to minimize how much 
> they need to be coded by hand (and there is of course no issue of 
> maintaining the same code in two places because migration scripts are a 
> fixed point in time once created).    This model is practical for many 
> reasons; all of the above reasons, plus that it is compatible with 
> applications that weren't using migrations up to point or were using some 
> other system, plus that it allows easy pruning of old migrations.
>
>
>
>
>
>
> Thanks,
> Ian
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Sharing metadata between alembic and SQLAlchemy

Reply via email to