Re: Stray changes detected only sometimes with revision --autogenerate

2020-09-30 Thread Daniel Krebs

Hi Mike,

I adapted our setup to the new cookbook recipe and it basically fixed 
both problems! I don't need to maintain my
patches anymore (thanks to setting 
`connection.dialect.default_schema_name`, this did the trick!) and the 
detected stray changes are also gone. I did some investigation with 
increased log level as you suggested and came to the conclusion that it 
must have been a bad idea to manipulate the `conn_table_names` object 
in-place. But since that's gone now, everything is working as expected \o/


Thank you so much for your help!

Cheers,
Daniel

On 29.09.20 19:04, Mike Bayer wrote:



Hi, so I added a quick recipe to the site just now just so that the "set 
search path" idea is documented to some extent, that is at 
https://alembic.sqlalchemy.org/en/latest/cookbook.html#rudimental-schema-level-multi-tenancy-for-postgresql-databases 
.


Re: autogenerate, if you have many schemas that all contain an exact 
copy of the same schema, then autogenerate implies you'd be running this 
exactly once for just one of the schemas, or a "master" schema, such 
that you generate the .py files that will then be applicable to all the 
schemas.  The recipe above discusses this, and in particular it works in 
a completely schema agnostic fashion, you would set 
include_schemas=False in the config and allow the PostgreSQL search_path 
to handle all movement between schemas.


then for the actual migration "upgrade" run, you would run those files 
against each schema in sequence, again making use of search_path in 
order to select each schema.


as far as how alembic is looking at current tables you would need to 
turn on SQL logging, using the "debug" level, such as in alembic.ini (if 
you're using that):


[logger_sqlalchemy]
level = DEBUG
handlers =
qualname = sqlalchemy.engine



this will emit a lot of queries and result sets.  You'd have to capture 
that logging and then grep through for the "nonexistent" schema - it 
will be present in a result set that Alembic is capturing, most 
fundamentally the one it uses to get all the schema names.





On Tue, Sep 29, 2020, at 4:01 PM, Daniel Krebs wrote:

Hi Mike,

thanks a bunch for the quick response!

>> This is supported by SQLAlchemy more robustly using the
>> schema_translate_map feature, see
>> 
https://docs.sqlalchemy.org/en/13/core/connections.html#schema-translating 


>> .   this feature allows schema names to render explicitly without the
>> need to manipulate search_path.
>
> this feature may not work completely with Alembic however, I'm
> evaluating that now.
>
> still doesnt shed any light on your sporadic problem, however.

It's been some time already since I've implemented this, so can't recall
the exact reason why I went down that road but I know for sure that I
was trying to make it work with schema_translate_map but didn't succeed.
I would have very much preferred a solution without patching alembic it
seemed to my last resort at the time. I'd be more than happy to find a
solution here :)


> However, when using that approach you'd normally be running
autogenerate from just one schema, since you are looking to generate
migrations just once and then run them on every tenant.
>
> so it's not clear why autogenerate is being run on every tenant
explicitly - you'd have just one "model" schema that's the one where you
actually run autogenerate upon.   there would be no need to consider
other schemas and include_schemas would be set to False.

Indeed that sounds like a good idea. So you'd suggest to keep one schema
just for autogeneration purposes and then apply the migrations to all
customer schemas? Is that possible from within alembic or would you wrap
some tooling around alembic in order to apply migrations one-by-one to
every schema?


> As far as being able to filter out schemas across many, the current
approach is the include_object hook, which is insufficient for many
schemas as it does not block the reflection of all the tables.   a new
hook include_name is being added in the coming weeks that allows for
pre-emptive inclusion or exclusion of specific schema names.

I tried the include_object hook back then, but I was facing reflection
issues if I remember correctly. We're using TimescaleDB [1] in the same
database which adds some schemas of its own that alembic doesn't seem to
be able to handle.


> I don't have much insight onto this other than looking at concurrent
activities on the database.   "sometimes" usually indicates a race
condition of some sort, and the "autogenerate" process is strictly one
of reading data.

Indeed I'm suspecting the same thing. However, I'm not the most
knowledgable DBA nor do I have an understanding of how alembic actually
compares the current state of the DB with the SA models. Could you maybe
provide a more concrete point what and where to look for?


Than

Re: Stray changes detected only sometimes with revision --autogenerate

2020-09-29 Thread Daniel Krebs

Hi Mike,

thanks a bunch for the quick response!

This is supported by SQLAlchemy more robustly using the 
schema_translate_map feature, see 
https://docs.sqlalchemy.org/en/13/core/connections.html#schema-translating 
.   this feature allows schema names to render explicitly without the 
need to manipulate search_path.


this feature may not work completely with Alembic however, I'm 
evaluating that now.


still doesnt shed any light on your sporadic problem, however.


It's been some time already since I've implemented this, so can't recall 
the exact reason why I went down that road but I know for sure that I 
was trying to make it work with schema_translate_map but didn't succeed. 
I would have very much preferred a solution without patching alembic it 
seemed to my last resort at the time. I'd be more than happy to find a 
solution here :)



> However, when using that approach you'd normally be running 
autogenerate from just one schema, since you are looking to generate 
migrations just once and then run them on every tenant.

>
> so it's not clear why autogenerate is being run on every tenant 
explicitly - you'd have just one "model" schema that's the one where you 
actually run autogenerate upon.   there would be no need to consider 
other schemas and include_schemas would be set to False.


Indeed that sounds like a good idea. So you'd suggest to keep one schema 
just for autogeneration purposes and then apply the migrations to all 
customer schemas? Is that possible from within alembic or would you wrap 
some tooling around alembic in order to apply migrations one-by-one to 
every schema?



> As far as being able to filter out schemas across many, the current 
approach is the include_object hook, which is insufficient for many 
schemas as it does not block the reflection of all the tables.   a new 
hook include_name is being added in the coming weeks that allows for 
pre-emptive inclusion or exclusion of specific schema names.


I tried the include_object hook back then, but I was facing reflection 
issues if I remember correctly. We're using TimescaleDB [1] in the same 
database which adds some schemas of its own that alembic doesn't seem to 
be able to handle.



> I don't have much insight onto this other than looking at concurrent 
activities on the database.   "sometimes" usually indicates a race 
condition of some sort, and the "autogenerate" process is strictly one 
of reading data.


Indeed I'm suspecting the same thing. However, I'm not the most 
knowledgable DBA nor do I have an understanding of how alembic actually 
compares the current state of the DB with the SA models. Could you maybe 
provide a more concrete point what and where to look for?



Thank you for taking the time to help!


Cheers,
Daniel

[1] https://www.timescale.com/

--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/fc2bc5e4-274b-a9e7-0566-3b745c4e5fc5%40enlyze.com.


Stray changes detected only sometimes with revision --autogenerate

2020-09-29 Thread Daniel Krebs
Hi,

we're having rather strange problems with Alembic 1.4.2 and Postgres 12, 
detecting stray changes *sometimes* but also sometimes not. I already dug 
through the code but I increasingly get the feel that this is rooted 
somewhere in the interaction between alembic and Postgres.

But let me explain our setup first. We maintain a set of SQLAlchemy model 
definitions that we evolve over time and use alembic to migrate the 
database accordingly. For every customer, we add a new Postgres schema to 
the database identified by a corresponding UUID (e.g. 
a3d74dcc-1634-33a5-ff74-235f3a7c6322). See [1] for our (stripped) env.py.

Since we want all customers to use the same DDL, we want common migrations 
that disregard the schema name altogether. It seems that this is not a 
supported use-case of alembic, so I hacked around it such that alembic only 
sees schema-less objects and I run SET search_path TO "uuid" before each 
migration (see env.py [1]). The patch should be rather straight-forward and 
can be found at [2].

Now the issue that we're facing is, that *sometimes* autogenerate detects 
changes in one or two customer schemas that are not real [3]. Deleting the 
migration and creating a new one often doesn't detect the changes anymore 
or for a different customer/schema. The tables that are incorrectly found 
to have changed also change over time. 

My current workaround is to autogenerate migrations until alembic "does the 
right thing". I know that patching alembic is not the best base upon which 
to ask for support, but to be honest I am running out of theories what is 
going wrong here. Hope someone can help or point me into the right 
direction :)

Cheers,
Daniel

[1] 
https://gist.github.com/daniel-k/114aa2ac846c02e437b8d86ab89d21ac#file-env-py
[2] 
https://gist.github.com/daniel-k/114aa2ac846c02e437b8d86ab89d21ac#file-alembic_patch-diff
[2] 
https://gist.github.com/daniel-k/114aa2ac846c02e437b8d86ab89d21ac#file-log-txt



-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/479fee75-e006-4c07-9ab3-149250205521n%40googlegroups.com.