[sqlalchemy] alembic del sys.modules[module_id] leading to orm mapper error

Will Angenent Sun, 21 Feb 2016 10:49:14 -0800

Hi,

We had this interesting issue recently, and I've been trying to figure out 
if we deserve this, if this is simply unavoidable, or whether it can be 
considered a bug. We're using python 2.7.6, sqlalchemy 1.0.12 and alembic 
0.8.4.


Summary:

This statement in alembic.util.pyfiles.load_python_file():
del sys.modules[module_id]
randomly causes the reference count of the module object to become zero; 
triggering cleanup of the object. This effectively causes all variables in 
the migration file to become None, leading to an sqlalchemy mapper problem 
initializing a mapper configuration for a many-to-many relationship in a 
model defined in the migration file.

Are we being stupid to be using the ORM in alembic migrations? If not, is 
it worth for me to spend more time on this? Is there any way to get this to 
behave non-randomly? More details are below.

Thanks,
Will

Long version...

What happened is that someone in my team added an alembic migration. He 
used the sqlalchemy ORM and used a declarative_base with a couple of model 
files to get the job done. The migration was fine and everyone was happy. 
Then, about a week later, I added an import statement in a totally 
unrelated area of code, and suddenly running alembic upgrade starting 
failing with a ORM mapper error. I didn't spend much time on it, but 
refactored a couple of things and the problem vanished.

Then a couple of days later, our tests started failing with the same error. 
We had a closer look and found the failure to be random. The inclusion of 
the import statment seemed to trigger the random behavior. It wasn't just 
the import statement though, other changes, such as removing a property in 
an ORM class could make the problem appear or go away. What we were doing 
in this particualr failure mode, is running py.test which would, in order:

- import this random 3rd party module
- use the alembic API to upgrade to ensure a postgres database is up to date
- later on, in an unrelated test, do a query, triggering the initialization 
of the mappings and crashing

At first, I thought it might be a problem with sqlalchemy. Spurred on by 
this comment in mapper.py:

            # initialize properties on all mappers
            # note that _mapper_registry is unordered, which
            # may randomly conceal/reveal issues related to
            # the order of mapper compilation

I added a couple of sorted() statements throughout the code, but it made no 
difference. Finally, I found that the problem was a lambda function in a 
relationship with a secondary. Something like e.g.

tag_to_resource = Table(
    'tag_to_resource', Base.metadata,
    Column('tag_id', ForeignKey('tags.id', ondelete='CASCADE'),
           primary_key=True, index=True),
    Column('resource_id', ForeignKey('resources.id', ondelete='CASCADE'),
           primary_key=True, index=True)
)

class Resource(Base):
    __tablename__ = 'resources'
    id = Column(UUIDType(binary=True), primary_key=True, default=uuid.uuid4)

    tags = relationship("Tag", secondary=lambda: tag_to_resource,
                        backref='resources')

The lambda function called in _process_dependent_arguments() was returning 
None instead of tag_to_resource. Resulting in a:

sqlalchemy.exc.NoForeignKeysError: Could not determine join condition 
between parent/child tables on relationship Resource.tags - there are no 
foreign keys linking these tables.  Ensure that referencing columns are 
associated with a ForeignKey or ForeignKeyConstraint, or specify a 
'primaryjoin' expression.

Looking deeper I found that __name__ was also None. This kind of thing 
happens when sys.modules is messed with. I looked at the alembic code and 
found this in load_python_file():

del sys.modules[module_id]

If I remove that statement, the problem goes away.

Could it be that the reference count of the module object is becoming zero 
randomly, causing python to delete the data, as explained in this post?
http://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name

I've narrowed the problem down to a python test script, but it still 
imports a load of other stuff. I can trigger the good + bad case by just 
removing an import statement. I've been trying to get this down to a simple 
script in an attempt to prove what's going on, but the problem tends to 
come and go while I'm deleting code; making it difficult to narrow down. 
For example, I was convinced one day that the problem vanished by upgrading 
to sql alchemy 1.0.12, but the very next day the same code started failing 
again!

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

[sqlalchemy] alembic del sys.modules[module_id] leading to orm mapper error

Reply via email to