Hi, You’re definitely on to something when you say it’s related to garbage collection and weak references. If I add this:
import gc gc.disable() the problem vanishes. I’ve been spending the last couple of hours trying to reduce the code to the smallest possible case. I’ve reached the point though that I can’t remove any more code without making the problem vanish. There’s still plenty of code cruft left that I can’t copy/paste, so I can’t easily break it down into a tiny example. To answer your question though, I’ve whittled down the contents of the migration file into this: --- from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Column, String, ForeignKey, Table, Integer from sqlalchemy.orm import relationship revision = '3a81b8b842d3' down_revision = None branch_labels = None depends_on = None Base = declarative_base() x_tag_to_resource = Table( 'tag_to_resource', Base.metadata, Column('tag_id', ForeignKey('tags.id', ondelete='CASCADE'), primary_key=True, index=True), Column('resource_id', ForeignKey('resources.id', ondelete='CASCADE'), primary_key=True, index=True) ) class XTag(Base): __tablename__ = 'tags' id = Column(Integer, primary_key=True) text = Column(String, nullable=False) def non_lambda_tag_to_resource(): # This needs to be here, otherwise the sys import itself # can get garbage collected in the erroneous case. import sys sys.stderr.write('*** x_tag_to_resource=%s\n' % x_tag_to_resource) sys.stderr.write('*** name=%s\n' % __name__) return x_tag_to_resource class XResource(Base): __tablename__ = 'resources' id = Column(Integer, primary_key=True) tags = relationship("XTag", secondary=non_lambda_tag_to_resource, backref='resources') — I don’t think there’s anything wrong with the above though. There also isn’t mixing between models in the application and models in the migration; so that doesn’t explain references sticking in some cases. By the way, this isn’t an urgent thing. I made the problem go away by deleting the bad migration; we didn’t need it any more. My interest is purely academic. I’m just worried this might happen again. Mostly though, I don’t like _not_ understanding what’s going on. Thanks, Will > On 22 Feb 2016, at 01:59, Mike Bayer <clas...@zzzcomputing.com> wrote: > > Well what makes no sense is why these mappers and classes would be present at > all. These classes are strictly defined within an alembic migration, and > make **no** callouts to any other mapped structures in the application, > correct? If so, when the module is garbage collected, all of the things > inside of it would be as well. The mapper registry is weak referencing so the > _configure_mappers() step shouldn't see it. > > If, OTOH, this mapping has something like a *backref* to some model in the > application, that would totally leave a dangling reference. > > Can I have an example of an exact mapping I can stick into an Alembic > migration to see this happening otherwise? > > > > On 02/21/2016 02:19 PM, Will Angenent wrote: >> Hi Mike, >> >> Thanks for your quick response yet again! Here’s the stack trace. >> >> tests/integration/test_database.py:14: in test_database_is_up_to_date >> create_test_db(session) >> __init__.py:111: in create_test_db >> pd_utils.do_import(dtype='locations', ifile=yaml_file) >> ../utils/provider_data/__init__.py:54: in do_import >> inserted, updated = getattr(self, 'import_%s' % item)(ifile) >> ../utils/provider_data/__init__.py:22: in import_locations >> return import_locations(self.session, ifile) >> ../utils/provider_data/locations.py:190: in import_locations >> Location).filter_by( >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py:1260: >> in query >> return self._query_cls(entities, self, **kwargs) >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:110: >> in __init__ >> self._set_entities(entities) >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:120: >> in _set_entities >> self._set_entity_selectables(self._entities) >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:150: >> in _set_entity_selectables >> ent.setup_entity(*d[entity]) >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:3250: >> in setup_entity >> self._with_polymorphic = ext_info.with_polymorphic_mappers >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py:747: >> in __get__ >> obj.__dict__[self.__name__] = result = self.fget(obj) >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:1893: >> in _with_polymorphic_mappers >> configure_mappers() >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:2756: >> in configure_mappers >> mapper._post_configure_properties() >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:1710: >> in _post_configure_properties >> prop.init() >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/interfaces.py:183: >> in init >> self.do_init() >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1613: >> in do_init >> self._setup_join_conditions() >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1688: >> in _setup_join_conditions >> can_be_synced_fn=self._columns_are_mapped >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1956: >> in __init__ >> self._determine_joins() >> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:2060: >> in _determine_joins >> "specify a 'primaryjoin' expression." % self.prop) >> E NoForeignKeysError: Could not determine join condition between >> parent/child tables on relationship Resource.tags - there are no foreign >> keys linking these tables. Ensure that referencing columns are >> associated with a ForeignKey or ForeignKeyConstraint, or specify a >> 'primaryjoin' expression. >> >>> The sys.modules activity is not really the primary cause, it's that >>> alembic makes use of a module object in a temporary way. >> Absolutely agree. What I did to diagnose this was to replace >> secondary=lambda: tag_to_resource with >> secondary=non_lambda_tag_to_resource, using this: >> >> def non_lambda_tag_to_resource(): >> import sys >> sys.stderr.write('*** tag_to_resource=%s\n' % tag_to_resource) >> # >> sys.stderr.write('*** name=%s\n' % __name__) >> # >> return tag_to_resource >> >> What I found is that in the bad case, both tag_to_resource and __name__ >> were None. >> >> Thanks, >> Will >> >>> On 21 Feb 2016, at 19:12, Mike Bayer <clas...@zzzcomputing.com >>> <mailto:clas...@zzzcomputing.com <mailto:clas...@zzzcomputing.com>>> wrote: >>> >>> >>> >>> Hi there - >>> >>> Can you post a stack trace, and also is your test suite making use of >>> clear_mappers() ? >>> >>> The sys.modules activity is not really the primary cause, it's that >>> alembic makes use of a module object in a temporary way. >>> >>> On Feb 21, 2016, at 1:48 PM, Will Angenent <w.angen...@gmail.com >>> <mailto:w.angen...@gmail.com> >>> <mailto:w.angen...@gmail.com <mailto:w.angen...@gmail.com>>> wrote: >>> >>>> Hi, >>>> >>>> We had this interesting issue recently, and I've been trying to >>>> figure out if we deserve this, if this is simply unavoidable, or >>>> whether it can be considered a bug. We're using python 2.7.6, >>>> sqlalchemy 1.0.12 and alembic 0.8.4. >>>> >>>> Summary: >>>> >>>> This statement in alembic.util.pyfiles.load_python_file(): >>>> del sys.modules[module_id] >>>> randomly causes the reference count of the module object to become >>>> zero; triggering cleanup of the object. This effectively causes all >>>> variables in the migration file to become None, leading to an >>>> sqlalchemy mapper problem initializing a mapper configuration for a >>>> many-to-many relationship in a model defined in the migration file. >>>> >>>> Are we being stupid to be using the ORM in alembic migrations? If >>>> not, is it worth for me to spend more time on this? Is there any way >>>> to get this to behave non-randomly? More details are below. >>>> >>>> Thanks, >>>> Will >>>> >>>> Long version... >>>> >>>> What happened is that someone in my team added an alembic migration. >>>> He used the sqlalchemy ORM and used a declarative_base with a couple >>>> of model files to get the job done. The migration was fine and >>>> everyone was happy. Then, about a week later, I added an import >>>> statement in a totally unrelated area of code, and suddenly running >>>> alembic upgrade starting failing with a ORM mapper error. I didn't >>>> spend much time on it, but refactored a couple of things and the >>>> problem vanished. >>>> >>>> Then a couple of days later, our tests started failing with the same >>>> error. We had a closer look and found the failure to be random. The >>>> inclusion of the import statment seemed to trigger the random >>>> behavior. It wasn't just the import statement though, other changes, >>>> such as removing a property in an ORM class could make the problem >>>> appear or go away. What we were doing in this particualr failure >>>> mode, is running py.test which would, in order: >>>> >>>> - import this random 3rd party module >>>> - use the alembic API to upgrade to ensure a postgres database is up >>>> to date >>>> - later on, in an unrelated test, do a query, triggering the >>>> initialization of the mappings and crashing >>>> >>>> At first, I thought it might be a problem with sqlalchemy. Spurred on >>>> by this comment in mapper.py: >>>> >>>> # initialize properties on all mappers >>>> # note that _mapper_registry is unordered, which >>>> # may randomly conceal/reveal issues related to >>>> # the order of mapper compilation >>>> >>>> I added a couple of sorted() statements throughout the code, but it >>>> made no difference. Finally, I found that the problem was a lambda >>>> function in a relationship with a secondary. Something like e.g. >>>> >>>> tag_to_resource = Table( >>>> 'tag_to_resource', Base.metadata, >>>> Column('tag_id', ForeignKey('tags.id', ondelete='CASCADE'), >>>> primary_key=True, index=True), >>>> Column('resource_id', ForeignKey('resources.id', ondelete='CASCADE'), >>>> primary_key=True, index=True) >>>> ) >>>> >>>> class Resource(Base): >>>> __tablename__ = 'resources' >>>> id = Column(UUIDType(binary=True), primary_key=True, >>>> default=uuid.uuid4) >>>> >>>> tags = relationship("Tag", secondary=lambda: tag_to_resource, >>>> backref='resources') >>>> >>>> The lambda function called in _process_dependent_arguments() was >>>> returning None instead of tag_to_resource. Resulting in a: >>>> >>>> sqlalchemy.exc.NoForeignKeysError: Could not determine join condition >>>> between parent/child tables on relationship Resource.tags - there are >>>> no foreign keys linking these tables. Ensure that referencing >>>> columns are associated with a ForeignKey or ForeignKeyConstraint, or >>>> specify a 'primaryjoin' expression. >>>> >>>> Looking deeper I found that __name__ was also None. This kind of >>>> thing happens when sys.modules is messed with. I looked at the >>>> alembic code and found this in load_python_file(): >>>> >>>> del sys.modules[module_id] >>>> >>>> If I remove that statement, the problem goes away. >>>> >>>> Could it be that the reference count of the module object is becoming >>>> zero randomly, causing python to delete the data, as explained in >>>> this post? >>>> http://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name >>>> >>>> <http://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name> >>>> >>>> I've narrowed the problem down to a python test script, but it still >>>> imports a load of other stuff. I can trigger the good + bad case by >>>> just removing an import statement. I've been trying to get this down >>>> to a simple script in an attempt to prove what's going on, but the >>>> problem tends to come and go while I'm deleting code; making it >>>> difficult to narrow down. For example, I was convinced one day that >>>> the problem vanished by upgrading to sql alchemy 1.0.12, but the very >>>> next day the same code started failing again! >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "sqlalchemy" group. >>>> To unsubscribe from this group and stop receiving emails from it, >>>> send an email to sqlalchemy+unsubscr...@googlegroups.com >>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com> >>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com >>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>>. >>>> To post to this group, send email to sqlalchemy@googlegroups.com >>>> <mailto:sqlalchemy@googlegroups.com> >>>> <mailto:sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>>. >>>> Visit this group at https://groups.google.com/group/sqlalchemy >>>> <https://groups.google.com/group/sqlalchemy>. >>>> For more options, visit https://groups.google.com/d/optout >>>> <https://groups.google.com/d/optout>. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to sqlalchemy+unsubscr...@googlegroups.com >>> <mailto:sqlalchemy+unsubscr...@googlegroups.com> >>> <mailto:sqlalchemy+unsubscr...@googlegroups.com >>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>>. >>> To post to this group, send email to sqlalchemy@googlegroups.com >>> <mailto:sqlalchemy@googlegroups.com> >>> <mailto:sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>>. >>> Visit this group at https://groups.google.com/group/sqlalchemy >>> <https://groups.google.com/group/sqlalchemy>. >>> For more options, visit https://groups.google.com/d/optout >>> <https://groups.google.com/d/optout>. >> >> -- >> You received this message because you are subscribed to the Google >> Groups "sqlalchemy" group. >> To unsubscribe from this group and stop receiving emails from it, send >> an email to sqlalchemy+unsubscr...@googlegroups.com >> <mailto:sqlalchemy+unsubscr...@googlegroups.com> >> <mailto:sqlalchemy+unsubscr...@googlegroups.com >> <mailto:sqlalchemy+unsubscr...@googlegroups.com>>. >> To post to this group, send email to sqlalchemy@googlegroups.com >> <mailto:sqlalchemy@googlegroups.com> >> <mailto:sqlalchemy@googlegroups.com>. >> Visit this group at https://groups.google.com/group/sqlalchemy. >> For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at https://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.