Hi,

You’re definitely on to something when you say it’s related to garbage 
collection and weak references. If I add this:

import gc
gc.disable()

the problem vanishes. I’ve been spending the last couple of hours trying to 
reduce the code to the smallest possible case. I’ve reached the point though 
that I can’t remove any more code without making the problem vanish. There’s 
still plenty of code cruft left that I can’t copy/paste, so I can’t easily 
break it down into a tiny example. To answer your question though, I’ve 
whittled down the contents of the migration file into this:

---
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, String, ForeignKey, Table, Integer
from sqlalchemy.orm import relationship

revision = '3a81b8b842d3'
down_revision = None
branch_labels = None
depends_on = None

Base = declarative_base()

x_tag_to_resource = Table(
    'tag_to_resource', Base.metadata,
    Column('tag_id', ForeignKey('tags.id', ondelete='CASCADE'),
           primary_key=True, index=True),
    Column('resource_id', ForeignKey('resources.id', ondelete='CASCADE'),
           primary_key=True, index=True)
)


class XTag(Base):
    __tablename__ = 'tags'
    id = Column(Integer, primary_key=True)
    text = Column(String, nullable=False)


def non_lambda_tag_to_resource():
    # This needs to be here, otherwise the sys import itself
    # can get garbage collected in the erroneous case.
    import sys
    sys.stderr.write('*** x_tag_to_resource=%s\n' % x_tag_to_resource)
    sys.stderr.write('*** name=%s\n' % __name__)
    return x_tag_to_resource


class XResource(Base):
    __tablename__ = 'resources'
    id = Column(Integer, primary_key=True)
    tags = relationship("XTag", secondary=non_lambda_tag_to_resource,
                        backref='resources')

—
I don’t think there’s anything wrong with the above though. There also isn’t 
mixing between models in the application and models in the migration; so that 
doesn’t explain references sticking in some cases.

By the way, this isn’t an urgent thing. I made the problem go away by deleting 
the bad migration; we didn’t need it any more. My interest is purely academic. 
I’m just worried this might happen again. Mostly though, I don’t like _not_ 
understanding what’s going on.

Thanks,
Will


> On 22 Feb 2016, at 01:59, Mike Bayer <clas...@zzzcomputing.com> wrote:
> 
> Well what makes no sense is why these mappers and classes would be present at 
> all.  These classes are strictly defined within an alembic migration, and 
> make **no** callouts to any other mapped structures in the application, 
> correct?     If so, when the module is garbage collected, all of the things 
> inside of it would be as well. The mapper registry is weak referencing so the 
> _configure_mappers() step shouldn't see it.
> 
> If, OTOH, this mapping has something like a *backref* to some model in the 
> application, that would totally leave a dangling reference.
> 
> Can I have an example of an exact mapping I can stick into an Alembic 
> migration to see this happening otherwise?
> 
> 
> 
> On 02/21/2016 02:19 PM, Will Angenent wrote:
>> Hi Mike,
>> 
>> Thanks for your quick response yet again! Here’s the stack trace.
>> 
>> tests/integration/test_database.py:14: in test_database_is_up_to_date
>>     create_test_db(session)
>> __init__.py:111: in create_test_db
>>     pd_utils.do_import(dtype='locations', ifile=yaml_file)
>> ../utils/provider_data/__init__.py:54: in do_import
>>     inserted, updated = getattr(self, 'import_%s' % item)(ifile)
>> ../utils/provider_data/__init__.py:22: in import_locations
>>     return import_locations(self.session, ifile)
>> ../utils/provider_data/locations.py:190: in import_locations
>>     Location).filter_by(
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py:1260:
>> in query
>>     return self._query_cls(entities, self, **kwargs)
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:110:
>> in __init__
>>     self._set_entities(entities)
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:120:
>> in _set_entities
>>     self._set_entity_selectables(self._entities)
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:150:
>> in _set_entity_selectables
>>     ent.setup_entity(*d[entity])
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:3250:
>> in setup_entity
>>     self._with_polymorphic = ext_info.with_polymorphic_mappers
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py:747:
>> in __get__
>>     obj.__dict__[self.__name__] = result = self.fget(obj)
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:1893:
>> in _with_polymorphic_mappers
>>     configure_mappers()
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:2756:
>> in configure_mappers
>>     mapper._post_configure_properties()
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:1710:
>> in _post_configure_properties
>>     prop.init()
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/interfaces.py:183:
>> in init
>>     self.do_init()
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1613:
>> in do_init
>>     self._setup_join_conditions()
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1688:
>> in _setup_join_conditions
>>     can_be_synced_fn=self._columns_are_mapped
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1956:
>> in __init__
>>     self._determine_joins()
>> ../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:2060:
>> in _determine_joins
>>     "specify a 'primaryjoin' expression." % self.prop)
>> E   NoForeignKeysError: Could not determine join condition between
>> parent/child tables on relationship Resource.tags - there are no foreign
>> keys linking these tables.  Ensure that referencing columns are
>> associated with a ForeignKey or ForeignKeyConstraint, or specify a
>> 'primaryjoin' expression.
>> 
>>> The sys.modules activity is not really the primary cause, it's that
>>> alembic makes use of a module object in a temporary way.
>> Absolutely agree. What I did to diagnose this was to replace
>> secondary=lambda: tag_to_resource with
>> secondary=non_lambda_tag_to_resource, using this:
>> 
>> def non_lambda_tag_to_resource():
>>     import sys
>>     sys.stderr.write('*** tag_to_resource=%s\n' % tag_to_resource)
>>                                                     #
>>     sys.stderr.write('*** name=%s\n' % __name__)
>>                                   #
>>     return tag_to_resource
>> 
>> What I found is that in the bad case, both tag_to_resource and __name__
>> were None.
>> 
>> Thanks,
>> Will
>> 
>>> On 21 Feb 2016, at 19:12, Mike Bayer <clas...@zzzcomputing.com
>>> <mailto:clas...@zzzcomputing.com <mailto:clas...@zzzcomputing.com>>> wrote:
>>> 
>>> 
>>> 
>>> Hi there -
>>> 
>>> Can you post a stack trace, and also is your test suite making use of
>>> clear_mappers() ?
>>> 
>>> The sys.modules activity is not really the primary cause, it's that
>>> alembic makes use of a module object in a temporary way.
>>> 
>>> On Feb 21, 2016, at 1:48 PM, Will Angenent <w.angen...@gmail.com 
>>> <mailto:w.angen...@gmail.com>
>>> <mailto:w.angen...@gmail.com <mailto:w.angen...@gmail.com>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> We had this interesting issue recently, and I've been trying to
>>>> figure out if we deserve this, if this is simply unavoidable, or
>>>> whether it can be considered a bug. We're using python 2.7.6,
>>>> sqlalchemy 1.0.12 and alembic 0.8.4.
>>>> 
>>>> Summary:
>>>> 
>>>> This statement in alembic.util.pyfiles.load_python_file():
>>>> del sys.modules[module_id]
>>>> randomly causes the reference count of the module object to become
>>>> zero; triggering cleanup of the object. This effectively causes all
>>>> variables in the migration file to become None, leading to an
>>>> sqlalchemy mapper problem initializing a mapper configuration for a
>>>> many-to-many relationship in a model defined in the migration file.
>>>> 
>>>> Are we being stupid to be using the ORM in alembic migrations? If
>>>> not, is it worth for me to spend more time on this? Is there any way
>>>> to get this to behave non-randomly? More details are below.
>>>> 
>>>> Thanks,
>>>> Will
>>>> 
>>>> Long version...
>>>> 
>>>> What happened is that someone in my team added an alembic migration.
>>>> He used the sqlalchemy ORM and used a declarative_base with a couple
>>>> of model files to get the job done. The migration was fine and
>>>> everyone was happy. Then, about a week later, I added an import
>>>> statement in a totally unrelated area of code, and suddenly running
>>>> alembic upgrade starting failing with a ORM mapper error. I didn't
>>>> spend much time on it, but refactored a couple of things and the
>>>> problem vanished.
>>>> 
>>>> Then a couple of days later, our tests started failing with the same
>>>> error. We had a closer look and found the failure to be random. The
>>>> inclusion of the import statment seemed to trigger the random
>>>> behavior. It wasn't just the import statement though, other changes,
>>>> such as removing a property in an ORM class could make the problem
>>>> appear or go away. What we were doing in this particualr failure
>>>> mode, is running py.test which would, in order:
>>>> 
>>>> - import this random 3rd party module
>>>> - use the alembic API to upgrade to ensure a postgres database is up
>>>> to date
>>>> - later on, in an unrelated test, do a query, triggering the
>>>> initialization of the mappings and crashing
>>>> 
>>>> At first, I thought it might be a problem with sqlalchemy. Spurred on
>>>> by this comment in mapper.py:
>>>> 
>>>>            # initialize properties on all mappers
>>>>            # note that _mapper_registry is unordered, which
>>>>            # may randomly conceal/reveal issues related to
>>>>            # the order of mapper compilation
>>>> 
>>>> I added a couple of sorted() statements throughout the code, but it
>>>> made no difference. Finally, I found that the problem was a lambda
>>>> function in a relationship with a secondary. Something like e.g.
>>>> 
>>>> tag_to_resource = Table(
>>>>    'tag_to_resource', Base.metadata,
>>>>    Column('tag_id', ForeignKey('tags.id', ondelete='CASCADE'),
>>>>           primary_key=True, index=True),
>>>>    Column('resource_id', ForeignKey('resources.id', ondelete='CASCADE'),
>>>>           primary_key=True, index=True)
>>>> )
>>>> 
>>>> class Resource(Base):
>>>>    __tablename__ = 'resources'
>>>>    id = Column(UUIDType(binary=True), primary_key=True,
>>>> default=uuid.uuid4)
>>>> 
>>>>    tags = relationship("Tag", secondary=lambda: tag_to_resource,
>>>>                        backref='resources')
>>>> 
>>>> The lambda function called in _process_dependent_arguments() was
>>>> returning None instead of tag_to_resource. Resulting in a:
>>>> 
>>>> sqlalchemy.exc.NoForeignKeysError: Could not determine join condition
>>>> between parent/child tables on relationship Resource.tags - there are
>>>> no foreign keys linking these tables.  Ensure that referencing
>>>> columns are associated with a ForeignKey or ForeignKeyConstraint, or
>>>> specify a 'primaryjoin' expression.
>>>> 
>>>> Looking deeper I found that __name__ was also None. This kind of
>>>> thing happens when sys.modules is messed with. I looked at the
>>>> alembic code and found this in load_python_file():
>>>> 
>>>> del sys.modules[module_id]
>>>> 
>>>> If I remove that statement, the problem goes away.
>>>> 
>>>> Could it be that the reference count of the module object is becoming
>>>> zero randomly, causing python to delete the data, as explained in
>>>> this post?
>>>> http://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name
>>>>  
>>>> <http://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name>
>>>> 
>>>> I've narrowed the problem down to a python test script, but it still
>>>> imports a load of other stuff. I can trigger the good + bad case by
>>>> just removing an import statement. I've been trying to get this down
>>>> to a simple script in an attempt to prove what's going on, but the
>>>> problem tends to come and go while I'm deleting code; making it
>>>> difficult to narrow down. For example, I was convinced one day that
>>>> the problem vanished by upgrading to sql alchemy 1.0.12, but the very
>>>> next day the same code started failing again!
>>>> 
>>>> 
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "sqlalchemy" group.
>>>> To unsubscribe from this group and stop receiving emails from it,
>>>> send an email to sqlalchemy+unsubscr...@googlegroups.com 
>>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>
>>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com 
>>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>>.
>>>> To post to this group, send email to sqlalchemy@googlegroups.com 
>>>> <mailto:sqlalchemy@googlegroups.com>
>>>> <mailto:sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>>.
>>>> Visit this group at https://groups.google.com/group/sqlalchemy 
>>>> <https://groups.google.com/group/sqlalchemy>.
>>>> For more options, visit https://groups.google.com/d/optout 
>>>> <https://groups.google.com/d/optout>.
>>> 
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to sqlalchemy+unsubscr...@googlegroups.com 
>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>
>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com 
>>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>>.
>>> To post to this group, send email to sqlalchemy@googlegroups.com 
>>> <mailto:sqlalchemy@googlegroups.com>
>>> <mailto:sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>>.
>>> Visit this group at https://groups.google.com/group/sqlalchemy 
>>> <https://groups.google.com/group/sqlalchemy>.
>>> For more options, visit https://groups.google.com/d/optout 
>>> <https://groups.google.com/d/optout>.
>> 
>> --
>> You received this message because you are subscribed to the Google
>> Groups "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to sqlalchemy+unsubscr...@googlegroups.com 
>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>
>> <mailto:sqlalchemy+unsubscr...@googlegroups.com 
>> <mailto:sqlalchemy+unsubscr...@googlegroups.com>>.
>> To post to this group, send email to sqlalchemy@googlegroups.com 
>> <mailto:sqlalchemy@googlegroups.com>
>> <mailto:sqlalchemy@googlegroups.com>.
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to