On 02/23/2016 03:38 PM, Will Angenent wrote:
Hi,

I think I’ve finally got to the bottom of it.

This triggers the problem:
- Load a module with some model classes using imp.load_source
- Delete the module from sys.modules
- Add a reference to the model classes using _mapper_registry.keys() so
it doesn't get garbage collected

so that would be where your application is making a reference to objects in the module file. SQLAlchemy itself has no such reference.

- Trigger garbage collection with gc.collect()
- Trigger execution of configure_mappers() with a query

This causes the module to be garbage collected, but the classes to
remain in memory.

Perhaps the original problem was that garbage collection was happening
sometime after the list(_mapper_registry) in
sqlalchemy.mapper.orm.configure_mappers but before the call to the
secondary function. I can demonstrate the problem quite easily with the
code below. The first bit of test.py demonstrates that GC can remove the
module contents without removing classes inside it. The second part
triggers the sqlalchemy mapper error like I had before.

Right, so no bug in SQLA or Alembic is demonstrated here because this relies upon an artificial access of a private ORM variable.




file: test.py
—
#!/usr/bin/env python

from sqlalchemy import Column, Integer
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.mapper import _mapper_registry
import gc
import imp
import sys


# Adapted from alembic.util.pyfiles, using python 2k versions
def load_module_py(module_id, path):
     with open(path, 'rb') as fp:
         return imp.load_source(module_id, path, fp)

mod = load_module_py('a_module', 'a_module.py')
Test = mod.Test  # To demonstrate __name__ becoming None after GC
mod = None  # So mod can be garbage collected

print 'Before garbage collection:',
Test.print_name()

# Create a copy of the model classes so they don't get garbage collected
non_weak_copy_of_mapper_registry_classes = _mapper_registry.keys()

# Garbage collect the module class
del sys.modules['a_module']
gc.collect()

print 'After garbage collection:',
Test.print_name()

# Do a query to trigger the ORM to process the mappers
Base = declarative_base()


class Table3(Base):
     __tablename__ = 'table3'

     id = Column(Integer, primary_key=True)


engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
Session.configure(bind=engine)
session = Session()
Base.metadata.create_all(session.bind)
session.query(Table3).all()

---
file: a_module.py
from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()


a_pivot_table = Table(
     'table1_to_table2', Base.metadata,
     Column('table1_id', ForeignKey('table1.id'), primary_key=True),
     Column('table2_id', ForeignKey('table2.id'), primary_key=True)
)


def get_a_pivot_table():
     print 'get_a_pivot_table __name__:', __name__
     print 'get_a_pivot_table a_pivot_table:', a_pivot_table
     return a_pivot_table


class Table1(Base):
     __tablename__ = 'table1'
     id = Column(Integer, primary_key=True)


class Table2(Base):
     __tablename__ = 'table2'
     id = Column(Integer, primary_key=True)
     table1_stuff = relationship("Table1", secondary=get_a_pivot_table)


class Test(object):
     @staticmethod
     def print_name():
         print '__name__:', __name__

---
The output:

$ ./test.py
Before garbage collection: __name__: a_module
After garbage collection: __name__: None
get_a_pivot_table __name__: None
get_a_pivot_table a_pivot_table: None
Traceback (most recent call last):
   File "./test.py", line 50, in <module>
     session.query(Table3).all()
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/session.py",
line 1260, in query
     return self._query_cls(entities, self, **kwargs)
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py",
line 110, in __init__
     self._set_entities(entities)
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py",
line 120, in _set_entities
     self._set_entity_selectables(self._entities)
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py",
line 150, in _set_entity_selectables
     ent.setup_entity(*d[entity])
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py",
line 3421, in setup_entity
     self._with_polymorphic = ext_info.with_polymorphic_mappers
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py",
line 754, in __get__
     obj.__dict__[self.__name__] = result = self.fget(obj)
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py",
line 1893, in _with_polymorphic_mappers
     configure_mappers()
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py",
line 2768, in configure_mappers
     mapper._post_configure_properties()
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py",
line 1710, in _post_configure_properties
     prop.init()
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/interfaces.py",
line 183, in init
     self.do_init()
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py",
line 1629, in do_init
     self._setup_join_conditions()
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py",
line 1704, in _setup_join_conditions
     can_be_synced_fn=self._columns_are_mapped
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py",
line 1972, in __init__
     self._determine_joins()
   File
"/Users/wangenent/code/sqlalchemy/weak-refs/venv/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py",
line 2076, in _determine_joins
     "specify a 'primaryjoin' expression." % self.prop)
sqlalchemy.exc.NoForeignKeysError: Could not determine join condition
between parent/child tables on relationship Table2.table1_stuff - there
are no foreign keys linking these tables.  Ensure that referencing
columns are associated with a ForeignKey or ForeignKeyConstraint, or
specify a 'primaryjoin' expression.

Thanks,
Will


On 23 Feb 2016, at 02:43, Mike Bayer <clas...@zzzcomputing.com
<mailto:clas...@zzzcomputing.com>> wrote:

To be honest I might just say to solve it that way, with the
gc.collect(), at least if you still had this problem.   That would
explain why the ORM still sees those mappings, they just weren't
garbage collected yet.

On Feb 22, 2016, at 3:23 PM, Will Angenent <w.angen...@gmail.com
<mailto:w.angen...@gmail.com>> wrote:

Hi,

You’re definitely on to something when you say it’s related to
garbage collection and weak references. If I add this:

import gc
gc.disable()

the problem vanishes. I’ve been spending the last couple of hours
trying to reduce the code to the smallest possible case. I’ve reached
the point though that I can’t remove any more code without making the
problem vanish. There’s still plenty of code cruft left that I can’t
copy/paste, so I can’t easily break it down into a tiny example. To
answer your question though, I’ve whittled down the contents of the
migration file into this:

---
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, String, ForeignKey, Table, Integer
from sqlalchemy.orm import relationship

revision = '3a81b8b842d3'
down_revision = None
branch_labels = None
depends_on = None

Base = declarative_base()

x_tag_to_resource = Table(
    'tag_to_resource', Base.metadata,
    Column('tag_id', ForeignKey('tags.id', ondelete='CASCADE'),
           primary_key=True, index=True),
    Column('resource_id', ForeignKey('resources.id', ondelete='CASCADE'),
           primary_key=True, index=True)
)


class XTag(Base):
    __tablename__ = 'tags'
    id = Column(Integer, primary_key=True)
    text = Column(String, nullable=False)


def non_lambda_tag_to_resource():
    # This needs to be here, otherwise the sys import itself
    # can get garbage collected in the erroneous case.
    import sys
    sys.stderr.write('*** x_tag_to_resource=%s\n' % x_tag_to_resource)
    sys.stderr.write('*** name=%s\n' % __name__)
    return x_tag_to_resource


class XResource(Base):
    __tablename__ = 'resources'
    id = Column(Integer, primary_key=True)
    tags = relationship("XTag", secondary=non_lambda_tag_to_resource,
                        backref='resources')

—
I don’t think there’s anything wrong with the above though. There
also isn’t mixing between models in the application and models in the
migration; so that doesn’t explain references sticking in some cases.

By the way, this isn’t an urgent thing. I made the problem go away by
deleting the bad migration; we didn’t need it any more. My interest
is purely academic. I’m just worried this might happen again. Mostly
though, I don’t like _not_ understanding what’s going on.

Thanks,
Will


On 22 Feb 2016, at 01:59, Mike Bayer <clas...@zzzcomputing.com
<mailto:clas...@zzzcomputing.com>> wrote:

Well what makes no sense is why these mappers and classes would be
present at all.  These classes are strictly defined within an
alembic migration, and make **no** callouts to any other mapped
structures in the application, correct?     If so, when the module
is garbage collected, all of the things inside of it would be as
well. The mapper registry is weak referencing so the
_configure_mappers() step shouldn't see it.

If, OTOH, this mapping has something like a *backref* to some model
in the application, that would totally leave a dangling reference.

Can I have an example of an exact mapping I can stick into an
Alembic migration to see this happening otherwise?



On 02/21/2016 02:19 PM, Will Angenent wrote:
Hi Mike,

Thanks for your quick response yet again! Here’s the stack trace.

tests/integration/test_database.py:14: in test_database_is_up_to_date
    create_test_db(session)
__init__.py:111: in create_test_db
    pd_utils.do_import(dtype='locations', ifile=yaml_file)
../utils/provider_data/__init__.py:54: in do_import
    inserted, updated = getattr(self, 'import_%s' % item)(ifile)
../utils/provider_data/__init__.py:22: in import_locations
    return import_locations(self.session, ifile)
../utils/provider_data/locations.py:190: in import_locations
    Location).filter_by(
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py:1260:
in query
    return self._query_cls(entities, self, **kwargs)
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:110:
in __init__
    self._set_entities(entities)
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:120:
in _set_entities
    self._set_entity_selectables(self._entities)
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:150:
in _set_entity_selectables
    ent.setup_entity(*d[entity])
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py:3250:
in setup_entity
    self._with_polymorphic = ext_info.with_polymorphic_mappers
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py:747:
in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:1893:
in _with_polymorphic_mappers
    configure_mappers()
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:2756:
in configure_mappers
    mapper._post_configure_properties()
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/mapper.py:1710:
in _post_configure_properties
    prop.init()
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/interfaces.py:183:
in init
    self.do_init()
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1613:
in do_init
    self._setup_join_conditions()
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1688:
in _setup_join_conditions
    can_be_synced_fn=self._columns_are_mapped
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:1956:
in __init__
    self._determine_joins()
../../devenv/local/lib/python2.7/site-packages/sqlalchemy/orm/relationships.py:2060:
in _determine_joins
    "specify a 'primaryjoin' expression." % self.prop)
E   NoForeignKeysError: Could not determine join condition between
parent/child tables on relationship Resource.tags - there are no
foreign
keys linking these tables.  Ensure that referencing columns are
associated with a ForeignKey or ForeignKeyConstraint, or specify a
'primaryjoin' expression.

The sys.modules activity is not really the primary cause, it's that
alembic makes use of a module object in a temporary way.
Absolutely agree. What I did to diagnose this was to replace
secondary=lambda: tag_to_resource with
secondary=non_lambda_tag_to_resource, using this:

def non_lambda_tag_to_resource():
    import sys
    sys.stderr.write('*** tag_to_resource=%s\n' % tag_to_resource)
                                                    #
    sys.stderr.write('*** name=%s\n' % __name__)
                                  #
    return tag_to_resource

What I found is that in the bad case, both tag_to_resource and __name__
were None.

Thanks,
Will

On 21 Feb 2016, at 19:12, Mike Bayer <clas...@zzzcomputing.com
<mailto:clas...@zzzcomputing.com>
<mailto:clas...@zzzcomputing.com>> wrote:



Hi there -

Can you post a stack trace, and also is your test suite making use of
clear_mappers() ?

The sys.modules activity is not really the primary cause, it's that
alembic makes use of a module object in a temporary way.

On Feb 21, 2016, at 1:48 PM, Will Angenent <w.angen...@gmail.com
<mailto:w.angen...@gmail.com>
<mailto:w.angen...@gmail.com>> wrote:

Hi,

We had this interesting issue recently, and I've been trying to
figure out if we deserve this, if this is simply unavoidable, or
whether it can be considered a bug. We're using python 2.7.6,
sqlalchemy 1.0.12 and alembic 0.8.4.

Summary:

This statement in alembic.util.pyfiles.load_python_file():
del sys.modules[module_id]
randomly causes the reference count of the module object to become
zero; triggering cleanup of the object. This effectively causes all
variables in the migration file to become None, leading to an
sqlalchemy mapper problem initializing a mapper configuration for a
many-to-many relationship in a model defined in the migration file.

Are we being stupid to be using the ORM in alembic migrations? If
not, is it worth for me to spend more time on this? Is there any way
to get this to behave non-randomly? More details are below.

Thanks,
Will

Long version...

What happened is that someone in my team added an alembic migration.
He used the sqlalchemy ORM and used a declarative_base with a couple
of model files to get the job done. The migration was fine and
everyone was happy. Then, about a week later, I added an import
statement in a totally unrelated area of code, and suddenly running
alembic upgrade starting failing with a ORM mapper error. I didn't
spend much time on it, but refactored a couple of things and the
problem vanished.

Then a couple of days later, our tests started failing with the same
error. We had a closer look and found the failure to be random. The
inclusion of the import statment seemed to trigger the random
behavior. It wasn't just the import statement though, other changes,
such as removing a property in an ORM class could make the problem
appear or go away. What we were doing in this particualr failure
mode, is running py.test which would, in order:

- import this random 3rd party module
- use the alembic API to upgrade to ensure a postgres database is up
to date
- later on, in an unrelated test, do a query, triggering the
initialization of the mappings and crashing

At first, I thought it might be a problem with sqlalchemy. Spurred on
by this comment in mapper.py:

           # initialize properties on all mappers
           # note that _mapper_registry is unordered, which
           # may randomly conceal/reveal issues related to
           # the order of mapper compilation

I added a couple of sorted() statements throughout the code, but it
made no difference. Finally, I found that the problem was a lambda
function in a relationship with a secondary. Something like e.g.

tag_to_resource = Table(
   'tag_to_resource', Base.metadata,
   Column('tag_id', ForeignKey('tags.id', ondelete='CASCADE'),
          primary_key=True, index=True),
   Column('resource_id', ForeignKey('resources.id',
ondelete='CASCADE'),
          primary_key=True, index=True)
)

class Resource(Base):
   __tablename__ = 'resources'
   id = Column(UUIDType(binary=True), primary_key=True,
default=uuid.uuid4)

   tags = relationship("Tag", secondary=lambda: tag_to_resource,
                       backref='resources')

The lambda function called in _process_dependent_arguments() was
returning None instead of tag_to_resource. Resulting in a:

sqlalchemy.exc.NoForeignKeysError: Could not determine join condition
between parent/child tables on relationship Resource.tags - there are
no foreign keys linking these tables.  Ensure that referencing
columns are associated with a ForeignKey or ForeignKeyConstraint, or
specify a 'primaryjoin' expression.

Looking deeper I found that __name__ was also None. This kind of
thing happens when sys.modules is messed with. I looked at the
alembic code and found this in load_python_file():

del sys.modules[module_id]

If I remove that statement, the problem goes away.

Could it be that the reference count of the module object is becoming
zero randomly, causing python to delete the data, as explained in
this post?
http://stackoverflow.com/questions/5365562/why-is-the-value-of-name-changing-after-assignment-to-sys-modules-name

I've narrowed the problem down to a python test script, but it still
imports a load of other stuff. I can trigger the good + bad case by
just removing an import statement. I've been trying to get this down
to a simple script in an attempt to prove what's going on, but the
problem tends to come and go while I'm deleting code; making it
difficult to narrow down. For example, I was convinced one day that
the problem vanished by upgrading to sql alchemy 1.0.12, but the very
next day the same code started failing again!


--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it,
send an email tosqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email tosqlalch...@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>
<mailto:sqlalchemy@googlegroups.com>.
Visit this group athttps://groups.google.com/group/sqlalchemy.
For more options, visithttps://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email tosqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email tosqlalch...@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>
<mailto:sqlalchemy@googlegroups.com>.
Visit this group athttps://groups.google.com/group/sqlalchemy.
For more options, visithttps://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email tosqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email tosqlalch...@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to