[sqlalchemy] How can I set table constraints with DeferredReflection?
I'm using the ORM and one of my tables does not have a primary key defined. I am also using DeferredReflection, and I can't seem to figure out how to defer the PrimaryKeyConstraint until Base.prepare() runs. Any pointers? Base = declarative_base(cls=DeferredReflection) class Person(Base): __tablename__ = 'people' __table_args__ = (PrimaryKeyConstraint(u'name'), {}) # this does not run if __name__ == '__main__': engine = create_engine('mysql://user:password@localhost/organisms') Base.prepare(engine) When this runs, we get an error constructing the Person class (the __main__ section is not hit). This makes sense given that we haven't reflected the table yet! Traceback (most recent call last): File /home/ubuntu/deferred_reflection.py, line 10, in module class Person(Base): File /usr/local/lib/python2.7/dist-packages/sqlalchemy/ext/declarative/api.py, line 53, in __init__ _as_declarative(cls, classname, cls.__dict__) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/ext/declarative/base.py, line 251, in _as_declarative **table_kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, line 350, in __new__ table._init(name, metadata, *args, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, line 427, in _init self._init_items(*args) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, line 70, in _init_items item._set_parent_with_dispatch(self) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/base.py, line 283, in _set_parent_with_dispatch self._set_parent(parent) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, line 2646, in _set_parent super(PrimaryKeyConstraint, self)._set_parent(table) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, line 2289, in _set_parent ColumnCollectionMixin._set_parent(self, table) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, line 2257, in _set_parent col = table.c[col] File /usr/local/lib/python2.7/dist-packages/sqlalchemy/util/_collections.py, line 156, in __getitem__ return self._data[key] KeyError: u'name' -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.
Re: [sqlalchemy] Trouble filtering binary columns with tuple_ and an in_ clause against Postgres
Looks good to me :) ... -- Ran 3 tests in 0.811s OK Thanks for you really fast help with this. Michael Bayer for president!! On Thursday, February 27, 2014 7:58:24 PM UTC-5, Michael Bayer wrote: that patch is in for 0.8 and 0.9. On Feb 27, 2014, at 7:29 PM, Michael Bayer mik...@zzzcomputing.comjavascript: wrote: On Feb 27, 2014, at 4:38 PM, Rob Crowell rob.c...@moat.com javascript: wrote: # in_ clause with 1 STRING, 1 BINARY filter_cols = tuple_(HashTest.hash_val, HashTest.hash_type) filter_vals = ((*encoded_hash*, 'md5'),) q = session.query(HashTest) q = q.filter(filter_cols.in_((filter_vals))) ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. 'SELECT hashtest.hash_test.hash_val AS hashtest_hash_test_hash_val, hashtest.hash_test.hash_type AS hashtest_hash_test_hash_type, hashtest.hash_test.region AS hashtest_hash_test_region \nFROM hashtest.hash_test \nWHERE (hashtest.hash_test.hash_val, hashtest.hash_test.hash_type) IN ((%(param_1)s, %(param_2)s))' {'param_1': psycopg2._psycopg.Binary object at 0x2d9d850, 'param_2': psycopg2._psycopg.Binary object at 0x2c18940} It's complaining about an improper encoding for the BINARY column. this has nothing to do with encoding and instead is about typing. you can see in the output that SQLAlchemy is turning both elements of the IN tuple into a Binary which would appear to be inappropriate here - the sqlalchemy.sql.Tuple object isn’t yet smart enough to handle heterogeneous types.There’s a patch which will resolve this attached to https://bitbucket.org/zzzeek/sqlalchemy/issue/2977/tuple_-needs-to-record-heterogeneous-types. In the meantime you can actually lift and use that Tuple class that’s in the patch: from sqlalchemy.sql.expression import ClauseList, ColumnElement, \ _literal_as_binds, BindParameter from sqlalchemy import types class tuple_(ClauseList, ColumnElement): def __init__(self, *clauses, **kw): clauses = [_literal_as_binds(c) for c in clauses] self.type = types.NULLTYPE self._type_tuple = [arg.type for arg in clauses] super(tuple_, self).__init__(*clauses, **kw) @property def _select_iterable(self): return (self, ) def _bind_param(self, operator, obj): return tuple_(*[ BindParameter(None, o, _compared_to_operator=operator, _compared_to_type=type_, unique=True) for o, type_ in zip(obj, self._type_tuple) ]).self_group() -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] Trouble filtering binary columns with tuple_ and an in_ clause against Postgres
When I pass binary data to a multi-column in_ clause, I seem to be geting inconsistent results and I need some help! I did some testing with MySQL, Postgres, and Vertica (connecting via https://pypi.python.org/pypi/vertica-sqlalchemy/0.1). It appears MySQL works correctly but both Postgres and Vertica (which is almost compatible with Posgres) I am having trouble. I authored a simple test model in a schema named 'hashtest'. The (silly) idea is to store binary hash values in the hash_val column and tag the type (i.e. md5) in hash_type -- I added a region column just so I would have 2 non-string columns to play with: Base = declarative_base() class HashTest(Base): __tablename__ = 'hash_test' __table_args__ = {'schema': 'hashtest'} hash_val = Column(Binary, primary_key=True) hash_type = Column(String, primary_key=True) region = Column(String) Insert a single row: hashtest= INSERT INTO hashtest.hash_test (hash_val, hash_type, region) VALUES (E'\\xf2666f453b364db65cfdd19756d7e0ad', 'md5', 'US'); For these tests I am using a binary string to store the md5 (rather than 16 characters 0-F, using 16 raw bytes); the binascii module can do this transformation for us: *encoded_hash* = '\xf2foE;6M\xb6\\\xfd\xd1\x97V\xd7\xe0\xad' self.assertEquals(binascii.unhexlify('f2666f453b364db65cfdd19756d7e0ad'), *encoded_hash*) We can pass this to a single-column IN clause no problem, and get our data back out (the unittest module provides assertEqual): # in_ clause with 1 BINARY filter_cols = tuple_(HashTest.hash_val) filter_vals = ((*encoded_hash*,),) q = session.query(HashTest) q = q.filter(filter_cols.in_((filter_vals))) rows = q.all() self.assertEqual(len(rows), 1) self.assertEqual(rows[0].hash_val, *encoded_hash*) We can also build a 2-column in_ clause, with both of the string hash_type and region columns, and things also work as expected. However, when I pass in a String and Binary value to a single in_ clause on hash_val and hash_type, things stop working: # in_ clause with 1 STRING, 1 BINARY filter_cols = tuple_(HashTest.hash_val, HashTest.hash_type) filter_vals = ((*encoded_hash*, 'md5'),) q = session.query(HashTest) q = q.filter(filter_cols.in_((filter_vals))) rows = q.all() self.assertEqual(len(rows), 1) self.assertEqual(rows[0].hash_val, *encoded_hash*) MySQL happily responds with the desired results, but Postgres (and Vertica) seem to choke up: == ERROR: testPostgres (md5test.BinaryTupleInTest) -- Traceback (most recent call last): File md5test.py, line 70, in testPostgres self._lookupHash(session) File md5test.py, line 49, in _lookupHash rows = q.all() File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, line 2286, in all return list(self) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, line 2398, in __iter__ return self._execute_and_instances(context) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, line 2413, in _execute_and_instances result = conn.execute(querycontext.statement, self._params) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 717, in execute return meth(self, multiparams, params) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py, line 317, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 814, in _execute_clauseelement compiled_sql, distilled_params File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 927, in _execute_context context) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 1076, in _handle_dbapi_exception exc_info File /usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py, line 185, in raise_from_cause reraise(type(exception), exception, tb=exc_tb) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 920, in _execute_context context) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py, line 426, in do_execute cursor.execute(statement, parameters) ProgrammingError: (ProgrammingError) operator does not exist: character varying = bytea LINE 3: ...hash_test.hash_val, hashtest.hash_test.hash_type) IN (('\xf2... ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. 'SELECT hashtest.hash_test.hash_val AS hashtest_hash_test_hash_val, hashtest.hash_test.hash_type AS hashtest_hash_test_hash_type, hashtest.hash_test.region AS hashtest_hash_test_region \nFROM
[sqlalchemy] Help with DeferredReflection and setting up an engine at runtime?
I am having a bit of trouble getting DeferredReflection working the way I want; not sure if I am overlooking something obvious or if I just don't really understand how it's supposed to work. I'm trying to define my models before creating my engine (this does not work): Base = declarative_base(cls=*DeferredReflection*) class CityStats(Base): __tablename__ = 'city_stats' __table_args__ = {'schema': 'prod', *'autoload': True*} if __name__ == '__main__': engine = create_engine('...') Base.metadata.bind = engine Base.prepare(engine) When I run this I get an error creating the CityStats class: sqlalchemy.exc.UnboundExecutionError: No engine is bound to this Table's MetaData. Pass an engine to the Table via autoload_with=someengine, or associate the MetaData with an engine via metadata.bind=someengine Of course it does work if I create my engine and set Base.metadata.bind BEFORE I define the CityStats model (this works): engine = create_engine('...') Base.metadata.bind = engine class CityStats(Base): __tablename__ = 'city_stats' __table_args__ = {'schema': 'prod', *'autoload': True*} Base.prepare(engine) I'm trying to avoid some kind of model_init() function that everyone who imports my models.py file will have to remember to call before importing my models. Is this possible? -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Help with DeferredReflection and setting up an engine at runtime?
Interesting, thanks Michael. I didn't realize autoload was implied when using DeferredReflection but that makes sense. Thanks! On Monday, February 17, 2014 7:17:34 PM UTC-5, Michael Bayer wrote: On Feb 17, 2014, at 6:23 PM, Rob Crowell robcc...@gmail.com javascript: wrote: I am having a bit of trouble getting DeferredReflection working the way I want; not sure if I am overlooking something obvious or if I just don't really understand how it's supposed to work. I'm trying to define my models before creating my engine (this does not work): Base = declarative_base(cls=*DeferredReflection*) class CityStats(Base): __tablename__ = 'city_stats' __table_args__ = {'schema': 'prod', *'autoload': True*} if __name__ == '__main__': engine = create_engine('...') Base.metadata.bind = engine Base.prepare(engine) When I run this I get an error creating the CityStats class: sqlalchemy.exc.UnboundExecutionError: No engine is bound to this Table's MetaData. Pass an engine to the Table via autoload_with=someengine, or associate the MetaData with an engine via metadata.bind=someengine” its all about the stack trace, lets look: Traceback (most recent call last): File test.py, line 8, in module class CityStats(Base): File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/ext/declarative/api.py, line 53, in __init__ _as_declarative(cls, classname, cls.__dict__) File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/ext/declarative/base.py, line 251, in _as_declarative **table_kw) File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/schema.py, line 350, in __new__ table._init(name, metadata, *args, **kw) File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/schema.py, line 423, in _init self._autoload(metadata, autoload_with, include_columns) File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/schema.py, line 439, in _autoload msg=No engine is bound to this Table's MetaData. File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/base.py, line 459, in _bind_or_error what we see here is that this script doesn’t get to create_engine() at all, it’s trying to hit the database as soon as you say “CityStats(Base)”. Why is that? Because you have “autoload=True” in your table args, which means, “reflect this table *right now*”. That is, you are defeating the purpose of using DeferredReflection. The solution is just take out that autoload=True. Any class which descends from the Base here is automatically part of the “classes to reflect” since you have DeferredReflection at the base. Also, if you’re on 0.9 take a look at the new “automap” extension, I’ve been using it and it’s pretty keen. There’s one fix for it in not-yet-released 0.9.3 but it’s only needed if you’re dealing with inheritance structures. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Better approach to filtering a relationship collection than using contains_eager()?
Sure! Here's the query I am attempting to replicate: SELECT people.id AS person_id, people.name, towns.id AS town_id, towns.name FROM people INNER JOIN visited_destinations ON visited_destinations.person_id = people.id INNER JOIN towns ON towns.id = visited_destinations.town_id WHERE towns.name IN ('Atlanta', 'Memphis') I realize it's confusing since I labeled 2 people as Sam in my test dataset, but I left it like that for consistency. You can see that one of the Sam's has person_id=9 and the other has person_id=10 from the MySQL results below: +---+--+-+-+ | person_id | name | town_id | name| +---+--+-+-+ | 8 | Bob | 2 | Atlanta | | 8 | Bob | 1 | Memphis | | 9 | Sam | 1 | Memphis | |10 | Sam | 2 | Atlanta | |10 | Sam | 2 | Atlanta | |10 | Sam | 2 | Atlanta | +---+--+-+-+ I'd like to turn this into 3 Person results, like this: Person(id=8, name=Bob, visited_towns=[Town(name=Atlanta), Town(name=Memphis)]) Person(id=9, name=Sam, visited_towns=[Town(Memphis)]) Person(id=10, name=Sam, visited_towns=[Town(Atlanta), Town(Atlanta), Town(Atlanta)]) On Wednesday, February 27, 2013 12:59:02 PM UTC-5, Michael Bayer wrote: I'm not yet digging into your problem, but one remark would be that there's two levels to deal with here. One is figuring out exactly what SQL you want, independent of SQLAlchemy. It's not clear here if you've gotten that part yet. The next part is getting parts of that SQL to route into your contains_eager(). We do that second. So let me know if you know the actual SQL you want to do first; we'd work from there. Don't deal with joinedload or contains_eager or any of that yet. On Feb 27, 2013, at 2:07 AM, Rob Crowell robcc...@gmail.com javascript: wrote: Example code: https://gist.github.com/rcrowell/5045832 I have Person and Town tables, which are joined in a many-to-many fashion through a VisitedDestinations table. I want to write a query which will return People that have visited either Atlanta or Memphis. I have a working example using contains_eager below, but I'm not sure if there is a better way... I am trying to get a Person object for each person that has visited at least one of these two cities, and I want to get joined Town objects for Atlanta and Memphis. If a person has visited one of these towns more than once, I'd like to get back one Town object for each visit (so 3 visits to Atlanta produces a visited_towns collection of size three): class Town(Base): __tablename__ = 'towns' id = Column('id', Integer, primary_key=True) name = Column('name', String(256)) class Person(Base): __tablename__ = 'people' id = Column('id', Integer, primary_key=True) name = Column('name', String(256)) class VisitedDestinations(Base): __tablename__ = 'visited_destinations' id = Column('id', Integer, primary_key=True) person_id = Column('person_id', Integer, ForeignKey(Person.id)) town_id = Column('town_id', Integer, ForeignKey(Town.id)) person = relationship(Person, backref='visited_destinations') town = relationship(Town, backref='visited_destinations') # use an association_proxy so client code does not have to deal with the visited_destinations table at all Person.visited_towns = association_proxy('visited_destinations', 'town') This code more or less does what I would like, but it uses an EXISTS query which I don't really want and it gets back ALL towns that a matching person has visited instead of only the matching towns: # gets all Town objects, including those that do not match our filter q = session.query(Person) q = q.filter(Person.visited_towns.any(Town.name.in_(['Atlanta', 'Memphis']))) q = q.options(joinedload_all(Person.visited_destinations, VisitedDestinations.town)) # can't do joinedload with association_proxy objects for person in q: print person, person.visited_towns Which produces: Person(name='Bob') [Town(name='Atlanta'), Town(name='Memphis')] Person(name='Sam') [Town(name='Memphis')] Person(name='Sam') [Town(name='Chattanooga'), Town(name='Atlanta'), Town(name='Atlanta'), Town(name='Atlanta')] In my database its likely that a person has visited thousands of destinations, and I really don't need to get all of them back here. As you
Re: [sqlalchemy] Better approach to filtering a relationship collection than using contains_eager()?
Ah okay, so you do recommend the contains_eager approach. I guess this is exactly the use-case it is designed for? I always get a little scared when I try using advanced features of SQLAlchemy :) One last question. The query here seems to take advantage of the fact that our table joins on Towns exactly once. If we had a second table WishlistDestinations, that tracked the towns that a Person would like to visit instead of ones he had already visited, what would be the syntax for filtering those out? Imagine we also add this model: class WishlistDestinations(Base): __tablename__ = 'wishlist_destinations' id = Column('id', Integer, primary_key=True) person_id = Column('person_id', Integer, ForeignKey(Person.id)) town_id = Column('town_id', Integer, ForeignKey(Town.id)) person = relationship(Person, backref='wishlist_destinations') town = relationship(Town, backref='wishlist_destinations') Person.wishlist_towns = association_proxy('wishlist_destinations', 'town') This query is obviously going to fail, since there are now 2 relationships to the Town model: q = session.query(Person) q = q.join(Person.visited_destinations, VisitedDestinations.town, WishlistDestinations.town) q = q.filter(Town.name.in_(['Atlanta', 'Memphis'])) q = q.options(contains_eager(Person.visited_destinations, VisitedDestinations.town)) How could I filter by users that have visited Atlanta or Memphis, that also want to visit Boston? The code below fails and I'm not sure how to write it correctly, here's my first guess: q = q.filter(VisitedDestinations.town.name.in_(['Atlanta', 'Memphis'])) q = q.filter(WishlistDestinations.town.name.in_(['Boston'])) AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with VisitedDestinations.town has an attribute 'name' My second guess also fails (I don't think I want to write an EXISTS query in the first place): q = q.filter(Person.visited_towns.any(Town.name.in_(['Atlanta', 'Memphis']))) q = q.filter(Person.wishlist_towns.any(Town.name.in_(['Boston']))) sqlalchemy.exc.OperationalError: (OperationalError) (1066, Not unique table/alias: 'towns')... What's the correct syntax in this case? On Wednesday, February 27, 2013 2:08:47 PM UTC-5, Michael Bayer wrote: oh. I saw you talking about at least one and exists and thought you had a more complex query.contains_eager() doesn't impact what's queried, only how results are used with the resulting objects, and is usually used with join(), just like this: session.query(Person).\ join(Person.visited_destinations, VisitedDestinations.town).\ options(contains_eager(Person.visited_destinations, VisitedDestinations.town)).\ filter(Town.name.in_(['Atlanta', 'Memphis'])) On Feb 27, 2013, at 1:48 PM, Rob Crowell rob.c...@moat.com javascript: wrote: Sure! Here's the query I am attempting to replicate: SELECT people.id AS person_id, people.name, towns.id AS town_id, towns.name FROM people INNER JOIN visited_destinations ON visited_destinations.person_id = people.id INNER JOIN towns ON towns.id = visited_destinations.town_id WHERE towns.name IN ('Atlanta', 'Memphis') I realize it's confusing since I labeled 2 people as Sam in my test dataset, but I left it like that for consistency. You can see that one of the Sam's has person_id=9 and the other has person_id=10 from the MySQL results below: +---+--+-+-+ | person_id | name | town_id | name| +---+--+-+-+ | 8 | Bob | 2 | Atlanta | | 8 | Bob | 1 | Memphis | | 9 | Sam | 1 | Memphis | |10 | Sam | 2 | Atlanta | |10 | Sam | 2 | Atlanta | |10 | Sam | 2 | Atlanta | +---+--+-+-+ I'd like to turn this into 3 Person results, like this: Person(id=8, name=Bob, visited_towns=[Town(name=Atlanta), Town(name=Memphis)]) Person(id=9, name=Sam, visited_towns=[Town(Memphis)]) Person(id=10, name=Sam, visited_towns=[Town(Atlanta), Town(Atlanta), Town(Atlanta)]) On Wednesday, February 27, 2013 12:59:02 PM UTC-5, Michael Bayer wrote: I'm not yet digging into your problem, but one remark would be that there's two levels to deal with here. One is figuring out exactly what SQL you want, independent of SQLAlchemy. It's not clear here if you've gotten that part yet. The next part is getting parts of that SQL to route into your contains_eager(). We do that second. So let me know if you know the actual SQL you want to do first; we'd work from there. Don't deal with joinedload or contains_eager or any of that yet. On Feb 27, 2013, at 2:07 AM, Rob Crowell robcc...@gmail.com wrote: Example code: https://gist.github.com/rcrowell/5045832 I have Person and Town tables, which are joined in a many-to-many
Re: [sqlalchemy] Better approach to filtering a relationship collection than using contains_eager()?
Oh cool! I was getting incorrect results putting VisitedDestinations.town in my contains_eager() call as you suggested, maybe I am doing something wrong: session = Session() visited_alias = aliased(Town) wishlist_alias = aliased(Town) q = session.query(Person) q = q.join(Person.visited_destinations).join(visited_alias, VisitedDestinations.town).filter(visited_alias.name.in_(['Atlanta', 'Memphis'])) q = q.join(Person.wishlist_destinations).join(wishlist_alias, WishlistDestinations.town).filter(wishlist_alias.name.in_(['Boston'])) q = q.options(contains_eager(Person.visited_destinations, VisitedDestinations.town), contains_eager(Person.wishlist_destinations, WishlistDestinations.town)) SAWarning: Multiple rows returned with uselist=False for eagerly-loaded attribute 'WishlistDestinations.town' SAWarning: Multiple rows returned with uselist=False for eagerly-loaded attribute 'VisitedDestinations.town' However this seems to take care of the warning and fixes the problem with my results: session = Session() visited_alias = aliased(Town) wishlist_alias = aliased(Town) q = session.query(Person) q = q.join(Person.visited_destinations).join(visited_alias, VisitedDestinations.town).filter(visited_alias.name.in_(['Atlanta', 'Memphis'])) q = q.join(Person.wishlist_destinations).join(wishlist_alias, WishlistDestinations.town).filter(wishlist_alias.name.in_(['Boston'])) q = q.options(contains_eager(Person.visited_destinations), contains_eager(Person.wishlist_destinations)) Any idea what is going wrong in the first case? And THANK YOU for your help Michael. You are really helpful :) On Wednesday, February 27, 2013 2:48:21 PM UTC-5, Michael Bayer wrote: On Feb 27, 2013, at 2:40 PM, Rob Crowell rob.c...@moat.com javascript: wrote: Ah okay, so you do recommend the contains_eager approach. I guess this is exactly the use-case it is designed for? I always get a little scared when I try using advanced features of SQLAlchemy :) One last question. The query here seems to take advantage of the fact that our table joins on Towns exactly once. If we had a second table WishlistDestinations, that tracked the towns that a Person would like to visit instead of ones he had already visited, what would be the syntax for filtering those out? Imagine we also add this model: class WishlistDestinations(Base): __tablename__ = 'wishlist_destinations' id = Column('id', Integer, primary_key=True) person_id = Column('person_id', Integer, ForeignKey(Person.id)) town_id = Column('town_id', Integer, ForeignKey(Town.id)) person = relationship(Person, backref='wishlist_destinations') town = relationship(Town, backref='wishlist_destinations') Person.wishlist_towns = association_proxy('wishlist_destinations', 'town') This query is obviously going to fail, since there are now 2 relationships to the Town model: q = session.query(Person) q = q.join(Person.visited_destinations, VisitedDestinations.town, WishlistDestinations.town) q = q.filter(Town.name.in_(['Atlanta', 'Memphis'])) q = q.options(contains_eager(Person.visited_destinations, VisitedDestinations.town)) How could I filter by users that have visited Atlanta or Memphis, that also want to visit Boston? The code below fails and I'm not sure how to write it correctly, here's my first guess: q = q.filter(VisitedDestinations.town.name.in_(['Atlanta', 'Memphis'])) q = q.filter(WishlistDestinations.town.name.in_(['Boston'])) AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with VisitedDestinations.town has an attribute 'name' there's no implicit join available when you attempt to say something like SomeClass.relationship1.relationship2, you always have to spell out a join() explicitly, so if you want to join to Wishlist also that's separate. But here you want to hit Town twice, so you also need to alias it: talias = aliased(Town) q = q.join(Person.wishlist_destinations).join(talias, WishlistDest.town).filter(talias.name == 'Boston') its just like SQL ! all the same rules. My second guess also fails (I don't think I want to write an EXISTS query in the first place): q = q.filter(Person.visited_towns.any(Town.name.in_(['Atlanta', 'Memphis']))) q = q.filter(Person.wishlist_towns.any(Town.name.in_(['Boston']))) sqlalchemy.exc.OperationalError: (OperationalError) (1066, Not unique table/alias: 'towns')... What's the correct syntax in this case? On Wednesday, February 27, 2013 2:08:47 PM UTC-5, Michael Bayer wrote: oh. I saw you talking about at least one and exists and thought you had a more complex query.contains_eager() doesn't impact what's queried, only how results are used with the resulting objects, and is usually used with join(), just like
[sqlalchemy] Difference between ZopeTransactionExtension(keep_session=True) and sessionmaker(expire_on_commit=False)?
I'm building a pyramid application using pyramid_tm and ZopeTransactionExtension. We've written a little subscriber on NewResponse that writes out some values to a log file about the current user (request.user.id) after each request. For anybody that knows pyramid pretty well, we set the request.user property using config.set_request_property(get_user, 'user', reify=True), but basically we look up the User object once in our db and then cache it as request.user for the lifetime of the request. For the most part this is working fine, except for in the case that the User object gets modified during the request (change their avatar, name, password, whatever). When this happens, we get a Detached Instance exception when we try to reference the 'id' field of request.user. It's worth pointing out that pyramid_tm runs and commits our transaction before our NewResponse subscriber fires, which means that request.user has already been flushed and committed to the database and appears to be detached (in fact, we can merge it back into our session and continue using it as normal, see Solution #3 below). We've found 3 work-arounds that seem to give us the desired behavior, but I'm not really sure which one is better. Solution 1 -- # setting up the session DBSession = scoped_session(sessionmaker(bind=engine, extension=ZopeTransactionExtension(), expire_on_commit=False)) # in the subscriber print request.user.id # does 0 additional db queries Solution 2 -- # setting up the session DBSession = scoped_session(sessionmaker(bind=engine, extension=ZopeTransactionExtension(keep_session=True))) # in the subscriber print request.user.id # does an additional SELECT query Solution 3 -- # setting up the session DBSession = scoped_session(sessionmaker(bind=engine, extension=ZopeTransactionExtension())) # in the subscriber session = DBSession() user = session.merge(request.user) # does an additional SELECT query print user.id Without using any of these solutions, we sure enough get the Exception: DetachedInstanceError: Instance User at 0x26d5990 is not bound to a Session; attribute refresh operation cannot proceed In this case, it seems Solution 1 is the best as it doesn't do any additional SQL queries (and I'm willing to accept that occasionally I might be writing a stale User object to disk), but is there any other downside to this approach? After my subscriber runs the web request is finished, so theoretically it doesn't matter that the objects are not expired, correct? Is there a better approach here that I am missing? Thanks! -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] Complicated filter clause causes recursion depth exceeded exception
Thanks Michael, Writing a big list of conditions and combining them with and_(*conditions) worked well. I was indeed querying like this before: for condition in conditions: q = q.filter(condition) print q On Friday, January 18, 2013 6:00:04 PM UTC-5, Michael Bayer wrote: On Jan 18, 2013, at 4:15 PM, rob.c...@moat.com javascript: wrote: I haven't boiled this down to a short test case yet, but when my WHERE clause gets especially long I start getting the recursion depth exceeded exception. Is this a well-known limitation of sqlalchemy? We're running this query in production currently without SQLAlchemy, and it performs fine, but perhaps I need to look for another approach... If I keep the filter condition relatively short, my query looks like this and runs fine (with fake columns start_date, X, Y, and Z on table T): SELECT X, sum(Z) AS Z FROM T WHERE T.start_date = :start_date_1 AND T.start_date = :start_date_2 AND NOT (T.X = :X_1 AND T.Y = :Y_1) AND NOT (T.X = :X_2 AND T.Y = :Y_2) AND NOT (T.X = :X_3 AND T.Y = :Y_3) GROUP BY T.X However, if I make the filter() clause very long (over 150 AND NOT... clauses), I start getting exceptions with this stack trace: Always amazing how many wacky new problems come around. Well, the compilation of these clauses is pretty straightforward, using a recursive traversal scheme. So if you give Python a tree structure of more than 1000 nodes deep and do such a traversal, this is the error you'd get, and I suppose it's sort of well known, depends on what perspective you're coming from. So this indicates you're creating a structure that is nested this deeply. Which is to say, really deep ! This could happen if you're doing the AND's using a nesting pattern of one at a time like this: from sqlalchemy.sql import column root = column('x') == 5 current = root for i in xrange(200): current = current (column('x') == 5) print current because that's really and_(expr, and_(expr, and_(expr, and_( for 200 times... ))). But if you flatten out the and_() you can get this: from sqlalchemy.sql import column, and_ expr = [column('x') == 5] for i in xrange(200): expr.append(column('x') == 5) expr = and_(*expr) print expr then you have a flat structure, and you're fine. So we could modify our and_()/or_ construct to open itself up this way, that is, as it's built, it flattens out the nesting, though maybe for now there's a way you can build up using one big and_() block. In fact to flatten out the nesting is something you could enable across the board here, and you can see why I'm hesitant to build this in by default as it adds lots of isinstance() and other expensive checks, but you can add this to your app as a quick fix (just run this anywhere at import time to redefine how and_() and or_() are rendered): from sqlalchemy.ext.compiler import compiles from sqlalchemy.sql.expression import BooleanClauseList @compiles(BooleanClauseList) def flatten_boolean_clause_list(clauselist, compiler, **kw): op = clauselist.operator flattened = [] rewrite = False stack = list(clauselist.clauses) while stack: elem = stack.pop(0) if isinstance(elem, BooleanClauseList) and elem.operator is op: stack[:0] = elem.clauses rewrite = True else: flattened.append(elem) if rewrite: clauselist = BooleanClauseList(operator=op, *flattened) return compiler.visit_clauselist(clauselist, **kw) then the original test passes because we've rewritten the nested list as a flat list. Basically the recursion is replaced by the stack based traversal we do here. or even quicker, you could just increase your recursion depth. It defaults to 1000, so here's 1, do this before you try to run the SQL: import sys sys.setrecursionlimit(1) File test.py, line 350, in do_test print q File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, line 3031, in __str__ return str(self._compile_context().statement) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 1790, in __str__ return unicode(self.compile()).encode('ascii', 'backslashreplace') File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 1778, in compile return self._compiler(dialect, bind=bind, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 1784, in _compiler return dialect.statement_compiler(dialect, self, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 277, in __init__ engine.Compiled.__init__(self, dialect, statement, **kwargs) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 705, in __init__ self.string = self.process(self.statement) File
[sqlalchemy] Complicated filter clause causes recursion depth exceeded exception
I haven't boiled this down to a short test case yet, but when my WHERE clause gets especially long I start getting the recursion depth exceeded exception. Is this a well-known limitation of sqlalchemy? We're running this query in production currently without SQLAlchemy, and it performs fine, but perhaps I need to look for another approach... If I keep the filter condition relatively short, my query looks like this and runs fine (with fake columns start_date, X, Y, and Z on table T): SELECT X, sum(Z) AS Z FROM T WHERE T.start_date = :start_date_1 AND T.start_date = :start_date_2 AND NOT (T.X = :X_1 AND T.Y = :Y_1) AND NOT (T.X = :X_2 AND T.Y = :Y_2) AND NOT (T.X = :X_3 AND T.Y = :Y_3) GROUP BY T.X However, if I make the filter() clause very long (over 150 AND NOT... clauses), I start getting exceptions with this stack trace: File test.py, line 350, in do_test print q File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, line 3031, in __str__ return str(self._compile_context().statement) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 1790, in __str__ return unicode(self.compile()).encode('ascii', 'backslashreplace') File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 1778, in compile return self._compiler(dialect, bind=bind, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 1784, in _compiler return dialect.statement_compiler(dialect, self, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 277, in __init__ engine.Compiled.__init__(self, dialect, statement, **kwargs) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 705, in __init__ self.string = self.process(self.statement) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, line 724, in process return obj._compiler_dispatch(self, **kwargs) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, line 72, in _compiler_dispatch return getter(visitor)(self, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 941, in visit_select t = select._whereclause._compiler_dispatch(self, **kwargs) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, line 72, in _compiler_dispatch return getter(visitor)(self, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in visit_clauselist for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 475, in genexpr s for s in File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in genexpr for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, line 72, in _compiler_dispatch return getter(visitor)(self, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in visit_clauselist for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 475, in genexpr s for s in File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in genexpr for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, line 72, in _compiler_dispatch return getter(visitor)(self, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in visit_clauselist for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 475, in genexpr s for s in File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in genexpr for c in clauselist.clauses) ... File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 475, in genexpr s for s in File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in genexpr for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, line 72, in _compiler_dispatch return getter(visitor)(self, **kw) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in visit_clauselist for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 475, in genexpr s for s in File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 477, in genexpr for c in clauselist.clauses) File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, line 72, in _compiler_dispatch return getter(visitor)(self, **kw) RuntimeError: maximum recursion depth exceeded -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To view this discussion on the web visit
[sqlalchemy] Re: Is it possible to define multiple model objects that refer to the same table?
Thanks so much! Your pointers were exactly what I needed, specifically the bit which led me to discover exclude_properties. I'll leave my working code here in case it ever helps anybody else out: from sqlalchemy import Column, Date, Enum, Integer, String, Table, create_engine from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import ColumnProperty, sessionmaker from sqlalchemy.orm.mapper import class_mapper engine = create_engine('mysql://user:password@localhost/ issue_tracker') Session = sessionmaker(bind=engine) Base = declarative_base() tables = {'issue_type': Table('issue_type', Base.metadata, Column('id', Integer, primary_key=True), Column('type', Enum('Bug', 'Feature', 'Improvement', 'One-Time Task')), Column('created', Date), Column('num_visits', Integer)), 'issue_type_label': Table('issue_type_label', Base.metadata, Column('id', Integer, primary_key=True), Column('type', Enum('Bug', 'Feature', 'Improvement', 'One-Time Task')), Column('label_id', String), Column('created', Date), Column('num_visits', Integer))} def get_columns(model): return [x.key for x in class_mapper(model).iterate_properties if isinstance(x, ColumnProperty)] class IssueType(Base): __table__ = tables['issue_type'] class IssueLabel(Base): __table__ = tables['issue_type_label'] __mapper_args__ = {'exclude_properties': ['type']} class IssueTypeLabel(Base): __table__ = tables['issue_type_label'] print issue type:, get_columns(IssueType) print issue label:, get_columns(IssueLabel) print issue type label:, get_columns(IssueTypeLabel) This code correctly prints the following: issue type: ['id', 'type', 'created', 'num_visits'] issue label: ['id', 'label_id', 'created', 'num_visits'] issue type label: ['id', 'type', 'label_id', 'created', 'num_visits'] On Nov 16, 8:49 pm, Michael Bayer mike...@zzzcomputing.com wrote: On Nov 16, 2012, at 2:59 PM, Rob Crowell wrote: Thanks for the help so far Michael! I can explain a little more about what I'm trying to do (I'm using a fictional application here but I think it pretty accurately translates into my actual application). BACKGROUND -- Let's say I'm writing an issue tracking application. Each issue that we're tracking has a type (an issue must have exactly one type), and each issue may have an unlimited number of user-provided labels. Each day, people browse our issue tracker and each time they do they generate a page view on the issue. Here's an example of one day's worth of data: IssueA: Views: 1, Type: Bug, Labels: [login-page (id=1), show-stopper (id=2)] IssueB: Views: 20, Type: One-Time Task, Labels: [disk-full (id=3), show-stopper (id=2)] IssueC: Views: 300, Type: Bug, Labels: [login-page (id=1)] The BigCo. I'm working for is very interested in knowing which issues are read by the most people, and they need the ability to generate reports sliced by arbitrary date ranges. However, we can tolerate a day delay, so we are writing summary tables each night. Two of these summary tables are aggregated by either issue type or label, and we also write a third table that can be used to drill-down and see page visits bucketed by both type and label: CREATE TABLE `issue_type` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `created` datetime NOT NULL, `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL, `num_watchers` int(10) unsigned NOT NULL, PRIMARY KEY (`id`) ) CREATE TABLE `issue_label` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `created` datetime NOT NULL, `label_id` int(10) unsigned NOT NULL, `num_watchers` int(10) unsigned NOT NULL, PRIMARY KEY (`id`) ) CREATE TABLE `issue_type_label` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `created` datetime NOT NULL, `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL, `label_id` int(10) unsigned NOT NULL, `num_visits` int(10) unsigned NOT NULL, PRIMARY KEY (`id`) ) So we'd run these insert statements at midnight: INSERT INTO issue_type (created, type, num_visits) VALUES (2012-11-15, Bug, 301), (2012-11-15, One-Time Task, 20); INSERT INTO issue_labels (created, label_id, num_visits) VALUES (2012-11-15, 1, 301), (2012-11-15, 2, 21), (2012-11-15, 3, 20); INSERT INTO issue_type_label (created, type, label_id, num_visits) VALUES (2012-11-15, Bug, 1, 301), (2012-11-15, Bug, 2, 1), (2012-11-15, One-Time Task, 3, 20), (2012-11-15, One-Time Task, 2, 20); Now when we want to generate the summary reports, we query one of the first two tables (if we're generating a report aggregated by issue type we hit issue_type, if we're
[sqlalchemy] Re: Is it possible to define multiple model objects that refer to the same table?
On Nov 15, 10:48 pm, Michael Bayer mike...@zzzcomputing.com wrote: On Nov 15, 2012, at 7:33 PM, Rob Crowell wrote: Sorry, that got cut off at the end. class IssueTag(Base): __tablename__ = 'issue_user_tag' sqlalchemy.exc.InvalidRequestError: Table 'issue_user_tag' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object. On Thursday, November 15, 2012 7:32:29 PM UTC-5, Rob Crowell wrote: I'm working with a denormalized cache schema, and I've run into a situation where it would be helpful to be able to create multiple classes that extend Base but refer to the same __tablename__. Is this possible to do? I am getting this Exception: sqlalchemy.exc.InvalidRequestError: Table '[tablename]' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object. For a little more insight, we have some attributes that always have exactly one value (user who created the issue), and other attributes that can have 1 or more values (user-defined tags for the issue). If we were being exhaustive, we would create two cached tables for our issues since sometimes we want to display recent issues sometimes by user and sometimes by tag: * issue_user * issue_tag However, we can get away with writing just one table and querying it with an appropriate group_by(user_id) to achieve the same end as having 2 tables. Since my application should behave as if there were 2 separate cache tables (and I'd like to keep open the option of adding two separate cache tables in the future), I would like to have 2 different Base classes representing the two ways in which we would query the table. The obvious way of doing this doesn't work: class IssueUser(Base): __tablename__ = 'issue_user_tag' class IssueTag(Base): -- two^H^H^H three ways: 1. map to a Table: mytable = Table(mytable, Base.metadata, Column(...)) class A(Base): __table__ = mytable class B(Base): __table__ = mytable 1a: variant of 1, map A as you did but use __table__ on B class A(Base): __tablename__ = 'mytable' x = Column(...) class B(Base): __table__ = A.__table__ 2. use single table inheritance with no discriminator class MyTable(Base): __tablename__ = 'mytable' class A(MyTable): # class B(MyTable): # ... I don't have an understanding of your querying situation yet, discriminating on group_by() seems a little strange as group_by() is only intended to be used to group for aggregates, but #1, #1a or #2 should fit the bill. Thanks for the help so far Michael! I can explain a little more about what I'm trying to do (I'm using a fictional application here but I think it pretty accurately translates into my actual application). BACKGROUND -- Let's say I'm writing an issue tracking application. Each issue that we're tracking has a type (an issue must have exactly one type), and each issue may have an unlimited number of user-provided labels. Each day, people browse our issue tracker and each time they do they generate a page view on the issue. Here's an example of one day's worth of data: IssueA: Views: 1, Type: Bug, Labels: [login-page (id=1), show-stopper (id=2)] IssueB: Views: 20, Type: One-Time Task, Labels: [disk-full (id=3), show-stopper (id=2)] IssueC: Views: 300, Type: Bug, Labels: [login-page (id=1)] The BigCo. I'm working for is very interested in knowing which issues are read by the most people, and they need the ability to generate reports sliced by arbitrary date ranges. However, we can tolerate a day delay, so we are writing summary tables each night. Two of these summary tables are aggregated by either issue type or label, and we also write a third table that can be used to drill-down and see page visits bucketed by both type and label: CREATE TABLE `issue_type` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `created` datetime NOT NULL, `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL, `num_watchers` int(10) unsigned NOT NULL, PRIMARY KEY (`id`) ) CREATE TABLE `issue_label` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `created` datetime NOT NULL, `label_id` int(10) unsigned NOT NULL, `num_watchers` int(10) unsigned NOT NULL, PRIMARY KEY (`id`) ) CREATE TABLE `issue_type_label` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `created` datetime NOT NULL, `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL, `label_id` int(10) unsigned NOT NULL, `num_visits` int(10) unsigned NOT NULL, PRIMARY KEY (`id`) ) So we'd run these insert statements at midnight: INSERT INTO issue_type (created, type, num_visits) VALUES (2012-11-15, Bug, 301), (2012-11-15, One-Time Task, 20); INSERT INTO issue_labels (created, label_id