[sqlalchemy] How can I set table constraints with DeferredReflection?

2014-04-02 Thread Rob Crowell
I'm using the ORM and one of my tables does not have a primary key defined. 
 I am also using DeferredReflection, and I can't seem to figure out how to 
defer the PrimaryKeyConstraint until Base.prepare() runs.  Any pointers?

Base = declarative_base(cls=DeferredReflection)

class Person(Base):
__tablename__ = 'people'
__table_args__ = (PrimaryKeyConstraint(u'name'), {})

# this does not run
if __name__ == '__main__':
engine = create_engine('mysql://user:password@localhost/organisms')
Base.prepare(engine)


When this runs, we get an error constructing the Person class (the __main__ 
section is not hit).  This makes sense given that we haven't reflected the 
table yet!

Traceback (most recent call last):
  File /home/ubuntu/deferred_reflection.py, line 10, in module
class Person(Base):
  File 
/usr/local/lib/python2.7/dist-packages/sqlalchemy/ext/declarative/api.py, 
line 53, in __init__
_as_declarative(cls, classname, cls.__dict__)
  File 
/usr/local/lib/python2.7/dist-packages/sqlalchemy/ext/declarative/base.py, 
line 251, in _as_declarative
**table_kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, 
line 350, in __new__
table._init(name, metadata, *args, **kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, 
line 427, in _init
self._init_items(*args)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, 
line 70, in _init_items
item._set_parent_with_dispatch(self)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/base.py, 
line 283, in _set_parent_with_dispatch
self._set_parent(parent)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, 
line 2646, in _set_parent
super(PrimaryKeyConstraint, self)._set_parent(table)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, 
line 2289, in _set_parent
ColumnCollectionMixin._set_parent(self, table)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py, 
line 2257, in _set_parent
col = table.c[col]
  File 
/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/_collections.py, 
line 156, in __getitem__
return self._data[key]
KeyError: u'name'

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Trouble filtering binary columns with tuple_ and an in_ clause against Postgres

2014-02-28 Thread Rob Crowell
Looks good to me :)

...
--
Ran 3 tests in 0.811s

OK


Thanks for you really fast help with this.  Michael Bayer for president!!


On Thursday, February 27, 2014 7:58:24 PM UTC-5, Michael Bayer wrote:

 that patch is in for 0.8 and 0.9.

 On Feb 27, 2014, at 7:29 PM, Michael Bayer 
 mik...@zzzcomputing.comjavascript: 
 wrote:


 On Feb 27, 2014, at 4:38 PM, Rob Crowell rob.c...@moat.com javascript: 
 wrote:


 # in_ clause with 1 STRING, 1 BINARY
 filter_cols = tuple_(HashTest.hash_val, HashTest.hash_type)
 filter_vals = ((*encoded_hash*, 'md5'),)

 q = session.query(HashTest)
 q = q.filter(filter_cols.in_((filter_vals)))
  ^
 HINT:  No operator matches the given name and argument type(s). You might 
 need to add explicit type casts.
  'SELECT hashtest.hash_test.hash_val AS hashtest_hash_test_hash_val, 
 hashtest.hash_test.hash_type AS hashtest_hash_test_hash_type, 
 hashtest.hash_test.region AS hashtest_hash_test_region \nFROM 
 hashtest.hash_test \nWHERE (hashtest.hash_test.hash_val, 
 hashtest.hash_test.hash_type) IN ((%(param_1)s, %(param_2)s))' {'param_1': 
 psycopg2._psycopg.Binary object at 0x2d9d850, 'param_2': 
 psycopg2._psycopg.Binary object at 0x2c18940}


 It's complaining about an improper encoding for the BINARY column. 


 this has nothing to do with encoding and instead is about typing.   you 
 can see in the output that SQLAlchemy is turning both elements of the IN 
 tuple into a Binary which would appear to be inappropriate here - the 
 sqlalchemy.sql.Tuple object isn’t yet smart enough to handle heterogeneous 
 types.There’s a patch which will resolve this attached to 
 https://bitbucket.org/zzzeek/sqlalchemy/issue/2977/tuple_-needs-to-record-heterogeneous-types.


 In the meantime you can actually lift and use that Tuple class that’s in 
 the patch:

 from sqlalchemy.sql.expression import ClauseList, ColumnElement, \
 _literal_as_binds, BindParameter
 from sqlalchemy import types

 class tuple_(ClauseList, ColumnElement):
 def __init__(self, *clauses, **kw):
 clauses = [_literal_as_binds(c) for c in clauses]
 self.type = types.NULLTYPE
 self._type_tuple = [arg.type for arg in clauses]

 super(tuple_, self).__init__(*clauses, **kw)

 @property
 def _select_iterable(self):
 return (self, )

 def _bind_param(self, operator, obj):
 return tuple_(*[
 BindParameter(None, o, _compared_to_operator=operator,
  _compared_to_type=type_, unique=True)
 for o, type_ in zip(obj, self._type_tuple)
 ]).self_group()






-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Trouble filtering binary columns with tuple_ and an in_ clause against Postgres

2014-02-27 Thread Rob Crowell
When I pass binary data to a multi-column in_ clause, I seem to be geting 
inconsistent results and I need some help!  I did some testing with MySQL, 
Postgres, and Vertica (connecting via 
https://pypi.python.org/pypi/vertica-sqlalchemy/0.1).  It appears MySQL 
works correctly but both Postgres and Vertica (which is almost compatible 
with Posgres) I am having trouble.

I authored a simple test model in a schema named 'hashtest'.  The (silly) 
idea is to store binary hash values in the hash_val column and tag the type 
(i.e. md5) in hash_type -- I added a region column just so I would have 2 
non-string columns to play with:

Base = declarative_base()
class HashTest(Base):
__tablename__ = 'hash_test'
__table_args__ = {'schema': 'hashtest'}

hash_val = Column(Binary, primary_key=True)
hash_type = Column(String, primary_key=True)
region = Column(String)


Insert a single row:
hashtest= INSERT INTO hashtest.hash_test (hash_val, hash_type, region) 
VALUES (E'\\xf2666f453b364db65cfdd19756d7e0ad', 'md5', 'US');


For these tests I am using a binary string to store the md5 (rather than 16 
characters 0-F, using 16 raw bytes); the binascii module can do this 
transformation for us:

*encoded_hash* = '\xf2foE;6M\xb6\\\xfd\xd1\x97V\xd7\xe0\xad'

self.assertEquals(binascii.unhexlify('f2666f453b364db65cfdd19756d7e0ad'), 
*encoded_hash*)


We can pass this to a single-column IN clause no problem, and get our data 
back out (the unittest module provides assertEqual):

# in_ clause with 1 BINARY
filter_cols = tuple_(HashTest.hash_val)
filter_vals = ((*encoded_hash*,),)

q = session.query(HashTest)
q = q.filter(filter_cols.in_((filter_vals)))
rows = q.all()
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0].hash_val, *encoded_hash*)

We can also build a 2-column in_ clause, with both of the string hash_type 
and region columns, and things also work as expected.  However, when I pass 
in a String and Binary value to a single in_ clause on hash_val and 
hash_type, things stop working:

# in_ clause with 1 STRING, 1 BINARY
filter_cols = tuple_(HashTest.hash_val, HashTest.hash_type)
filter_vals = ((*encoded_hash*, 'md5'),)

q = session.query(HashTest)
q = q.filter(filter_cols.in_((filter_vals)))
rows = q.all()
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0].hash_val, *encoded_hash*)


MySQL happily responds with the desired results, but Postgres (and Vertica) 
seem to choke up:

==
ERROR: testPostgres (md5test.BinaryTupleInTest)
--
Traceback (most recent call last):
  File md5test.py, line 70, in testPostgres
self._lookupHash(session)
  File md5test.py, line 49, in _lookupHash
rows = q.all()
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, 
line 2286, in all
return list(self)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, 
line 2398, in __iter__
return self._execute_and_instances(context)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, 
line 2413, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
line 717, in execute
return meth(self, multiparams, params)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py, 
line 317, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
line 814, in _execute_clauseelement
compiled_sql, distilled_params
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
line 927, in _execute_context
context)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
line 1076, in _handle_dbapi_exception
exc_info
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py, 
line 185, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
line 920, in _execute_context
context)
  File 
/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py, line 
426, in do_execute
cursor.execute(statement, parameters)
ProgrammingError: (ProgrammingError) operator does not exist: character 
varying = bytea
LINE 3: ...hash_test.hash_val, hashtest.hash_test.hash_type) IN (('\xf2...
 ^
HINT:  No operator matches the given name and argument type(s). You might 
need to add explicit type casts.
 'SELECT hashtest.hash_test.hash_val AS hashtest_hash_test_hash_val, 
hashtest.hash_test.hash_type AS hashtest_hash_test_hash_type, 
hashtest.hash_test.region AS hashtest_hash_test_region \nFROM 

[sqlalchemy] Help with DeferredReflection and setting up an engine at runtime?

2014-02-17 Thread Rob Crowell
I am having a bit of trouble getting DeferredReflection working the way I 
want; not sure if I am overlooking something obvious or if I just don't 
really understand how it's supposed to work.  

I'm trying to define my models before creating my engine (this does not 
work):

Base = declarative_base(cls=*DeferredReflection*)

class CityStats(Base):
__tablename__ = 'city_stats'
__table_args__ = {'schema': 'prod', *'autoload': True*}

if __name__ == '__main__':
engine = create_engine('...')
Base.metadata.bind = engine
Base.prepare(engine)


When I run this I get an error creating the CityStats class: 
sqlalchemy.exc.UnboundExecutionError: 
No engine is bound to this Table's MetaData. Pass an engine to the Table 
via autoload_with=someengine, or associate the MetaData with an engine 
via metadata.bind=someengine

Of course it does work if I create my engine and set Base.metadata.bind 
BEFORE I define the CityStats model (this works):

engine = create_engine('...')
Base.metadata.bind = engine

class CityStats(Base):
__tablename__ = 'city_stats'
__table_args__ = {'schema': 'prod', *'autoload': True*}

Base.prepare(engine)


I'm trying to avoid some kind of model_init() function that everyone who 
imports my models.py file will have to remember to call before importing my 
models.  Is this possible?

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Help with DeferredReflection and setting up an engine at runtime?

2014-02-17 Thread Rob Crowell
Interesting, thanks Michael.  I didn't realize autoload was implied when 
using DeferredReflection but that makes sense.

Thanks!

On Monday, February 17, 2014 7:17:34 PM UTC-5, Michael Bayer wrote:


 On Feb 17, 2014, at 6:23 PM, Rob Crowell robcc...@gmail.com javascript: 
 wrote:

 I am having a bit of trouble getting DeferredReflection working the way I 
 want; not sure if I am overlooking something obvious or if I just don't 
 really understand how it's supposed to work.  

 I'm trying to define my models before creating my engine (this does not 
 work):

 Base = declarative_base(cls=*DeferredReflection*)

 class CityStats(Base):
 __tablename__ = 'city_stats'
 __table_args__ = {'schema': 'prod', *'autoload': True*}

 if __name__ == '__main__':
 engine = create_engine('...')
 Base.metadata.bind = engine
 Base.prepare(engine)


 When I run this I get an error creating the CityStats class: 
 sqlalchemy.exc.UnboundExecutionError: 
 No engine is bound to this Table's MetaData. Pass an engine to the Table 
 via autoload_with=someengine, or associate the MetaData with an engine 
 via metadata.bind=someengine”


 its all about the stack trace, lets look:

 Traceback (most recent call last):
   File test.py, line 8, in module
 class CityStats(Base):
   File 
 /Users/classic/dev/sqlalchemy/lib/sqlalchemy/ext/declarative/api.py, line 
 53, in __init__
 _as_declarative(cls, classname, cls.__dict__)
   File 
 /Users/classic/dev/sqlalchemy/lib/sqlalchemy/ext/declarative/base.py, 
 line 251, in _as_declarative
 **table_kw)
   File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/schema.py, line 
 350, in __new__
 table._init(name, metadata, *args, **kw)
   File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/schema.py, line 
 423, in _init
 self._autoload(metadata, autoload_with, include_columns)
   File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/schema.py, line 
 439, in _autoload
 msg=No engine is bound to this Table's MetaData. 
   File /Users/classic/dev/sqlalchemy/lib/sqlalchemy/sql/base.py, line 
 459, in _bind_or_error

 what we see here is that this script doesn’t get to create_engine() at 
 all, it’s trying to hit the database as soon as you say “CityStats(Base)”. 
  Why is that?  Because you have “autoload=True” in your table args, which 
 means, “reflect this table *right now*”.  That is, you are defeating the 
 purpose of using DeferredReflection.

 The solution is just take out that autoload=True.  Any class which 
 descends from the Base here is automatically part of the “classes to 
 reflect” since you have DeferredReflection at the base.

 Also, if you’re on 0.9 take a look at the new “automap” extension, I’ve 
 been using it and it’s pretty keen.  There’s one fix for it in 
 not-yet-released 0.9.3 but it’s only needed if you’re dealing with 
 inheritance structures.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Better approach to filtering a relationship collection than using contains_eager()?

2013-02-27 Thread Rob Crowell
Sure!  Here's the query I am attempting to replicate:

SELECT people.id AS person_id, people.name, towns.id AS town_id, towns.name
FROM people
INNER JOIN visited_destinations ON visited_destinations.person_id = 
people.id
INNER JOIN towns ON towns.id = visited_destinations.town_id
WHERE towns.name IN ('Atlanta', 'Memphis')

I realize it's confusing since I labeled 2 people as Sam in my test 
dataset, but I left it like that for consistency.  You can see that one of 
the Sam's has person_id=9 and the other has person_id=10 from the MySQL 
results below:

+---+--+-+-+
| person_id | name | town_id | name|
+---+--+-+-+
| 8 | Bob  |   2 | Atlanta |
| 8 | Bob  |   1 | Memphis |
| 9 | Sam  |   1 | Memphis |
|10 | Sam  |   2 | Atlanta |
|10 | Sam  |   2 | Atlanta |
|10 | Sam  |   2 | Atlanta |
+---+--+-+-+

I'd like to turn this into 3 Person results, like this:
Person(id=8, name=Bob, visited_towns=[Town(name=Atlanta), 
Town(name=Memphis)])
Person(id=9, name=Sam, visited_towns=[Town(Memphis)])
Person(id=10, name=Sam, visited_towns=[Town(Atlanta), 
Town(Atlanta), Town(Atlanta)])

On Wednesday, February 27, 2013 12:59:02 PM UTC-5, Michael Bayer wrote:

 I'm not yet digging into your problem, but one remark would be that 
 there's two levels to deal with here. One is figuring out exactly what SQL 
 you want, independent of SQLAlchemy.  It's not clear here if you've gotten 
 that part yet.  The next part is getting parts of that SQL to route into 
 your contains_eager().   We do that second.

 So let me know if you know the actual SQL you want to do first; we'd work 
 from there.   Don't deal with joinedload or contains_eager or any of that 
 yet.


 On Feb 27, 2013, at 2:07 AM, Rob Crowell robcc...@gmail.com javascript: 
 wrote:

 Example code: https://gist.github.com/rcrowell/5045832

 I have Person and Town tables, which are joined in a many-to-many fashion 
 through a VisitedDestinations table.  I want to write a query which will 
 return People that have visited either Atlanta or Memphis.  I have a 
 working example using contains_eager below, but I'm not sure if there is a 
 better way...

 I am trying to get a Person object for each person that has visited at 
 least one of these two cities, and I want to get joined Town objects for 
 Atlanta and Memphis.  If a person has visited one of these towns more than 
 once, I'd like to get back one Town object for each visit (so 3 visits to 
 Atlanta produces a visited_towns collection of size three):

 class Town(Base):
 __tablename__ = 'towns'
 id = Column('id', Integer, primary_key=True)
 name = Column('name', String(256))

 class Person(Base):
 __tablename__ = 'people'
 id = Column('id', Integer, primary_key=True)
 name = Column('name', String(256))

 class VisitedDestinations(Base):
 __tablename__ = 'visited_destinations'
 id = Column('id', Integer, primary_key=True)
 person_id = Column('person_id', Integer, ForeignKey(Person.id))
 town_id = Column('town_id', Integer, ForeignKey(Town.id))

 person = relationship(Person, backref='visited_destinations')
 town = relationship(Town, backref='visited_destinations')

 # use an association_proxy so client code does not have to deal with 
 the visited_destinations table at all
 Person.visited_towns = association_proxy('visited_destinations', 
 'town')

 This code more or less does what I would like, but it uses an EXISTS query 
 which I don't really want and it gets back ALL towns that a matching person 
 has visited instead of only the matching towns:

 # gets all Town objects, including those that do not match our filter 

 q = session.query(Person) 
 
 q = q.filter(Person.visited_towns.any(Town.name.in_(['Atlanta', 
 'Memphis']))) 
 q = q.options(joinedload_all(Person.visited_destinations, 
 VisitedDestinations.town))   # can't do joinedload with association_proxy 
 objects 
 for person in q:   

 print person, person.visited_towns 

 Which produces:
 Person(name='Bob') [Town(name='Atlanta'), Town(name='Memphis')]
 Person(name='Sam') [Town(name='Memphis')]
 Person(name='Sam') [Town(name='Chattanooga'), Town(name='Atlanta'), 
 Town(name='Atlanta'), Town(name='Atlanta')]


 In my database its likely that a person has visited thousands of 
 destinations, and I really don't need to get all of them back here.  As you

Re: [sqlalchemy] Better approach to filtering a relationship collection than using contains_eager()?

2013-02-27 Thread Rob Crowell
Ah okay, so you do recommend the contains_eager approach.  I guess this is 
exactly the use-case it is designed for?  I always get a little scared when 
I try using advanced features of SQLAlchemy :)

One last question.  The query here seems to take advantage of the fact that 
our table joins on Towns exactly once.  If we had a second table 
WishlistDestinations, that tracked the towns that a Person would like to 
visit instead of ones he had already visited, what would be the syntax for 
filtering those out?

Imagine we also add this model:

class WishlistDestinations(Base):
__tablename__ = 'wishlist_destinations'
id = Column('id', Integer, primary_key=True)
person_id = Column('person_id', Integer, ForeignKey(Person.id))
town_id = Column('town_id', Integer, ForeignKey(Town.id))

person = relationship(Person, backref='wishlist_destinations')
town = relationship(Town, backref='wishlist_destinations')

Person.wishlist_towns = association_proxy('wishlist_destinations', 
'town')


This query is obviously going to fail, since there are now 2 relationships 
to the Town model:

q = session.query(Person)
q = q.join(Person.visited_destinations, VisitedDestinations.town, 
WishlistDestinations.town)
q = q.filter(Town.name.in_(['Atlanta', 'Memphis']))
q = q.options(contains_eager(Person.visited_destinations, 
VisitedDestinations.town))

How could I filter by users that have visited Atlanta or Memphis, that also 
want to visit Boston?  The code below fails and I'm not sure how to write 
it correctly, here's my first guess:

q = q.filter(VisitedDestinations.town.name.in_(['Atlanta', 'Memphis']))
q = q.filter(WishlistDestinations.town.name.in_(['Boston']))

AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' 
object associated with VisitedDestinations.town has an attribute 'name'


My second guess also fails (I don't think I want to write an EXISTS query 
in the first place):
q = q.filter(Person.visited_towns.any(Town.name.in_(['Atlanta', 
'Memphis'])))
q = q.filter(Person.wishlist_towns.any(Town.name.in_(['Boston'])))

sqlalchemy.exc.OperationalError: (OperationalError) (1066, Not unique 
table/alias: 'towns')...


What's the correct syntax in this case?


On Wednesday, February 27, 2013 2:08:47 PM UTC-5, Michael Bayer wrote:

 oh.  I saw you talking about at least one and exists and thought you 
 had a more complex query.contains_eager() doesn't impact what's 
 queried, only how results are used with the resulting objects, and is 
 usually used with join(), just like this:

 session.query(Person).\
 join(Person.visited_destinations, VisitedDestinations.town).\
 options(contains_eager(Person.visited_destinations, 
 VisitedDestinations.town)).\
 filter(Town.name.in_(['Atlanta', 'Memphis']))


 On Feb 27, 2013, at 1:48 PM, Rob Crowell rob.c...@moat.com javascript: 
 wrote:

 Sure!  Here's the query I am attempting to replicate:

 SELECT people.id AS person_id, people.name, towns.id AS town_id, 
 towns.name
 FROM people
 INNER JOIN visited_destinations ON visited_destinations.person_id = 
 people.id
 INNER JOIN towns ON towns.id = visited_destinations.town_id
 WHERE towns.name IN ('Atlanta', 'Memphis')

 I realize it's confusing since I labeled 2 people as Sam in my test 
 dataset, but I left it like that for consistency.  You can see that one of 
 the Sam's has person_id=9 and the other has person_id=10 from the MySQL 
 results below:

 +---+--+-+-+
 | person_id | name | town_id | name|
 +---+--+-+-+
 | 8 | Bob  |   2 | Atlanta |
 | 8 | Bob  |   1 | Memphis |
 | 9 | Sam  |   1 | Memphis |
 |10 | Sam  |   2 | Atlanta |
 |10 | Sam  |   2 | Atlanta |
 |10 | Sam  |   2 | Atlanta |
 +---+--+-+-+

 I'd like to turn this into 3 Person results, like this:
 Person(id=8, name=Bob, visited_towns=[Town(name=Atlanta), 
 Town(name=Memphis)])
 Person(id=9, name=Sam, visited_towns=[Town(Memphis)])
 Person(id=10, name=Sam, visited_towns=[Town(Atlanta), 
 Town(Atlanta), Town(Atlanta)])

 On Wednesday, February 27, 2013 12:59:02 PM UTC-5, Michael Bayer wrote:

 I'm not yet digging into your problem, but one remark would be that 
 there's two levels to deal with here. One is figuring out exactly what SQL 
 you want, independent of SQLAlchemy.  It's not clear here if you've gotten 
 that part yet.  The next part is getting parts of that SQL to route into 
 your contains_eager().   We do that second.

 So let me know if you know the actual SQL you want to do first; we'd work 
 from there.   Don't deal with joinedload or contains_eager or any of that 
 yet.


 On Feb 27, 2013, at 2:07 AM, Rob Crowell robcc...@gmail.com wrote:

 Example code: https://gist.github.com/rcrowell/5045832

 I have Person and Town tables, which are joined in a many-to-many

Re: [sqlalchemy] Better approach to filtering a relationship collection than using contains_eager()?

2013-02-27 Thread Rob Crowell
Oh cool!  I was getting incorrect results putting VisitedDestinations.town 
in my contains_eager() call as you suggested, maybe I am doing something 
wrong:

session = Session()
visited_alias = aliased(Town)
wishlist_alias = aliased(Town)
q = session.query(Person)
q = q.join(Person.visited_destinations).join(visited_alias, 
VisitedDestinations.town).filter(visited_alias.name.in_(['Atlanta', 
'Memphis']))
q = q.join(Person.wishlist_destinations).join(wishlist_alias, 
WishlistDestinations.town).filter(wishlist_alias.name.in_(['Boston']))
q = q.options(contains_eager(Person.visited_destinations, 
VisitedDestinations.town),
  contains_eager(Person.wishlist_destinations, 
WishlistDestinations.town))

SAWarning: Multiple rows returned with uselist=False for eagerly-loaded 
attribute 'WishlistDestinations.town' 
SAWarning: Multiple rows returned with uselist=False for eagerly-loaded 
attribute 'VisitedDestinations.town' 

However this seems to take care of the warning and fixes the problem with 
my results:

session = Session()
visited_alias = aliased(Town)
wishlist_alias = aliased(Town)
q = session.query(Person)
q = q.join(Person.visited_destinations).join(visited_alias, 
VisitedDestinations.town).filter(visited_alias.name.in_(['Atlanta', 
'Memphis']))
q = q.join(Person.wishlist_destinations).join(wishlist_alias, 
WishlistDestinations.town).filter(wishlist_alias.name.in_(['Boston']))
q = q.options(contains_eager(Person.visited_destinations), 
contains_eager(Person.wishlist_destinations))

Any idea what is going wrong in the first case?

And THANK YOU for your help Michael.  You are really helpful :)



On Wednesday, February 27, 2013 2:48:21 PM UTC-5, Michael Bayer wrote:


 On Feb 27, 2013, at 2:40 PM, Rob Crowell rob.c...@moat.com javascript: 
 wrote:

 Ah okay, so you do recommend the contains_eager approach.  I guess this is 
 exactly the use-case it is designed for?  I always get a little scared when 
 I try using advanced features of SQLAlchemy :)

 One last question.  The query here seems to take advantage of the fact 
 that our table joins on Towns exactly once.  If we had a second table 
 WishlistDestinations, that tracked the towns that a Person would like to 
 visit instead of ones he had already visited, what would be the syntax for 
 filtering those out?

 Imagine we also add this model:

 class WishlistDestinations(Base):
 __tablename__ = 'wishlist_destinations'
 id = Column('id', Integer, primary_key=True)
 person_id = Column('person_id', Integer, ForeignKey(Person.id))
 town_id = Column('town_id', Integer, ForeignKey(Town.id))

 person = relationship(Person, backref='wishlist_destinations')
 town = relationship(Town, backref='wishlist_destinations')

 Person.wishlist_towns = association_proxy('wishlist_destinations', 
 'town')


 This query is obviously going to fail, since there are now 2 relationships 
 to the Town model:

 q = session.query(Person)
 q = q.join(Person.visited_destinations, VisitedDestinations.town, 
 WishlistDestinations.town)
 q = q.filter(Town.name.in_(['Atlanta', 'Memphis']))
 q = q.options(contains_eager(Person.visited_destinations, 
 VisitedDestinations.town))

 How could I filter by users that have visited Atlanta or Memphis, that 
 also want to visit Boston?  The code below fails and I'm not sure how to 
 write it correctly, here's my first guess:

 q = q.filter(VisitedDestinations.town.name.in_(['Atlanta', 'Memphis']))
 q = q.filter(WishlistDestinations.town.name.in_(['Boston']))

 AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' 
 object associated with VisitedDestinations.town has an attribute 'name'


 there's no implicit join available when you attempt to say something 
 like SomeClass.relationship1.relationship2, you always have to spell out a 
 join() explicitly, so if you want to join to Wishlist also that's separate. 
  But here you want to hit Town twice, so you also need to alias it:
  
 talias = aliased(Town)

 q = q.join(Person.wishlist_destinations).join(talias, 
 WishlistDest.town).filter(talias.name == 'Boston')

 its just like SQL !  all the same rules.
  





 My second guess also fails (I don't think I want to write an EXISTS query 
 in the first place):
 q = q.filter(Person.visited_towns.any(Town.name.in_(['Atlanta', 
 'Memphis'])))
 q = q.filter(Person.wishlist_towns.any(Town.name.in_(['Boston'])))

 sqlalchemy.exc.OperationalError: (OperationalError) (1066, Not unique 
 table/alias: 'towns')...


 What's the correct syntax in this case?


 On Wednesday, February 27, 2013 2:08:47 PM UTC-5, Michael Bayer wrote:

 oh.  I saw you talking about at least one and exists and thought you 
 had a more complex query.contains_eager() doesn't impact what's 
 queried, only how results are used with the resulting objects, and is 
 usually used with join(), just like

[sqlalchemy] Difference between ZopeTransactionExtension(keep_session=True) and sessionmaker(expire_on_commit=False)?

2013-02-19 Thread rob . crowell
I'm building a pyramid application using pyramid_tm and 
ZopeTransactionExtension.  We've written a little subscriber on NewResponse 
that writes out some values to a log file about the current user 
(request.user.id) after each request.  For anybody that knows pyramid 
pretty well, we set the request.user property 
using config.set_request_property(get_user, 'user', reify=True), but 
basically we look up the User object once in our db and then cache it as 
request.user for the lifetime of the request.

For the most part this is working fine, except for in the case that the 
User object gets modified during the request (change their avatar, name, 
password, whatever).  When this happens, we get a Detached Instance 
exception when we try to reference the 'id' field of request.user.  It's 
worth pointing out that pyramid_tm runs and commits our transaction before 
our NewResponse subscriber fires, which means that request.user has already 
been flushed and committed to the database and appears to be detached (in 
fact, we can merge it back into our session and continue using it as 
normal, see Solution #3 below).

We've found 3 work-arounds that seem to give us the desired behavior, but 
I'm not really sure which one is better.

Solution 1
--
# setting up the session
DBSession = scoped_session(sessionmaker(bind=engine, 
extension=ZopeTransactionExtension(), expire_on_commit=False))

# in the subscriber
print request.user.id   # does 0 additional db queries

Solution 2
--
# setting up the session
DBSession = scoped_session(sessionmaker(bind=engine, 
extension=ZopeTransactionExtension(keep_session=True)))

# in the subscriber
print request.user.id   # does an additional SELECT query


Solution 3
--
# setting up the session
DBSession = scoped_session(sessionmaker(bind=engine, 
extension=ZopeTransactionExtension()))

# in the subscriber
session = DBSession()
user = session.merge(request.user)   # does an additional SELECT query
print user.id


Without using any of these solutions, we sure enough get the Exception:
DetachedInstanceError: Instance User at 0x26d5990 is not bound to a 
Session; attribute refresh operation cannot proceed


In this case, it seems Solution 1 is the best as it doesn't do any 
additional SQL queries (and I'm willing to accept that occasionally I might 
be writing a stale User object to disk), but is there any other downside to 
this approach?  After my subscriber runs the web request is finished, so 
theoretically it doesn't matter that the objects are not expired, correct? 
 Is there a better approach here that I am missing?

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Complicated filter clause causes recursion depth exceeded exception

2013-01-22 Thread rob . crowell
Thanks Michael,

Writing a big list of conditions and combining them with and_(*conditions) 
worked well.  I was indeed querying like this before:

for condition in conditions:
  q = q.filter(condition)
print q  

On Friday, January 18, 2013 6:00:04 PM UTC-5, Michael Bayer wrote:


 On Jan 18, 2013, at 4:15 PM, rob.c...@moat.com javascript: wrote:

 I haven't boiled this down to a short test case yet, but when my WHERE 
 clause gets especially long I start getting the recursion depth exceeded 
 exception.  Is this a well-known limitation of sqlalchemy?  We're running 
 this query in production currently without SQLAlchemy, and it performs 
 fine, but perhaps I need to look for another approach...

 If I keep the filter condition relatively short, my query looks like this 
 and runs fine (with fake columns start_date, X, Y, and Z on table T):

 SELECT X, sum(Z) AS Z
 FROM T
 WHERE T.start_date = :start_date_1 
   AND T.start_date = :start_date_2 
   AND NOT (T.X = :X_1 AND T.Y = :Y_1) 
   AND NOT (T.X = :X_2 AND T.Y = :Y_2)
   AND NOT (T.X = :X_3 AND T.Y = :Y_3)
 GROUP BY T.X

 However, if I make the filter() clause very long (over 150 AND NOT... 
 clauses), I start getting exceptions with this stack trace:


 Always amazing how many wacky new problems come around.   Well, the 
 compilation of these clauses is pretty straightforward, using a recursive 
 traversal scheme.  So if you give Python a tree structure of more than 1000 
 nodes deep and do such a traversal, this is the error you'd get, and I 
 suppose it's sort of well known, depends on what perspective you're 
 coming from.

 So this indicates you're creating a structure that is nested this deeply. 
  Which is to say, really deep !   

 This could happen if you're doing the AND's using a nesting pattern of one 
 at a time like this:

 from sqlalchemy.sql import column

 root = column('x') == 5
 current = root

 for i in xrange(200):
 current = current  (column('x') == 5)

 print current


 because that's really and_(expr, and_(expr, and_(expr, and_( for 200 
 times... ))).

 But if you flatten out the and_() you can get this:

 from sqlalchemy.sql import column, and_

 expr = [column('x') == 5]
 for i in xrange(200):
 expr.append(column('x') == 5)

 expr = and_(*expr)

 print expr

 then you have a flat structure, and you're fine.

 So we could modify our and_()/or_ construct to open itself up this way, 
 that is, as it's built, it flattens out the nesting, though maybe for now 
 there's a way you can build up using one big and_() block.

 In fact to flatten out the nesting is something you could enable across 
 the board here, and you can see why I'm hesitant to build this in by 
 default as it adds lots of isinstance() and other expensive checks, but you 
 can add this to your app as a quick fix (just run this anywhere at import 
 time to redefine how and_() and or_() are rendered):

 from sqlalchemy.ext.compiler import compiles
 from sqlalchemy.sql.expression import BooleanClauseList

 @compiles(BooleanClauseList)
 def flatten_boolean_clause_list(clauselist, compiler, **kw):
 op = clauselist.operator
 flattened = []
 rewrite = False
 stack = list(clauselist.clauses)
 while stack:
 elem = stack.pop(0)
 if isinstance(elem, BooleanClauseList) and elem.operator is op:
 stack[:0] = elem.clauses
 rewrite = True
 else:
 flattened.append(elem)
 if rewrite:
 clauselist = BooleanClauseList(operator=op, *flattened)
 return compiler.visit_clauselist(clauselist, **kw)

 then the original test passes because we've rewritten the nested list as a 
 flat list.   Basically the recursion is replaced by the stack based 
 traversal we do here.

 or even quicker, you could just increase your recursion depth.  It 
 defaults to 1000, so here's 1, do this before you try to run the SQL:

 import sys
 sys.setrecursionlimit(1)











   File test.py, line 350, in do_test
 print q
   File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, 
 line 3031, in __str__
 return str(self._compile_context().statement)
   File 
 /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 
 1790, in __str__
 return unicode(self.compile()).encode('ascii', 'backslashreplace')
   File 
 /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 
 1778, in compile
 return self._compiler(dialect, bind=bind, **kw)
   File 
 /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 
 1784, in _compiler
 return dialect.statement_compiler(dialect, self, **kw)
   File 
 /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, line 
 277, in __init__
 engine.Compiled.__init__(self, dialect, statement, **kwargs)
   File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
 line 705, in __init__
 self.string = self.process(self.statement)
   File 

[sqlalchemy] Complicated filter clause causes recursion depth exceeded exception

2013-01-18 Thread rob . crowell
I haven't boiled this down to a short test case yet, but when my WHERE 
clause gets especially long I start getting the recursion depth exceeded 
exception.  Is this a well-known limitation of sqlalchemy?  We're running 
this query in production currently without SQLAlchemy, and it performs 
fine, but perhaps I need to look for another approach...

If I keep the filter condition relatively short, my query looks like this 
and runs fine (with fake columns start_date, X, Y, and Z on table T):

SELECT X, sum(Z) AS Z
FROM T
WHERE T.start_date = :start_date_1 
  AND T.start_date = :start_date_2 
  AND NOT (T.X = :X_1 AND T.Y = :Y_1) 
  AND NOT (T.X = :X_2 AND T.Y = :Y_2)
  AND NOT (T.X = :X_3 AND T.Y = :Y_3)
GROUP BY T.X

However, if I make the filter() clause very long (over 150 AND NOT... 
clauses), I start getting exceptions with this stack trace:

  File test.py, line 350, in do_test
print q
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py, 
line 3031, in __str__
return str(self._compile_context().statement)
  File 
/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 
1790, in __str__
return unicode(self.compile()).encode('ascii', 'backslashreplace')
  File 
/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 
1778, in compile
return self._compiler(dialect, bind=bind, **kw)
  File 
/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py, line 
1784, in _compiler
return dialect.statement_compiler(dialect, self, **kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 277, in __init__
engine.Compiled.__init__(self, dialect, statement, **kwargs)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
line 705, in __init__
self.string = self.process(self.statement)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py, 
line 724, in process
return obj._compiler_dispatch(self, **kwargs)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, 
line 72, in _compiler_dispatch
return getter(visitor)(self, **kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 941, in visit_select
t = select._whereclause._compiler_dispatch(self, **kwargs)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, 
line 72, in _compiler_dispatch
return getter(visitor)(self, **kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in visit_clauselist
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 475, in genexpr
s for s in 
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in genexpr
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, 
line 72, in _compiler_dispatch
return getter(visitor)(self, **kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in visit_clauselist
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 475, in genexpr
s for s in 
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in genexpr
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, 
line 72, in _compiler_dispatch
return getter(visitor)(self, **kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in visit_clauselist
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 475, in genexpr
s for s in 
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in genexpr
for c in clauselist.clauses)
...
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 475, in genexpr
s for s in 
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in genexpr
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, 
line 72, in _compiler_dispatch
return getter(visitor)(self, **kw)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in visit_clauselist
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 475, in genexpr
s for s in 
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py, 
line 477, in genexpr
for c in clauselist.clauses)
  File /usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py, 
line 72, in _compiler_dispatch
return getter(visitor)(self, **kw)
RuntimeError: maximum recursion depth exceeded

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To view this discussion on the web visit 

[sqlalchemy] Re: Is it possible to define multiple model objects that refer to the same table?

2012-11-21 Thread Rob Crowell
Thanks so much!  Your pointers were exactly what I needed,
specifically the bit which led me to discover exclude_properties.
I'll leave my working code here in case it ever helps anybody else
out:

from sqlalchemy import Column, Date, Enum, Integer, String, Table,
create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import ColumnProperty, sessionmaker
from sqlalchemy.orm.mapper import class_mapper

engine = create_engine('mysql://user:password@localhost/
issue_tracker')
Session = sessionmaker(bind=engine)

Base = declarative_base()

tables = {'issue_type':
  Table('issue_type', Base.metadata,
Column('id', Integer, primary_key=True),
Column('type', Enum('Bug', 'Feature',
'Improvement', 'One-Time Task')),
Column('created', Date),
Column('num_visits', Integer)),

  'issue_type_label':
  Table('issue_type_label', Base.metadata,
Column('id', Integer, primary_key=True),
Column('type', Enum('Bug', 'Feature',
'Improvement', 'One-Time Task')),
Column('label_id', String),
Column('created', Date),
Column('num_visits', Integer))}

def get_columns(model):
return [x.key for x in class_mapper(model).iterate_properties if
isinstance(x, ColumnProperty)]

class IssueType(Base):
__table__ = tables['issue_type']


class IssueLabel(Base):
__table__ = tables['issue_type_label']
__mapper_args__ = {'exclude_properties': ['type']}


class IssueTypeLabel(Base):
__table__ = tables['issue_type_label']


print issue type:, get_columns(IssueType)
print issue label:, get_columns(IssueLabel)
print issue type label:, get_columns(IssueTypeLabel)


This code correctly prints the following:
issue type: ['id', 'type', 'created', 'num_visits']
issue label: ['id', 'label_id', 'created', 'num_visits']
issue type label: ['id', 'type', 'label_id', 'created',
'num_visits']

On Nov 16, 8:49 pm, Michael Bayer mike...@zzzcomputing.com wrote:
 On Nov 16, 2012, at 2:59 PM, Rob Crowell wrote:











  Thanks for the help so far Michael!  I can explain a little more about
  what I'm trying to do (I'm using a fictional application here but I
  think it pretty accurately translates into my actual application).

  BACKGROUND
  --

  Let's say I'm writing an issue tracking application.  Each issue that
  we're tracking has a type (an issue must have exactly one type), and
  each issue may have an unlimited number of user-provided labels.

  Each day, people browse our issue tracker and each time they do they
  generate a page view on the issue.  Here's an example of one day's
  worth of data:

     IssueA: Views: 1, Type: Bug, Labels: [login-page (id=1),
  show-stopper (id=2)]
     IssueB: Views: 20, Type: One-Time Task, Labels: [disk-full
  (id=3), show-stopper (id=2)]
     IssueC: Views: 300, Type: Bug, Labels: [login-page (id=1)]

  The BigCo. I'm working for is very interested in knowing which issues
  are read by the most people, and they need the ability to generate
  reports sliced by arbitrary date ranges.  However, we can tolerate a
  day delay, so we are writing summary tables each night.  Two of these
  summary tables are aggregated by either issue type or label, and we
  also write a third table that can be used to drill-down and see page
  visits bucketed by both type and label:

  CREATE TABLE `issue_type` (
   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `created` datetime NOT NULL,
   `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL,
   `num_watchers` int(10) unsigned NOT NULL,
   PRIMARY KEY (`id`)
  )

  CREATE TABLE `issue_label` (
   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `created` datetime NOT NULL,
   `label_id` int(10) unsigned NOT NULL,
   `num_watchers` int(10) unsigned NOT NULL,
   PRIMARY KEY (`id`)
  )

  CREATE TABLE `issue_type_label` (
   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `created` datetime NOT NULL,
   `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL,
   `label_id` int(10) unsigned NOT NULL,
   `num_visits` int(10) unsigned NOT NULL,
   PRIMARY KEY (`id`)
  )

  So we'd run these insert statements at midnight:

  INSERT INTO issue_type (created, type, num_visits) VALUES
     (2012-11-15, Bug, 301),
     (2012-11-15, One-Time Task, 20);

  INSERT INTO issue_labels (created, label_id, num_visits) VALUES
     (2012-11-15, 1, 301),
     (2012-11-15, 2, 21),
     (2012-11-15, 3, 20);

  INSERT INTO issue_type_label (created, type, label_id, num_visits)
  VALUES
     (2012-11-15, Bug, 1, 301),
     (2012-11-15, Bug, 2, 1),
     (2012-11-15, One-Time Task, 3, 20),
     (2012-11-15, One-Time Task, 2, 20);

  Now when we want to generate the summary reports, we query one of the
  first two tables (if we're generating a report aggregated by issue
  type we hit issue_type, if we're

[sqlalchemy] Re: Is it possible to define multiple model objects that refer to the same table?

2012-11-16 Thread Rob Crowell
On Nov 15, 10:48 pm, Michael Bayer mike...@zzzcomputing.com wrote:
 On Nov 15, 2012, at 7:33 PM, Rob Crowell wrote:









  Sorry, that got cut off at the end.

  class IssueTag(Base):
      __tablename__ = 'issue_user_tag'

  sqlalchemy.exc.InvalidRequestError: Table 'issue_user_tag' is already 
  defined for this MetaData instance.  Specify 'extend_existing=True' to 
  redefine options and columns on an existing Table object.

  On Thursday, November 15, 2012 7:32:29 PM UTC-5, Rob Crowell wrote:
  I'm working with a denormalized cache schema, and I've run into a situation 
  where it would be helpful to be able to create multiple classes that extend 
  Base but refer to the same __tablename__.  Is this possible to do?  I am 
  getting this Exception:
      sqlalchemy.exc.InvalidRequestError: Table '[tablename]' is already 
  defined for this MetaData instance.  Specify 'extend_existing=True' to 
  redefine options and columns on an existing Table object.

  For a little more insight, we have some attributes that always have exactly 
  one value (user who created the issue), and other attributes that can have 
  1 or more values (user-defined tags for the issue).  If we were being 
  exhaustive, we would create two cached tables for our issues since 
  sometimes we want to display recent issues sometimes by user and sometimes 
  by tag:
      * issue_user
      * issue_tag

  However, we can get away with writing just one table and querying it with 
  an appropriate group_by(user_id) to achieve the same end as having 2 
  tables.  Since my application should behave as if there were 2 separate 
  cache tables (and I'd like to keep open the option of adding two separate 
  cache tables in the future), I would like to have 2 different Base classes 
  representing the two ways in which we would query the table.  The obvious 
  way of doing this doesn't work:

  class IssueUser(Base):
      __tablename__ = 'issue_user_tag'

  class IssueTag(Base):

  --

 two^H^H^H three ways:

 1. map to a Table:

 mytable = Table(mytable, Base.metadata, Column(...))

 class A(Base):
     __table__ = mytable

 class B(Base):
     __table__ = mytable

 1a: variant of 1, map A as you did but use __table__ on B

 class A(Base):
     __tablename__ = 'mytable'

     x = Column(...)

 class B(Base):
     __table__ = A.__table__

 2. use single table inheritance with no discriminator

 class MyTable(Base):
     __tablename__ = 'mytable'

 class A(MyTable):
    # 

 class B(MyTable):
    # ...

 I don't have an understanding of your querying situation yet, discriminating 
 on group_by() seems a little strange as group_by() is only intended to be 
 used to group for aggregates, but #1, #1a or #2 should fit the bill.


Thanks for the help so far Michael!  I can explain a little more about
what I'm trying to do (I'm using a fictional application here but I
think it pretty accurately translates into my actual application).

BACKGROUND
--

Let's say I'm writing an issue tracking application.  Each issue that
we're tracking has a type (an issue must have exactly one type), and
each issue may have an unlimited number of user-provided labels.

Each day, people browse our issue tracker and each time they do they
generate a page view on the issue.  Here's an example of one day's
worth of data:

IssueA: Views: 1, Type: Bug, Labels: [login-page (id=1),
show-stopper (id=2)]
IssueB: Views: 20, Type: One-Time Task, Labels: [disk-full
(id=3), show-stopper (id=2)]
IssueC: Views: 300, Type: Bug, Labels: [login-page (id=1)]

The BigCo. I'm working for is very interested in knowing which issues
are read by the most people, and they need the ability to generate
reports sliced by arbitrary date ranges.  However, we can tolerate a
day delay, so we are writing summary tables each night.  Two of these
summary tables are aggregated by either issue type or label, and we
also write a third table that can be used to drill-down and see page
visits bucketed by both type and label:

CREATE TABLE `issue_type` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `created` datetime NOT NULL,
  `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL,
  `num_watchers` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`)
)

CREATE TABLE `issue_label` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `created` datetime NOT NULL,
  `label_id` int(10) unsigned NOT NULL,
  `num_watchers` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`)
)

CREATE TABLE `issue_type_label` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `created` datetime NOT NULL,
  `type` enum('Bug','Feature','Improvement','One-Time Task') NOT NULL,
  `label_id` int(10) unsigned NOT NULL,
  `num_visits` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`)
)

So we'd run these insert statements at midnight:

INSERT INTO issue_type (created, type, num_visits) VALUES
(2012-11-15, Bug, 301),
(2012-11-15, One-Time Task, 20);

INSERT INTO issue_labels (created, label_id