[sqlalchemy] Appending to an expired relationship list
Dear list, In our application we've run across what seems to be at least a serious gotcha, if not an actual bug, in SQLAlchemy. Please let me know if you agree, or if there is something I'm doing wrong. Attached is a minimal reproducer (as minimal as I could manage anyway). The assertion at the end of the script fails with SQLAlchemy 0.6.8 and 0.7.6. I'm using MySQL, although I don't think the database matters. (SQLite's nested transaction support is broken, so it won't work.) The most important part is the lazy_create method, which is how we are avoiding race conditions when inserting into tables with a unique constraint. It will either insert a new row with the given unique column value(s), or select the existing row. (The IntegrityError it catches will be a unique constraint violation.) @classmethod def lazy_create(cls, **kwargs): session.begin_nested() try: item = cls(**kwargs) session.add(item) session.commit() except IntegrityError: session.rollback() item = session.query(cls).filter_by(**kwargs).one() return item The problem comes when we later do something like this: task.packages.append(Package.lazy_create(name=u'asdf')) If the Package with name 'asdf' already exists, this .append() call will have no effect. The Package instance is silently discarded from the task.packages list. As we eventually discovered, the reason is that the nested rollback inside lazy_create causes task.packages to be expired. But the .append() method is still called on the expired list because of the way our statement is written. The expired list is discarded, and so the .append() has no effect. In this particular case, we can rewrite the statement as: package = Package.lazy_create(name=u'asdf') task.packages.append(package) which fixes the problem, but it worries me. Shouldn't calling .append() or any other method on an expired relationship list at least raise an exception, given that the list has been discarded? Could SQLAlchemy expire the relationship list without actually replacing the list instance? Then we wouldn't need to worry about whether we are accidentally holding onto references to expired lists which have been replaced out from underneath us. Is there some other way we could avoid this kind of problem? -- Dan Callaghan d...@djc.id.au -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. bz816879.py Description: application/python
[sqlalchemy] turn filtered query with subquery into relationship
Dear List, Hoping for help with following unanswered stackoverflow question. If the approach is not the best, that feedback is good as well. In the case below is it better to just not use relationships and just use the method with the query? Thanks In the code below I want to replace *all_holdings* in Account with a property called *holdings* that returns the *desired_holdings* (which are the holdings representing the latest known quantity which can change over time). I'm having trouble figuring out how to construct the call to relationship. In addition any comments on the appropriateness of the pattern (keeping historic data in a single table and using a max date subquery to get most recent), better alternatives, improvements on the query appreciated. from sqlalchemy import Column, Integer, String, Date, DateTime, REAL, ForeignKey, func from sqlalchemy.orm import relationship, aliased from sqlalchemy.sql.operators import and_, eq from sqlalchemy.ext.declarative import declarative_base from db import session import datetime import string Base = declarative_base() class MySQLSettings(object): __table_args__ = {'mysql_engine':'InnoDB'} class Account(MySQLSettings, Base): __tablename__ = 'account' id = Column(Integer, primary_key=True) name = Column(String(64)) all_holdings = relationship('Holding', backref='account') def desired_holdings(self): max_date_subq = session.query(Holding.account_id.label('account_id'), Holding.stock_id.label('stock_id'), func.max(Holding.as_of).label('max_as_of')). \ group_by(Holding.account_id, Holding.stock_id).subquery() desired_query = session.query(Holding).join(Account, Account.id==account.id).join(max_date_subq).\ filter(max_date_subq.c.account_id==account.id).\ filter(Holding.as_of==max_date_subq.c.max_as_of).\ filter(Holding.account_id==max_date_subq.c.account_id).\ filter(Holding.stock_id==max_date_subq.c.stock_id) return desired_query.all() def __init__(self, name): self.name = name class Stock(MySQLSettings, Base): __tablename__ = 'stock' id = Column(Integer, primary_key=True) name = Column(String(64)) def __init__(self, name): self.name = name class Holding(MySQLSettings, Base): __tablename__ = 'holding' id = Column(Integer, primary_key=True) account_id = Column(Integer, ForeignKey('account.id'), nullable=False) stock_id = Column(Integer, ForeignKey('stock.id'), nullable=False) quantity = Column(REAL) as_of = Column(Date) stock = relationship('Stock') def __str__(self): return Holding(%f, '%s' '%s')%(self.quantity, self.stock.name, str(self.as_of)) def __init__(self, account, stock, quantity, as_of): self.account_id = account.id self.stock_id = stock.id self.quantity = quantity self.as_of = as_of if __name__ == __main__: ibm = Stock('ibm') session.add(ibm) account = Account('a') session.add(account) session.flush() session.add_all([ Holding(account, ibm, 100, datetime.date(2001, 1, 1)), Holding(account, ibm, 200, datetime.date(2001, 1, 3)), Holding(account, ibm, 300, datetime.date(2001, 1, 5)) ]) session.commit() print All holdings by relation:\n\t, \ string.join([ str(h) for h in account.all_holdings ], \n\t) print Desired holdings query:\n\t, \ string.join([ str(h) for h in account.desired_holdings() ], \n\t) The results when run are: All holdings by relation: Holding(100.00, 'ibm' '2001-01-01') Holding(200.00, 'ibm' '2001-01-03') Holding(300.00, 'ibm' '2001-01-05') Desired holdings query: Holding(300.00, 'ibm' '2001-01-05') -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/KKB3-3r5kSAJ. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] Re: PostgreSQL hstore custom type?
On Sun, 2012-04-15 at 05:32 -0700, Kyle Schaffrick wrote: Wow, a blast from the past! :) Yes. Which version of SQLAlchemy does your new version work with? It's great that you are now able to get rid of the HStoreComparator, that really makes it easier to use. The version of SQLA I was using 2 years ago when I wrote it required the custom comparator to make the operators work on the ORM InstrumentedAttribute as well as on the Core Column. I suppose the new version doesn't require this duplication? I've gone back through the thread and I can't find a link to your work [regarding the hstore implementation]. Can you post one? On your question, I think that PostgreSQL's hstore supports unicode just fine; the keys and values are of type text. It is just a bug in my code that it does not work. I think the main issue would be to replace .encode('string_escape') in _serialize_hstore() and .decode('string_escape') in _parse_hstore() with something that will do the same job but work for both unicode and string, and to make sure the regexes are unicode safe (or replace them with a proper parser). signature.asc Description: This is a digitally signed message part
Re: [sqlalchemy] Appending to an expired relationship list
On Apr 30, 2012, at 1:08 AM, Dan Callaghan wrote: Shouldn't calling .append() or any other method on an expired relationship list at least raise an exception, given that the list has been discarded? Could SQLAlchemy expire the relationship list without actually replacing the list instance? This is certainly a case I've never seen before. The collection internals currently don't try to prevent collections from being used once they are no longer associated with a parent. This wouldn't really work generically, as it's entirely valid to replace a collection with another, then continue appending to the detached collection.Most of the avenues that were at first apparent here fall under the category of a post-detach append being disallowed or otherwise warning. There's an inconsistency regarding when a collection is replaced with another (it's explicitly unlinked from the internal CollectionAdapter which maintains ORM events, so that it no longer generates events), versus when the collection is expired and just discarded (nothing is done, the CollectionAdapter remains and essentially passes invalid events if the collection continues to be used). But really here we're talking about a third state - the collection is still associated with its CollectionAdapter, and the CollectionAdapter is in some kind of illegal to use state.Meaning the above inconsistency is really not inconsistent at all, it's just a three-state system - linked, unlinked, and invalid. The empty the list case is probably not possible at all without major backwards-incompatible changes. SQLAlchemy indicates an attribute as expired by the fact that the attribute's key is no longer in obj.__dict__. With no value present, this expiration case indicates that a new collection should be loaded on next access from the database. Another chief concern when attempting to change this behavior is one of latency.Expiration and append events both occur in large numbers, so any function calls added to these adds palpable latency. The invalidate-on-expire use case is at ticket http://www.sqlalchemy.org/trac/ticket/2476 and is organized for the three-state idea, and shouldn't add latency in most cases. This is for 0.8 only. Is there some other way we could avoid this kind of problem? Personally I never use the begin_nested()/check/rollback pattern, it's messy and does a lot of work. commit() and rollback() have network, database, and in-python overhead, Python raising exceptions has lots of overhead, running a flush() and generating an INSERT that's thrown away has lots of overhead. The way this method is structured, it's guaranteed to be quite slow whether or not the object exists. The expiration that commit/rollback does is very expensive to subsequent operations too; in 0.8 we have improved this so that only objects that were actually made dirty within the begin_nested() are expired, but until then if you have expire_on_commit enabled you'll be seeing everything else in the session reloading itself on next access after each of these append() operations. I always use recipes that involve loading the object first, either individually or by populating a collection up front before a large sequence of operations. A simple recipe that I often use (or a variant) is UniqueObject: http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] turn filtered query with subquery into relationship
On Apr 30, 2012, at 10:54 AM, dan wrote: Dear List, Hoping for help with following unanswered stackoverflow question. If the approach is not the best, that feedback is good as well. In the case below is it better to just not use relationships and just use the method with the query? Thanks In the code below I want to replace *all_holdings* in Account with a property called holdings that returns the *desired_holdings* (which are the holdings representing the latest known quantity which can change over time). I'm having trouble figuring out how to construct the call to relationship. The desired_holdings() query is pretty complicated and I'm not seeing a win by trying to get relationship() to do it. relationship() is oriented towards maintaining the persistence between two classes, not as much a reporting technique (and anything with max()/group_by in it is referring to reporting). I would stick @property on top of desired_holdings, use object_session(self) to get at session, and be done. This use case is described at http://docs.sqlalchemy.org/en/rel_0_7/orm/relationships.html#building-query-enabled-properties . In addition any comments on the appropriateness of the pattern (keeping historic data in a single table and using a max date subquery to get most recent), better alternatives, improvements on the query appreciated. from sqlalchemy import Column, Integer, String, Date, DateTime, REAL, ForeignKey, func from sqlalchemy.orm import relationship, aliased from sqlalchemy.sql.operators import and_, eq from sqlalchemy.ext.declarative import declarative_base from db import session import datetime import string Base = declarative_base() class MySQLSettings(object): __table_args__ = {'mysql_engine':'InnoDB'} class Account(MySQLSettings, Base): __tablename__ = 'account' id = Column(Integer, primary_key=True) name = Column(String(64)) all_holdings = relationship('Holding', backref='account') def desired_holdings(self): max_date_subq = session.query(Holding.account_id.label('account_id'), Holding.stock_id.label('stock_id'), func.max(Holding.as_of).label('max_as_of')). \ group_by(Holding.account_id, Holding.stock_id).subquery() desired_query = session.query(Holding).join(Account, Account.id==account.id).join(max_date_subq).\ filter(max_date_subq.c.account_id==account.id).\ filter(Holding.as_of==max_date_subq.c.max_as_of).\ filter(Holding.account_id==max_date_subq.c.account_id).\ filter(Holding.stock_id==max_date_subq.c.stock_id) return desired_query.all() def __init__(self, name): self.name = name class Stock(MySQLSettings, Base): __tablename__ = 'stock' id = Column(Integer, primary_key=True) name = Column(String(64)) def __init__(self, name): self.name = name class Holding(MySQLSettings, Base): __tablename__ = 'holding' id = Column(Integer, primary_key=True) account_id = Column(Integer, ForeignKey('account.id'), nullable=False) stock_id = Column(Integer, ForeignKey('stock.id'), nullable=False) quantity = Column(REAL) as_of = Column(Date) stock = relationship('Stock') def __str__(self): return Holding(%f, '%s' '%s')%(self.quantity, self.stock.name, str(self.as_of)) def __init__(self, account, stock, quantity, as_of): self.account_id = account.id self.stock_id = stock.id self.quantity = quantity self.as_of = as_of if __name__ == __main__: ibm = Stock('ibm') session.add(ibm) account = Account('a') session.add(account) session.flush() session.add_all([ Holding(account, ibm, 100, datetime.date(2001, 1, 1)), Holding(account, ibm, 200, datetime.date(2001, 1, 3)), Holding(account, ibm, 300, datetime.date(2001, 1, 5)) ]) session.commit() print All holdings by relation:\n\t, \ string.join([ str(h) for h in account.all_holdings ], \n\t) print Desired holdings query:\n\t, \ string.join([ str(h) for h in account.desired_holdings() ], \n\t) The results when run are: All holdings by relation: Holding(100.00, 'ibm' '2001-01-01') Holding(200.00, 'ibm' '2001-01-03') Holding(300.00, 'ibm' '2001-01-05') Desired holdings query: Holding(300.00, 'ibm' '2001-01-05') -- You received this message because you are subscribed to the Google Groups
[sqlalchemy] alembic 0.3.2 released
Hey lists - I've just put out Alembic 0.3.2. This version features initial support for Oracle, and some bug fixes. There's plenty more to do with Alembic so keep those pull requests coming in ! Thanks all for the help on this project. Alembic 0.3.2: http://pypi.python.org/pypi/alembic/ - [feature] Basic support for Oracle added, courtesy shgoh. #40 - [feature] Added support for UniqueConstraint in autogenerate, courtesy Atsushi Odagiri - [bug] Fixed support of schema-qualified ForeignKey target in column alter operations, courtesy Alexander Kolov. - [bug] Fixed bug whereby create_unique_constraint() would include in the constraint columns that are added to all Table objects using events, externally to the generation of the constraint. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] how to filter a timestamp field by date ?
I this class/table class Posting: id INT timestamp_created TIMESTAMP i'm trying to figure out how to execute this sql SELECT * FROM posting WHERE date(timestamp_created) = date(%s) ; i can't seem to find a date operator in the api docs. i swear i've seen one before. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] how to filter a timestamp field by date ?
You're looking for func which generates SQL function expressions. See: http://docs.sqlalchemy.org/en/latest/core/expression_api.html For your example: from sqlalchemy.sql.expression import func session.query(Posting).filter(func.date(Posting.timestamp_created)==func.date(my_var)).all() On Apr 30, 2012, at 5:20 PM, Jonathan Vanasco jonat...@findmeon.com wrote: I this class/table class Posting: id INT timestamp_created TIMESTAMP i'm trying to figure out how to execute this sql SELECT * FROM posting WHERE date(timestamp_created) = date(%s) ; i can't seem to find a date operator in the api docs. i swear i've seen one before. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
Re: [sqlalchemy] how to filter a timestamp field by date ?
I'd also add that if your TIMESTAMP is indexed, you'll want to compare it using BETWEEN date 00:00:00 / date 23:59:59 (pseudocode) for better efficiency. On Apr 30, 2012, at 5:44 PM, David Bowser wrote: You're looking for func which generates SQL function expressions. See: http://docs.sqlalchemy.org/en/latest/core/expression_api.html For your example: from sqlalchemy.sql.expression import func session.query(Posting).filter(func.date(Posting.timestamp_created)==func.date(my_var)).all() On Apr 30, 2012, at 5:20 PM, Jonathan Vanasco jonat...@findmeon.com wrote: I this class/table class Posting: id INT timestamp_created TIMESTAMP i'm trying to figure out how to execute this sql SELECT * FROM posting WHERE date(timestamp_created) = date(%s) ; i can't seem to find a date operator in the api docs. i swear i've seen one before. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: how to filter a timestamp field by date ?
thanks to you both! i managed to just now figure out how to do it with sql.expression.cast as well : filter( sqlalchemy.sql.expression.cast(model.core.Posting.timestamp_created,sqlalchemy.types.Date) == self.request.app_meta.datetime.date() ) the docs on the .func were confusing at first. now i understand how it does that. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.