[sqlalchemy] Appending to an expired relationship list

2012-04-30 Thread Dan Callaghan
Dear list,

In our application we've run across what seems to be at least a serious 
gotcha, if not an actual bug, in SQLAlchemy. Please let me know if you 
agree, or if there is something I'm doing wrong.

Attached is a minimal reproducer (as minimal as I could manage anyway). 
The assertion at the end of the script fails with SQLAlchemy 0.6.8 and 
0.7.6. I'm using MySQL, although I don't think the database matters. 
(SQLite's nested transaction support is broken, so it won't work.)

The most important part is the lazy_create method, which is how we are 
avoiding race conditions when inserting into tables with a unique 
constraint. It will either insert a new row with the given unique column 
value(s), or select the existing row. (The IntegrityError it catches 
will be a unique constraint violation.)

@classmethod
def lazy_create(cls, **kwargs):
session.begin_nested()
try:
item = cls(**kwargs)
session.add(item)
session.commit()
except IntegrityError:
session.rollback()
item = session.query(cls).filter_by(**kwargs).one()
return item

The problem comes when we later do something like this:

task.packages.append(Package.lazy_create(name=u'asdf'))

If the Package with name 'asdf' already exists, this .append() call will 
have no effect. The Package instance is silently discarded from the 
task.packages list.

As we eventually discovered, the reason is that the nested rollback 
inside lazy_create causes task.packages to be expired. But the .append() 
method is still called on the expired list because of the way our 
statement is written. The expired list is discarded, and so the 
.append() has no effect.

In this particular case, we can rewrite the statement as:

package = Package.lazy_create(name=u'asdf')
task.packages.append(package)

which fixes the problem, but it worries me.

Shouldn't calling .append() or any other method on an expired 
relationship list at least raise an exception, given that the list has 
been discarded?

Could SQLAlchemy expire the relationship list without actually replacing 
the list instance? Then we wouldn't need to worry about whether we are 
accidentally holding onto references to expired lists which have been 
replaced out from underneath us.

Is there some other way we could avoid this kind of problem?

-- 
Dan Callaghan d...@djc.id.au

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



bz816879.py
Description: application/python


[sqlalchemy] turn filtered query with subquery into relationship

2012-04-30 Thread dan
Dear List,

Hoping for help with following unanswered stackoverflow question. If the 
approach is not the best, that feedback is good as well.
In the case below is it better to just not use relationships and just use 
the method with the query?

Thanks


In the code below I want to replace *all_holdings* in Account with a 
property called *holdings* that returns the *desired_holdings* (which are 
the holdings representing the latest known quantity which can change over 
time). I'm having trouble figuring out how to construct the call to 
relationship.

In addition any comments on the appropriateness of the pattern (keeping 
historic data in a single table and using a max date subquery to get most 
recent), better alternatives, improvements on the query appreciated.

from sqlalchemy import Column, Integer, String, Date, DateTime, REAL, 
ForeignKey, func
from sqlalchemy.orm import relationship, aliased
from sqlalchemy.sql.operators import and_, eq
from sqlalchemy.ext.declarative import declarative_base
from db import session
import datetime
import string

Base = declarative_base()

class MySQLSettings(object):
__table_args__ = {'mysql_engine':'InnoDB'}

class Account(MySQLSettings, Base):
__tablename__ = 'account'
id = Column(Integer, primary_key=True)
name = Column(String(64))
all_holdings = relationship('Holding', backref='account')

def desired_holdings(self):
max_date_subq = session.query(Holding.account_id.label('account_id'),
  Holding.stock_id.label('stock_id'),
  
func.max(Holding.as_of).label('max_as_of')). \
  group_by(Holding.account_id, 
Holding.stock_id).subquery()

desired_query = session.query(Holding).join(Account,

Account.id==account.id).join(max_date_subq).\

filter(max_date_subq.c.account_id==account.id).\

filter(Holding.as_of==max_date_subq.c.max_as_of).\

filter(Holding.account_id==max_date_subq.c.account_id).\

filter(Holding.stock_id==max_date_subq.c.stock_id)

return desired_query.all()

def __init__(self, name):
self.name = name

class Stock(MySQLSettings, Base):
__tablename__ = 'stock'
id = Column(Integer, primary_key=True)
name = Column(String(64))

def __init__(self, name):
self.name = name

class Holding(MySQLSettings, Base):
__tablename__ = 'holding'
id = Column(Integer, primary_key=True)
account_id = Column(Integer, ForeignKey('account.id'), nullable=False)
stock_id = Column(Integer, ForeignKey('stock.id'), nullable=False)
quantity = Column(REAL)
as_of = Column(Date)

stock = relationship('Stock')

def __str__(self):
return Holding(%f, '%s' '%s')%(self.quantity, self.stock.name, 
str(self.as_of))

def __init__(self, account, stock, quantity, as_of):
self.account_id = account.id
self.stock_id = stock.id
self.quantity = quantity
self.as_of = as_of

if __name__ == __main__:
ibm = Stock('ibm')
session.add(ibm)
account = Account('a')
session.add(account)
session.flush()
session.add_all([ Holding(account, ibm, 100, datetime.date(2001, 1, 1)),
  Holding(account, ibm, 200, datetime.date(2001, 1, 3)),
  Holding(account, ibm, 300, datetime.date(2001, 1, 5)) ])
session.commit()

print All holdings by relation:\n\t, \
string.join([ str(h) for h in account.all_holdings ], \n\t)

print Desired holdings query:\n\t, \
string.join([ str(h) for h in account.desired_holdings() ], \n\t)

The results when run are:

All holdings by relation:
Holding(100.00, 'ibm' '2001-01-01')
Holding(200.00, 'ibm' '2001-01-03')
Holding(300.00, 'ibm' '2001-01-05')
Desired holdings query:
Holding(300.00, 'ibm' '2001-01-05')

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/sqlalchemy/-/KKB3-3r5kSAJ.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: PostgreSQL hstore custom type?

2012-04-30 Thread Adam Tauno Williams
On Sun, 2012-04-15 at 05:32 -0700, Kyle Schaffrick wrote:
 Wow, a blast from the past! :)  

Yes.

 Which version of SQLAlchemy does your new version work with? It's
 great that you are now able to get rid of the HStoreComparator, that
 really makes it easier to use. The version of SQLA I was using 2 years
 ago when I wrote it required the custom comparator to make the
 operators work on the ORM InstrumentedAttribute as well as on the Core
 Column. I suppose the new version doesn't require this duplication?

I've gone back through the thread and I can't find a link to your work
[regarding the hstore implementation].  Can you post one?

 On your question, I think that PostgreSQL's hstore supports unicode
 just fine; the keys and values are of type text. It is just a bug in
 my code that it does not work. I think the main issue would be to
 replace .encode('string_escape') in _serialize_hstore()
 and .decode('string_escape') in _parse_hstore() with something that
 will do the same job but work for both unicode and string, and to make
 sure the regexes are unicode safe (or replace them with a proper
 parser).



signature.asc
Description: This is a digitally signed message part


Re: [sqlalchemy] Appending to an expired relationship list

2012-04-30 Thread Michael Bayer

On Apr 30, 2012, at 1:08 AM, Dan Callaghan wrote:

 
 Shouldn't calling .append() or any other method on an expired 
 relationship list at least raise an exception, given that the list has 
 been discarded?


 Could SQLAlchemy expire the relationship list without actually replacing 
 the list instance?


This is certainly a case I've never seen before.

The collection internals currently don't try to prevent collections from being 
used once they are no longer associated with a parent.   This wouldn't really 
work generically, as it's entirely valid to replace a collection with another, 
then continue appending to the detached collection.Most of the avenues that 
were at first apparent here fall under the category of a post-detach append 
being disallowed or otherwise warning. There's an inconsistency regarding 
when a collection is replaced with another (it's explicitly unlinked from the 
internal CollectionAdapter which maintains ORM events, so that it no longer 
generates events), versus when the collection is expired and just discarded 
(nothing is done, the CollectionAdapter remains and essentially passes invalid 
events if the collection continues to be used).

But really here we're talking about a third state - the collection is still 
associated with its CollectionAdapter, and the CollectionAdapter is in some 
kind of illegal to use state.Meaning the above inconsistency is really 
not inconsistent at all, it's just a three-state system - linked, unlinked, and 
invalid.

The empty the list case is probably not possible at all without major 
backwards-incompatible changes.  SQLAlchemy indicates an attribute as expired 
by the fact that the attribute's key is no longer in obj.__dict__.   With no 
value present, this expiration case indicates that a new collection should be 
loaded on next access from the database.

Another chief concern when attempting to change this behavior is one of 
latency.Expiration and append events both occur in large numbers, so any 
function calls added to these adds palpable latency.

The invalidate-on-expire use case is at ticket 
http://www.sqlalchemy.org/trac/ticket/2476 and is organized for the 
three-state idea, and shouldn't add latency in most cases.   This is for 0.8 
only.

 
 Is there some other way we could avoid this kind of problem?

Personally I never use the begin_nested()/check/rollback pattern, it's messy 
and does a lot of work.  commit() and rollback() have network, database, and 
in-python overhead, Python raising exceptions has lots of overhead, running a 
flush() and generating an INSERT that's thrown away has lots of overhead.  The 
way this method is structured, it's guaranteed to be quite slow whether or not 
the object exists. The expiration that commit/rollback does is very 
expensive to subsequent operations too; in 0.8 we have improved this so that 
only objects that were actually made dirty within the begin_nested() are 
expired, but until then if you have expire_on_commit enabled you'll be seeing 
everything else in the session reloading itself on next access after each of 
these append() operations.

I always use recipes that involve loading the object first, either individually 
or by populating a collection up front before a large sequence of operations.   
  A simple recipe that I often use (or a variant) is UniqueObject:  
http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] turn filtered query with subquery into relationship

2012-04-30 Thread Michael Bayer

On Apr 30, 2012, at 10:54 AM, dan wrote:

 Dear List,
 
 Hoping for help with following unanswered stackoverflow question. If the 
 approach is not the best, that feedback is good as well.
 In the case below is it better to just not use relationships and just use the 
 method with the query?
 
 Thanks
 
 
 In the code below I want to replace *all_holdings* in Account with a property 
 called holdings that returns the *desired_holdings* (which are the holdings 
 representing the latest known quantity which can change over time). I'm 
 having trouble figuring out how to construct the call to relationship.
 
 

The desired_holdings() query is pretty complicated and I'm not seeing a win by 
trying to get relationship() to do it.  relationship() is oriented towards 
maintaining the persistence between two classes, not as much a reporting 
technique (and anything with max()/group_by in it is referring to reporting).

I would stick @property on top of desired_holdings, use object_session(self) to 
get at session, and be done.

This use case is described at 
http://docs.sqlalchemy.org/en/rel_0_7/orm/relationships.html#building-query-enabled-properties
 .





 In addition any comments on the appropriateness of the pattern (keeping 
 historic data in a single table and using a max date subquery to get most 
 recent), better alternatives, improvements on the query appreciated.
 
 from sqlalchemy import Column, Integer, String, Date, DateTime, REAL, 
 ForeignKey, func
 from sqlalchemy.orm import relationship, aliased
 from sqlalchemy.sql.operators import and_, eq
 from sqlalchemy.ext.declarative import declarative_base
 from db import session
 import datetime
 import string
 
 Base = declarative_base()
 
 class MySQLSettings(object):
 __table_args__ = {'mysql_engine':'InnoDB'}
 
 class Account(MySQLSettings, Base):
 __tablename__ = 'account'
 id = Column(Integer, primary_key=True)
 name = Column(String(64))
 all_holdings = relationship('Holding', backref='account')
 
 def desired_holdings(self):
 max_date_subq = session.query(Holding.account_id.label('account_id'),
   Holding.stock_id.label('stock_id'),
   
 func.max(Holding.as_of).label('max_as_of')). \
   group_by(Holding.account_id, 
 Holding.stock_id).subquery()
 
 desired_query = session.query(Holding).join(Account,
 
 Account.id==account.id).join(max_date_subq).\
 
 filter(max_date_subq.c.account_id==account.id).\
 
 filter(Holding.as_of==max_date_subq.c.max_as_of).\
 
 filter(Holding.account_id==max_date_subq.c.account_id).\
 
 filter(Holding.stock_id==max_date_subq.c.stock_id)
 
 return desired_query.all()
 
 def __init__(self, name):
 self.name = name
 
 class Stock(MySQLSettings, Base):
 __tablename__ = 'stock'
 id = Column(Integer, primary_key=True)
 name = Column(String(64))
 
 def __init__(self, name):
 self.name = name
 
 class Holding(MySQLSettings, Base):
 __tablename__ = 'holding'
 id = Column(Integer, primary_key=True)
 account_id = Column(Integer, ForeignKey('account.id'), nullable=False)
 stock_id = Column(Integer, ForeignKey('stock.id'), nullable=False)
 quantity = Column(REAL)
 as_of = Column(Date)
 
 stock = relationship('Stock')
 
 def __str__(self):
 return Holding(%f, '%s' '%s')%(self.quantity, self.stock.name, 
 str(self.as_of))
 
 def __init__(self, account, stock, quantity, as_of):
 self.account_id = account.id
 self.stock_id = stock.id
 self.quantity = quantity
 self.as_of = as_of
 
 if __name__ == __main__:
 ibm = Stock('ibm')
 session.add(ibm)
 account = Account('a')
 session.add(account)
 session.flush()
 session.add_all([ Holding(account, ibm, 100, datetime.date(2001, 1, 1)),
   Holding(account, ibm, 200, datetime.date(2001, 1, 3)),
   Holding(account, ibm, 300, datetime.date(2001, 1, 5)) ])
 session.commit()
 
 print All holdings by relation:\n\t, \
 string.join([ str(h) for h in account.all_holdings ], \n\t)
 
 print Desired holdings query:\n\t, \
 string.join([ str(h) for h in account.desired_holdings() ], \n\t)
 The results when run are:
 
 All holdings by relation:
 Holding(100.00, 'ibm' '2001-01-01')
 Holding(200.00, 'ibm' '2001-01-03')
 Holding(300.00, 'ibm' '2001-01-05')
 Desired holdings query:
 Holding(300.00, 'ibm' '2001-01-05')
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 

[sqlalchemy] alembic 0.3.2 released

2012-04-30 Thread Michael Bayer
Hey lists -

I've just put out Alembic 0.3.2.

This version features initial support for Oracle, and some bug fixes.

There's plenty more to do with Alembic so keep those pull requests coming in !  
Thanks all for the help on this project.

Alembic 0.3.2:

http://pypi.python.org/pypi/alembic/


- [feature] Basic support for Oracle added, 
  courtesy shgoh. #40

- [feature] Added support for UniqueConstraint
  in autogenerate, courtesy Atsushi Odagiri

- [bug] Fixed support of schema-qualified 
  ForeignKey target in column alter operations,
  courtesy Alexander Kolov.

- [bug] Fixed bug whereby create_unique_constraint()
  would include in the constraint columns that
  are added to all Table objects using events,
  externally to the generation of the constraint.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] how to filter a timestamp field by date ?

2012-04-30 Thread Jonathan Vanasco

I this class/table

class Posting:
id INT
timestamp_created TIMESTAMP

i'm trying to figure out how to execute this sql

SELECT * FROM posting WHERE date(timestamp_created) = date(%s) ;

i can't seem to find a date operator in the api docs.  i swear i've
seen one before.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] how to filter a timestamp field by date ?

2012-04-30 Thread David Bowser
You're looking for func which generates SQL function expressions. 
See: http://docs.sqlalchemy.org/en/latest/core/expression_api.html



For your example:

from sqlalchemy.sql.expression import func

session.query(Posting).filter(func.date(Posting.timestamp_created)==func.date(my_var)).all()


On Apr 30, 2012, at 5:20 PM, Jonathan Vanasco jonat...@findmeon.com wrote:

 
 I this class/table
 
 class Posting:
id INT
timestamp_created TIMESTAMP
 
 i'm trying to figure out how to execute this sql
 
 SELECT * FROM posting WHERE date(timestamp_created) = date(%s) ;
 
 i can't seem to find a date operator in the api docs.  i swear i've
 seen one before.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] how to filter a timestamp field by date ?

2012-04-30 Thread Michael Bayer
I'd also add that if your TIMESTAMP is indexed, you'll want to compare it using 
BETWEEN date 00:00:00  / date 23:59:59 (pseudocode) for better efficiency.


On Apr 30, 2012, at 5:44 PM, David Bowser wrote:

 You're looking for func which generates SQL function expressions. 
 See: http://docs.sqlalchemy.org/en/latest/core/expression_api.html
 
 
 
 For your example:
 
 from sqlalchemy.sql.expression import func
 
 session.query(Posting).filter(func.date(Posting.timestamp_created)==func.date(my_var)).all()
 
 
 On Apr 30, 2012, at 5:20 PM, Jonathan Vanasco jonat...@findmeon.com wrote:
 
 
 I this class/table
 
 class Posting:
   id INT
   timestamp_created TIMESTAMP
 
 i'm trying to figure out how to execute this sql
 
 SELECT * FROM posting WHERE date(timestamp_created) = date(%s) ;
 
 i can't seem to find a date operator in the api docs.  i swear i've
 seen one before.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: how to filter a timestamp field by date ?

2012-04-30 Thread Jonathan Vanasco

thanks to you both!

i managed to just now figure out how to do it with sql.expression.cast
as well :

filter( 
sqlalchemy.sql.expression.cast(model.core.Posting.timestamp_created,sqlalchemy.types.Date)
==  self.request.app_meta.datetime.date() )

the docs on the .func were confusing at first.  now i understand how
it does that.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.