[sqlalchemy] Appending to an expired relationship list

2012-04-30 Thread Dan Callaghan
Dear list,

In our application we've run across what seems to be at least a serious 
gotcha, if not an actual bug, in SQLAlchemy. Please let me know if you 
agree, or if there is something I'm doing wrong.

Attached is a minimal reproducer (as minimal as I could manage anyway). 
The assertion at the end of the script fails with SQLAlchemy 0.6.8 and 
0.7.6. I'm using MySQL, although I don't think the database matters. 
(SQLite's nested transaction support is broken, so it won't work.)

The most important part is the lazy_create method, which is how we are 
avoiding race conditions when inserting into tables with a unique 
constraint. It will either insert a new row with the given unique column 
value(s), or select the existing row. (The IntegrityError it catches 
will be a unique constraint violation.)

@classmethod
def lazy_create(cls, **kwargs):
session.begin_nested()
try:
item = cls(**kwargs)
session.add(item)
session.commit()
except IntegrityError:
session.rollback()
item = session.query(cls).filter_by(**kwargs).one()
return item

The problem comes when we later do something like this:

task.packages.append(Package.lazy_create(name=u'asdf'))

If the Package with name 'asdf' already exists, this .append() call will 
have no effect. The Package instance is silently discarded from the 
task.packages list.

As we eventually discovered, the reason is that the nested rollback 
inside lazy_create causes task.packages to be expired. But the .append() 
method is still called on the expired list because of the way our 
statement is written. The expired list is discarded, and so the 
.append() has no effect.

In this particular case, we can rewrite the statement as:

package = Package.lazy_create(name=u'asdf')
task.packages.append(package)

which fixes the problem, but it worries me.

Shouldn't calling .append() or any other method on an expired 
relationship list at least raise an exception, given that the list has 
been discarded?

Could SQLAlchemy expire the relationship list without actually replacing 
the list instance? Then we wouldn't need to worry about whether we are 
accidentally holding onto references to expired lists which have been 
replaced out from underneath us.

Is there some other way we could avoid this kind of problem?

-- 
Dan Callaghan d...@djc.id.au

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



bz816879.py
Description: application/python


Re: [sqlalchemy] Appending to an expired relationship list

2012-04-30 Thread Michael Bayer

On Apr 30, 2012, at 1:08 AM, Dan Callaghan wrote:

 
 Shouldn't calling .append() or any other method on an expired 
 relationship list at least raise an exception, given that the list has 
 been discarded?


 Could SQLAlchemy expire the relationship list without actually replacing 
 the list instance?


This is certainly a case I've never seen before.

The collection internals currently don't try to prevent collections from being 
used once they are no longer associated with a parent.   This wouldn't really 
work generically, as it's entirely valid to replace a collection with another, 
then continue appending to the detached collection.Most of the avenues that 
were at first apparent here fall under the category of a post-detach append 
being disallowed or otherwise warning. There's an inconsistency regarding 
when a collection is replaced with another (it's explicitly unlinked from the 
internal CollectionAdapter which maintains ORM events, so that it no longer 
generates events), versus when the collection is expired and just discarded 
(nothing is done, the CollectionAdapter remains and essentially passes invalid 
events if the collection continues to be used).

But really here we're talking about a third state - the collection is still 
associated with its CollectionAdapter, and the CollectionAdapter is in some 
kind of illegal to use state.Meaning the above inconsistency is really 
not inconsistent at all, it's just a three-state system - linked, unlinked, and 
invalid.

The empty the list case is probably not possible at all without major 
backwards-incompatible changes.  SQLAlchemy indicates an attribute as expired 
by the fact that the attribute's key is no longer in obj.__dict__.   With no 
value present, this expiration case indicates that a new collection should be 
loaded on next access from the database.

Another chief concern when attempting to change this behavior is one of 
latency.Expiration and append events both occur in large numbers, so any 
function calls added to these adds palpable latency.

The invalidate-on-expire use case is at ticket 
http://www.sqlalchemy.org/trac/ticket/2476 and is organized for the 
three-state idea, and shouldn't add latency in most cases.   This is for 0.8 
only.

 
 Is there some other way we could avoid this kind of problem?

Personally I never use the begin_nested()/check/rollback pattern, it's messy 
and does a lot of work.  commit() and rollback() have network, database, and 
in-python overhead, Python raising exceptions has lots of overhead, running a 
flush() and generating an INSERT that's thrown away has lots of overhead.  The 
way this method is structured, it's guaranteed to be quite slow whether or not 
the object exists. The expiration that commit/rollback does is very 
expensive to subsequent operations too; in 0.8 we have improved this so that 
only objects that were actually made dirty within the begin_nested() are 
expired, but until then if you have expire_on_commit enabled you'll be seeing 
everything else in the session reloading itself on next access after each of 
these append() operations.

I always use recipes that involve loading the object first, either individually 
or by populating a collection up front before a large sequence of operations.   
  A simple recipe that I often use (or a variant) is UniqueObject:  
http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.