On Oct 3, 2010, at 11:09 PM, Graham Lowe wrote: > >>>> [(t.i, t.s) for t in session.query(T)] > [(1, u'a'), (2, u'b')] >>>> session.identity_map > {(<class '__main__.T'>, (2,)): <sqlalchemy.orm.state.InstanceState > object at 0x1c47a90>}
as your list comprehension completes, only the most recent value remains strongly referenced. Python garbage collection has kicked in and cleaned out the first T instance, only the second remains. > The identity_map is holding onto the tail of this list comprehension. > This leads some really weird results: > > In psql: > update t set s='c'; > >>>> [(t.i, t.s) for t in session.query(T)] > [(1, u'c'), (2, u'b')] > > As an end-user, I would either expect both items to be updated or > neither of them to be updated. You should expect only that which is strongly referenced outside of the scope of the session, or has pending changes established upon it before being dereferenced, to remain in the identity map. Objects that lose their reference are garbage collected, and the Session cleans out its accounting for otherwise unchanged items when weakref callbacks alert it to this activity. This is how a single session can watch hundreds of thousands of persistent objects go by without the memory size of the application growing any more than the size of collections maintained outside of the session, assuming the session is regularly flushed (hence autoflush=True the default in that regard). If you weren't using autocommit=True, a transaction would remain open throughout the scope of your working with T objects in the session. If you were to then use SERIALIZABLE transaction with PG, this would prevent the inconsistent read from occurring. Allowing transaction isolation to be useful is why autocommit=False is the default and we don't talk much about autocommit=True. Once you were to call session.rollback() or session.commit(), the state of each T still in memory would be expired and you'd see the 'c' value for both T objects after you re-access them in the subsequent transaction that starts up the moment you query again. In this way, the Session doesn't have to reinvent transactional concurrency behavior, or make continuous expensive and complicated guesses as to when it should expire which attributes - it very simply assumes the greatest data consistency with a SERIALIZABLE transaction, and has only simple decisions to be made regarding attribute state. That doesn't mean you have to use SERIALIZABLE. You'd use it typically if you are expecting to deal with individual rows in a concurrent fashion. Otherwise the default of READ COMMITTED is usually fine for most needs. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.