Re: [sqlalchemy] Curious Session.identity_map behavior (using psycopg2)

Michael Bayer Sun, 03 Oct 2010 21:02:51 -0700

On Oct 3, 2010, at 11:09 PM, Graham Lowe wrote:

> 
>>>> [(t.i, t.s) for t in session.query(T)]
> [(1, u'a'), (2, u'b')]
>>>> session.identity_map
> {(<class '__main__.T'>, (2,)): <sqlalchemy.orm.state.InstanceState
> object at 0x1c47a90>}


as your list comprehension completes, only the most recent value remains 
strongly referenced.  Python garbage collection has kicked in and cleaned out 
the first T instance, only the second remains.

> The identity_map is holding onto the tail of this list comprehension.
> This leads some really weird results:
> 
> In psql:
> update t set s='c';
> 
>>>> [(t.i, t.s) for t in session.query(T)]
> [(1, u'c'), (2, u'b')]
> 
> As an end-user, I would either expect both items to be updated or
> neither of them to be updated.

You should expect only that which is strongly referenced outside of the scope 
of the session, or has pending changes established upon it before being 
dereferenced, to remain in the identity map.   Objects that lose their 
reference are garbage collected, and the Session cleans out its accounting for 
otherwise unchanged items when weakref callbacks alert it to this activity.  
This is how a single session can watch hundreds of thousands of persistent 
objects go by without the memory size of the application growing any more than 
the size of collections maintained outside of the session, assuming the session 
is regularly flushed (hence autoflush=True the default in that regard).

If you weren't using autocommit=True, a transaction would remain open 
throughout the scope of your working with T objects in the session.  If you 
were to then use SERIALIZABLE transaction with PG, this would prevent the 
inconsistent read from occurring.   Allowing transaction isolation to be useful 
is why autocommit=False is the default and we don't talk much about 
autocommit=True.  Once you were to call session.rollback() or session.commit(), 
the state of each T still in memory would be expired and you'd see the 'c' 
value for both T objects after you re-access them in the subsequent transaction 
that starts up the moment you query again.   In this way, the Session doesn't 
have to reinvent transactional concurrency behavior, or make continuous 
expensive and complicated guesses as to when it should expire which attributes 
- it very simply assumes the greatest data consistency with a SERIALIZABLE 
transaction, and has only simple decisions to be made regarding attribute 
state.    That doesn't mean you have to use SERIALIZABLE.  You'd use it 
typically if you are expecting to deal with individual rows in a concurrent 
fashion.   Otherwise the default of READ COMMITTED is usually fine for most 
needs.












-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Curious Session.identity_map behavior (using psycopg2)

Reply via email to