Re: [sqlalchemy] Unexpected behavior of is_modified()

2010-09-27 Thread Michael Bayer

On Sep 27, 2010, at 6:15 PM, Ted Turocy wrote:

 Hello,
 
 I am puzzled by some behavior of is_modified() I am seeing.
 
 Here is a sample program which illustrates the situation.  I'm running
 this on 0.6.4:

This is a limitation of is_modified(), and it is definitely a bug here that 
this limitation is not documented (ticket 1928).

Its a serious performance hit for us to load the old value for a simple 
scalar attribute when a set event occurs.   In this test, all attributes are 
expired after each commit().  In the second case, the test does not access 
instance.cans ahead of time, so when 42 is assigned, the set event is 
registered, the old value is not loaded, and history-wise we see a value of 
42 with a constant of NO_VALUE to compare it against, so it counts as history.

If the is_modified() method and in turn the attributes.get_history() facility 
were to be enhanced such that the attribute history function could reach 
upwards to the ORM's un-expiration functionality and load the old value of 
the attribute as though it were being accessed, then the value would be present 
for a true compare - however this is not functionality the ORM internals use 
right now.  Scalar attributes just get bundled into the UPDATE statement if we 
aren't sure what the previous value was, rather than going back and issuing 
another SELECT.   This again is much more performant in the vast majority of 
use cases, though certainly not all.

A workaround is to enable full history tracking for the attribute, so that the 
old value is loaded upon a set event.   This is not public API at the moment 
but it would look like:

Spam.cans.impl.active_history = True

or, set expire_on_commit=False on your session.

The active_history feature should become public API at some point.   The 
alternative notion of modifying attributes.get_history() to get the old value 
to load from a SELECT at the point of history, rather than upon attribute set, 
would be doable but would require more intensive changes, as we'd have to break 
out the load the old value behavior into a separate function that calls upon 
loader callables in such a way that they know not to populate the current 
instance dictionary.   The feature would still not be ideal as a default within 
the ORM since you'd see many more SELECTs being emitted within a typical 
flush() against a set of persistent, modified objects, most of which are 
unnecessary.



 
 -
 import sqlalchemy
 import sqlalchemy.orm
 from sqlalchemy import Table, Column, Integer
 
 class Spam(object):
def __init__(self, cans):
self.cans = cans
 
 engine = sqlalchemy.create_engine('sqlite:///:memory:')
 Session = sqlalchemy.orm.sessionmaker(autocommit=False, bind=engine)
 session = Session()
 meta = sqlalchemy.MetaData()
 meta.bind = engine
 
 spam_table = Table('spam', meta,
   Column('id', Integer, primary_key=True),
   Column('cans', Integer)
   )
 
 meta.create_all(engine)
 
 sqlalchemy.orm.mapper(Spam, spam_table)
 
 # Setup finished.  Now test things out
 instance = Spam(42)
 session.add(instance)
 session.commit()
 
 print Cans of spam = , instance.cans
 instance.cans = 42
 print Has instance been modified? (Should be False): ,
 session.is_modified(instance)
 session.commit()
 
 instance.cans = 42
 print We have made no net change to the instance...
 print Has instance been modified? (Should be False): ,
 session.is_modified(instance)
 -
 
 When I run this, I get:
 
 -
 Cans of spam =  42
 Has instance been modified? (Should be False):  False
 We have made no net change to the instance...
 Has instance been modified? (Should be False):  True
 -
 
 That's surprising to me. In both cases, no net change is being made to
 the instance.
 
 Noodling around further, it appears that this is triggered by re-using
 the same instance object, and doing another __setattr__ on the column
 without having done a __getattr__ on it earlier.  I suspect, without
 having dug around, that after the commit, the instance's attribute
 values are marked as needing to be re-loaded, but they are not being
 re-loaded first in order to check whether the new value is in fact
 different than the existing one.
 
 I've also noticed if I re-obtain the instance via a query using .get()
 on the primary key, the behavior is the expected one, again presumably
 because the query operation is loading in all the column values.
 
 Am I missing something here?
 
 Ted
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to 

Re: [sqlalchemy] Unexpected behavior of is_modified()

2010-09-27 Thread Michael Bayer

On Sep 27, 2010, at 6:53 PM, Michael Bayer wrote:

 
 On Sep 27, 2010, at 6:15 PM, Ted Turocy wrote:
 
 Hello,
 
 I am puzzled by some behavior of is_modified() I am seeing.
 
 Here is a sample program which illustrates the situation.  I'm running
 this on 0.6.4:
 
 This is a limitation of is_modified(), and it is definitely a bug here that 
 this limitation is not documented (ticket 1928).
 
 Its a serious performance hit for us to load the old value for a simple 
 scalar attribute when a set event occurs.   In this test, all attributes are 
 expired after each commit().  In the second case, the test does not access 
 instance.cans ahead of time, so when 42 is assigned, the set event is 
 registered, the old value is not loaded, and history-wise we see a value of 
 42 with a constant of NO_VALUE to compare it against, so it counts as 
 history.
 
 If the is_modified() method and in turn the attributes.get_history() facility 
 were to be enhanced such that the attribute history function could reach 
 upwards to the ORM's un-expiration functionality and load the old value of 
 the attribute as though it were being accessed, then the value would be 
 present for a true compare - however this is not functionality the ORM 
 internals use right now.  Scalar attributes just get bundled into the UPDATE 
 statement if we aren't sure what the previous value was, rather than going 
 back and issuing another SELECT.   This again is much more performant in the 
 vast majority of use cases, though certainly not all.
 
 A workaround is to enable full history tracking for the attribute, so that 
 the old value is loaded upon a set event.   This is not public API at the 
 moment but it would look like:
 
 Spam.cans.impl.active_history = True
 
 or, set expire_on_commit=False on your session.
 
 The active_history feature should become public API at some point.   The 
 alternative notion of modifying attributes.get_history() to get the old 
 value to load from a SELECT at the point of history, rather than upon 
 attribute set, would be doable but would require more intensive changes, as 
 we'd have to break out the load the old value behavior into a separate 
 function that calls upon loader callables in such a way that they know not to 
 populate the current instance dictionary.   The feature would still not be 
 ideal as a default within the ORM since you'd see many more SELECTs being 
 emitted within a typical flush() against a set of persistent, modified 
 objects, most of which are unnecessary.

just another thought, thinking about this, if we wanted to get into hey let's 
refactor attributes yet again, I suppose we could hold onto the old value 
when we expire attributes, probably in the committed_state dictionary or 
something else.   Not a change I'm planning on anytime soon, it would 
destablize everything and it still wouldn't work in all cases, but something to 
think about.   A long time ago we tried making attributes hold onto their 
history and support rollback in place...it had lots of impossible edge cases.


 
 
 
 
 -
 import sqlalchemy
 import sqlalchemy.orm
 from sqlalchemy import Table, Column, Integer
 
 class Spam(object):
   def __init__(self, cans):
   self.cans = cans
 
 engine = sqlalchemy.create_engine('sqlite:///:memory:')
 Session = sqlalchemy.orm.sessionmaker(autocommit=False, bind=engine)
 session = Session()
 meta = sqlalchemy.MetaData()
 meta.bind = engine
 
 spam_table = Table('spam', meta,
  Column('id', Integer, primary_key=True),
  Column('cans', Integer)
  )
 
 meta.create_all(engine)
 
 sqlalchemy.orm.mapper(Spam, spam_table)
 
 # Setup finished.  Now test things out
 instance = Spam(42)
 session.add(instance)
 session.commit()
 
 print Cans of spam = , instance.cans
 instance.cans = 42
 print Has instance been modified? (Should be False): ,
 session.is_modified(instance)
 session.commit()
 
 instance.cans = 42
 print We have made no net change to the instance...
 print Has instance been modified? (Should be False): ,
 session.is_modified(instance)
 -
 
 When I run this, I get:
 
 -
 Cans of spam =  42
 Has instance been modified? (Should be False):  False
 We have made no net change to the instance...
 Has instance been modified? (Should be False):  True
 -
 
 That's surprising to me. In both cases, no net change is being made to
 the instance.
 
 Noodling around further, it appears that this is triggered by re-using
 the same instance object, and doing another __setattr__ on the column
 without having done a __getattr__ on it earlier.  I suspect, without
 having dug around, that after the commit, the instance's attribute
 values are marked as needing to be re-loaded, but they are not being
 re-loaded first in order to check whether the new value is in fact
 different than the existing one.
 
 I've also noticed if I re-obtain the instance via a query using .get()
 on the primary key, the behavior is the expected