Re: [sqlalchemy] Unexpected behavior of is_modified()
On Sep 27, 2010, at 6:15 PM, Ted Turocy wrote: Hello, I am puzzled by some behavior of is_modified() I am seeing. Here is a sample program which illustrates the situation. I'm running this on 0.6.4: This is a limitation of is_modified(), and it is definitely a bug here that this limitation is not documented (ticket 1928). Its a serious performance hit for us to load the old value for a simple scalar attribute when a set event occurs. In this test, all attributes are expired after each commit(). In the second case, the test does not access instance.cans ahead of time, so when 42 is assigned, the set event is registered, the old value is not loaded, and history-wise we see a value of 42 with a constant of NO_VALUE to compare it against, so it counts as history. If the is_modified() method and in turn the attributes.get_history() facility were to be enhanced such that the attribute history function could reach upwards to the ORM's un-expiration functionality and load the old value of the attribute as though it were being accessed, then the value would be present for a true compare - however this is not functionality the ORM internals use right now. Scalar attributes just get bundled into the UPDATE statement if we aren't sure what the previous value was, rather than going back and issuing another SELECT. This again is much more performant in the vast majority of use cases, though certainly not all. A workaround is to enable full history tracking for the attribute, so that the old value is loaded upon a set event. This is not public API at the moment but it would look like: Spam.cans.impl.active_history = True or, set expire_on_commit=False on your session. The active_history feature should become public API at some point. The alternative notion of modifying attributes.get_history() to get the old value to load from a SELECT at the point of history, rather than upon attribute set, would be doable but would require more intensive changes, as we'd have to break out the load the old value behavior into a separate function that calls upon loader callables in such a way that they know not to populate the current instance dictionary. The feature would still not be ideal as a default within the ORM since you'd see many more SELECTs being emitted within a typical flush() against a set of persistent, modified objects, most of which are unnecessary. - import sqlalchemy import sqlalchemy.orm from sqlalchemy import Table, Column, Integer class Spam(object): def __init__(self, cans): self.cans = cans engine = sqlalchemy.create_engine('sqlite:///:memory:') Session = sqlalchemy.orm.sessionmaker(autocommit=False, bind=engine) session = Session() meta = sqlalchemy.MetaData() meta.bind = engine spam_table = Table('spam', meta, Column('id', Integer, primary_key=True), Column('cans', Integer) ) meta.create_all(engine) sqlalchemy.orm.mapper(Spam, spam_table) # Setup finished. Now test things out instance = Spam(42) session.add(instance) session.commit() print Cans of spam = , instance.cans instance.cans = 42 print Has instance been modified? (Should be False): , session.is_modified(instance) session.commit() instance.cans = 42 print We have made no net change to the instance... print Has instance been modified? (Should be False): , session.is_modified(instance) - When I run this, I get: - Cans of spam = 42 Has instance been modified? (Should be False): False We have made no net change to the instance... Has instance been modified? (Should be False): True - That's surprising to me. In both cases, no net change is being made to the instance. Noodling around further, it appears that this is triggered by re-using the same instance object, and doing another __setattr__ on the column without having done a __getattr__ on it earlier. I suspect, without having dug around, that after the commit, the instance's attribute values are marked as needing to be re-loaded, but they are not being re-loaded first in order to check whether the new value is in fact different than the existing one. I've also noticed if I re-obtain the instance via a query using .get() on the primary key, the behavior is the expected one, again presumably because the query operation is loading in all the column values. Am I missing something here? Ted -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to
Re: [sqlalchemy] Unexpected behavior of is_modified()
On Sep 27, 2010, at 6:53 PM, Michael Bayer wrote: On Sep 27, 2010, at 6:15 PM, Ted Turocy wrote: Hello, I am puzzled by some behavior of is_modified() I am seeing. Here is a sample program which illustrates the situation. I'm running this on 0.6.4: This is a limitation of is_modified(), and it is definitely a bug here that this limitation is not documented (ticket 1928). Its a serious performance hit for us to load the old value for a simple scalar attribute when a set event occurs. In this test, all attributes are expired after each commit(). In the second case, the test does not access instance.cans ahead of time, so when 42 is assigned, the set event is registered, the old value is not loaded, and history-wise we see a value of 42 with a constant of NO_VALUE to compare it against, so it counts as history. If the is_modified() method and in turn the attributes.get_history() facility were to be enhanced such that the attribute history function could reach upwards to the ORM's un-expiration functionality and load the old value of the attribute as though it were being accessed, then the value would be present for a true compare - however this is not functionality the ORM internals use right now. Scalar attributes just get bundled into the UPDATE statement if we aren't sure what the previous value was, rather than going back and issuing another SELECT. This again is much more performant in the vast majority of use cases, though certainly not all. A workaround is to enable full history tracking for the attribute, so that the old value is loaded upon a set event. This is not public API at the moment but it would look like: Spam.cans.impl.active_history = True or, set expire_on_commit=False on your session. The active_history feature should become public API at some point. The alternative notion of modifying attributes.get_history() to get the old value to load from a SELECT at the point of history, rather than upon attribute set, would be doable but would require more intensive changes, as we'd have to break out the load the old value behavior into a separate function that calls upon loader callables in such a way that they know not to populate the current instance dictionary. The feature would still not be ideal as a default within the ORM since you'd see many more SELECTs being emitted within a typical flush() against a set of persistent, modified objects, most of which are unnecessary. just another thought, thinking about this, if we wanted to get into hey let's refactor attributes yet again, I suppose we could hold onto the old value when we expire attributes, probably in the committed_state dictionary or something else. Not a change I'm planning on anytime soon, it would destablize everything and it still wouldn't work in all cases, but something to think about. A long time ago we tried making attributes hold onto their history and support rollback in place...it had lots of impossible edge cases. - import sqlalchemy import sqlalchemy.orm from sqlalchemy import Table, Column, Integer class Spam(object): def __init__(self, cans): self.cans = cans engine = sqlalchemy.create_engine('sqlite:///:memory:') Session = sqlalchemy.orm.sessionmaker(autocommit=False, bind=engine) session = Session() meta = sqlalchemy.MetaData() meta.bind = engine spam_table = Table('spam', meta, Column('id', Integer, primary_key=True), Column('cans', Integer) ) meta.create_all(engine) sqlalchemy.orm.mapper(Spam, spam_table) # Setup finished. Now test things out instance = Spam(42) session.add(instance) session.commit() print Cans of spam = , instance.cans instance.cans = 42 print Has instance been modified? (Should be False): , session.is_modified(instance) session.commit() instance.cans = 42 print We have made no net change to the instance... print Has instance been modified? (Should be False): , session.is_modified(instance) - When I run this, I get: - Cans of spam = 42 Has instance been modified? (Should be False): False We have made no net change to the instance... Has instance been modified? (Should be False): True - That's surprising to me. In both cases, no net change is being made to the instance. Noodling around further, it appears that this is triggered by re-using the same instance object, and doing another __setattr__ on the column without having done a __getattr__ on it earlier. I suspect, without having dug around, that after the commit, the instance's attribute values are marked as needing to be re-loaded, but they are not being re-loaded first in order to check whether the new value is in fact different than the existing one. I've also noticed if I re-obtain the instance via a query using .get() on the primary key, the behavior is the expected