Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Michael Bayer

On Apr 25, 2014, at 6:54 PM, Tim Kersten  wrote:

>  
> 
> This is an unusual use case because it seems like you'd like to outright 
> ignore the row if it doesn't match?  or are you throwing an exception if you 
> don't get the expected count?
> 
> Yes, I'm ignoring the row if it doesn't match. One use case for this: A 
> misbehaving piece of code set a "bad" value in a column in many rows. The 
> code was fixed, and thus when customers used that code in future the values 
> would be updated to "good" values. A data migration to automatically adjust 
> the "bad" values would have the potential to overwrite one that's just been 
> altered by a customer, and this is what I'm trying to avoid, with as minimal 
> an impact on the running system. I'd only like to avoid doing so while still 
> using the ORM and was wondering if there's a way to do this without having a 
> version column?

I can think of some potentially very exotic ways of injecting this behavior 
with cursor_execute() events, but it would be quite awkward.  The ORM still 
might complain when it sees zero rows updated.

But the real problem is if you want the ORM to update a row, then have it 
ignored, now you've totally got the wrong data in memory.  I'm not sure how 
your application is constructed that A. you want to use the ORM but B. you 
don't care if it has the wrong data loaded after it thinks it's synchronized.   
 If you just need some simplistic CRUD interface (e.g. object.save()) you can 
build those on top of Core pretty easily.   The ORM has a specific usage model 
and this goes counter to that, hence there's a Core.

Keep in mind you can use a timestamp as a versioning column.   Seems like your 
use case is straightforward, only update rows that haven't been changed before 
timestamp X.  if your rows have a trigger or similar that does an "updated_at" 
column then you'd be able to target those rows.




>  
> 
> The version_id feature will basically throw an exception if the row count is 
> not what's expected.   Other than using repeatable read which is the best 
> approach, it's the best way to prevent writing a stale record with the ORM.
> 
> Indeed. If I'd have it on the table in question I wouldn't have the problem 
> I'm facing now :)
>  
> 
>> Yes. Above I combined an extra attribute 'name' with the primary key in the 
>> WHERE clause of the update statement to ensure that 'name's value isn't 
>> overwritten if it's changed since by another transaction since I've first 
>> read it. If I use the .all() method to get one or many instances that I can 
>> update, even if I lock for update, the where clause of the update contains 
>> _only_ the primary key, meaning that it will end up overwriting the other 
>> transactions value.
>> 
>> If in one shell I do something like this:
>> 
>> instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
>> name='foo').all()
>> for instance in instances:
>> instance.name = 'bar'
>> Session.add(instance)
>> 
>> And in another shell I do the same but update the instance.name to 
>> 'overwrite' instead of 'bar', and now commit the first shell, the second one 
>> will indeed set the 'name' to 'overwrite' instead of not updating any rows.
> 
> OK again this might behave differently with a different isolation level, not 
> sure, haven't used MySQL's transactional features too much.
> 
> No doubt. I'm stuck on my current isolation level though, with no chance of 
> changing it.
>  
> 
>> This is certainly a nice solution and I do use it for several tables, but 
>> it's also a little more course than my original update statement above. In 
>> my update statement I check the 'id' and 'name' columns, any other column 
>> that's changed I don't have to care about, since I'm not writing now values 
>> to them so other transactions can update those without effecting mine, where 
>> as the version id feature would force me to reread the row.
> 
> Not sure how version id forces you to reread the row.   The UPDATE statement 
> is emitted as UPDATE table ... WHERE pk= AND version_id=.   It 
> doesn't use any more reads than a regular ORM operation that is updating an 
> in-memory row.   The approach here is pretty standard and is copied from that 
> of Hibernate.
> 
> Sorry, I didn't communicate that very well. If another transaction updates 
> the row, but not the column I'm interested in, then the version_id gets 
> updated and the update from my own transaction fails. I then need to reread 
> the row to update my stale view of it's contents and try to commit again. 
> This is what I mean by needing to do an additional read (compared to my 
> "update where" approach) if the row had other columns changed in another 
> transaction. It's not much of an issue though - I'd need a system that has an 
> insane amount of updates to a single row for this to become a major issue, at 
> which point I'll likely have very different concerns :)
> 
> -- 
> You received this message because you 

Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Tim Kersten


On Friday, April 25, 2014 11:19:40 PM UTC+1, Michael Bayer wrote:
>
>
> On Apr 25, 2014, at 5:40 PM, Tim Kersten > 
> wrote:
>
> On Friday, April 25, 2014 5:05:21 PM UTC+1, Michael Bayer wrote:
>>
>>
>> On Apr 25, 2014, at 4:22 AM, Tim Kersten  wrote:
>>
>> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
>> new_foo_value})
>>
>> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
>> mymodel.foo=%s
>>
>> If I read that correctly it means that the update won't set any rows if 
>> the value foo has changed in some other transaction since I last read it, 
>> so I won't end up overwriting anything that's been changed by someone else.
>>
>>
>> well that depends highly on the transactional capabilities/settings of 
>> your database, but if you have read committed isolation those rows will be 
>> locked, which means your UPDATE statement would then wait until the other 
>> transaction commits.  At that point, your UPDATE will proceed and overwrite 
>> whatever the other transaction did.So it wouldn’t really work for the 
>> purpose of “preventing overwriting anything that’s been changed by someone 
>> else”.  If OTOH you have repeatable read set up, it should actually raise 
>> an exception when a conflict is detected.  Which also might not be what you 
>> want, that is, your operation will fail.
>>
>
> I use read committed isolation and you're right, it does lock the row. 
> However, once the other transaction commits it will not overwrite what the 
> other one did as that's what the where clause protects against. I began 2 
> transactions in 2 separate shells (mysql/InnoDB) to demonstrate:
>
>
> Oh OK the query that was in the most recent email didn’t make this clear, 
> I saw the “bar” part of it and such which threw me off from what you’re 
> doing.
>

Sorry :)
 

>
> This is an unusual use case because it seems like you’d like to outright 
> ignore the row if it doesn’t match?  or are you throwing an exception if 
> you don’t get the expected count?
>

Yes, I'm ignoring the row if it doesn't match. One use case for this: A 
misbehaving piece of code set a "bad" value in a column in many rows. The 
code was fixed, and thus when customers used that code in future the values 
would be updated to "good" values. A data migration to automatically adjust 
the "bad" values would have the potential to overwrite one that's just been 
altered by a customer, and this is what I'm trying to avoid, with as 
minimal an impact on the running system. I'd only like to avoid doing so 
while still using the ORM and was wondering if there's a way to do this 
without having a version column?
 

>
> The version_id feature will basically throw an exception if the row count 
> is not what’s expected.   Other than using repeatable read which is the 
> best approach, it’s the best way to prevent writing a stale record with the 
> ORM.
>

Indeed. If I'd have it on the table in question I wouldn't have the problem 
I'm facing now :)
 

Yes. Above I combined an extra attribute 'name' with the primary key in the 
> WHERE clause of the update statement to ensure that 'name's value isn't 
> overwritten if it's changed since by another transaction since I've first 
> read it. If I use the .all() method to get one or many instances that I can 
> update, even if I lock for update, the where clause of the update contains 
> _only_ the primary key, meaning that it will end up overwriting the other 
> transactions value.
>
>
> If in one shell I do something like this:
>
> instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
> name='foo').all()
> for instance in instances:
> instance.name = 'bar'
> Session.add(instance)
>
> And in another shell I do the same but update the instance.name to 
> 'overwrite' instead of 'bar', and now commit the first shell, the second 
> one will indeed set the 'name' to 'overwrite' instead of not updating any 
> rows.
>
>
> OK again this might behave differently with a different isolation level, 
> not sure, haven’t used MySQL’s transactional features too much.
>

No doubt. I'm stuck on my current isolation level though, with no chance of 
changing it.
 

This is certainly a nice solution and I do use it for several tables, but 
> it's also a little more course than my original update statement above. In 
> my update statement I check the 'id' and 'name' columns, any other column 
> that's changed I don't have to care about, since I'm not writing now values 
> to them so other transactions can update those without effecting mine, 
> where as the version id feature would force me to reread the row.
>
>
> Not sure how version id forces you to reread the row.   The UPDATE 
> statement is emitted as UPDATE table … WHERE pk= AND 
> version_id=.   It doesn’t use any more reads than a regular ORM 
> operation that is updating an in-memory row.   The approach here is pretty 
> standard and is copied from that of Hibernate.
>

Sorry, I didn't communicate 

Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Michael Bayer

On Apr 25, 2014, at 5:40 PM, Tim Kersten  wrote:

> 
> 
> On Friday, April 25, 2014 5:05:21 PM UTC+1, Michael Bayer wrote:
> 
> On Apr 25, 2014, at 4:22 AM, Tim Kersten  wrote:
> 
>> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
>> new_foo_value})
>> 
>> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
>> mymodel.foo=%s
>> 
>> If I read that correctly it means that the update won't set any rows if the 
>> value foo has changed in some other transaction since I last read it, so I 
>> won't end up overwriting anything that's been changed by someone else.
> 
> well that depends highly on the transactional capabilities/settings of your 
> database, but if you have read committed isolation those rows will be locked, 
> which means your UPDATE statement would then wait until the other transaction 
> commits.  At that point, your UPDATE will proceed and overwrite whatever the 
> other transaction did.So it wouldn't really work for the purpose of 
> "preventing overwriting anything that's been changed by someone else".  If 
> OTOH you have repeatable read set up, it should actually raise an exception 
> when a conflict is detected.  Which also might not be what you want, that is, 
> your operation will fail.
> 
> I use read committed isolation and you're right, it does lock the row. 
> However, once the other transaction commits it will not overwrite what the 
> other one did as that's what the where clause protects against. I began 2 
> transactions in 2 separate shells (mysql/InnoDB) to demonstrate:

Oh OK the query that was in the most recent email didn't make this clear, I saw 
the "bar" part of it and such which threw me off from what you're doing.

This is an unusual use case because it seems like you'd like to outright ignore 
the row if it doesn't match?  or are you throwing an exception if you don't get 
the expected count?

The version_id feature will basically throw an exception if the row count is 
not what's expected.   Other than using repeatable read which is the best 
approach, it's the best way to prevent writing a stale record with the ORM.

> Yes. Above I combined an extra attribute 'name' with the primary key in the 
> WHERE clause of the update statement to ensure that 'name's value isn't 
> overwritten if it's changed since by another transaction since I've first 
> read it. If I use the .all() method to get one or many instances that I can 
> update, even if I lock for update, the where clause of the update contains 
> _only_ the primary key, meaning that it will end up overwriting the other 
> transactions value.
> 
> If in one shell I do something like this:
> 
> instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
> name='foo').all()
> for instance in instances:
> instance.name = 'bar'
> Session.add(instance)
> 
> And in another shell I do the same but update the instance.name to 
> 'overwrite' instead of 'bar', and now commit the first shell, the second one 
> will indeed set the 'name' to 'overwrite' instead of not updating any rows.

OK again this might behave differently with a different isolation level, not 
sure, haven't used MySQL's transactional features too much.

> 
> 
> 
> This is certainly a nice solution and I do use it for several tables, but 
> it's also a little more course than my original update statement above. In my 
> update statement I check the 'id' and 'name' columns, any other column that's 
> changed I don't have to care about, since I'm not writing now values to them 
> so other transactions can update those without effecting mine, where as the 
> version id feature would force me to reread the row.

Not sure how version id forces you to reread the row.   The UPDATE statement is 
emitted as UPDATE table ... WHERE pk= AND version_id=.   It 
doesn't use any more reads than a regular ORM operation that is updating an 
in-memory row.   The approach here is pretty standard and is copied from that 
of Hibernate.


-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Tim Kersten


On Friday, April 25, 2014 5:05:21 PM UTC+1, Michael Bayer wrote:
>
>
> On Apr 25, 2014, at 4:22 AM, Tim Kersten > 
> wrote:
>
> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
> new_foo_value})
>
> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
> mymodel.foo=%s
>
> If I read that correctly it means that the update won't set any rows if 
> the value foo has changed in some other transaction since I last read it, 
> so I won't end up overwriting anything that's been changed by someone else.
>
>
> well that depends highly on the transactional capabilities/settings of 
> your database, but if you have read committed isolation those rows will be 
> locked, which means your UPDATE statement would then wait until the other 
> transaction commits.  At that point, your UPDATE will proceed and overwrite 
> whatever the other transaction did.So it wouldn’t really work for the 
> purpose of “preventing overwriting anything that’s been changed by someone 
> else”.  If OTOH you have repeatable read set up, it should actually raise 
> an exception when a conflict is detected.  Which also might not be what you 
> want, that is, your operation will fail.
>

I use read committed isolation and you're right, it does lock the row. 
However, once the other transaction commits it will not overwrite what the 
other one did as that's what the where clause protects against. I began 2 
transactions in 2 separate shells (mysql/InnoDB) to demonstrate:

1mysql> create table `test` ( `id` int(11) not null auto_increment primary 
key, `name` varchar(10)) engine=InnoDB;
1mysql> begin;
1mysql> insert into test set name='foo';
1mysql> commit;
1mysql> begin;
Query OK, 0 rows affected (0.00 sec)
1mysql> select * from test;
++--+
| id | name |
++--+
|  1 | foo  |
++--+
1 row in set (0.00 sec)

2mysql> begin;
2mysql> select * from test;
++--+
| id | name |
++--+
|  1 | foo  |
++--+
1 row in set (0.00 sec)

1mysql> update test set name='bar' where id=1 and name='foo';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

2mysql> update test set name='overwrite' where id=1 and name='foo';

1mysql> commit;
2mysql> commit;

1mysql> select * from test;
++--+
| id | name |
++--+
|  1 | bar  |
++--+

If I use .all() and make my changes on the instances, the generated sql 
> issues will update where the primary key matches, rather than 'foo': UPDATE 
> mymodel SET foo=%s WHERE mymodel.uuid=%s
>
>
> the UPDATE statement is the only way a row in the relational database gets 
> updated.  There is no difference in transaction isolation behavior between 
> using an UPDATE for many rows versus an UPDATE for a single row.
>

Yes. Above I combined an extra attribute 'name' with the primary key in the 
WHERE clause of the update statement to ensure that 'name's value isn't 
overwritten if it's changed since by another transaction since I've first 
read it. If I use the .all() method to get one or many instances that I can 
update, even if I lock for update, the where clause of the update contains 
_only_ the primary key, meaning that it will end up overwriting the other 
transactions value.

If in one shell I do something like this:

instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
name='foo').all()
for instance in instances:
instance.name = 'bar'
Session.add(instance)

And in another shell I do the same but update the instance.name to 
'overwrite' instead of 'bar', and now commit the first shell, the second 
one will indeed set the 'name' to 'overwrite' instead of not updating any 
rows.


 

> To prevent stale write without using repeatable read you can also use the 
> version id feature: 
> http://docs.sqlalchemy.org/en/rel_0_9/orm/mapper_config.html#configuring-a-version-counter.
>however this feature only takes effect for individual objects, not the 
> query.update() feature as it works based on rows that have been loaded into 
> memory.
>

This is certainly a nice solution and I do use it for several tables, but 
it's also a little more course than my original update statement above. In 
my update statement I check the 'id' and 'name' columns, any other column 
that's changed I don't have to care about, since I'm not writing now values 
to them so other transactions can update those without effecting mine, 
where as the version id feature would force me to reread the row.
 
 

> On Thursday, April 24, 2014 2:03:19 AM UTC+1, Michael Bayer wrote:
>>
>> perhaps I’m missing something but wouldn’t you just change the update() 
>> here to all(), so that you SELECT only those rows you care about into 
>> memory, then change each “foo” as needed and flush?   I’m not seeing what 
>> the issue is.  The row isn’t locked if you aren’t using SELECT..FOR UPDATE.
>>
>>
>> On Apr 23, 2014, at 8:27 PM, Tim Kersten  wrote:
>>
>> I'd like to run a data migration on a live server, but only updat

Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Michael Bayer

On Apr 25, 2014, at 4:22 AM, Tim Kersten  wrote:

> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
> new_foo_value})
> 
> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
> mymodel.foo=%s
> 
> If I read that correctly it means that the update won't set any rows if the 
> value foo has changed in some other transaction since I last read it, so I 
> won't end up overwriting anything that's been changed by someone else.

well that depends highly on the transactional capabilities/settings of your 
database, but if you have read committed isolation those rows will be locked, 
which means your UPDATE statement would then wait until the other transaction 
commits.  At that point, your UPDATE will proceed and overwrite whatever the 
other transaction did.So it wouldn't really work for the purpose of 
"preventing overwriting anything that's been changed by someone else".  If OTOH 
you have repeatable read set up, it should actually raise an exception when a 
conflict is detected.  Which also might not be what you want, that is, your 
operation will fail.

> 
> If I use .all() and make my changes on the instances, the generated sql 
> issues will update where the primary key matches, rather than 'foo': UPDATE 
> mymodel SET foo=%s WHERE mymodel.uuid=%s

the UPDATE statement is the only way a row in the relational database gets 
updated.  There is no difference in transaction isolation behavior between 
using an UPDATE for many rows versus an UPDATE for a single row.


To prevent stale write without using repeatable read you can also use the 
version id feature: 
http://docs.sqlalchemy.org/en/rel_0_9/orm/mapper_config.html#configuring-a-version-counter
 .   however this feature only takes effect for individual objects, not the 
query.update() feature as it works based on rows that have been loaded into 
memory.



> 
> 
> On Thursday, April 24, 2014 2:03:19 AM UTC+1, Michael Bayer wrote:
> perhaps I'm missing something but wouldn't you just change the update() here 
> to all(), so that you SELECT only those rows you care about into memory, then 
> change each "foo" as needed and flush?   I'm not seeing what the issue is.  
> The row isn't locked if you aren't using SELECT..FOR UPDATE.
> 
> 
> On Apr 23, 2014, at 8:27 PM, Tim Kersten  wrote:
> 
>> I'd like to run a data migration on a live server, but only update rows if 
>> the data hasn't changed since I've read it, and would like to do so 
>> optimistically, so without locking the row. Doing so like below works and 
>> prevents me updating rows that have changed since I last read the row.
>> 
>> Session.query(MyModel).filter_by(foo=old_foo_value).filter_by(bar=old_bar_value).update({'foo':
>>  new_foo_value})
>> 
>> While the approach works, it doesn't use the ORM. (I use an after_flush hook 
>> to inspect & log changes from dirty instances in the session).  
>> 
>> Is there a way to update an ORM instance conditionally like above?
>> 
>> Kindest Regards,
>> Tim
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to sqlalchemy+...@googlegroups.com.
>> To post to this group, send email to sqlal...@googlegroups.com.
>> Visit this group at http://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Tim Kersten
Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
new_foo_value})

This generates something like this: UPDATE mymodel SET foo=%s WHERE 
mymodel.foo=%s

If I read that correctly it means that the update won't set any rows if the 
value foo has changed in some other transaction since I last read it, so I 
won't end up overwriting anything that's been changed by someone else.

If I use .all() and make my changes on the instances, the generated sql 
issues will update where the primary key matches, rather than 'foo': UPDATE 
mymodel SET foo=%s WHERE mymodel.uuid=%s


On Thursday, April 24, 2014 2:03:19 AM UTC+1, Michael Bayer wrote:
>
> perhaps I’m missing something but wouldn’t you just change the update() 
> here to all(), so that you SELECT only those rows you care about into 
> memory, then change each “foo” as needed and flush?   I’m not seeing what 
> the issue is.  The row isn’t locked if you aren’t using SELECT..FOR UPDATE.
>
>
> On Apr 23, 2014, at 8:27 PM, Tim Kersten > 
> wrote:
>
> I'd like to run a data migration on a live server, but only update rows if 
> the data hasn't changed since I've read it, and would like to do so 
> optimistically, so without locking the row. Doing so like below works and 
> prevents me updating rows that have changed since I last read the row.
>
> Session.query(MyModel).filter_by(foo=old_foo_value).filter_by(bar=old_bar_value).update({'foo':
>  
> new_foo_value})
>
> While the approach works, it doesn't use the ORM. (I use an after_flush 
> hook to inspect & log changes from dirty instances in the session).  
>
> Is there a way to update an ORM instance conditionally like above?
>
> Kindest Regards,
> Tim
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com .
> To post to this group, send email to sqlal...@googlegroups.com
> .
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.