Re: [sqlalchemy] "update where" using the ORM

2014-04-28 Thread Tim Kersten
On Saturday, April 26, 2014 9:12:22 PM UTC+1, Michael Bayer wrote:

>
> On Apr 26, 2014, at 3:26 PM, Tim Kersten > 
> wrote: 
>
> > The resulting behaviour would be identical to using a version col id, 
> but only for this transaction and the instance passed to the update_where() 
> method, and instead of "UPDATE ... WHERE  = %s AND version = %s" you'd 
> have "UPDATE ... WHERE  = %s AND name = %s”. 
>
> This is where it would have to go: 
>
>
> https://bitbucket.org/zzzeek/sqlalchemy/src/146fbf6d26a8c4140a47aeb03131fdf81007b9a2/lib/sqlalchemy/orm/persistence.py?at=master#cl-308


Cool, thanks.
 

>
>
> where you can see that logic is wired to a single “expression”, which 
> could be a SQL expression that gathers up lots of columns, but the 
> expression is fixed.  It isn’t derivable from all the attributes that have 
> “changed”, and the logic here would need to be expanded into a much more 
> elaborate, complicated, and non-performant system to support this case.   
> For a feature to be added, it must attain a certain ratio of “impact on 
> complexity” to “how many people will actually use it”.   If the feature is 
> very simple and non-intrusive, we can often add it even if only one person 
> needs it.  If the feature is very complex, we can add it only if this is an 
> obvious need by a significant percentage of users.   
>
> in many cases we add event hooks in areas that are to allow expansion of 
> capabilities, but in the case of “persistence”, we already have 
> before_update() and after_update(), adding more hooks into the construction 
> of the actual SQL would be very complex and extremely specific to the 
> mechanics; it would be brittle, unstable and difficult to use. 
>
> IMHO the two existing approaches have no downsides: 
>
> 1. repeatable read isolation (which can be set on a per-session or 
> per-transaction basis.  Why not just use it?) 
>

Are you saying there's a way to use the ORM with repeatable read isolation 
without potentially overwriting another user's changes? I'm not sure how 
(other than using version col id, in which case I don't need to use 
repeatable read isolation).

 

>
> 2. version columns, including version columns that can be *timestamps*.   
> There is no need to go through an expensive (think TEXT/BLOB columns) and 
> error prone (think floating points) comparison of every column if the 
> UPDATE of a stale row is to be avoided - “stale” just means “our timestamp 
> of the row is earlier than the timestamp that’s present”.
>

Timestamps are only so granular though (depending on the database used) - 
two updates in very close succession may have the same timestamp set, so 
while it's unlikely to lead to data loss, it cannot guarantee it like a 
plain counter version column would.
 

>
> > Advantages of using the above approach instead of version col id: 
> >  - Much finer grain changes possible without raising an exception, 
>
> The behavior where version misses are to be ignored is also quite unusual 
> and I’ve never known anyone to want silent failure of an UPDATE statement 
> like that.  An entity update has a specific intent which is to target that 
> entity; this is at the core of what an ORM is trying to do.
>

You're right, an ORM is meant to target a specific entity. Whether the 
failure is silent or not doesn't play much role, in fact as you pointed out 
earlier it makes a lot of sense in an ORM for a failed update to raise an 
exception.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-26 Thread Michael Bayer

On Apr 26, 2014, at 3:26 PM, Tim Kersten  wrote:

> The resulting behaviour would be identical to using a version col id, but 
> only for this transaction and the instance passed to the update_where() 
> method, and instead of "UPDATE ... WHERE  = %s AND version = %s" you'd 
> have "UPDATE ... WHERE  = %s AND name = %s".

This is where it would have to go:

https://bitbucket.org/zzzeek/sqlalchemy/src/146fbf6d26a8c4140a47aeb03131fdf81007b9a2/lib/sqlalchemy/orm/persistence.py?at=master#cl-308

where you can see that logic is wired to a single "expression", which could be 
a SQL expression that gathers up lots of columns, but the expression is fixed.  
It isn't derivable from all the attributes that have "changed", and the logic 
here would need to be expanded into a much more elaborate, complicated, and 
non-performant system to support this case.   For a feature to be added, it 
must attain a certain ratio of "impact on complexity" to "how many people will 
actually use it".   If the feature is very simple and non-intrusive, we can 
often add it even if only one person needs it.  If the feature is very complex, 
we can add it only if this is an obvious need by a significant percentage of 
users.   

in many cases we add event hooks in areas that are to allow expansion of 
capabilities, but in the case of "persistence", we already have before_update() 
and after_update(), adding more hooks into the construction of the actual SQL 
would be very complex and extremely specific to the mechanics; it would be 
brittle, unstable and difficult to use.

IMHO the two existing approaches have no downsides:

1. repeatable read isolation (which can be set on a per-session or 
per-transaction basis.  Why not just use it?)

2. version columns, including version columns that can be *timestamps*.   There 
is no need to go through an expensive (think TEXT/BLOB columns) and error prone 
(think floating points) comparison of every column if the UPDATE of a stale row 
is to be avoided - "stale" just means "our timestamp of the row is earlier than 
the timestamp that's present".   

> Advantages of using the above approach instead of version col id:
>  - Much finer grain changes possible without raising an exception,

The behavior where version misses are to be ignored is also quite unusual and 
I've never known anyone to want silent failure of an UPDATE statement like 
that.  An entity update has a specific intent which is to target that entity; 
this is at the core of what an ORM is trying to do.


-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-26 Thread Tim Kersten


On Saturday, April 26, 2014 12:29:50 AM UTC+1, Michael Bayer wrote:
>
>
> On Apr 25, 2014, at 6:54 PM, Tim Kersten > 
> wrote:
>
> This is an unusual use case because it seems like you’d like to outright 
>> ignore the row if it doesn’t match?  or are you throwing an exception if 
>> you don’t get the expected count?
>>
>
> Yes, I'm ignoring the row if it doesn't match. One use case for this: A 
> misbehaving piece of code set a "bad" value in a column in many rows. The 
> code was fixed, and thus when customers used that code in future the values 
> would be updated to "good" values. A data migration to automatically adjust 
> the "bad" values would have the potential to overwrite one that's just been 
> altered by a customer, and this is what I'm trying to avoid, with as 
> minimal an impact on the running system. I'd only like to avoid doing so 
> while still using the ORM and was wondering if there's a way to do this 
> without having a version column?
>
>
> I can think of some potentially very exotic ways of injecting this 
> behavior with cursor_execute() events, but it would be quite awkward.  The 
> ORM still might complain when it sees zero rows updated.
>

> But the real problem is if you want the ORM to update a row, then have it 
> ignored, now you’ve totally got the wrong data in memory.  I’m not sure how 
> your application is constructed that A. you want to use the ORM but B. you 
> don’t care if it has the wrong data loaded after it thinks it’s 
> synchronized.If you just need some simplistic CRUD interface (e.g. 
> object.save()) you can build those on top of Core pretty easily.   The ORM 
> has a specific usage model and this goes counter to that, hence there’s a 
> Core.
>

In my initial the sql expression example no updates happen because no 
matching rows are found and thus they're ignored, however it's not 
important that they're ignored, just that no update can happen. I'd be 
happy to have a way to do this via the ORM if that raises an exception 
instead. I guess what I've been looking for is here is the exact same 
behaviour as the version col id behaviour, except there is no version id 
column. Instead there are one or more columns used in a similar manor 
instead, basically a mechanism to add additional expressions. If zero rows 
are matched, an exception is raised, similar to if a version column was 
updated by another transaction.

Session.set_update_where(my_instance, MyModel.name == 'old_value')

The resulting behaviour would be identical to using a version col id, but 
only for this transaction and the instance passed to the update_where() 
method, and instead of "UPDATE ... WHERE  = %s AND version = %s" you'd 
have "UPDATE ... WHERE  = %s AND name = %s".

An alternative approach might be to do infer all changed column values and 
add their old values to the WHERE part before a flush. This behaviour would 
need to be turned on for a table or perhaps just for an instance.

The above approaches provide the same behaviour that version col id does, 
but has different trade offs.

Advantages of using the version col id approach instead of the above:
 - Smaller SQL queries as the WHERE clause contains only the PK and the 
version column, vs all lots of columns and their old values.

Advantages of using the above approach instead of version col id:
 - Much finer grain changes possible without raising an exception, provided 
that the various transactions all modify different columns. (i.e. txn A 
changes the 'name' col, and txn B changes the 'description' col, both would 
be accepted, where as with the version col id one of the transactions would 
have failed.)

I don't know half as much as I'd like about how queries are run in the DB, 
so my suggested alternative approaches above may have serious performance 
implications other than increasing SQL traffic.

Thank you very much for your feedback. For now I'll focus on adding version 
columns to most of my tables and that should solve not only my current 
issues but will also protect the system from the lost update issues in 
general.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Michael Bayer

On Apr 25, 2014, at 6:54 PM, Tim Kersten  wrote:

>  
> 
> This is an unusual use case because it seems like you'd like to outright 
> ignore the row if it doesn't match?  or are you throwing an exception if you 
> don't get the expected count?
> 
> Yes, I'm ignoring the row if it doesn't match. One use case for this: A 
> misbehaving piece of code set a "bad" value in a column in many rows. The 
> code was fixed, and thus when customers used that code in future the values 
> would be updated to "good" values. A data migration to automatically adjust 
> the "bad" values would have the potential to overwrite one that's just been 
> altered by a customer, and this is what I'm trying to avoid, with as minimal 
> an impact on the running system. I'd only like to avoid doing so while still 
> using the ORM and was wondering if there's a way to do this without having a 
> version column?

I can think of some potentially very exotic ways of injecting this behavior 
with cursor_execute() events, but it would be quite awkward.  The ORM still 
might complain when it sees zero rows updated.

But the real problem is if you want the ORM to update a row, then have it 
ignored, now you've totally got the wrong data in memory.  I'm not sure how 
your application is constructed that A. you want to use the ORM but B. you 
don't care if it has the wrong data loaded after it thinks it's synchronized.   
 If you just need some simplistic CRUD interface (e.g. object.save()) you can 
build those on top of Core pretty easily.   The ORM has a specific usage model 
and this goes counter to that, hence there's a Core.

Keep in mind you can use a timestamp as a versioning column.   Seems like your 
use case is straightforward, only update rows that haven't been changed before 
timestamp X.  if your rows have a trigger or similar that does an "updated_at" 
column then you'd be able to target those rows.




>  
> 
> The version_id feature will basically throw an exception if the row count is 
> not what's expected.   Other than using repeatable read which is the best 
> approach, it's the best way to prevent writing a stale record with the ORM.
> 
> Indeed. If I'd have it on the table in question I wouldn't have the problem 
> I'm facing now :)
>  
> 
>> Yes. Above I combined an extra attribute 'name' with the primary key in the 
>> WHERE clause of the update statement to ensure that 'name's value isn't 
>> overwritten if it's changed since by another transaction since I've first 
>> read it. If I use the .all() method to get one or many instances that I can 
>> update, even if I lock for update, the where clause of the update contains 
>> _only_ the primary key, meaning that it will end up overwriting the other 
>> transactions value.
>> 
>> If in one shell I do something like this:
>> 
>> instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
>> name='foo').all()
>> for instance in instances:
>> instance.name = 'bar'
>> Session.add(instance)
>> 
>> And in another shell I do the same but update the instance.name to 
>> 'overwrite' instead of 'bar', and now commit the first shell, the second one 
>> will indeed set the 'name' to 'overwrite' instead of not updating any rows.
> 
> OK again this might behave differently with a different isolation level, not 
> sure, haven't used MySQL's transactional features too much.
> 
> No doubt. I'm stuck on my current isolation level though, with no chance of 
> changing it.
>  
> 
>> This is certainly a nice solution and I do use it for several tables, but 
>> it's also a little more course than my original update statement above. In 
>> my update statement I check the 'id' and 'name' columns, any other column 
>> that's changed I don't have to care about, since I'm not writing now values 
>> to them so other transactions can update those without effecting mine, where 
>> as the version id feature would force me to reread the row.
> 
> Not sure how version id forces you to reread the row.   The UPDATE statement 
> is emitted as UPDATE table ... WHERE pk= AND version_id=.   It 
> doesn't use any more reads than a regular ORM operation that is updating an 
> in-memory row.   The approach here is pretty standard and is copied from that 
> of Hibernate.
> 
> Sorry, I didn't communicate that very well. If another transaction updates 
> the row, but not the column I'm interested in, then the version_id gets 
> updated and the update from my own transaction fails. I then need to reread 
> the row to update my stale view of it's contents and try to commit again. 
> This is what I mean by needing to do an additional read (compared to my 
> "update where" approach) if the row had other columns changed in another 
> transaction. It's not much of an issue though - I'd need a system that has an 
> insane amount of updates to a single row for this to become a major issue, at 
> which point I'll likely have very different concerns :)
> 
> -- 
> You received this message because you 

Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Tim Kersten


On Friday, April 25, 2014 11:19:40 PM UTC+1, Michael Bayer wrote:
>
>
> On Apr 25, 2014, at 5:40 PM, Tim Kersten > 
> wrote:
>
> On Friday, April 25, 2014 5:05:21 PM UTC+1, Michael Bayer wrote:
>>
>>
>> On Apr 25, 2014, at 4:22 AM, Tim Kersten  wrote:
>>
>> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
>> new_foo_value})
>>
>> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
>> mymodel.foo=%s
>>
>> If I read that correctly it means that the update won't set any rows if 
>> the value foo has changed in some other transaction since I last read it, 
>> so I won't end up overwriting anything that's been changed by someone else.
>>
>>
>> well that depends highly on the transactional capabilities/settings of 
>> your database, but if you have read committed isolation those rows will be 
>> locked, which means your UPDATE statement would then wait until the other 
>> transaction commits.  At that point, your UPDATE will proceed and overwrite 
>> whatever the other transaction did.So it wouldn’t really work for the 
>> purpose of “preventing overwriting anything that’s been changed by someone 
>> else”.  If OTOH you have repeatable read set up, it should actually raise 
>> an exception when a conflict is detected.  Which also might not be what you 
>> want, that is, your operation will fail.
>>
>
> I use read committed isolation and you're right, it does lock the row. 
> However, once the other transaction commits it will not overwrite what the 
> other one did as that's what the where clause protects against. I began 2 
> transactions in 2 separate shells (mysql/InnoDB) to demonstrate:
>
>
> Oh OK the query that was in the most recent email didn’t make this clear, 
> I saw the “bar” part of it and such which threw me off from what you’re 
> doing.
>

Sorry :)
 

>
> This is an unusual use case because it seems like you’d like to outright 
> ignore the row if it doesn’t match?  or are you throwing an exception if 
> you don’t get the expected count?
>

Yes, I'm ignoring the row if it doesn't match. One use case for this: A 
misbehaving piece of code set a "bad" value in a column in many rows. The 
code was fixed, and thus when customers used that code in future the values 
would be updated to "good" values. A data migration to automatically adjust 
the "bad" values would have the potential to overwrite one that's just been 
altered by a customer, and this is what I'm trying to avoid, with as 
minimal an impact on the running system. I'd only like to avoid doing so 
while still using the ORM and was wondering if there's a way to do this 
without having a version column?
 

>
> The version_id feature will basically throw an exception if the row count 
> is not what’s expected.   Other than using repeatable read which is the 
> best approach, it’s the best way to prevent writing a stale record with the 
> ORM.
>

Indeed. If I'd have it on the table in question I wouldn't have the problem 
I'm facing now :)
 

Yes. Above I combined an extra attribute 'name' with the primary key in the 
> WHERE clause of the update statement to ensure that 'name's value isn't 
> overwritten if it's changed since by another transaction since I've first 
> read it. If I use the .all() method to get one or many instances that I can 
> update, even if I lock for update, the where clause of the update contains 
> _only_ the primary key, meaning that it will end up overwriting the other 
> transactions value.
>
>
> If in one shell I do something like this:
>
> instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
> name='foo').all()
> for instance in instances:
> instance.name = 'bar'
> Session.add(instance)
>
> And in another shell I do the same but update the instance.name to 
> 'overwrite' instead of 'bar', and now commit the first shell, the second 
> one will indeed set the 'name' to 'overwrite' instead of not updating any 
> rows.
>
>
> OK again this might behave differently with a different isolation level, 
> not sure, haven’t used MySQL’s transactional features too much.
>

No doubt. I'm stuck on my current isolation level though, with no chance of 
changing it.
 

This is certainly a nice solution and I do use it for several tables, but 
> it's also a little more course than my original update statement above. In 
> my update statement I check the 'id' and 'name' columns, any other column 
> that's changed I don't have to care about, since I'm not writing now values 
> to them so other transactions can update those without effecting mine, 
> where as the version id feature would force me to reread the row.
>
>
> Not sure how version id forces you to reread the row.   The UPDATE 
> statement is emitted as UPDATE table … WHERE pk= AND 
> version_id=.   It doesn’t use any more reads than a regular ORM 
> operation that is updating an in-memory row.   The approach here is pretty 
> standard and is copied from that of Hibernate.
>

Sorry, I didn't communicate 

Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Michael Bayer

On Apr 25, 2014, at 5:40 PM, Tim Kersten  wrote:

> 
> 
> On Friday, April 25, 2014 5:05:21 PM UTC+1, Michael Bayer wrote:
> 
> On Apr 25, 2014, at 4:22 AM, Tim Kersten  wrote:
> 
>> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
>> new_foo_value})
>> 
>> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
>> mymodel.foo=%s
>> 
>> If I read that correctly it means that the update won't set any rows if the 
>> value foo has changed in some other transaction since I last read it, so I 
>> won't end up overwriting anything that's been changed by someone else.
> 
> well that depends highly on the transactional capabilities/settings of your 
> database, but if you have read committed isolation those rows will be locked, 
> which means your UPDATE statement would then wait until the other transaction 
> commits.  At that point, your UPDATE will proceed and overwrite whatever the 
> other transaction did.So it wouldn't really work for the purpose of 
> "preventing overwriting anything that's been changed by someone else".  If 
> OTOH you have repeatable read set up, it should actually raise an exception 
> when a conflict is detected.  Which also might not be what you want, that is, 
> your operation will fail.
> 
> I use read committed isolation and you're right, it does lock the row. 
> However, once the other transaction commits it will not overwrite what the 
> other one did as that's what the where clause protects against. I began 2 
> transactions in 2 separate shells (mysql/InnoDB) to demonstrate:

Oh OK the query that was in the most recent email didn't make this clear, I saw 
the "bar" part of it and such which threw me off from what you're doing.

This is an unusual use case because it seems like you'd like to outright ignore 
the row if it doesn't match?  or are you throwing an exception if you don't get 
the expected count?

The version_id feature will basically throw an exception if the row count is 
not what's expected.   Other than using repeatable read which is the best 
approach, it's the best way to prevent writing a stale record with the ORM.

> Yes. Above I combined an extra attribute 'name' with the primary key in the 
> WHERE clause of the update statement to ensure that 'name's value isn't 
> overwritten if it's changed since by another transaction since I've first 
> read it. If I use the .all() method to get one or many instances that I can 
> update, even if I lock for update, the where clause of the update contains 
> _only_ the primary key, meaning that it will end up overwriting the other 
> transactions value.
> 
> If in one shell I do something like this:
> 
> instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
> name='foo').all()
> for instance in instances:
> instance.name = 'bar'
> Session.add(instance)
> 
> And in another shell I do the same but update the instance.name to 
> 'overwrite' instead of 'bar', and now commit the first shell, the second one 
> will indeed set the 'name' to 'overwrite' instead of not updating any rows.

OK again this might behave differently with a different isolation level, not 
sure, haven't used MySQL's transactional features too much.

> 
> 
> 
> This is certainly a nice solution and I do use it for several tables, but 
> it's also a little more course than my original update statement above. In my 
> update statement I check the 'id' and 'name' columns, any other column that's 
> changed I don't have to care about, since I'm not writing now values to them 
> so other transactions can update those without effecting mine, where as the 
> version id feature would force me to reread the row.

Not sure how version id forces you to reread the row.   The UPDATE statement is 
emitted as UPDATE table ... WHERE pk= AND version_id=.   It 
doesn't use any more reads than a regular ORM operation that is updating an 
in-memory row.   The approach here is pretty standard and is copied from that 
of Hibernate.


-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Tim Kersten


On Friday, April 25, 2014 5:05:21 PM UTC+1, Michael Bayer wrote:
>
>
> On Apr 25, 2014, at 4:22 AM, Tim Kersten > 
> wrote:
>
> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
> new_foo_value})
>
> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
> mymodel.foo=%s
>
> If I read that correctly it means that the update won't set any rows if 
> the value foo has changed in some other transaction since I last read it, 
> so I won't end up overwriting anything that's been changed by someone else.
>
>
> well that depends highly on the transactional capabilities/settings of 
> your database, but if you have read committed isolation those rows will be 
> locked, which means your UPDATE statement would then wait until the other 
> transaction commits.  At that point, your UPDATE will proceed and overwrite 
> whatever the other transaction did.So it wouldn’t really work for the 
> purpose of “preventing overwriting anything that’s been changed by someone 
> else”.  If OTOH you have repeatable read set up, it should actually raise 
> an exception when a conflict is detected.  Which also might not be what you 
> want, that is, your operation will fail.
>

I use read committed isolation and you're right, it does lock the row. 
However, once the other transaction commits it will not overwrite what the 
other one did as that's what the where clause protects against. I began 2 
transactions in 2 separate shells (mysql/InnoDB) to demonstrate:

1mysql> create table `test` ( `id` int(11) not null auto_increment primary 
key, `name` varchar(10)) engine=InnoDB;
1mysql> begin;
1mysql> insert into test set name='foo';
1mysql> commit;
1mysql> begin;
Query OK, 0 rows affected (0.00 sec)
1mysql> select * from test;
++--+
| id | name |
++--+
|  1 | foo  |
++--+
1 row in set (0.00 sec)

2mysql> begin;
2mysql> select * from test;
++--+
| id | name |
++--+
|  1 | foo  |
++--+
1 row in set (0.00 sec)

1mysql> update test set name='bar' where id=1 and name='foo';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

2mysql> update test set name='overwrite' where id=1 and name='foo';

1mysql> commit;
2mysql> commit;

1mysql> select * from test;
++--+
| id | name |
++--+
|  1 | bar  |
++--+

If I use .all() and make my changes on the instances, the generated sql 
> issues will update where the primary key matches, rather than 'foo': UPDATE 
> mymodel SET foo=%s WHERE mymodel.uuid=%s
>
>
> the UPDATE statement is the only way a row in the relational database gets 
> updated.  There is no difference in transaction isolation behavior between 
> using an UPDATE for many rows versus an UPDATE for a single row.
>

Yes. Above I combined an extra attribute 'name' with the primary key in the 
WHERE clause of the update statement to ensure that 'name's value isn't 
overwritten if it's changed since by another transaction since I've first 
read it. If I use the .all() method to get one or many instances that I can 
update, even if I lock for update, the where clause of the update contains 
_only_ the primary key, meaning that it will end up overwriting the other 
transactions value.

If in one shell I do something like this:

instances = Session.query(MyModel).with_lockmode('update').filter_by(id=1, 
name='foo').all()
for instance in instances:
instance.name = 'bar'
Session.add(instance)

And in another shell I do the same but update the instance.name to 
'overwrite' instead of 'bar', and now commit the first shell, the second 
one will indeed set the 'name' to 'overwrite' instead of not updating any 
rows.


 

> To prevent stale write without using repeatable read you can also use the 
> version id feature: 
> http://docs.sqlalchemy.org/en/rel_0_9/orm/mapper_config.html#configuring-a-version-counter.
>however this feature only takes effect for individual objects, not the 
> query.update() feature as it works based on rows that have been loaded into 
> memory.
>

This is certainly a nice solution and I do use it for several tables, but 
it's also a little more course than my original update statement above. In 
my update statement I check the 'id' and 'name' columns, any other column 
that's changed I don't have to care about, since I'm not writing now values 
to them so other transactions can update those without effecting mine, 
where as the version id feature would force me to reread the row.
 
 

> On Thursday, April 24, 2014 2:03:19 AM UTC+1, Michael Bayer wrote:
>>
>> perhaps I’m missing something but wouldn’t you just change the update() 
>> here to all(), so that you SELECT only those rows you care about into 
>> memory, then change each “foo” as needed and flush?   I’m not seeing what 
>> the issue is.  The row isn’t locked if you aren’t using SELECT..FOR UPDATE.
>>
>>
>> On Apr 23, 2014, at 8:27 PM, Tim Kersten  wrote:
>>
>> I'd like to run a data migration on a live server, but only updat

Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Michael Bayer

On Apr 25, 2014, at 4:22 AM, Tim Kersten  wrote:

> Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
> new_foo_value})
> 
> This generates something like this: UPDATE mymodel SET foo=%s WHERE 
> mymodel.foo=%s
> 
> If I read that correctly it means that the update won't set any rows if the 
> value foo has changed in some other transaction since I last read it, so I 
> won't end up overwriting anything that's been changed by someone else.

well that depends highly on the transactional capabilities/settings of your 
database, but if you have read committed isolation those rows will be locked, 
which means your UPDATE statement would then wait until the other transaction 
commits.  At that point, your UPDATE will proceed and overwrite whatever the 
other transaction did.So it wouldn't really work for the purpose of 
"preventing overwriting anything that's been changed by someone else".  If OTOH 
you have repeatable read set up, it should actually raise an exception when a 
conflict is detected.  Which also might not be what you want, that is, your 
operation will fail.

> 
> If I use .all() and make my changes on the instances, the generated sql 
> issues will update where the primary key matches, rather than 'foo': UPDATE 
> mymodel SET foo=%s WHERE mymodel.uuid=%s

the UPDATE statement is the only way a row in the relational database gets 
updated.  There is no difference in transaction isolation behavior between 
using an UPDATE for many rows versus an UPDATE for a single row.


To prevent stale write without using repeatable read you can also use the 
version id feature: 
http://docs.sqlalchemy.org/en/rel_0_9/orm/mapper_config.html#configuring-a-version-counter
 .   however this feature only takes effect for individual objects, not the 
query.update() feature as it works based on rows that have been loaded into 
memory.



> 
> 
> On Thursday, April 24, 2014 2:03:19 AM UTC+1, Michael Bayer wrote:
> perhaps I'm missing something but wouldn't you just change the update() here 
> to all(), so that you SELECT only those rows you care about into memory, then 
> change each "foo" as needed and flush?   I'm not seeing what the issue is.  
> The row isn't locked if you aren't using SELECT..FOR UPDATE.
> 
> 
> On Apr 23, 2014, at 8:27 PM, Tim Kersten  wrote:
> 
>> I'd like to run a data migration on a live server, but only update rows if 
>> the data hasn't changed since I've read it, and would like to do so 
>> optimistically, so without locking the row. Doing so like below works and 
>> prevents me updating rows that have changed since I last read the row.
>> 
>> Session.query(MyModel).filter_by(foo=old_foo_value).filter_by(bar=old_bar_value).update({'foo':
>>  new_foo_value})
>> 
>> While the approach works, it doesn't use the ORM. (I use an after_flush hook 
>> to inspect & log changes from dirty instances in the session).  
>> 
>> Is there a way to update an ORM instance conditionally like above?
>> 
>> Kindest Regards,
>> Tim
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to sqlalchemy+...@googlegroups.com.
>> To post to this group, send email to sqlal...@googlegroups.com.
>> Visit this group at http://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-25 Thread Tim Kersten
Session.query(MyModel).filter_by(foo=old_foo_value).update({'foo': 
new_foo_value})

This generates something like this: UPDATE mymodel SET foo=%s WHERE 
mymodel.foo=%s

If I read that correctly it means that the update won't set any rows if the 
value foo has changed in some other transaction since I last read it, so I 
won't end up overwriting anything that's been changed by someone else.

If I use .all() and make my changes on the instances, the generated sql 
issues will update where the primary key matches, rather than 'foo': UPDATE 
mymodel SET foo=%s WHERE mymodel.uuid=%s


On Thursday, April 24, 2014 2:03:19 AM UTC+1, Michael Bayer wrote:
>
> perhaps I’m missing something but wouldn’t you just change the update() 
> here to all(), so that you SELECT only those rows you care about into 
> memory, then change each “foo” as needed and flush?   I’m not seeing what 
> the issue is.  The row isn’t locked if you aren’t using SELECT..FOR UPDATE.
>
>
> On Apr 23, 2014, at 8:27 PM, Tim Kersten > 
> wrote:
>
> I'd like to run a data migration on a live server, but only update rows if 
> the data hasn't changed since I've read it, and would like to do so 
> optimistically, so without locking the row. Doing so like below works and 
> prevents me updating rows that have changed since I last read the row.
>
> Session.query(MyModel).filter_by(foo=old_foo_value).filter_by(bar=old_bar_value).update({'foo':
>  
> new_foo_value})
>
> While the approach works, it doesn't use the ORM. (I use an after_flush 
> hook to inspect & log changes from dirty instances in the session).  
>
> Is there a way to update an ORM instance conditionally like above?
>
> Kindest Regards,
> Tim
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+...@googlegroups.com .
> To post to this group, send email to sqlal...@googlegroups.com
> .
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] "update where" using the ORM

2014-04-23 Thread Michael Bayer
perhaps I'm missing something but wouldn't you just change the update() here to 
all(), so that you SELECT only those rows you care about into memory, then 
change each "foo" as needed and flush?   I'm not seeing what the issue is.  The 
row isn't locked if you aren't using SELECT..FOR UPDATE.


On Apr 23, 2014, at 8:27 PM, Tim Kersten  wrote:

> I'd like to run a data migration on a live server, but only update rows if 
> the data hasn't changed since I've read it, and would like to do so 
> optimistically, so without locking the row. Doing so like below works and 
> prevents me updating rows that have changed since I last read the row.
> 
> Session.query(MyModel).filter_by(foo=old_foo_value).filter_by(bar=old_bar_value).update({'foo':
>  new_foo_value})
> 
> While the approach works, it doesn't use the ORM. (I use an after_flush hook 
> to inspect & log changes from dirty instances in the session).  
> 
> Is there a way to update an ORM instance conditionally like above?
> 
> Kindest Regards,
> Tim
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] "update where" using the ORM

2014-04-23 Thread Tim Kersten
I'd like to run a data migration on a live server, but only update rows if 
the data hasn't changed since I've read it, and would like to do so 
optimistically, so without locking the row. Doing so like below works and 
prevents me updating rows that have changed since I last read the row.

Session.query(MyModel).filter_by(foo=old_foo_value).filter_by(bar=old_bar_value).update({'foo':
 
new_foo_value})

While the approach works, it doesn't use the ORM. (I use an after_flush 
hook to inspect & log changes from dirty instances in the session).  

Is there a way to update an ORM instance conditionally like above?

Kindest Regards,
Tim

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.