The main utility of multiple_update is performance. In my tests it's 2-4x
faster than separate updates, and this is the main reason I'm using it in
our application.

And putting this logic in a function means there's a nice Pythonic API to
call, rather than hand-building SQL whenever you want to do this.

In terms of the approach you've given with the temporary table, that's
interesting. I haven't used the UPDATE...FROM syntax or temporary tables
much at all, so I didn't think of this approach.

It's not quite as simple as the separate updates in one query. However, I'd
be happy to use that approach if it's significantly faster, and if it's
factored out into the multiple_update() function. The nice thing about
having a function is that you can change how it works under the covers, but
the API can remain the same.

Are you interested in benchmarking my multiple_update (second version)
against your approach and posting results?

-Ben



On Fri, Mar 29, 2013 at 10:00 PM, zbouboutchi
<zboubout...@zbouboutchi.net>wrote:

> Hi Ben,
>
> I'm confused about the utility of this function.. Since it's possible to
> use external table with update statements, you certainly can do the whole
> update using a single request.
> I prepared a little example that reproduces yours in pure SQL.
>
> -- let's create a similar dataset:
> create temporary table test (id integer, age integer, name text);
> insert into test values (1, 10, 'Rose'), (42, 45, 'Bobby'), (23, 38,
> 'Sarrah');
>
> -- Here are the values I want to update, in this example, null won't
> update the test table.
> create temporary table source (id integer, age integer, name text);
> insert into source values (42, null, 'Bob'), (23, 30, 'Sarah');
>
> -- The "Bazinga" part:
> update test t
>   set name = coalesce(s.name, t.name), -- a small workaround to keep the
> original value if the source is null
>       age = coalesce(s.age, t.age) -- feel free to use any function you
> want, that works exactly the same way as usual.
>   from source s
>   where s.id = t.id;
>
> -- Here it is:
> select * from test order by id;
>
> 1;10;Rose
> 23;30;Sarah
> 42;50;Bob
>
> -- Bazinga --
>
> This method is the fastest I know, even on largest datasets where you
> might waste a lot of time and energy to execute millions and millions of
> requests, here you just update your whole dataset with one request and take
> profit of all the database optimisations (indexes etc.). With a modern
> dbms, you can even check the execution plan and get usefull tips on needed
> optimisations before you execute the nightmare.
> The only limitation I know is that you can't update more than 1 table at a
> time, but transactions give the ability to delay constraints checks at
> their end, so, 1 update per table seems legit.
>
> I think not every DBMS can handle this type of update, but it's part of
> SQL standards.
> At least, the famous ones (postgres, mssql, oracle, mysql etc.) have these
> functionnalities, with more or less syntax variations.
> I hope this will help.
>
> Regards,
>
> Christophe NINUCCI
>
> Le 27/03/2013 21:41, Ben Hoyt a écrit :
>
>> Hi folks,
>>
>> In our own utility file, I've added a function called multiple_update()
>> that executes multiple separate update statements in one query. Note that
>> this is not the same as a single db.update() that updates multiple rows --
>> with this function, the columns and vars for each row can be different.
>>
>> In my testing (Windows 64-bit, db is PostgreSQL running on another
>> machine) I've found this to be about 3-4x as fast as individual
>> db.update()s wrapped in a transaction, so a worthwhile speed increase. The
>> speed increase factor seems to be about constant regardless of the number
>> of rows (initially I was hoping it go up with N, but it didn't in my tests.)
>>
>> Any thoughts on the API signature, and whether this would be a good
>> addition to web.DB? Any comments on this approach in general? Is the way
>> I'm doing the cast to unicode and then '\n'.join(updates) the correct
>> approach?
>>
>> Couple of notes:
>>
>> * If this were part of web.py, obviously it'd be a method on DB instead
>> of a stand-alone function.
>> * dbconn.db_connection is our "current database object". Again, this
>> would go away if it were a DB method.
>>
>> Code below:
>>
>> -----
>> def multiple_update(table, where, values_vars, database=None,
>> _test=False):
>>     """Execute multiple separate update statements in one query.
>>
>>     >>> database = web.DB(None, {})
>>     >>> row1 = ({'name': 'Bob'}, {'id': 42})
>>     >>> row2 = ({'name': 'Sarah', 'age': 30}, {'id': 23})
>>     >>> print multiple_update('foo', 'id = $id', [row1, row2],
>> _test=True, database=database)
>>     UPDATE foo SET name = 'Bob' WHERE id = 42;
>>     UPDATE foo SET age = 30, name = 'Sarah' WHERE id = 23;
>>     >>> print multiple_update('foo', 'id = $id', [], _test=True,
>> database=database)
>>     None
>>     """
>>     if not values_vars:
>>         return
>>     if database is None:
>>         database = dbconn.db_connection
>>
>>     updates = []
>>     for values, vars in values_vars:
>>         update = u'UPDATE {table} SET {values} WHERE {where};'.format(
>>                 table=table,
>>                 values=web.db.sqlwhere(values, ', '),
>>                 where=database._where(where, vars))
>>         updates.append(unicode(update)**)
>>
>>     query = '\n'.join(updates)
>>     if _test:
>>         return query
>>     database.query(query)
>> -----
>>
>> Thanks,
>> Ben.
>> --
>> You received this message because you are subscribed to the Google Groups
>> "web.py" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to 
>> webpy+unsubscribe@**googlegroups.com<webpy%2bunsubscr...@googlegroups.com>
>> .
>>
>> To post to this group, send email to webpy@googlegroups.com.
>> Visit this group at 
>> http://groups.google.com/**group/webpy?hl=en<http://groups.google.com/group/webpy?hl=en>
>> .
>> For more options, visit 
>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
>> .
>>
>>
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "web.py" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/**
> topic/webpy/dTeS_5Vm7BM/**unsubscribe?hl=en<https://groups.google.com/d/topic/webpy/dTeS_5Vm7BM/unsubscribe?hl=en>
> .
> To unsubscribe from this group and all its topics, send an email to
> webpy+unsubscribe@**googlegroups.com<webpy%2bunsubscr...@googlegroups.com>
> .
> To post to this group, send email to webpy@googlegroups.com.
> Visit this group at 
> http://groups.google.com/**group/webpy?hl=en<http://groups.google.com/group/webpy?hl=en>
> .
> For more options, visit 
> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"web.py" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to webpy+unsubscr...@googlegroups.com.
To post to this group, send email to webpy@googlegroups.com.
Visit this group at http://groups.google.com/group/webpy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to