Re: [sqlalchemy] Extending mapper to use PostgreSQL's INSERT/UPDATE...RETURNING

Michael Bayer Tue, 22 Dec 2009 17:39:45 -0800

On Dec 22, 2009, at 5:20 PM, Igor Katson wrote:

>> 0.5 supports this using the "postgres_returning" argument to update() and 
>> insert(), but the ORM doesn't make use of it.  in 0.6, the feature has been 
>> generalized using the returning() method on insert() and update(), and the 
>> ORM will use it to fetch new primary key identifiers.
>> 
>> It does not take effect within the ORM for inline SQL other than primary key 
>> identifiers, but you can use an insert() with returning() manually to 
>> achieve that effect.
>> 
> Why not, Michael? Is it hard to implement this? This seems to be pretty 
> useful and performance-improving, reducing the amount of selects needed 
> when coding unaware of object expiry.


because a. I dont have the time to implement conditional returning() awareness 
within the ORM and b. I don't think it will have as dramatic a performance 
impact as you expect.   When I added default returning() support for insert() 
statements by default, the number of function call counts in our performance 
tests went up, due to the additional complexity.  I have made the 
automatic-returning feature optional for all dialects since its not really 
clear that performance is actually enhanced.   So its a race between Python 
method performance, which is abysmal, and overhead of SQL round trips.    
Future enhancements like turning ResultProxy into C (which some work has been 
done on) could then be game changers again.

Other reasons performance might not increase much at all are:

1. a typical ORM application may create/modify a bunch of objects, then call 
commit().  By default everything is expired unconditionally in that case - 
RETURNING for non-primary key columns is wasted overhead.
2. the "set an attribute to a SQL expression" use case is very rare.  in most 
cases the additional complexity of deciding about returning() at the ORM level 
just adds complexity for no reason.
3. The use case of SQL-level defaults/onupdates on columns is there, but again 
the usual use here is for timestamps and such that usually aren't fetched in 
the same transaction as the one in which they are flushed - fetching them back 
just so they can be thrown away is again wasted overhead.
4. its not clear overall if returning() is faster for all DBAPIs - in 
particular the mechanics of using RETURNING with oracle and MSSQL are pretty 
complex.


> 
> P.S. And in SQLAlchemy 0.5, what's the most orm-like way to do solve 
> this? In my code, it looks like this now, cause I didn't find an easier 
> (less lines + less sql-like code) way to do it.
>        columns = ','.join([str(col) for col in obj_table.c])
>        query = '''UPDATE %(table)s
>        SET col = col + %(value)s
>        WHERE id = %(id)s
>        RETURNING %(columns)s''' % {
>            'id': id,
>            'columns': columns,
>            'table': obj_table,
>            'value': value
>        }
>        obj = meta.Session.query(Obj).from_statement(query).first()

As far as the rendering above, you should use the postgres_returning keyword in 
conjunction with an update() construct, instead of piecing together a string.  
It should be in the API documentation.    The usage of query(Obj) to call it 
and get an object back from RETURNING is novel and I hadn't thought of that.    
 You need to ensure that the rows you're matching are not already present in 
the session, or have been expired, else the new state will not be reflected in 
the objects.

You also should take a measurement of performance enhancement with the above 
technique, vs. using query.update() or Session.execute(update()) and then 
re-selecting the rows back, or just allowing the natural expiration to result 
in individual SELECTs as needed.   

If you truly have some very performance critical section where thousands of 
rows are being inserted/updated, the ORM is going to give you fairly mediocre 
performance with or without RETURNING, as it is designed to do lots of the 
thinking and effort for you in a very generic (read: unoptimized) way and hence 
is an expensive route to go no matter what.  You'd be better off dropping into 
the usage of insert()/update()/delete() constructs and not attempting to 
reinstate the new rows back into ORM objects, since that is all an expensive 
process designed to make the production of source code easier.   In this case 
your source code is being made more difficult to produce anyway so the ORM 
might not be worth it.


--

You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Extending mapper to use PostgreSQL's INSERT/UPDATE...RETURNING

Reply via email to