Michael Bayer wrote: > On Dec 22, 2009, at 5:20 PM, Igor Katson wrote: > >>> 0.5 supports this using the "postgres_returning" argument to update() and >>> insert(), but the ORM doesn't make use of it. in 0.6, the feature has been >>> generalized using the returning() method on insert() and update(), and the >>> ORM will use it to fetch new primary key identifiers. >>> >>> It does not take effect within the ORM for inline SQL other than primary >>> key identifiers, but you can use an insert() with returning() manually to >>> achieve that effect. >>> >> Why not, Michael? Is it hard to implement this? This seems to be pretty >> useful and performance-improving, reducing the amount of selects needed >> when coding unaware of object expiry. > > because a. I dont have the time to implement conditional returning() > awareness within the ORM and b. I don't think it will have as dramatic a > performance impact as you expect. When I added default returning() support > for insert() statements by default, the number of function call counts in our > performance tests went up, due to the additional complexity. I have made the > automatic-returning feature optional for all dialects since its not really > clear that performance is actually enhanced. So its a race between Python > method performance, which is abysmal, and overhead of SQL round trips. > Future enhancements like turning ResultProxy into C (which some work has been > done on) could then be game changers again. > > Other reasons performance might not increase much at all are: > > 1. a typical ORM application may create/modify a bunch of objects, then call > commit(). By default everything is expired unconditionally in that case - > RETURNING for non-primary key columns is wasted overhead. > 2. the "set an attribute to a SQL expression" use case is very rare. in most > cases the additional complexity of deciding about returning() at the ORM > level just adds complexity for no reason. > 3. The use case of SQL-level defaults/onupdates on columns is there, but > again the usual use here is for timestamps and such that usually aren't > fetched in the same transaction as the one in which they are flushed - > fetching them back just so they can be thrown away is again wasted overhead. > 4. its not clear overall if returning() is faster for all DBAPIs - in > particular the mechanics of using RETURNING with oracle and MSSQL are pretty > complex. > > >> P.S. And in SQLAlchemy 0.5, what's the most orm-like way to do solve >> this? In my code, it looks like this now, cause I didn't find an easier >> (less lines + less sql-like code) way to do it. >> columns = ','.join([str(col) for col in obj_table.c]) >> query = '''UPDATE %(table)s >> SET col = col + %(value)s >> WHERE id = %(id)s >> RETURNING %(columns)s''' % { >> 'id': id, >> 'columns': columns, >> 'table': obj_table, >> 'value': value >> } >> obj = meta.Session.query(Obj).from_statement(query).first() > > As far as the rendering above, you should use the postgres_returning keyword > in conjunction with an update() construct, instead of piecing together a > string. It should be in the API documentation. The usage of query(Obj) to > call it and get an object back from RETURNING is novel and I hadn't thought > of that. You need to ensure that the rows you're matching are not already > present in the session, or have been expired, else the new state will not be > reflected in the objects. > > You also should take a measurement of performance enhancement with the above > technique, vs. using query.update() or Session.execute(update()) and then > re-selecting the rows back, or just allowing the natural expiration to result > in individual SELECTs as needed. > > If you truly have some very performance critical section where thousands of > rows are being inserted/updated, the ORM is going to give you fairly mediocre > performance with or without RETURNING, as it is designed to do lots of the > thinking and effort for you in a very generic (read: unoptimized) way and > hence is an expensive route to go no matter what. You'd be better off > dropping into the usage of insert()/update()/delete() constructs and not > attempting to reinstate the new rows back into ORM objects, since that is all > an expensive process designed to make the production of source code easier. > In this case your source code is being made more difficult to produce anyway > so the ORM might not be worth it. > Thanks, Michael, it's clear now. A was not thinking that Python's overhead might be more than RDBMS's one. After profiling my application with repoze.profile and apache ab, I see that SQL only takes around 10% of the request serving time (being extremely fast and serving 1000 requests in 1 second!), everything else is Python.
-- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.