[sqlalchemy] new LIMIT/OFFSET Support for oracle - huge speed penalty

2010-06-23 Thread Ralph Heinkel

Hi,

we are about upgrading our sqlalchemy library from 0.4.8 to something 
newer and during this process we have detected that the LIMIT/OFFSET 
support for oracle has been changed, from using  “ROW NUMBER OVER...” 
to a wrapped subquery approach in conjunction with ROWNUM as described 
in 
http://www.sqlalchemy.org/docs/reference/dialects/oracle.html#limit-offset-support



Unfortunately this approch is about 10 times slower for large tables 
which is mainly related to the fact that the innermost subquery has to 
sort the entire table with a plain 'order by'.

Interestingly the
ROW_NUMBER() OVER (ORDER BY some db fields)
is so much more efficient than the normal order by approach.

I don't know the reasons behind the decision for getting rid of the 
row number over approach, but could it make sense to reimplement 
this algorithm again into the current SA as an optional feature?


Any help/feedback is very appreciated.

Thanks,

Ralph

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] new LIMIT/OFFSET Support for oracle - huge speed penalty

2010-06-23 Thread Ian Kelly
On Wed, Jun 23, 2010 at 1:50 AM, Ralph Heinkel ralph.hein...@web.de wrote:
 Hi,

 we are about upgrading our sqlalchemy library from 0.4.8 to something newer
 and during this process we have detected that the LIMIT/OFFSET support for
 oracle has been changed, from using  “ROW NUMBER OVER...” to a wrapped
 subquery approach in conjunction with ROWNUM as described in
 http://www.sqlalchemy.org/docs/reference/dialects/oracle.html#limit-offset-support


 Unfortunately this approch is about 10 times slower for large tables which
 is mainly related to the fact that the innermost subquery has to sort the
 entire table with a plain 'order by'.
 Interestingly the
        ROW_NUMBER() OVER (ORDER BY some db fields)
 is so much more efficient than the normal order by approach.

Do you have benchmarks to back that up?  In Django, we switched from
using row_number to rownum after a contributor convinced me that
rownum was faster.  See:

http://code.djangoproject.com/ticket/9136

Thanks,
Ian

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] new LIMIT/OFFSET Support for oracle - huge speed penalty

2010-06-23 Thread Michael Bayer

On Jun 23, 2010, at 3:50 AM, Ralph Heinkel wrote:

 Hi,
 
 we are about upgrading our sqlalchemy library from 0.4.8 to something newer 
 and during this process we have detected that the LIMIT/OFFSET support for 
 oracle has been changed, from using  “ROW NUMBER OVER...” to a wrapped 
 subquery approach in conjunction with ROWNUM as described in 
 http://www.sqlalchemy.org/docs/reference/dialects/oracle.html#limit-offset-support
 
 
 Unfortunately this approch is about 10 times slower for large tables which is 
 mainly related to the fact that the innermost subquery has to sort the entire 
 table with a plain 'order by'.
 Interestingly the
   ROW_NUMBER() OVER (ORDER BY some db fields)
 is so much more efficient than the normal order by approach.
 
 I don't know the reasons behind the decision for getting rid of the row 
 number over approach, but could it make sense to reimplement this algorithm 
 again into the current SA as an optional feature?

A full history of this feature is here:

http://www.sqlalchemy.org/trac/ticket/536

The rationale is based on the bug described in that ticket, as well as that we 
preferred to go with an approach that was recommended by a lead engineer at 
Oracle.

The dialect includes an option to add the /*+ FIRST_ROWS(N) */ directive, by 
specifying the optimize_limits keyword to create engine - we originally had 
that in the query in all cases, until some folks chimed in that we shouldn't 
make that decision by default.I don't know if that helps your use case.

The previous system can be restored using a @compiles directive.  I have 
documented that recipe at 
http://www.sqlalchemy.org/trac/wiki/UsageRecipes/OracleRowNumberOver .


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.