[jira] Commented: (DERBY-581) Modify SQL to skip N rows of the result and return the next M rows

Rick Hillegas (JIRA) Mon, 06 Apr 2009 07:07:36 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696097#action_12696097
 ]


Rick Hillegas commented on DERBY-581:
-------------------------------------

Hi Knut,

I think there's more work which could be done to improve our FETCH/OFFSET 
support (such as adding ORDER BY to subselects), but I agree that this 
particular issue can be closed. Thanks.

> Modify SQL to skip N rows of the result and return the next M rows
> ------------------------------------------------------------------
>
>                 Key: DERBY-581
>                 URL: https://issues.apache.org/jira/browse/DERBY-581
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>         Environment: All
>            Reporter: Craig Russell
>            Assignee: Bryan Pendleton
>            Priority: Minor
>
> I agree that the information should be expressed in SQL so that the query 
> optimized and execution strategy can know what the user needs in terms of 
> cardinality.
> I'd also like to ask that when we consider extending the SQL in this manner 
> we consider skipping the first N rows and returning the next M rows.
> Craig
> On Sep 20, 2005, at 10:19 AM, Suavi Ali Demir wrote:
> Another little detail about optimization is that Statement.setMaxRows() kind 
> of functions on the JDBC side may not be sufficient since it is called after 
> SQL statement is prepared and returned as an object (after query plan is 
> built). Therefore, it may be necessary to have language syntax to indicate 
> the intention to fetch first 1000 rows only, so that when the query is 
> prepared, this intention can be taken into account.
> Regards,
> Ali
> Mike Matrigali <[email protected]> wrote:
> As craig points out it is important in performance testing to say
> exactly what you are measuring. In general Derby will try to
> stream rows to the user before it has finished looking at all rows.
> So often looking at the first row will and stopping will mean that
> many rows have not been processed. BUT when an order by is involved
> and the query plan either has no appropriate matching index, or decides
> to use a different index then all the rows are processed, then they are
> sent to the sorter and finally after all rows are processed they are
> streamed to the client.
> So as you have seen reading the first 1000 rows of a much larger data
> set can happen very quickly.
> As subsequent mail threads have pointed out, returning the top 1000
> sorted rows is an interesting problem which could be costed and executed
> differently if that information was pushed into the optimizer and the
> sorter (and medium level projects were done in those areas).
> > On Sep 16, 2005, at 4:42 PM, Scott Ogden wrote:
> > 
> > 
> > 
> > I have observed some interesting query performance behavior and am
> > hoping someone here can explain. 
> > 
> > In my scenario, it appears that an existing index is not being used for
> > the 'order by' part of the operation and as a result the perfo rmance of
> > certain queries is suffering. Can someone explain if this is supposed
> > to be what is happening and why? Please see below for the specific
> > queries and their performance characteristics. 
> > 
> > Here are the particulars:
> > 
> > ---------------------------------
> > 
> > create table orders(
> > 
> > order_id varchar(50) NOT NULL
> > 
> > CONSTRAINT ORDERS_PK PRIMARY KEY,
> > 
> > amount numeric(31,2),
> > 
> > time date,
> > 
> > inv_num varchar(50),
> > 
> > line_num varchar(50),
> > 
> > phone varchar(50),
> > 
> > prod_num varchar(50));
> > > --Load a large amount of data (720,000 records) into the 'orders' table
> > > 
> > --Create an index on the time column as that will be used i n the 'where'
> > clause.
> > 
> > create index IX_ORDERS_TIME on orders(time);
> > > 
> > --When I run a query against this table returning top 1,000 records,
> > this query returns very quickly, consistently less than .010 seconds.
> >> 
> >>
> >> select * from orders
> >>
> >> where time > '10/01/2002' and time < '11/30/2002'
> >>
> >> order by time;
> >>
> >> --Now run a similarly query against same table, returning the top
> >> 1,000 records.
> >>
> >> --The difference is that the results are now sorted by the primary key
> >> ('order_id') rather than 'time'. 
> >>
> >> --This query returns slowly, approximately 15 seconds. Why??
> >>
> >> select * from orders
> >>
> >> where time > '10/01/2002' and time < '11/30/2002'
> >>
> >> order by order_id;
> >>
> >> --Now run a third query against the same 'orders' table, removing the
> >> where clause
> >>
> >> --This query returns quickly, around .010 seconds. 
> >>
> >> 
> >>
> >> select * from orders
> >>
> >> order by order_id;
> >>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-581) Modify SQL to skip N rows of the result and return the next M rows

Reply via email to