[
https://issues.apache.org/jira/browse/DERBY-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696097#action_12696097
]
Rick Hillegas commented on DERBY-581:
-------------------------------------
Hi Knut,
I think there's more work which could be done to improve our FETCH/OFFSET
support (such as adding ORDER BY to subselects), but I agree that this
particular issue can be closed. Thanks.
> Modify SQL to skip N rows of the result and return the next M rows
> ------------------------------------------------------------------
>
> Key: DERBY-581
> URL: https://issues.apache.org/jira/browse/DERBY-581
> Project: Derby
> Issue Type: New Feature
> Components: SQL
> Environment: All
> Reporter: Craig Russell
> Assignee: Bryan Pendleton
> Priority: Minor
>
> I agree that the information should be expressed in SQL so that the query
> optimized and execution strategy can know what the user needs in terms of
> cardinality.
> I'd also like to ask that when we consider extending the SQL in this manner
> we consider skipping the first N rows and returning the next M rows.
> Craig
> On Sep 20, 2005, at 10:19 AM, Suavi Ali Demir wrote:
> Another little detail about optimization is that Statement.setMaxRows() kind
> of functions on the JDBC side may not be sufficient since it is called after
> SQL statement is prepared and returned as an object (after query plan is
> built). Therefore, it may be necessary to have language syntax to indicate
> the intention to fetch first 1000 rows only, so that when the query is
> prepared, this intention can be taken into account.
> Regards,
> Ali
> Mike Matrigali <[email protected]> wrote:
> As craig points out it is important in performance testing to say
> exactly what you are measuring. In general Derby will try to
> stream rows to the user before it has finished looking at all rows.
> So often looking at the first row will and stopping will mean that
> many rows have not been processed. BUT when an order by is involved
> and the query plan either has no appropriate matching index, or decides
> to use a different index then all the rows are processed, then they are
> sent to the sorter and finally after all rows are processed they are
> streamed to the client.
> So as you have seen reading the first 1000 rows of a much larger data
> set can happen very quickly.
> As subsequent mail threads have pointed out, returning the top 1000
> sorted rows is an interesting problem which could be costed and executed
> differently if that information was pushed into the optimizer and the
> sorter (and medium level projects were done in those areas).
> > On Sep 16, 2005, at 4:42 PM, Scott Ogden wrote:
> >
> >
> >
> > I have observed some interesting query performance behavior and am
> > hoping someone here can explain.
> >
> > In my scenario, it appears that an existing index is not being used for
> > the 'order by' part of the operation and as a result the perfo rmance of
> > certain queries is suffering. Can someone explain if this is supposed
> > to be what is happening and why? Please see below for the specific
> > queries and their performance characteristics.
> >
> > Here are the particulars:
> >
> > ---------------------------------
> >
> > create table orders(
> >
> > order_id varchar(50) NOT NULL
> >
> > CONSTRAINT ORDERS_PK PRIMARY KEY,
> >
> > amount numeric(31,2),
> >
> > time date,
> >
> > inv_num varchar(50),
> >
> > line_num varchar(50),
> >
> > phone varchar(50),
> >
> > prod_num varchar(50));
> > > --Load a large amount of data (720,000 records) into the 'orders' table
> > >
> > --Create an index on the time column as that will be used i n the 'where'
> > clause.
> >
> > create index IX_ORDERS_TIME on orders(time);
> > >
> > --When I run a query against this table returning top 1,000 records,
> > this query returns very quickly, consistently less than .010 seconds.
> >>
> >>
> >> select * from orders
> >>
> >> where time > '10/01/2002' and time < '11/30/2002'
> >>
> >> order by time;
> >>
> >> --Now run a similarly query against same table, returning the top
> >> 1,000 records.
> >>
> >> --The difference is that the results are now sorted by the primary key
> >> ('order_id') rather than 'time'.
> >>
> >> --This query returns slowly, approximately 15 seconds. Why??
> >>
> >> select * from orders
> >>
> >> where time > '10/01/2002' and time < '11/30/2002'
> >>
> >> order by order_id;
> >>
> >> --Now run a third query against the same 'orders' table, removing the
> >> where clause
> >>
> >> --This query returns quickly, around .010 seconds.
> >>
> >>
> >>
> >> select * from orders
> >>
> >> order by order_id;
> >>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.