Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread drh
Eduardo <[EMAIL PROTECTED]> wrote:
> At 19:42 14/06/2006, you wrote:
> >On Jun 14, 2006, at 16:42 UTC, [EMAIL PROTECTED] wrote:
> >
> > > Might go a lot faster if you put a "+" in fron of
> > > recID.  Like this:
> > >
> > >SELECT * FROM table WHERE +recID IN (...) ORDER BY dateFld
> >
> >What magic is this?  I checked both lang_expr.html and 
> >lang_select.html but can't find any such syntax -- perhaps I missed 
> >it.  What does it do?
> 
> The '+' mads the optimizer, so it will not use any optimization. In 
> these case, the optimization engine (at where.c) takes a bad 
> decission. It's not a feature, is a side effect of the optimizer design. 
> 

Eduardo is essentially correct (even if his English is a tad off :-)).
Putting "+" in front of the name of a column in the WHERE clause
(or in the ORDER BY clause) disqualifies that column from use by 
the optimizer.  So instead of using the index on recID to satisfy
the WHERE clause as it normally would, the extra "+" causes SQLite
to chooses the index on dateFld to satisfy the ORDER BY clause.
Whether or not that is a win or a loss depends on the content of
your table and is best determined by experimentation.

--
D. Richard Hipp   <[EMAIL PROTECTED]>



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread Eduardo

At 19:42 14/06/2006, you wrote:

On Jun 14, 2006, at 16:42 UTC, [EMAIL PROTECTED] wrote:

> Might go a lot faster if you put a "+" in fron of
> recID.  Like this:
>
>SELECT * FROM table WHERE +recID IN (...) ORDER BY dateFld

What magic is this?  I checked both lang_expr.html and 
lang_select.html but can't find any such syntax -- perhaps I missed 
it.  What does it do?


The '+' mads the optimizer, so it will not use any optimization. In 
these case, the optimization engine (at where.c) takes a bad 
decission. It's not a feature, is a side effect of the optimizer design. 



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread joe
On Jun 14, 2006, at 16:42 UTC, [EMAIL PROTECTED] wrote:

> Might go a lot faster if you put a "+" in fron of
> recID.  Like this:
> 
>SELECT * FROM table WHERE +recID IN (...) ORDER BY dateFld

What magic is this?  I checked both lang_expr.html and lang_select.html but 
can't find any such syntax -- perhaps I missed it.  What does it do?

Thanks,
- Joe

--
Joe Strout -- [EMAIL PROTECTED]
Verified Express, LLC "Making the Internet a Better Place"
http://www.verex.com/



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread joe
On Jun 14, 2006, at 16:34 UTC, Eduardo wrote:

> >I need to select a large set of records out of a table, sort them by 
> >one column, and then get just a subset of the sorted list.  (For 
> >example, I might want records 40-60 ordered by date, which is a 
> >completely different set than records 40-60 ordered by user ID.)  I 
> >start with the full list of record IDs I want, and a query something like 
> >this:
> 
> So you want the records from position 40 to position 60 ?

Yes, if by "position" you are referring to the sorted list.

> In DB2 was faster select the 0-59 (1 to 60) records from the date 
> ordered table and from that selection in inverse order, select the 
> 0-19 (1-20). It was faster because selecting records from 0 (1) to 
> N-1 (N) was a lot faster than using the select records M-1(M) to 
> N-1(N), even when used two times.

In my tests, selecting the 20 records I want is no problem (iterating 
througheven 13000 records once the SELECT returns is lickety-split).  It's 
mainly the sorting that was the problem.

(Though the SELECT itself takes several seconds, but I'm assuming that that's 
already as fast as it can get.)

Best,
- Joe

--
Joe Strout -- [EMAIL PROTECTED]
Verified Express, LLC "Making the Internet a Better Place"
http://www.verex.com/



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread drh
[EMAIL PROTECTED] wrote:
> 
>  SELECT  FROM  WHERE recID IN () ORDER BY dateFld
> 

Might go a lot faster if you put a "+" in fron of
recID.  Like this:

   SELECT * FROM table WHERE +recID IN (...) ORDER BY dateFld

Whether or not this is faster depends on what fraction of
redIDs actually match the (...). 
--
D. Richard Hipp   <[EMAIL PROTECTED]>



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread Eduardo

At 17:56 14/06/2006, you wrote:
I'm finding that ORDER BY is surprisingly slow, and it makes me 
wonder if I'm doing something wrong.  Here's the situation:


I need to select a large set of records out of a table, sort them by 
one column, and then get just a subset of the sorted list.  (For 
example, I might want records 40-60 ordered by date, which is a 
completely different set than records 40-60 ordered by user ID.)  I 
start with the full list of record IDs I want, and a query something like this:


So you want the records from position 40 to position 60 ?


 SELECT  FROM  WHERE recID IN ( ORDER BY dateFld

I have a unique index on recID, and an index on dateFld.

When my record IDs list is about 13000 items, the ORDER BY takes 
about 10 seconds (i.e., the query takes 10 seconds longer than the 
same query without the ORDER BY clause).  Yet if I remove the ORDER 
BY, grab all the dateFld values into my own array, and sort it 
myself, the sort takes about 2 seconds.


In DB2 was faster select the 0-59 (1 to 60) records from the date 
ordered table and from that selection in inverse order, select the 
0-19 (1-20). It was faster because selecting records from 0 (1) to 
N-1 (N) was a lot faster than using the select records M-1(M) to 
N-1(N), even when used two times.


HTH



Alien.org contacted...waiting for Ripley 



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread joe
On Jun 14, 2006, at 16:16 UTC, Paul Smith wrote:

> >  SELECT  FROM  WHERE recID IN ( ORDER BY dateFld
> >
> >I have a unique index on recID, and an index on dateFld.
> 
> Try making another index on both fields at once. SQLite can only use one 
> index at a time for each query.

Aha, that's probably it.  If I were the query engine, faced with this problem, 
I would certainly use the recID index to satisfy the WHERE clause.  I'll see if 
a combination index makes any difference.

Thanks for the pointer to the EXPLAIN command, too.

Best,
- Joe

--
Joe Strout -- [EMAIL PROTECTED]
Verified Express, LLC "Making the Internet a Better Place"
http://www.verex.com/



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread joe
On Jun 14, 2006, at 16:06 UTC, Marco Bambini wrote:

> Have you tried to create an indexed?

Yes, I mentioned that both recID and dateFld are indexed.

> Have you tried to analyze your query with SQLiteManager in order to  
> see which indexes are used?

No, I didn't know about that.  Thanks for the suggestion.

Best,
- Joe


--
Joe Strout -- [EMAIL PROTECTED]
Verified Express, LLC "Making the Internet a Better Place"
http://www.verex.com/



Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread Paul Smith

At 16:56 14/06/2006, [EMAIL PROTECTED] wrote:
I'm finding that ORDER BY is surprisingly slow, and it makes me wonder if 
I'm doing something wrong.  Here's the situation:


I need to select a large set of records out of a table, sort them by one 
column, and then get just a subset of the sorted list.  (For example, I 
might want records 40-60 ordered by date, which is a completely different 
set than records 40-60 ordered by user ID.)  I start with the full list of 
record IDs I want, and a query something like this:


 SELECT  FROM  WHERE recID IN ( ORDER BY dateFld

I have a unique index on recID, and an index on dateFld.


Try making another index on both fields at once. SQLite can only use one 
index at a time for each query.


So,
CREATE INDEX table_recdate ON table (recID, dateFld);

See if that makes any difference.

When my record IDs list is about 13000 items, the ORDER BY takes about 10 
seconds (i.e., the query takes 10 seconds longer than the same query 
without the ORDER BY clause).  Yet if I remove the ORDER BY, grab all the 
dateFld values into my own array, and sort it myself, the sort takes about 
2 seconds.


This has left me with the weird result that it's actually *faster* for me 
to query the database twice: first to get the unordered list of all 
records and their dates, which I then sort myself, and then query again to 
get just the subset of records I really want.


(That's what we do in some cases, eg if we have to do some sorts of joins 
it's quicker to get all the data and merge it in memory, rather than use 
the DB)


Am I missing something here?  If my own code can sort these dates in 2 
seconds, why does sqlite take 10?  And why did indexing the dateFld not 
make any difference (i.e., it took about 10 seconds before I added the 
index too)?


SQLite hasn't got as powerful an optimiser as some DBs such as MySQL etc 
(but then it is a tiny fraction of the size, so what do you expect). Some 
DBs also automatically create indices as they decide they're necessary - 
SQLite doesn't, you need to do it yourself.


This means you need to think about things a bit more yourself. The 
'EXPLAIN' command is your friend - learn how to use it at least a bit if 
performance is an issue - you can usually see where there are plain loops 
(which go around all records) or indexed loops (which are much quicker). 
Usually careful creation of the suitable indices helps a lot.


Read http://www.sqlite.org/optoverview.html - this gives some details of 
the limited optimisations that SQLite can do, so you can try to take 
advantage of them




PaulVPOP3 - Internet Email Server/Gateway
[EMAIL PROTECTED]  http://www.pscs.co.uk/




Re: [sqlite] speed of ORDER BY clause?

2006-06-14 Thread Marco Bambini

Have you tried to create an indexed?
Have you tried to analyze your query with SQLiteManager in order to  
see which indexes are used?


---
Marco Bambini
http://www.sqlabs.net
http://www.sqlabs.net/blog/



On Jun 14, 2006, at 5:56 PM, [EMAIL PROTECTED] wrote:

I'm finding that ORDER BY is surprisingly slow, and it makes me  
wonder if I'm doing something wrong.  Here's the situation:


I need to select a large set of records out of a table, sort them  
by one column, and then get just a subset of the sorted list.  (For  
example, I might want records 40-60 ordered by date, which is a  
completely different set than records 40-60 ordered by user ID.)  I  
start with the full list of record IDs I want, and a query  
something like this:


 SELECT  FROM  WHERE recID IN ( ORDER BY  
dateFld


I have a unique index on recID, and an index on dateFld.

When my record IDs list is about 13000 items, the ORDER BY takes  
about 10 seconds (i.e., the query takes 10 seconds longer than the  
same query without the ORDER BY clause).  Yet if I remove the ORDER  
BY, grab all the dateFld values into my own array, and sort it  
myself, the sort takes about 2 seconds.


This has left me with the weird result that it's actually *faster*  
for me to query the database twice: first to get the unordered list  
of all records and their dates, which I then sort myself, and then  
query again to get just the subset of records I really want.


Am I missing something here?  If my own code can sort these dates  
in 2 seconds, why does sqlite take 10?  And why did indexing the  
dateFld not make any difference (i.e., it took about 10 seconds  
before I added the index too)?


Finally, can anyone see a more efficient solution to this problem?

Many thanks,
- Joe


--
Joe Strout -- [EMAIL PROTECTED]
Verified Express, LLC "Making the Internet a Better Place"
http://www.verex.com/





[sqlite] speed of ORDER BY clause?

2006-06-14 Thread joe
I'm finding that ORDER BY is surprisingly slow, and it makes me wonder if I'm 
doing something wrong.  Here's the situation:

I need to select a large set of records out of a table, sort them by one 
column, and then get just a subset of the sorted list.  (For example, I might 
want records 40-60 ordered by date, which is a completely different set than 
records 40-60 ordered by user ID.)  I start with the full list of record IDs I 
want, and a query something like this:

 SELECT  FROM  WHERE recID IN ( ORDER BY dateFld

I have a unique index on recID, and an index on dateFld.

When my record IDs list is about 13000 items, the ORDER BY takes about 10 
seconds (i.e., the query takes 10 seconds longer than the same query without 
the ORDER BY clause).  Yet if I remove the ORDER BY, grab all the dateFld 
values into my own array, and sort it myself, the sort takes about 2 seconds.

This has left me with the weird result that it's actually *faster* for me to 
query the database twice: first to get the unordered list of all records and 
their dates, which I then sort myself, and then query again to get just the 
subset of records I really want.

Am I missing something here?  If my own code can sort these dates in 2 seconds, 
why does sqlite take 10?  And why did indexing the dateFld not make any 
difference (i.e., it took about 10 seconds before I added the index too)?

Finally, can anyone see a more efficient solution to this problem?

Many thanks,
- Joe


--
Joe Strout -- [EMAIL PROTECTED]
Verified Express, LLC "Making the Internet a Better Place"
http://www.verex.com/