Re: [sqlite] 3 million rows, query speeds, and returning zero for rows that don't exist

Simon Slavin Wed, 01 Jul 2009 18:36:01 -0700

On 2 Jul 2009, at 2:00am, yaconsult wrote:

> For testing purposes, I created indexes
> for all the columns used in the selects below.


This doesn't do what you want.  For instance, suppose I had a huge  
amount of data in this table:

   description   |  date   |  time
----------------+---------+---------

and suppose I wanted to list all the events in date/time order.  I do  
something like

SELECT description FROM events ORDER BY date,time

Creating one index on the date column and one on the time column does  
not help much: it lets the SELECT sort by date quickly, but then it  
has to sort all items on each day manually.  So if you have a hundred  
events one day and a hundred on the next it has to perform two sorts  
of 100 items each time I execute the SELECT command.  With 3 million  
lines a day that's a huge amount of sorting when you do the SELECT  
command.

What this actually needs is one index on both columns:

CREATE INDEX timestamp ON events (date,time)

Then the SELECT command just uses that index and magically gets every  
single line in the order it wants.  Even for something like

select
date,
hour,
min,
sec,
count(port)
from log
where
uid != "-"
and (response = 207
     or response = 200)
and port = 8444
group by
timestamp
order by
timestamp;

indexes on all the separate fields would help only in selecting on one  
field.  A good index for this might be something like

CREATE INDEX i_207_and_200 ON log (uid,port,response,timestamp)

or

CREATE INDEX i_207_and_200 ON log (uid,port,timestamp,response)

Imagine you were doing the SELECT yourself, by hand.  Work out what  
kind of index would help /you/ most.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] 3 million rows, query speeds, and returning zero for rows that don't exist

Reply via email to