Re: [sqlite] Speeding up quer

Dennis Cote Tue, 16 Nov 2004 16:43:33 -0800

Lloyd Thomas wrote:
> There are no indexes in may tables. Please find the following schemas
> for my tables. Would it make more sense to convert my datetime
> columns to microtime?. What other recommendations would you make for
> these tables? CREATE TABLE users (
>   user_id INTEGER PRIMARY KEY,
>   extn_no varchar(16) default NULL,
>   username varchar(255) default NULL,
>   password varchar(255) default NULL,
>   admin NOT NULL default 'No',
>   depthead NOT NULL default 'No',
>   user NOT NULL default 'Yes',
>   firstname varchar(255) default NULL,
>   surname varchar(255) default NULL,
>   job_title varchar(255) default NULL,
>   user_email varchar(255) default NULL,
>   deleted NOT NULL default 'No'
> );
>
> CREATE TABLE call_data (
>   call_id INTEGER PRIMARY KEY,
>   direction NOT NULL default 'Out',
>   group_no varchar(16) default NULL,
>   start_no varchar(16) default NULL,
>   extn_no varchar(16) NOT NULL default '',
>   trunk_no varchar(16) NOT NULL default '',
>   trans NOT NULL default 'No',
>   ddi varchar(16) default NULL,
>   dest varchar(32) NOT NULL default '',
>   dest_name varchar(255) default NULL,
>   duration varchar(8) NOT NULL default '',
>   ring_time varchar(5) default NULL,
>   call_time datetime default NULL,
>   cost decimal(10,2) default NULL,
>   band varchar(10) default NULL,
>   site_id varchar(10) default NULL
> );
>
>
>
>> Hi there,
>>
>>> I am have a problem with a query which may well have over 200,000
>>> records.
>>> I
>>> have building a website using PHP and PHP is timing out after
>>> 30secs due the
>>> the size of the call_data table (I think). Is there anyway I can
>>> improve the
>>> following query so that it is faster. I think I am using sqlite
>>> 2.8.14 (not
>>> sure).
>>>
>>> SELECT call_id, C.extn_no AS extn_no, dest, dest_name, call_time,
>>> duration,
>>> cost, U.firstname AS firstname, U.surname AS surname FROM call_data
>>> as C LEFT JOIN users as U on C.extn_no = U.extn_no WHERE 1 = '1' AND
>>> julianday(call_time) >= julianday('2004-10-16 09:00:00') AND
>>> julianday(call_time) <= julianday('2004-11-16 17:29:59') AND
>>> direction = 'Out' ORDER BY cost desc LIMIT 0,16;
>>
>> Two things:
>>
>> 1) Why do you have the "1 = '1'" expression?  It's just wasting
>> processor time.
>>
>> 2) You can do preprocessing of the julian dates, converting the
>> expressions (like '2004-11-16 17:29:59') into whatever format is
>> stored in the database, then comparing that directly.  This will
>> allow you to use an index on call_time.
>>
>> Be aware, however, that maintaining an index can be time-consuming
>> if you have many records going in and out.
>>
>> Right now, it is my guess that every record in the table will have
>> to be read, in order to apply the juliandate() function.  Thus no
>> index is used, and every record is read.  This is just an uneducated
>> guess, though -- I haven't checked the EXPLAIN output.
>>
>> Ulrik P.


Lloyd,

To clarify Ulriks suggestion, you will need to create an index on your
call_time field like this.

CREATE INDEX call_time_idx on call_data(call_time);

Then, assuming you are storing complete datetime strings in the call_time
field, you can use the following query to return your results.

SELECT call_id, C.extn_no AS extn_no, dest, dest_name, call_time,
duration, cost, U.firstname AS firstname, U.surname AS surname
FROM call_data as C
LEFT JOIN users as U on C.extn_no = U.extn_no
WHERE call_time >= '2004-10-16 09:00:00'
AND call_time <= '2004-11-16 17:29:59'
AND direction = 'Out'
ORDER BY cost desc
LIMIT 0,16;

This should be substantially faster since it will use the new index to find
the first and last records that match the call_time range tests. Note, you
shouldn't use the BETWEEN operator for the the call_time range test since it
will not the use index to locate the matching records.

You shoudln't use the julianday() function on the call_time values or the
range limits. If you do, you will disable the use of the index. Datetime
strings are directly comparable.

This may still take a while to execute since it must locate all the records
in the given range and sort them by cost before it can select the 16 records
to return. If 10,000 records fall in that range, you will copy and sort all
10,000 records even though you only return 16.

One other suggestion is to remove the NOT NULL constraints wherever they are
not needed. In particular, you have a few fields declared NOT NULL with a
DEFAULT value clause. The default value will be inserted if, and only if, an
insert statement doesn't provide a value for that field (i.e. a NULL is
inserted), so the NOT NULL constraint is redundant. Each NOT NULL clause
generates test opcodes that are executed for each insert into the table.

Re: [sqlite] Speeding up quer

Reply via email to