Thanks for many excellent suggestions on how to speed up the query.  I
can now make sqlite3 essentially tie mysql.  (Curiously, this is only
the case on the first run of the query; if I run the query again, the
MySQL time drops to close to zero while sqlite appears to take the
same time.  Maybe MySQL leaves the database sorted or somehow caches
that particular run?)

> You also need to increase your cache size to match the mysql performance
> pragma cache 100000;

This helped.  Thanks.

The other suggestions involved indexing on more than one field.  These
are excerpted below, and these all helped.


Thanks again.
Best,
Michael


>>> sqlite> explain query plan
>>> ? ...> SELECT name,score
>>> ? ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
>>> ? ...> ? ? ? ? FROM release_cl
>>> ? ...> ? ? ? ? WHERE media<3
>>> ? ...> ? ? ? ? AND year=2006
>>> ? ...> ? ? ? ? GROUP BY facilitynumber) r
>>>
>> Michael, from what I see, if your release_cl table is not properly indexed
>> to be quickly aggregated (media and year field), this will lead to full
>> table reading (so all the data of your 4M records). If it's properly
>> indexed, and the result number of records of this select is big, consider
>> adding non-indexed fields to this (or brand-new) index since otherwise
>> sqlite quickly finds records with this index, but have to look up main data
>> tree to retrieve other fields.



> How many distinct media-types are there?
> How many distinct facilities do you have?
> How many rows are typically returned by your FacilityScore subquery?

There are  about 30 distinct media and about 18 years.
There are 50,000 distinct facilities.
The FacilityScore subquery returns about 16,000 rows.

> Per suggestions, I indexed year and media on the big table.  So I now
> have separate indexes for the key variable (releasenumber) and for
> year and for media.    Would  it make more sense to have a single
> index for all three, thus:
>
> CREATE UNIQUE INDEX r ON release_cl (year,media,releasenumber);

Correct.  Each SELECT command uses zero or one index.  It's pointless
to create individual indexes for each column in your table.  Look at
the columns, ORDER BY and WHERE of your SELECT query and make up one
index which is ideal for that query.  For the ideal index in SQLite,
you make the index up of

first, the columns mentioned in the WHERE clause
then, the columns mentioned in the ORDER BY clause
finally any other columns you want retrieved.

For each purpose you have to choose your preferred cutoff: it's
usually not worth doing the last section of the above, and sometimes
not even worth doing the ORDER BY section.  The more indexes, and the
longer they are, the longer each WRITE operation takes for that table.
 You could end up with a table which might be fast to search but need
a very long time for each INSERT.  Also, increasing the length of each
index will increase the database filesize, and therefore cause
operations which change the database file to take longer.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to