Thanks for many excellent suggestions on how to speed up the query. I can now make sqlite3 essentially tie mysql. (Curiously, this is only the case on the first run of the query; if I run the query again, the MySQL time drops to close to zero while sqlite appears to take the same time. Maybe MySQL leaves the database sorted or somehow caches that particular run?)
> You also need to increase your cache size to match the mysql performance > pragma cache 100000; This helped. Thanks. The other suggestions involved indexing on more than one field. These are excerpted below, and these all helped. Thanks again. Best, Michael >>> sqlite> explain query plan >>> ? ...> SELECT name,score >>> ? ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score >>> ? ...> ? ? ? ? FROM release_cl >>> ? ...> ? ? ? ? WHERE media<3 >>> ? ...> ? ? ? ? AND year=2006 >>> ? ...> ? ? ? ? GROUP BY facilitynumber) r >>> >> Michael, from what I see, if your release_cl table is not properly indexed >> to be quickly aggregated (media and year field), this will lead to full >> table reading (so all the data of your 4M records). If it's properly >> indexed, and the result number of records of this select is big, consider >> adding non-indexed fields to this (or brand-new) index since otherwise >> sqlite quickly finds records with this index, but have to look up main data >> tree to retrieve other fields. > How many distinct media-types are there? > How many distinct facilities do you have? > How many rows are typically returned by your FacilityScore subquery? There are about 30 distinct media and about 18 years. There are 50,000 distinct facilities. The FacilityScore subquery returns about 16,000 rows. > Per suggestions, I indexed year and media on the big table. So I now > have separate indexes for the key variable (releasenumber) and for > year and for media. Would it make more sense to have a single > index for all three, thus: > > CREATE UNIQUE INDEX r ON release_cl (year,media,releasenumber); Correct. Each SELECT command uses zero or one index. It's pointless to create individual indexes for each column in your table. Look at the columns, ORDER BY and WHERE of your SELECT query and make up one index which is ideal for that query. For the ideal index in SQLite, you make the index up of first, the columns mentioned in the WHERE clause then, the columns mentioned in the ORDER BY clause finally any other columns you want retrieved. For each purpose you have to choose your preferred cutoff: it's usually not worth doing the last section of the above, and sometimes not even worth doing the ORDER BY section. The more indexes, and the longer they are, the longer each WRITE operation takes for that table. You could end up with a table which might be fast to search but need a very long time for each INSERT. Also, increasing the length of each index will increase the database filesize, and therefore cause operations which change the database file to take longer. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users