If you test against 3.2.1, just use your original where clause and original index pk_data:
where utime >= 1146441600 and utime < 1149120000 as I don't know what effect the unary +'s would have on it. --- Joe Wilson <[EMAIL PROTECTED]> wrote: > For my databases, GROUP BY is slower in recent SQLite releases > because my queries are not able to use indexed columns for > GROUP BY items by design: > > http://www.mail-archive.com/sqlite-users%40sqlite.org/msg15563.html > > I'm curious what sort of timings you'd get with SQLite 3.2.1. > http://sqlite.org/sqlite-3.2.1.tar.gz > > To convert to the older database format: > > sqlite334 334.db .dump | sqlite321 321.db > > And then perform your timings with both SQLite versions against > the 321.db file to level the playing field. > Just having an index on utime should be optimal for SQLite 3.2.1. > > --- Steve Green <[EMAIL PROTECTED]> wrote: > > Using Richard's suggestion of changing the where clause of my query > > to > > > > where +utime >= 1146441600 and +utime < 114912000 > > > > did force sqlite to use the index that gave better performance. > > > > However, I'm seeing some strange behavior that I'm hoping someone can > > shed some light on. > > > > With the time period mentioned below, the data set is about 2.5 million > > rows and and 86K distinct u_id values. Using the index on (u_id, utime), > > the query time was reduced from 13.5 minutes to 26 seconds (not great, > > but at least the browser won't timeout waiting for a response). > > > > However, with a different time period, I have a much smaller data set of > > about 150K rows and 20K distinct u_id values. Using the index on (u_id, > > utime), the query still takes about 20 seconds. However, if the primary > > index on (utime, r_id, u_id) is used, the query only takes 0.5 seconds. > > Unfortunately at query time I have no idea of knowing how much data is > > going to have to be traversed, so the idea of modifying the query to force > > the use of different indexes is not possible. Can anyone explain why the > > performance is so poor with the smaller data set and the "optimal" index. > > Any suggestions on a workaround so that optimal performance can be achieved > > with all data set sizes? > > > > Thanks for your help, > > > > Steve > > > > Steve Green wrote: > > > > > Sorry, I forgot to mention that I'm using sqlite v3.3.4 on redhat linux > > > v7.3 > > > > > > Steve > > > > > > Steve Green wrote: > > > > > >> Hi, > > >> > > >> I have the following schema > > >> > > >> CREATE TABLE data( > > >> utime int4, > > >> r_id int2, > > >> u_id int4, > > >> data1 int8, > > >> data2 int8 > > >> ); > > >> > > >> Each row is uniquely defined by utime, r_id, and u_id, so I have the > > >> following index > > >> > > >> CREATE UNIQUE INDEX pk_data on data( utime, r_id, u_id ); > > >> > > >> This index is also needed because at update time, I use "insert or > > >> replace" to update the data table. > > >> > > >> The type of query that I'm attempting to perform is similar to > > >> > > >> select u_id, sum( data1 ), sum( data2 ) > > >> from data where utime >= 1146441600 and utime < 1149120000 > > >> group by u_id > > >> order by sum( data1 ) desc > > >> limit 10 > > >> > > >> My current table has about 2.5 million rows and about 86,000 distinct > > >> u_id values in the time period selected, and the query takes about 13.5 > > >> minutes. > > >> > > >> Performing an explain query plan reveals > > >> > > >> 0|0|TABLE data WITH INDEX pk_data > > >> > > >> so the primary index is being used... > > >> > > >> Based on some past experiences, I added the following index to the table > > >> > > >> CREATE INDEX ix_data_ut on data( u_id, utime ); > > >> > > >> Note that the utime is the last parameter in the index. With the primary > > >> index in place, I was not able to convince sqlite to use this index. > > >> To test > > >> the index, I was forced to drop the primary index (which I can't do in my > > >> production environment). After dropping the primary index, an > > >> explain query > > >> plan revealed > > >> > > >> 0|0|TABLE data WITH INDEX ix_data_ut > > >> > > >> and the query ran in 26 seconds... > > >> > > >> Subsequent tests using the following indexes provided no performance > > >> improvement > > >> over the unique index, although sqlite's query planner chose these > > >> indexes over > > >> the unique index > > >> > > >> CREATE INDEX ix_data_tu on data( utime, u_id ); > > >> CREATE INDEX ix_data_t on data( utime ); > > >> > > >> So, is there a way that I can get sqlite to use the optimal index > > >> without having > > >> to remove my unique index? Perhaps this would involve rewritting the > > >> query, but > > >> I'm at a loss as to how that could be done. > > >> > > >> Thanks for your time, > > >> > > >> Steve __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com