Re: [sqlite] Query performance issues - index selection
Kurt, Thanks, the single primary index you suggested does give us the desired results. Steve Kurt Welgehausen wrote: Steve Green <[EMAIL PROTECTED]> wrote: CREATE UNIQUE INDEX pk_data on data( utime, r_id, u_id ); ... CREATE INDEX ix_data_ut on data( u_id, utime ); Wouldn't a unique index on (u_id, utime, r_id) get you the pk constraint and better performance on your query with no other index to confuse things? Regards -- Steve Green SAVVIS Transforming Information Technology SM This message contains information which may be confidential and/or privileged. Unless you are the intended recipient (or authorized to receive for the intended recipient), you may not read, use, copy or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender by reply e-mail at [EMAIL PROTECTED] and delete the message and any attachment(s) thereto without retaining any copies.
Re: [sqlite] Query performance issues - index selection
Steve Green <[EMAIL PROTECTED]> wrote: > the unary + operator in front of the two utime terms did cause the optimal > index to be used... Can you explain why this works? A unary + in front of a column name disqualifies the term from consideration by the optimizer, and hence from use by indices. That left the other index as the only choice. > > What is the best way to send you sample data (and what is the maximum > file size that you'd like)? > Actually, if you could just send me your schema and the query that is causing problems by private email, I can probably use that to figure out why the optimizer is failing so badly in this case. -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] Query performance issues - index selection
Steve Green <[EMAIL PROTECTED]> wrote: > CREATE UNIQUE INDEX pk_data on data( utime, r_id, u_id ); > ... > CREATE INDEX ix_data_ut on data( u_id, utime ); Wouldn't a unique index on (u_id, utime, r_id) get you the pk constraint and better performance on your query with no other index to confuse things? Regards
Re: [sqlite] Query performance issues - index selection
Richard, Thanks for the reply. I did run ANALYZE and that didn't help. However, the unary + operator in front of the two utime terms did cause the optimal index to be used... Can you explain why this works? I also realized that if I change the "where" clause to something like where u_id >= 0 and utime >= ..., it will cuase the optimal index to be used (however, if we every started using negative u_id values, it would break). What is the best way to send you sample data (and what is the maximum file size that you'd like)? Steve [EMAIL PROTECTED] wrote: Steve Green <[EMAIL PROTECTED]> wrote: [...] select u_id, sum( data1 ), sum( data2 ) from data where utime >= 1146441600 and utime < 114912 group by u_id order by sum( data1 ) desc limit 10 [...] So, is there a way that I can get sqlite to use the optimal index without having to remove my unique index? Have you run ANALYZE. That might fix it. If not, try adding a unary + in front of the two utime terms in the WHERE clause: WHERE +utime >= 1145441600 AND +utime < 114912 If you can send me some sample data and queries, that would be great. It will help me to improve the optimizer so that these kinds of things come up less often in the future. -- D. Richard Hipp <[EMAIL PROTECTED]> -- Steve Green SAVVIS Transforming Information Technology SM This message contains information which may be confidential and/or privileged. Unless you are the intended recipient (or authorized to receive for the intended recipient), you may not read, use, copy or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender by reply e-mail at [EMAIL PROTECTED] and delete the message and any attachment(s) thereto without retaining any copies.
Re: [sqlite] Query performance issues - index selection
Steve Green <[EMAIL PROTECTED]> wrote: [...] > > select u_id, sum( data1 ), sum( data2 ) > from data where utime >= 1146441600 and utime < 114912 > group by u_id > order by sum( data1 ) desc > limit 10 > [...] > > So, is there a way that I can get sqlite to use the optimal index > without having to remove my unique index? Have you run ANALYZE. That might fix it. If not, try adding a unary + in front of the two utime terms in the WHERE clause: WHERE +utime >= 1145441600 AND +utime < 114912 If you can send me some sample data and queries, that would be great. It will help me to improve the optimizer so that these kinds of things come up less often in the future. -- D. Richard Hipp <[EMAIL PROTECTED]>
[sqlite] Query performance issues - index selection
Hi, I have the following schema CREATE TABLE data( utime int4, r_id int2, u_id int4, data1 int8, data2 int8 ); Each row is uniquely defined by utime, r_id, and u_id, so I have the following index CREATE UNIQUE INDEX pk_data on data( utime, r_id, u_id ); This index is also needed because at update time, I use "insert or replace" to update the data table. The type of query that I'm attempting to perform is similar to select u_id, sum( data1 ), sum( data2 ) from data where utime >= 1146441600 and utime < 114912 group by u_id order by sum( data1 ) desc limit 10 My current table has about 2.5 million rows and about 86,000 distinct u_id values in the time period selected, and the query takes about 13.5 minutes. Performing an explain query plan reveals 0|0|TABLE data WITH INDEX pk_data so the primary index is being used... Based on some past experiences, I added the following index to the table CREATE INDEX ix_data_ut on data( u_id, utime ); Note that the utime is the last parameter in the index. With the primary index in place, I was not able to convince sqlite to use this index. To test the index, I was forced to drop the primary index (which I can't do in my production environment). After dropping the primary index, an explain query plan revealed 0|0|TABLE data WITH INDEX ix_data_ut and the query ran in 26 seconds... Subsequent tests using the following indexes provided no performance improvement over the unique index, although sqlite's query planner chose these indexes over the unique index CREATE INDEX ix_data_tu on data( utime, u_id ); CREATE INDEX ix_data_t on data( utime ); So, is there a way that I can get sqlite to use the optimal index without having to remove my unique index? Perhaps this would involve rewritting the query, but I'm at a loss as to how that could be done. Thanks for your time, Steve -- Steve Green SAVVIS Transforming Information Technology SM This message contains information which may be confidential and/or privileged. Unless you are the intended recipient (or authorized to receive for the intended recipient), you may not read, use, copy or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender by reply e-mail at [EMAIL PROTECTED] and delete the message and any attachment(s) thereto without retaining any copies.