Re: [sqlite] Multi-column index is not used with IN operator
On 01/17/2013 09:17 PM, Selen Schabenberger wrote: I was trying to avoid the INDEXED BY or the "+" operator as I cannot make sure that the search with the index will always (for every database created by my application) be faster than with the PK. I prefer the optimizer does the cost calculations and makes the choice. Do you want me to write another ticket for the enhancement you mentioned? Don't worry about it for now. Changes to the optimizer are things that need to be considered really carefully. It's too easy to improve one type of query to the detriment of others. Dan. -Selen From: Dan Kennedy To: General Discussion of SQLite Database Sent: Thursday, January 17, 2013 2:38 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/17/2013 06:32 PM, Selen Schabenberger wrote: Dan, I have just realised that with the sqlite_stat3 table, the query optimizer uses the INDEX IDX_TAG. When I drop this table, the PK is used as you said. But the concatenated index is still not used with the stat3. Looks like with stat3 enabled the estimate of the number of rows matched by the WHERE clause is better. So it uses INDEX_IDX_TAG. But not the composite index as the stat3 estimation is disabled if the query uses more than the first column of the index (in this case it uses two - 'Tag' and 'Flag'). That's probably something that could be enhanced at some point. Knowing that the query could be 60 times faster, is there anything I can do to speed up this query or you think it is a bug in the optimizer? You could figure out why the stat1 data is deceptive in this case and whether there is anything that can be done about it. Or add an INDEXED BY clause to the query. Adding a unary '+' operator in front of 'Id' might work too. Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
I was trying to avoid the INDEXED BY or the "+" operator as I cannot make sure that the search with the index will always (for every database created by my application) be faster than with the PK. I prefer the optimizer does the cost calculations and makes the choice. Do you want me to write another ticket for the enhancement you mentioned? -Selen > > From: Dan Kennedy >To: General Discussion of SQLite Database >Sent: Thursday, January 17, 2013 2:38 PM >Subject: Re: [sqlite] Multi-column index is not used with IN operator > >On 01/17/2013 06:32 PM, Selen Schabenberger wrote: >> >> Dan, I have just realised that with the sqlite_stat3 table, the query >> optimizer uses the INDEX IDX_TAG. When I drop this table, the PK is used >> as you said. But the concatenated index is still not used with the stat3. > >Looks like with stat3 enabled the estimate of the number of rows matched >by the WHERE clause is better. So it uses INDEX_IDX_TAG. But not the >composite index as the stat3 estimation is disabled if the query uses >more than the first column of the index (in this case it uses two - >'Tag' and 'Flag'). That's probably something that could be enhanced at >some point. > >> Knowing that the query could be 60 times faster, is there anything I can >> do to speed up this query or you think it is a bug in the optimizer? > >You could figure out why the stat1 data is deceptive in this case and >whether there is anything that can be done about it. > >Or add an INDEXED BY clause to the query. > >Adding a unary '+' operator in front of 'Id' might work too. > >Dan. > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
On 01/17/2013 06:32 PM, Selen Schabenberger wrote: Dan, I have just realised that with the sqlite_stat3 table, the query optimizer uses the INDEX IDX_TAG. When I drop this table, the PK is used as you said. But the concatenated index is still not used with the stat3. Looks like with stat3 enabled the estimate of the number of rows matched by the WHERE clause is better. So it uses INDEX_IDX_TAG. But not the composite index as the stat3 estimation is disabled if the query uses more than the first column of the index (in this case it uses two - 'Tag' and 'Flag'). That's probably something that could be enhanced at some point. Knowing that the query could be 60 times faster, is there anything I can do to speed up this query or you think it is a bug in the optimizer? You could figure out why the stat1 data is deceptive in this case and whether there is anything that can be done about it. Or add an INDEXED BY clause to the query. Adding a unary '+' operator in front of 'Id' might work too. Dan. - Selen *From:* Selen Schabenberger *To:* General Discussion of SQLite Database ; Dan Kennedy *Sent:* Wednesday, January 16, 2013 3:46 PM *Subject:* Re: [sqlite] Multi-column index is not used with IN operator Sorry, wrong query wrong result. But still when I add the order by, the index is used: Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 ) ) and Flag=1) order by Id limit 200 selectId order from detail 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~33 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 0 0 0 &n bsp; USE TEMP B-TREE FOR ORDER BY > > From: Dan Kennedy mailto:danielk1...@gmail.com>> >To: General Discussion of SQLite Database mailto:sqlite-users@sqlite.org>> >Sent: Wednesday, January 16, 2013 3:40 PM >Subject: Re: [sqlite] Multi-column index is not used with IN operator > >On 01/16/2013 09:22 PM, Selen Schabenberger wrote: >> PRAGMA foreign_keys=OFF; >> BEGIN TRANSACTION; >> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER >> NOT NULL, 'Flag' INTEGER NOT NULL ); >> ANALYZE sqlite_master; >> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289'); >> INSERT INTO "sqlite_stat1" >> VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); >> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); >> CREATE INDEX IDX_TAG on Message (Tag); >> COMMIT; >> >> Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, >> 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, >> 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, >> 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, >> 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, >> 1167671, 1167675 ) ) and Flag=1) limit 200 >> >> selectId order from detail >> 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows) >> 0 0 0 EXECUTE LIST SUBQUERY 1 >> 0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY > >The query and the result above don't match. > >If I add the ORDER BY clause to the query it uses the IPK index. > >Dan. >___ >sqlite-users mailing list >sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ sqlite-users mailing list sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
Dan, I have just realised that with the sqlite_stat3 table, the query optimizer uses the INDEX IDX_TAG. When I drop this table, the PK is used as you said. But the concatenated index is still not used with the stat3. Knowing that the query could be 60 times faster, is there anything I can do to speed up this query or you think it is a bug in the optimizer? - Selen > > From: Selen Schabenberger >To: General Discussion of SQLite Database ; Dan >Kennedy >Sent: Wednesday, January 16, 2013 3:46 PM >Subject: Re: [sqlite] Multi-column index is not used with IN operator > >Sorry, wrong query wrong result. > >But still when I add the order by, the index is used: > >Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, >1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, >1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, >1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, >1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, >1167675 ) ) and Flag=1) order by Id limit 200 > > >selectId order from detail >0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG >(Tag=?) (~33 rows) >0 0 0 EXECUTE LIST SUBQUERY 1 >0 0 0 USE TEMP B-TREE FOR ORDER BY > > > > > >> >> From: Dan Kennedy >>To: General Discussion of SQLite Database >>Sent: Wednesday, January 16, 2013 3:40 PM >>Subject: Re: [sqlite] Multi-column index is not used with IN operator >> >>On 01/16/2013 09:22 PM, Selen Schabenberger wrote: >>> PRAGMA foreign_keys=OFF; >>> BEGIN TRANSACTION; >>> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER >>> NOT NULL, 'Flag' INTEGER NOT NULL ); >>> ANALYZE sqlite_master; >>> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289'); >>> INSERT INTO "sqlite_stat1" >>> VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); >>> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); >>> CREATE INDEX IDX_TAG on Message (Tag); >>> COMMIT; >>> >>> Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, >>> 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, >>> 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, >>> 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, >>> 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, >>> 1167671, 1167675 ) ) and Flag=1) limit 200 >>> >>> selectId order from detail >>> 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows) >>> 0 0 0 EXECUTE LIST SUBQUERY 1 >>> 0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY >> >>The query and the result above don't match. >> >>If I add the ORDER BY clause to the query it uses the IPK index. >> >>Dan. >>___ >>sqlite-users mailing list >>sqlite-users@sqlite.org >>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> >> >> >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
Sorry, wrong query wrong result. But still when I add the order by, the index is used: Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 ) ) and Flag=1) order by Id limit 200 selectId order from detail 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~33 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 0 0 0 USE TEMP B-TREE FOR ORDER BY > > From: Dan Kennedy >To: General Discussion of SQLite Database >Sent: Wednesday, January 16, 2013 3:40 PM >Subject: Re: [sqlite] Multi-column index is not used with IN operator > >On 01/16/2013 09:22 PM, Selen Schabenberger wrote: >> PRAGMA foreign_keys=OFF; >> BEGIN TRANSACTION; >> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER >> NOT NULL, 'Flag' INTEGER NOT NULL ); >> ANALYZE sqlite_master; >> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289'); >> INSERT INTO "sqlite_stat1" >> VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); >> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); >> CREATE INDEX IDX_TAG on Message (Tag); >> COMMIT; >> >> Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, >> 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, >> 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, >> 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, >> 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, >> 1167671, 1167675 ) ) and Flag=1) limit 200 >> >> selectId order from detail >> 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows) >> 0 0 0 EXECUTE LIST SUBQUERY 1 >> 0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY > >The query and the result above don't match. > >If I add the ORDER BY clause to the query it uses the IPK index. > >Dan. >___ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
On 01/16/2013 09:22 PM, Selen Schabenberger wrote: PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); ANALYZE sqlite_master; INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289'); INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); CREATE INDEX IDX_TAG on Message (Tag); COMMIT; Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 ) ) and Flag=1) limit 200 selectId order from detail 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY The query and the result above don't match. If I add the ORDER BY clause to the query it uses the IPK index. Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); ANALYZE sqlite_master; INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289'); INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); CREATE INDEX IDX_TAG on Message (Tag); COMMIT; Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 ) ) and Flag=1) limit 200 selectId order from detail 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 0 0 0 USE TEMP B-TREE FOR ORDER BY - Selen >________ > From: Dan Kennedy >To: sqlite-users@sqlite.org >Sent: Wednesday, January 16, 2013 3:14 PM >Subject: Re: [sqlite] Multi-column index is not used with IN operator > >On 01/16/2013 09:04 PM, Selen Schabenberger wrote: >> >> >> The entry for the Tag index in sqlite_stat1: >> >> 460132 1289 >> >> The results are returned in 163 ms on the network share. > >I can't reproduce that. Once I add the entry to sqlite_stat1 >it uses the IPK index. Can you post the new .dump of the schema >and query again? > >Dan. > > > >> >> - Selen > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
On 01/16/2013 09:04 PM, Selen Schabenberger wrote: The entry for the Tag index in sqlite_stat1: 460132 1289 The results are returned in 163 ms on the network share. I can't reproduce that. Once I add the entry to sqlite_stat1 it uses the IPK index. Can you post the new .dump of the schema and query again? Dan. - Selen From: Dan Kennedy To: General Discussion of SQLite Database Sent: Wednesday, January 16, 2013 2:56 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 08:48 PM, Selen Schabenberger wrote: However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. If the index consists of only the Tag, then this index is used. When you create the index on just the Tag column, what does the corresponding sqlite_stat1 entry look like? Dan. You said, there would be a penalty if the index was not a covering index. But seems like it is a discount for 3.7.15.2? - Selen *From:* Dan Kennedy *To:* sqlite-users@sqlite.org *Sent:* Wednesday, January 16, 2013 1:12 PM *Subject:* Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 06:25 PM, Selen Schabenberger wrote: > Below is the output of the dump. If it does not help reproduce the error, then I can try to share the original database file itself. > > PRAGMA foreign_keys=OFF; > BEGIN TRANSACTION; > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); > ANALYZE sqlite_master; > INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); > COMMIT; > Got it this time. Considering this one: SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) AND flag=1 ORDER BY id LIMIT 200; Looks like the query planner now assigns a higher cost to external sorts (ORDER BY clauses that cannot use indexes) than it did in 3.6.23.1. In both cases, SQLite assumes that scanning the full table implies visiting 460132 rows. In both versions, this plan is assigned a cost of 460132. If there are 30 elements in the IN(...) set, SQLite assumes that scanning the index requires visiting (30*1275)=38250 rows. Basic cost of 38250, plus some insignificant amount for the 30 seek operations required. In version 3.6.23.1, the penalty for the external sort is (nRow*log10(nRow)), where nRow is the number of rows to sort (in this case 38250). SQLite rounds up the log10() expression to 5, so the penalty is roughly 191250. Total cost of 229500. It then gets a 50% discount for using a covering index, so the overall cost is roughly 115000. Making it preferable to use the index. However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. None of this jumps out as obviously incorrect. In practice, how much slower is 3.7.16 at running the query above? What does: SELECT count(*) FROM message WHERE tag IN () AND flag=1; return? Is it close to the 38250 that SQLite is using as an estimate when planning the query? Thanks, Dan. > > Thanks! > Selen > > > From: Dan Kennedymailto:danielk1...@gmail.com>> > To: sqlite-users@sqlite.org<mailto:sqlite-users@sqlite.org> > Sent: Wednesday, January 16, 2013 12:05 PM > Subject: Re: [sqlite] Multi-column index is not used with IN operator > > On 01/16/2013 05:13 PM, Selen Schabenberger wrote: >> I attach a small database where it is possible to reproduce the >> issue. I deleted all irrelevant tables and all the tuples in the >> Message table to keep the file size small but had run ANALYZE before >> doing that. > > Mailing list does not allow attachments. Can you either upload the > db somewhere, or include the output of ".dump" in the body of the > message if it is small enough? Thanks. > > > > >> >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN >> >> SELECT * FROM mes
Re: [sqlite] Multi-column index is not used with IN operator
The entry for the Tag index in sqlite_stat1: 460132 1289 The results are returned in 163 ms on the network share. - Selen From: Dan Kennedy To: General Discussion of SQLite Database Sent: Wednesday, January 16, 2013 2:56 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 08:48 PM, Selen Schabenberger wrote: >>However, in 3.7.16, the penalty for the external sort is >>(3*nRow*log10(nRow)) and there is no discount for using a >>covering index (instead, there would be another penalty if the >>index were not a covering index). For a total cost of roughly >>612000. So this version of SQLite does a full table scan. > > If the index consists of only the Tag, then this index is used. When you create the index on just the Tag column, what does the corresponding sqlite_stat1 entry look like? Dan. You > said, there would be a penalty if the index was not a covering index. > But seems like it is a discount for 3.7.15.2? > > - Selen > > > *From:* Dan Kennedy > *To:* sqlite-users@sqlite.org > *Sent:* Wednesday, January 16, 2013 1:12 PM > *Subject:* Re: [sqlite] Multi-column index is not used with IN operator > > On 01/16/2013 06:25 PM, Selen Schabenberger wrote: > > Below is the output of the dump. If it does not help reproduce the > error, then I can try to share the original database file itself. > > > > PRAGMA foreign_keys=OFF; > > BEGIN TRANSACTION; > > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' > INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); > > ANALYZE sqlite_master; > > INSERT INTO "sqlite_stat1" > VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); > > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); > > COMMIT; > > > > Got it this time. Considering this one: > > SELECT * FROM message > WHERE tag IN > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) > > AND flag=1 > ORDER BY id LIMIT 200; > > Looks like the query planner now assigns a higher cost to > external sorts (ORDER BY clauses that cannot use indexes) than it > did in 3.6.23.1. In both cases, SQLite assumes that scanning the > full table implies visiting 460132 rows. In both versions, this > plan is assigned a cost of 460132. > > If there are 30 elements in the IN(...) set, SQLite assumes that > scanning the index requires visiting (30*1275)=38250 rows. > Basic cost of 38250, plus some insignificant amount for the 30 > seek operations required. > > In version 3.6.23.1, the penalty for the external sort is > (nRow*log10(nRow)), where nRow is the number of rows to sort (in > this case 38250). SQLite rounds up the log10() expression to 5, > so the penalty is roughly 191250. Total cost of 229500. It then > gets a 50% discount for using a covering index, so the overall > cost is roughly 115000. Making it preferable to use the index. > > However, in 3.7.16, the penalty for the external sort is > (3*nRow*log10(nRow)) and there is no discount for using a > covering index (instead, there would be another penalty if the > index were not a covering index). For a total cost of roughly > 612000. So this version of SQLite does a full table scan. > > None of this jumps out as obviously incorrect. In practice, how > much slower is 3.7.16 at running the query above? > > What does: > > SELECT count(*) FROM message WHERE tag IN () AND flag=1; > > return? Is it close to the 38250 that SQLite is using as an > estimate when planning the query? > > Thanks, > Dan. > > > > > > > > Thanks! > > Selen > > > > > > From: Dan Kennedymailto:danielk1...@gmail.com>> > > To: sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> > > Sent: Wednesday, January 16, 2013 12:05 PM > > Subject: Re: [sqlite] Multi-column index is not used with IN operator > > > > On 01/16/2013 05:13 PM, Selen Schabenberger wrote: > >> I attach a small database where it is possible to reproduce the > >> issue. I deleted all irrelevant tables and all the tuples in the > >> Message table to keep the file size small but had run ANALYZE before > >> doing that. > > > > Mailing list does not allow attachments. Can you either upload the > > db somewhere, or include the output of ".dump" in the body of the > > m
Re: [sqlite] Multi-column index is not used with IN operator
On 01/16/2013 08:48 PM, Selen Schabenberger wrote: However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. If the index consists of only the Tag, then this index is used. When you create the index on just the Tag column, what does the corresponding sqlite_stat1 entry look like? Dan. You said, there would be a penalty if the index was not a covering index. But seems like it is a discount for 3.7.15.2? - Selen *From:* Dan Kennedy *To:* sqlite-users@sqlite.org *Sent:* Wednesday, January 16, 2013 1:12 PM *Subject:* Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 06:25 PM, Selen Schabenberger wrote: > Below is the output of the dump. If it does not help reproduce the error, then I can try to share the original database file itself. > > PRAGMA foreign_keys=OFF; > BEGIN TRANSACTION; > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); > ANALYZE sqlite_master; > INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); > COMMIT; > Got it this time. Considering this one: SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) AND flag=1 ORDER BY id LIMIT 200; Looks like the query planner now assigns a higher cost to external sorts (ORDER BY clauses that cannot use indexes) than it did in 3.6.23.1. In both cases, SQLite assumes that scanning the full table implies visiting 460132 rows. In both versions, this plan is assigned a cost of 460132. If there are 30 elements in the IN(...) set, SQLite assumes that scanning the index requires visiting (30*1275)=38250 rows. Basic cost of 38250, plus some insignificant amount for the 30 seek operations required. In version 3.6.23.1, the penalty for the external sort is (nRow*log10(nRow)), where nRow is the number of rows to sort (in this case 38250). SQLite rounds up the log10() expression to 5, so the penalty is roughly 191250. Total cost of 229500. It then gets a 50% discount for using a covering index, so the overall cost is roughly 115000. Making it preferable to use the index. However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. None of this jumps out as obviously incorrect. In practice, how much slower is 3.7.16 at running the query above? What does: SELECT count(*) FROM message WHERE tag IN () AND flag=1; return? Is it close to the 38250 that SQLite is using as an estimate when planning the query? Thanks, Dan. > > Thanks! > Selen > > ____ > From: Dan Kennedymailto:danielk1...@gmail.com>> > To: sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> > Sent: Wednesday, January 16, 2013 12:05 PM > Subject: Re: [sqlite] Multi-column index is not used with IN operator > > On 01/16/2013 05:13 PM, Selen Schabenberger wrote: >> I attach a small database where it is possible to reproduce the >> issue. I deleted all irrelevant tables and all the tuples in the >> Message table to keep the file size small but had run ANALYZE before >> doing that. > > Mailing list does not allow attachments. Can you either upload the > db somewhere, or include the output of ".dump" in the body of the > message if it is small enough? Thanks. > > > > >> >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN >> >> SELECT * FROM message WHERE tag IN >> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >&g
Re: [sqlite] Multi-column index is not used with IN operator
>However, in 3.7.16, the penalty for the external sort is >(3*nRow*log10(nRow)) and there is no discount for using a >covering index (instead, there would be another penalty if the >index were not a covering index). For a total cost of roughly >612000. So this version of SQLite does a full table scan. If the index consists of only the Tag, then this index is used. You said, there would be a penalty if the index was not a covering index. But seems like it is a discount for 3.7.15.2? - Selen From: Dan Kennedy To: sqlite-users@sqlite.org Sent: Wednesday, January 16, 2013 1:12 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 06:25 PM, Selen Schabenberger wrote: > Below is the output of the dump. If it does not help reproduce the error, > then I can try to share the original database file itself. > > PRAGMA foreign_keys=OFF; > BEGIN TRANSACTION; > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT > NULL, 'Flag' INTEGER NOT NULL ); > ANALYZE sqlite_master; > INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 > 1289 1275 1'); > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); > COMMIT; > Got it this time. Considering this one: SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) AND flag=1 ORDER BY id LIMIT 200; Looks like the query planner now assigns a higher cost to external sorts (ORDER BY clauses that cannot use indexes) than it did in 3.6.23.1. In both cases, SQLite assumes that scanning the full table implies visiting 460132 rows. In both versions, this plan is assigned a cost of 460132. If there are 30 elements in the IN(...) set, SQLite assumes that scanning the index requires visiting (30*1275)=38250 rows. Basic cost of 38250, plus some insignificant amount for the 30 seek operations required. In version 3.6.23.1, the penalty for the external sort is (nRow*log10(nRow)), where nRow is the number of rows to sort (in this case 38250). SQLite rounds up the log10() expression to 5, so the penalty is roughly 191250. Total cost of 229500. It then gets a 50% discount for using a covering index, so the overall cost is roughly 115000. Making it preferable to use the index. However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. None of this jumps out as obviously incorrect. In practice, how much slower is 3.7.16 at running the query above? What does: SELECT count(*) FROM message WHERE tag IN () AND flag=1; return? Is it close to the 38250 that SQLite is using as an estimate when planning the query? Thanks, Dan. > > Thanks! > Selen > > ____ > From: Dan Kennedy > To: sqlite-users@sqlite.org > Sent: Wednesday, January 16, 2013 12:05 PM > Subject: Re: [sqlite] Multi-column index is not used with IN operator > > On 01/16/2013 05:13 PM, Selen Schabenberger wrote: >> I attach a small database where it is possible to reproduce the >> issue. I deleted all irrelevant tables and all the tuples in the >> Message table to keep the file size small but had run ANALYZE before >> doing that. > > Mailing list does not allow attachments. Can you either upload the > db somewhere, or include the output of ".dump" in the body of the > message if it is small enough? Thanks. > > > > >> >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN >> >> SELECT * FROM message WHERE tag IN >> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
Re: [sqlite] Multi-column index is not used with IN operator
My original query looks like this one and the Message table has in total 10 columns in it (there is one BLOB). Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 ) ) and Flag=1) limit 200 If the database file is on the network share (which is most likely the case), then it takes ~22000 ms (with the index 300ms!) to return the results. On the local drive it is ~300 ms (with the index 10 ms). There are in total 101 rows that match the where clause. Thanks! Selen From: Dan Kennedy To: sqlite-users@sqlite.org Sent: Wednesday, January 16, 2013 1:12 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 06:25 PM, Selen Schabenberger wrote: > Below is the output of the dump. If it does not help reproduce the error, > then I can try to share the original database file itself. > > PRAGMA foreign_keys=OFF; > BEGIN TRANSACTION; > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT > NULL, 'Flag' INTEGER NOT NULL ); > ANALYZE sqlite_master; > INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 > 1289 1275 1'); > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); > COMMIT; > Got it this time. Considering this one: SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) AND flag=1 ORDER BY id LIMIT 200; Looks like the query planner now assigns a higher cost to external sorts (ORDER BY clauses that cannot use indexes) than it did in 3.6.23.1. In both cases, SQLite assumes that scanning the full table implies visiting 460132 rows. In both versions, this plan is assigned a cost of 460132. If there are 30 elements in the IN(...) set, SQLite assumes that scanning the index requires visiting (30*1275)=38250 rows. Basic cost of 38250, plus some insignificant amount for the 30 seek operations required. In version 3.6.23.1, the penalty for the external sort is (nRow*log10(nRow)), where nRow is the number of rows to sort (in this case 38250). SQLite rounds up the log10() expression to 5, so the penalty is roughly 191250. Total cost of 229500. It then gets a 50% discount for using a covering index, so the overall cost is roughly 115000. Making it preferable to use the index. However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. None of this jumps out as obviously incorrect. In practice, how much slower is 3.7.16 at running the query above? What does: SELECT count(*) FROM message WHERE tag IN () AND flag=1; return? Is it close to the 38250 that SQLite is using as an estimate when planning the query? Thanks, Dan. > > Thanks! > Selen > > ________ > From: Dan Kennedy > To: sqlite-users@sqlite.org > Sent: Wednesday, January 16, 2013 12:05 PM > Subject: Re: [sqlite] Multi-column index is not used with IN operator > > On 01/16/2013 05:13 PM, Selen Schabenberger wrote: >> I attach a small database where it is possible to reproduce the >> issue. I deleted all irrelevant tables and all the tuples in the >> Message table to keep the file size small but had run ANALYZE before >> doing that. > > Mailing list does not allow attachments. Can you either upload the > db somewhere, or include the output of ".dump" in the body of the > message if it is small enough? Thanks. > > > > >> >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN >> >> SELECT * FROM message WHERE tag IN >> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 &g
Re: [sqlite] Multi-column index is not used with IN operator
I attach a small database where it is possible to reproduce the issue. I deleted all irrelevant tables and all the tuples in the Message table to keep the file size small but had run ANALYZE before doing that. This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ) AND flag=1 ORDER BY id LIMIT 200; I get this result: selectId order from detail 0 0 0 SCAN TABLE Message USING INTEGER PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 Hope someone can help. - Selen From: Selen Schabenberger To: General Discussion of SQLite Database ; Richard Hipp Sent: Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator Hi Richard, I tested the whole scenario one more time with the new SQLite version. As you suggested I put a plus sign in front of the Flag column and that really made the query much faster by using the multi column index (Tag, Flag, Id) instead of the primary index on the Id column. However what I don't get is, I actually had removed that single column index on the Flag before and run ANALZE. How come the query optimizer makes another decision when I put a + in front of a column which is not indexed alone? Is there another way to improve this query, other than using the + sign? I would really appreciate any suggestions. Happy new year! Regards,Selen --- On Fri, 12/14/12, Richard Hipp wrote: From: Richard Hipp Subject: Re: [sqlite] Multi-column index is not used with IN operator To: "Selen Schabenberger" , "General Discussion of SQLite Database" Date: Friday, December 14, 2012, 3:09 PM On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger wrote: Hi All, I am observing some strange behaviour on my database when I execute a query with an IN operator having more than "22" expressions. My table structure looks basically as follows: CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT NULL, "Flag" INTEGER ) I have a multi-column index on (Tag, Flag, Id) as well as a single column index on the Flag column. My guess is that the single-column index on Flag is misleading the query optimizer. You can probably fix this by either (1) running ANALYZE or (2) adding a "+" in front of the "Flag" column name in the WHERE clause of your query, like this: "... +Flag=1 ..." -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
On 01/16/2013 06:25 PM, Selen Schabenberger wrote: Below is the output of the dump. If it does not help reproduce the error, then I can try to share the original database file itself. PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); ANALYZE sqlite_master; INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); COMMIT; Got it this time. Considering this one: SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) AND flag=1 ORDER BY id LIMIT 200; Looks like the query planner now assigns a higher cost to external sorts (ORDER BY clauses that cannot use indexes) than it did in 3.6.23.1. In both cases, SQLite assumes that scanning the full table implies visiting 460132 rows. In both versions, this plan is assigned a cost of 460132. If there are 30 elements in the IN(...) set, SQLite assumes that scanning the index requires visiting (30*1275)=38250 rows. Basic cost of 38250, plus some insignificant amount for the 30 seek operations required. In version 3.6.23.1, the penalty for the external sort is (nRow*log10(nRow)), where nRow is the number of rows to sort (in this case 38250). SQLite rounds up the log10() expression to 5, so the penalty is roughly 191250. Total cost of 229500. It then gets a 50% discount for using a covering index, so the overall cost is roughly 115000. Making it preferable to use the index. However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. None of this jumps out as obviously incorrect. In practice, how much slower is 3.7.16 at running the query above? What does: SELECT count(*) FROM message WHERE tag IN () AND flag=1; return? Is it close to the 38250 that SQLite is using as an estimate when planning the query? Thanks, Dan. Thanks! Selen ____________________ From: Dan Kennedy To: sqlite-users@sqlite.org Sent: Wednesday, January 16, 2013 12:05 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 05:13 PM, Selen Schabenberger wrote: I attach a small database where it is possible to reproduce the issue. I deleted all irrelevant tables and all the tuples in the Message table to keep the file size small but had run ANALYZE before doing that. Mailing list does not allow attachments. Can you either upload the db somewhere, or include the output of ".dump" in the body of the message if it is small enough? Thanks. This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ) AND flag=1 ORDER BY id LIMIT 200; I get this result: selectId order from detail 0 0 0 SCAN TABLE Message USING INTEGER PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 Hope someone can help. - Selen ____ From: Selen Schabenberger To: General Discussion of SQLite Database; Richard Hipp Sent: Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator Hi Richard, I tested the whole scenario one more time with the new SQLite version. As you suggested I put a plus sign in front of the Flag column and that really made the query much faster by using the multi column index (Tag, Flag, Id) instead of the primary inde
Re: [sqlite] Multi-column index is not used with IN operator
Below is the output of the dump. If it does not help reproduce the error, then I can try to share the original database file itself. PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); ANALYZE sqlite_master; INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); COMMIT; Thanks! Selen ________________ From: Dan Kennedy To: sqlite-users@sqlite.org Sent: Wednesday, January 16, 2013 12:05 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 05:13 PM, Selen Schabenberger wrote: > I attach a small database where it is possible to reproduce the > issue. I deleted all irrelevant tables and all the tuples in the > Message table to keep the file size small but had run ANALYZE before > doing that. Mailing list does not allow attachments. Can you either upload the db somewhere, or include the output of ".dump" in the body of the message if it is small enough? Thanks. > > This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN > > SELECT * FROM message WHERE tag IN > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > > ) AND flag=1 > ORDER BY id LIMIT 200; > > I get this result: selectId order from detail > > > 0 0 0 SCAN TABLE Message USING INTEGER > PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE > LIST SUBQUERY 1 > > > Hope someone can help. > > > - Selen > > > > From: Selen > Schabenberger To: General Discussion of SQLite > Database; Richard Hipp Sent: > Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite] > Multi-column index is not used with IN operator > > Hi Richard, I tested the whole scenario one more time with the new > SQLite version. As you suggested I put a plus sign in front of the > Flag column and that really made the query much faster by using the > multi column index (Tag, Flag, Id) instead of the primary index on > the Id column. However what I don't get is, I actually had removed > that single column index on the Flag before and run ANALZE. How come > the query optimizer makes another decision when I put a + in front of > a column which is not indexed alone? Is there another way to improve > this query, other than using the + sign? I would really appreciate > any suggestions. Happy new year! Regards,Selen > > --- On Fri, 12/14/12, Richard Hipp wrote: > > From: Richard Hipp Subject: Re: [sqlite] Multi-column > index is not used with IN operator To: "Selen > Schabenberger", "General Discussion of SQLite > Database" Date: Friday, December 14, 2012, > 3:09 PM > > > > On Thu, Dec 13, 2012 at 10:06 AM, Selen > Schabenberger wrote: > > > Hi All, > > > > I am observing some strange behaviour on my database when I execute a > query with an IN operator having more than "22" expressions. My table > structure looks basically as follows: > > > > CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" > INTEGER NOT NULL, "Flag" INTEGER ) > > > > > > I have a multi-column index on (Tag, Flag, Id) as well as a single > column index on the Flag column. > > My guess is that the single-column index on Flag is misleading the > query optimizer. You c
Re: [sqlite] Multi-column index is not used with IN operator
On 01/16/2013 05:13 PM, Selen Schabenberger wrote: I attach a small database where it is possible to reproduce the issue. I deleted all irrelevant tables and all the tuples in the Message table to keep the file size small but had run ANALYZE before doing that. Mailing list does not allow attachments. Can you either upload the db somewhere, or include the output of ".dump" in the body of the message if it is small enough? Thanks. This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ) AND flag=1 ORDER BY id LIMIT 200; I get this result: selectId order from detail 0 0 0 SCAN TABLE Message USING INTEGER PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 Hope someone can help. - Selen From: Selen Schabenberger To: General Discussion of SQLite Database; Richard Hipp Sent: Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator Hi Richard, I tested the whole scenario one more time with the new SQLite version. As you suggested I put a plus sign in front of the Flag column and that really made the query much faster by using the multi column index (Tag, Flag, Id) instead of the primary index on the Id column. However what I don't get is, I actually had removed that single column index on the Flag before and run ANALZE. How come the query optimizer makes another decision when I put a + in front of a column which is not indexed alone? Is there another way to improve this query, other than using the + sign? I would really appreciate any suggestions. Happy new year! Regards,Selen --- On Fri, 12/14/12, Richard Hipp wrote: From: Richard Hipp Subject: Re: [sqlite] Multi-column index is not used with IN operator To: "Selen Schabenberger", "General Discussion of SQLite Database" Date: Friday, December 14, 2012, 3:09 PM On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger wrote: Hi All, I am observing some strange behaviour on my database when I execute a query with an IN operator having more than "22" expressions. My table structure looks basically as follows: CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT NULL, "Flag" INTEGER ) I have a multi-column index on (Tag, Flag, Id) as well as a single column index on the Flag column. My guess is that the single-column index on Flag is misleading the query optimizer. You can probably fix this by either (1) running ANALYZE or (2) adding a "+" in front of the "Flag" column name in the WHERE clause of your query, like this: "... +Flag=1 ..." ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
I attach a small database where it is possible to reproduce the issue. I deleted all irrelevant tables and all the tuples in the Message table to keep the file size small but had run ANALYZE before doing that. This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 ) AND flag=1 ORDER BY id LIMIT 200; I get this result: selectId order from detail 0 0 0 SCAN TABLE Message USING INTEGER PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 Hope someone can help. - Selen From: Selen Schabenberger To: General Discussion of SQLite Database ; Richard Hipp Sent: Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator Hi Richard, I tested the whole scenario one more time with the new SQLite version. As you suggested I put a plus sign in front of the Flag column and that really made the query much faster by using the multi column index (Tag, Flag, Id) instead of the primary index on the Id column. However what I don't get is, I actually had removed that single column index on the Flag before and run ANALZE. How come the query optimizer makes another decision when I put a + in front of a column which is not indexed alone? Is there another way to improve this query, other than using the + sign? I would really appreciate any suggestions. Happy new year! Regards,Selen --- On Fri, 12/14/12, Richard Hipp wrote: From: Richard Hipp Subject: Re: [sqlite] Multi-column index is not used with IN operator To: "Selen Schabenberger" , "General Discussion of SQLite Database" Date: Friday, December 14, 2012, 3:09 PM On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger wrote: Hi All, I am observing some strange behaviour on my database when I execute a query with an IN operator having more than "22" expressions. My table structure looks basically as follows: CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT NULL, "Flag" INTEGER ) I have a multi-column index on (Tag, Flag, Id) as well as a single column index on the Flag column. My guess is that the single-column index on Flag is misleading the query optimizer. You can probably fix this by either (1) running ANALYZE or (2) adding a "+" in front of the "Flag" column name in the WHERE clause of your query, like this: "... +Flag=1 ..." -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
Hi Richard, I tested the whole scenario one more time with the new SQLite version. As you suggested I put a plus sign in front of the Flag column and that really made the query much faster by using the multi column index (Tag, Flag, Id) instead of the primary index on the Id column. However what I don't get is, I actually had removed that single column index on the Flag before and run ANALZE. How come the query optimizer makes another decision when I put a + in front of a column which is not indexed alone? Is there another way to improve this query, other than using the + sign? I would really appreciate any suggestions. Happy new year! Regards,Selen --- On Fri, 12/14/12, Richard Hipp wrote: From: Richard Hipp Subject: Re: [sqlite] Multi-column index is not used with IN operator To: "Selen Schabenberger" , "General Discussion of SQLite Database" Date: Friday, December 14, 2012, 3:09 PM On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger wrote: Hi All, I am observing some strange behaviour on my database when I execute a query with an IN operator having more than "22" expressions. My table structure looks basically as follows: CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT NULL, "Flag" INTEGER ) I have a multi-column index on (Tag, Flag, Id) as well as a single column index on the Flag column. My guess is that the single-column index on Flag is misleading the query optimizer. You can probably fix this by either (1) running ANALYZE or (2) adding a "+" in front of the "Flag" column name in the WHERE clause of your query, like this: "... +Flag=1 ..." -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
Actually I'm already running ANALYZE with the SQLITE_ENABLE_STAT3 enabled. By the way I observe different behaviour with different SQLite versions. Version 3.6.19 (testing with Sqliteman 1.2.2) is selecting the multi-column index, whereas the v3.7 primary index. - Selen From: Hick Gunter To: 'Selen Schabenberger' ; 'General Discussion of SQLite Database' Sent: Friday, December 14, 2012 3:26 PM Subject: AW: [sqlite] Multi-column index is not used with IN operator From what I understand, SQLite performs query planning based on cost estimates. A full table scan costs less per record than an index lookup. 1) (scan cost) * (IN Lookup) is compared with 2) (IN size) * (key lookup) * (sort factor) IN Lookup cost is logarithmic; IN size is linear. The number 22 appears to be where a plot of the costs cross over. ANALYZE will improve SQLite's cost estimates. Gunter -Ursprüngliche Nachricht- Von: Selen Schabenberger [mailto:selen_oz...@yahoo.com] Gesendet: Donnerstag, 13. Dezember 2012 16:07 An: sqlite-users@sqlite.org Betreff: [sqlite] Multi-column index is not used with IN operator Hi All, I am observing some strange behaviour on my database when I execute a query with an IN operator having more than "22" expressions. My table structure looks basically as follows: CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT NULL, "Flag" INTEGER ) I have a multi-column index on (Tag, Flag, Id) as well as a single column index on the Flag column. If I execute the following query on this table, the Messages table is scanned using the primary key and the results are returned in 20 seconds. > explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, > 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, > 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, > 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, > 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, > 1167663, 1167667, 1167671, 1167675 ) and Flag=1) order by Messages.Id limit > 0, 100 >> Scan table Messages using integer primary key >> Execute list subquery1 However if I reduce the number of expressions in the IN operator, the index on the (Tag, Flag, Id) is used. > explain query plan Select Messages.MessageId from Messages where (AspTag in ( > 1146883, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, > 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, > 1148016, 1148018, 1148020, 1148022, 1148040, 1148042) and PduFlag=1) order > by MessageId limit 0, 100 >> Search table Messages using covering index IDX_TAG_FLAG_ID >> Execute list subquery1 >> Use temp b-tree for order by If I rewrite the first query in multiple IN operators connecting with OR then I can again use my multi-column index and it is much more efficient than the full table scan. > explain query plan Select Messages.Id from Messages where (Tag in ( > 1148138,1148191, 1148232, 1148234, 1167643, 1167659,1146883, 1146884, > 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, > 1147912, 1147914, 1147968, 1147970, 1147976, 1147978 ) or Tag in (1148012, > 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, > 1148136, 1167660, 1167663, 1167667, 1167671, 1167675) ) and Flag=1 order by > Messages.Id limit 0, 100 >> Execute list subquery1 >> Search table Messages using covering index IDX_TAG_FLAG_ID >> Execute list subquery1 >> Search table Messages using covering index IDX_TAG_FLAG_ID >> Use temp b-tree for order by Does anybody have any idea why it behaves so? Are there any restrictions with the IN operator? Regards, -Selen ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Gunter Hick Software Engineer Scientific Games International GmbH Klitschgasse 2 – 4, A - 1130 Vienna, Austria FN 157284 a, HG Wien Tel: +43 1 80100 0 E-Mail: h...@scigames.at This e-mail is confidential and may well also be legally privileged. If you have received it in error, you are on notice as to its status and accordingly please notify us immediately by reply e-mail and then delete this message from your system. Please do not copy it or use it for any purposes, or disclose its contents to any person as to do so could be a breach of confidence. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
>From what I understand, SQLite performs query planning based on cost estimates. A full table scan costs less per record than an index lookup. 1) (scan cost) * (IN Lookup) is compared with 2) (IN size) * (key lookup) * (sort factor) IN Lookup cost is logarithmic; IN size is linear. The number 22 appears to be where a plot of the costs cross over. ANALYZE will improve SQLite's cost estimates. Gunter -Ursprüngliche Nachricht- Von: Selen Schabenberger [mailto:selen_oz...@yahoo.com] Gesendet: Donnerstag, 13. Dezember 2012 16:07 An: sqlite-users@sqlite.org Betreff: [sqlite] Multi-column index is not used with IN operator Hi All, I am observing some strange behaviour on my database when I execute a query with an IN operator having more than "22" expressions. My table structure looks basically as follows: CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT NULL, "Flag" INTEGER ) I have a multi-column index on (Tag, Flag, Id) as well as a single column index on the Flag column. If I execute the following query on this table, the Messages table is scanned using the primary key and the results are returned in 20 seconds. > explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, > 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, > 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, > 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, > 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, > 1167663, 1167667, 1167671, 1167675 ) and Flag=1) order by Messages.Id limit > 0, 100 >> Scan table Messages using integer primary key >> Execute list subquery1 However if I reduce the number of expressions in the IN operator, the index on the (Tag, Flag, Id) is used. > explain query plan Select Messages.MessageId from Messages where (AspTag in ( > 1146883, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, > 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, > 1148016, 1148018, 1148020, 1148022, 1148040, 1148042) and PduFlag=1) order > by MessageId limit 0, 100 >> Search table Messages using covering index IDX_TAG_FLAG_ID >> Execute list subquery1 >> Use temp b-tree for order by If I rewrite the first query in multiple IN operators connecting with OR then I can again use my multi-column index and it is much more efficient than the full table scan. > explain query plan Select Messages.Id from Messages where (Tag in ( > 1148138,1148191, 1148232, 1148234, 1167643, 1167659,1146883, 1146884, > 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, > 1147912, 1147914, 1147968, 1147970, 1147976, 1147978 ) or Tag in (1148012, > 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, > 1148136, 1167660, 1167663, 1167667, 1167671, 1167675) ) and Flag=1 order by > Messages.Id limit 0, 100 >> Execute list subquery1 >> Search table Messages using covering index IDX_TAG_FLAG_ID >> Execute list subquery1 >> Search table Messages using covering index IDX_TAG_FLAG_ID >> Use temp b-tree for order by Does anybody have any idea why it behaves so? Are there any restrictions with the IN operator? Regards, -Selen ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Gunter Hick Software Engineer Scientific Games International GmbH Klitschgasse 2 – 4, A - 1130 Vienna, Austria FN 157284 a, HG Wien Tel: +43 1 80100 0 E-Mail: h...@scigames.at This e-mail is confidential and may well also be legally privileged. If you have received it in error, you are on notice as to its status and accordingly please notify us immediately by reply e-mail and then delete this message from your system. Please do not copy it or use it for any purposes, or disclose its contents to any person as to do so could be a breach of confidence. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
Reordering the index worked but actually I need the index in (Tag, Flag, Id) order. The Flag column has a very low cardinality, and my queries are mostly filtering Tag & Flag or the Tag alone. When I drop the index on the Flag, and leave the index on (Tag, Flag, Id), the query planner still chooses the primary key on the Id column. -Selen From: Richard Hipp To: Selen Schabenberger ; General Discussion of SQLite Database Sent: Friday, December 14, 2012 3:09 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger wrote: Hi All, > >I am observing some strange behaviour on my database when I execute a query >with an IN operator having more than "22" expressions. My table structure >looks basically as follows: > >CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT >NULL, "Flag" INTEGER ) > > >I have a multi-column index on (Tag, Flag, Id) as well as a single column >index on the Flag column. > My guess is that the single-column index on Flag is misleading the query optimizer. You can probably fix this by either (1) running ANALYZE or (2) adding a "+" in front of the "Flag" column name in the WHERE clause of your query, like this: "... +Flag=1 ..." -- D. Richard Hipp d...@sqlite.org From: Simon Slavin To: Selen Schabenberger ; General Discussion of SQLite Database Sent: Friday, December 14, 2012 2:11 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 13 Dec 2012, at 3:06pm, Selen Schabenberger wrote: >> explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, >> 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, >> 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, >> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, >> 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, >> 1167663, 1167667, 1167671, 1167675 ) and Flag=1) order by Messages.Id limit >> 0, 100 Just out of interest, that form where you go ... WHERE (Tag in (a,b,c) AND Flag=1) ... Can you try a version where it uses instead ... WHERE Tag in (a,b,c) AND Flag=1 ... or ... WHERE (Tag in (a,b,c)) AND (Flag=1) ... I'm trying to predict how the parsing works here but I don't know whether I figured it out. Another thing to try is to reverse the order of your index > I have a multi-column index on (Tag, Flag, Id) as well as a single column > index on the Flag column. Can you add a new one, or change your existing one, to (Flag, Tag, Id) ? > Are there any restrictions with the IN operator? Nothing that you got anywhere close to. It's meant to be able to handle text and lists far longer than that. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger wrote: > Hi All, > > I am observing some strange behaviour on my database when I execute a > query with an IN operator having more than "22" expressions. My table > structure looks basically as follows: > > CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER > NOT NULL, "Flag" INTEGER ) > > > I have a multi-column index on (Tag, Flag, Id) as well as a single column > index on the Flag column. > My guess is that the single-column index on Flag is misleading the query optimizer. You can probably fix this by either (1) running ANALYZE or (2) adding a "+" in front of the "Flag" column name in the WHERE clause of your query, like this: "... +Flag=1 ..." -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Multi-column index is not used with IN operator
On 13 Dec 2012, at 3:06pm, Selen Schabenberger wrote: >> explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, >> 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, >> 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, >> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, >> 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, >> 1167663, 1167667, 1167671, 1167675 ) and Flag=1) order by Messages.Id limit >> 0, 100 Just out of interest, that form where you go ... WHERE (Tag in (a,b,c) AND Flag=1) ... Can you try a version where it uses instead ... WHERE Tag in (a,b,c) AND Flag=1 ... or ... WHERE (Tag in (a,b,c)) AND (Flag=1) ... I'm trying to predict how the parsing works here but I don't know whether I figured it out. Another thing to try is to reverse the order of your index > I have a multi-column index on (Tag, Flag, Id) as well as a single column > index on the Flag column. Can you add a new one, or change your existing one, to (Flag, Tag, Id) ? > Are there any restrictions with the IN operator? Nothing that you got anywhere close to. It's meant to be able to handle text and lists far longer than that. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users