My original query looks like this one and the Message table has in total 10 columns in it (there is one BLOB).
Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 ) ) and Flag=1) limit 200 If the database file is on the network share (which is most likely the case), then it takes ~22000 ms (with the index 300ms!) to return the results. On the local drive it is ~300 ms (with the index 10 ms). There are in total 101 rows that match the where clause. Thanks! Selen ________________________________ From: Dan Kennedy <danielk1...@gmail.com> To: sqlite-users@sqlite.org Sent: Wednesday, January 16, 2013 1:12 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 06:25 PM, Selen Schabenberger wrote: > Below is the output of the dump. If it does not help reproduce the error, > then I can try to share the original database file itself. > > PRAGMA foreign_keys=OFF; > BEGIN TRANSACTION; > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT > NULL, 'Flag' INTEGER NOT NULL ); > ANALYZE sqlite_master; > INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 > 1289 1275 1'); > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); > COMMIT; > Got it this time. Considering this one: SELECT * FROM message WHERE tag IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) AND flag=1 ORDER BY id LIMIT 200; Looks like the query planner now assigns a higher cost to external sorts (ORDER BY clauses that cannot use indexes) than it did in 3.6.23.1. In both cases, SQLite assumes that scanning the full table implies visiting 460132 rows. In both versions, this plan is assigned a cost of 460132. If there are 30 elements in the IN(...) set, SQLite assumes that scanning the index requires visiting (30*1275)=38250 rows. Basic cost of 38250, plus some insignificant amount for the 30 seek operations required. In version 3.6.23.1, the penalty for the external sort is (nRow*log10(nRow)), where nRow is the number of rows to sort (in this case 38250). SQLite rounds up the log10() expression to 5, so the penalty is roughly 191250. Total cost of 229500. It then gets a 50% discount for using a covering index, so the overall cost is roughly 115000. Making it preferable to use the index. However, in 3.7.16, the penalty for the external sort is (3*nRow*log10(nRow)) and there is no discount for using a covering index (instead, there would be another penalty if the index were not a covering index). For a total cost of roughly 612000. So this version of SQLite does a full table scan. None of this jumps out as obviously incorrect. In practice, how much slower is 3.7.16 at running the query above? What does: SELECT count(*) FROM message WHERE tag IN (....) AND flag=1; return? Is it close to the 38250 that SQLite is using as an estimate when planning the query? Thanks, Dan. > > Thanks! > Selen > > ________________________________ > From: Dan Kennedy<danielk1...@gmail.com> > To: sqlite-users@sqlite.org > Sent: Wednesday, January 16, 2013 12:05 PM > Subject: Re: [sqlite] Multi-column index is not used with IN operator > > On 01/16/2013 05:13 PM, Selen Schabenberger wrote: >> I attach a small database where it is possible to reproduce the >> issue. I deleted all irrelevant tables and all the tuples in the >> Message table to keep the file size small but had run ANALYZE before >> doing that. > > Mailing list does not allow attachments. Can you either upload the > db somewhere, or include the output of ".dump" in the body of the > message if it is small enough? Thanks. > > > > >> >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN >> >> SELECT * FROM message WHERE tag IN >> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 >> >> > ) AND flag=1 >> ORDER BY id LIMIT 200; >> >> I get this result: selectId order from detail >> >> >> 0 0 0 SCAN TABLE Message USING INTEGER >> PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE >> LIST SUBQUERY 1 >> >> >> Hope someone can help. >> >> >> - Selen >> >> >> >> ________________________________ From: Selen >> Schabenberger<selen_oz...@yahoo.com> To: General Discussion of SQLite >> Database<sqlite-users@sqlite.org>; Richard Hipp<d...@sqlite.org> Sent: >> Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite] >> Multi-column index is not used with IN operator >> >> Hi Richard, I tested the whole scenario one more time with the new >> SQLite version. As you suggested I put a plus sign in front of the >> Flag column and that really made the query much faster by using the >> multi column index (Tag, Flag, Id) instead of the primary index on >> the Id column. However what I don't get is, I actually had removed >> that single column index on the Flag before and run ANALZE. How come >> the query optimizer makes another decision when I put a + in front of >> a column which is not indexed alone? Is there another way to improve >> this query, other than using the + sign? I would really appreciate >> any suggestions. Happy new year! Regards,Selen >> >> --- On Fri, 12/14/12, Richard Hipp<d...@sqlite.org> wrote: >> >> From: Richard Hipp<d...@sqlite.org> Subject: Re: [sqlite] Multi-column >> index is not used with IN operator To: "Selen >> Schabenberger"<selen_oz...@yahoo.com>, "General Discussion of SQLite >> Database"<sqlite-users@sqlite.org> Date: Friday, December 14, 2012, >> 3:09 PM >> >> >> >> On Thu, Dec 13, 2012 at 10:06 AM, Selen >> Schabenberger<selen_oz...@yahoo.com> wrote: >> >> >> Hi All, >> >> >> >> I am observing some strange behaviour on my database when I execute a >> query with an IN operator having more than "22" expressions. My table >> structure looks basically as follows: >> >> >> >> CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" >> INTEGER NOT NULL, "Flag" INTEGER ) >> >> >> >> >> >> I have a multi-column index on (Tag, Flag, Id) as well as a single >> column index on the Flag column. >> >> My guess is that the single-column index on Flag is misleading the >> query optimizer. You can probably fix this by either (1) running >> ANALYZE or (2) adding a "+" in front of the "Flag" column name in the >> WHERE clause of your query, like this: "... +Flag=1 ..." >> >> >> >> >> >> >> _______________________________________________ sqlite-users mailing >> list sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users