The entry for the Tag index in sqlite_stat1:
460132 1289 The results are returned in 163 ms on the network share. - Selen ________________________________ From: Dan Kennedy <danielk1...@gmail.com> To: General Discussion of SQLite Database <sqlite-users@sqlite.org> Sent: Wednesday, January 16, 2013 2:56 PM Subject: Re: [sqlite] Multi-column index is not used with IN operator On 01/16/2013 08:48 PM, Selen Schabenberger wrote: >>However, in 3.7.16, the penalty for the external sort is >>(3*nRow*log10(nRow)) and there is no discount for using a >>covering index (instead, there would be another penalty if the >>index were not a covering index). For a total cost of roughly >>612000. So this version of SQLite does a full table scan. > > If the index consists of only the Tag, then this index is used. When you create the index on just the Tag column, what does the corresponding sqlite_stat1 entry look like? Dan. You > said, there would be a penalty if the index was not a covering index. > But seems like it is a discount for 3.7.15.2? > > - Selen > > ------------------------------------------------------------------------ > *From:* Dan Kennedy <danielk1...@gmail.com> > *To:* sqlite-users@sqlite.org > *Sent:* Wednesday, January 16, 2013 1:12 PM > *Subject:* Re: [sqlite] Multi-column index is not used with IN operator > > On 01/16/2013 06:25 PM, Selen Schabenberger wrote: > > Below is the output of the dump. If it does not help reproduce the > error, then I can try to share the original database file itself. > > > > PRAGMA foreign_keys=OFF; > > BEGIN TRANSACTION; > > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' > INTEGER NOT NULL, 'Flag' INTEGER NOT NULL ); > > ANALYZE sqlite_master; > > INSERT INTO "sqlite_stat1" > VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1'); > > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id'); > > COMMIT; > > > > Got it this time. Considering this one: > > SELECT * FROM message > WHERE tag IN > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) > > AND flag=1 > ORDER BY id LIMIT 200; > > Looks like the query planner now assigns a higher cost to > external sorts (ORDER BY clauses that cannot use indexes) than it > did in 3.6.23.1. In both cases, SQLite assumes that scanning the > full table implies visiting 460132 rows. In both versions, this > plan is assigned a cost of 460132. > > If there are 30 elements in the IN(...) set, SQLite assumes that > scanning the index requires visiting (30*1275)=38250 rows. > Basic cost of 38250, plus some insignificant amount for the 30 > seek operations required. > > In version 3.6.23.1, the penalty for the external sort is > (nRow*log10(nRow)), where nRow is the number of rows to sort (in > this case 38250). SQLite rounds up the log10() expression to 5, > so the penalty is roughly 191250. Total cost of 229500. It then > gets a 50% discount for using a covering index, so the overall > cost is roughly 115000. Making it preferable to use the index. > > However, in 3.7.16, the penalty for the external sort is > (3*nRow*log10(nRow)) and there is no discount for using a > covering index (instead, there would be another penalty if the > index were not a covering index). For a total cost of roughly > 612000. So this version of SQLite does a full table scan. > > None of this jumps out as obviously incorrect. In practice, how > much slower is 3.7.16 at running the query above? > > What does: > > SELECT count(*) FROM message WHERE tag IN (....) AND flag=1; > > return? Is it close to the 38250 that SQLite is using as an > estimate when planning the query? > > Thanks, > Dan. > > > > > > > > Thanks! > > Selen > > > > ________________________________ > > From: Dan Kennedy<danielk1...@gmail.com <mailto:danielk1...@gmail.com>> > > To: sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> > > Sent: Wednesday, January 16, 2013 12:05 PM > > Subject: Re: [sqlite] Multi-column index is not used with IN operator > > > > On 01/16/2013 05:13 PM, Selen Schabenberger wrote: > >> I attach a small database where it is possible to reproduce the > >> issue. I deleted all irrelevant tables and all the tuples in the > >> Message table to keep the file size small but had run ANALYZE before > >> doing that. > > > > Mailing list does not allow attachments. Can you either upload the > > db somewhere, or include the output of ".dump" in the body of the > > message if it is small enough? Thanks. > > > > > > > > > > > > >> > >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN > >> > >> SELECT * FROM message WHERE tag IN > >> > (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > >> > > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > >> > > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > >> > > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > >> > > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > >> > > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > >> > > > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 > >> > >> > > ) AND flag=1 > >> ORDER BY id LIMIT 200; > >> > >> I get this result: selectId order from detail > >> > >> > >> 0 0 0 SCAN TABLE Message USING INTEGER > >> PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE > >> LIST SUBQUERY 1 > >> > >> > >> Hope someone can help. > >> > >> > >> - Selen > >> > >> > >> > >> ________________________________ From: Selen > >> Schabenberger<selen_oz...@yahoo.com <mailto:selen_oz...@yahoo.com>> > To: General Discussion of SQLite > >> Database<sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>>; > Richard Hipp<d...@sqlite.org <mailto:d...@sqlite.org>> Sent: > >> Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite] > >> Multi-column index is not used with IN operator > >> > >> Hi Richard, I tested the whole scenario one more time with the new > >> SQLite version. As you suggested I put a plus sign in front of the > >> Flag column and that really made the query much faster by using the > >> multi column index (Tag, Flag, Id) instead of the primary index on > >> the Id column. However what I don't get is, I actually had removed > >> that single column index on the Flag before and run ANALZE. How come > >> the query optimizer makes another decision when I put a + in front of > >> a column which is not indexed alone? Is there another way to improve > >> this query, other than using the + sign? I would really appreciate > >> any suggestions. Happy new year! Regards,Selen > >> > >> --- On Fri, 12/14/12, Richard Hipp<d...@sqlite.org > <mailto:d...@sqlite.org>> wrote: > >> > >> From: Richard Hipp<d...@sqlite.org <mailto:d...@sqlite.org>> Subject: > Re: [sqlite] Multi-column > >> index is not used with IN operator To: "Selen > >> Schabenberger"<selen_oz...@yahoo.com > <mailto:selen_oz...@yahoo.com>>, "General Discussion of SQLite > >> Database"<sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>> > Date: Friday, December 14, 2012, > >> 3:09 PM > >> > >> > >> > >> On Thu, Dec 13, 2012 at 10:06 AM, Selen > >> Schabenberger<selen_oz...@yahoo.com <mailto:selen_oz...@yahoo.com>> > wrote: > >> > >> > >> Hi All, > >> > >> > >> > >> I am observing some strange behaviour on my database when I execute a > >> query with an IN operator having more than "22" expressions. My table > >> structure looks basically as follows: > >> > >> > >> > >> CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" > >> INTEGER NOT NULL, "Flag" INTEGER ) > >> > >> > >> > >> > >> > >> I have a multi-column index on (Tag, Flag, Id) as well as a single > >> column index on the Flag column. > >> > >> My guess is that the single-column index on Flag is misleading the > >> query optimizer. You can probably fix this by either (1) running > >> ANALYZE or (2) adding a "+" in front of the "Flag" column name in the > >> WHERE clause of your query, like this: "... +Flag=1 ..." > >> > >> > >> > >> > >> > >> > >> _______________________________________________ sqlite-users mailing > >> list sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> > >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > _______________________________________________ > > sqlite-users mailing list > > sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > _______________________________________________ > > sqlite-users mailing list > > sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users