On 01/16/2013 08:48 PM, Selen Schabenberger wrote:
However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

If the index consists of only the Tag, then this index is used.

When you create the index on just the Tag column, what does the
corresponding sqlite_stat1 entry look like?

Dan.




 You
said, there would be a penalty if the index was not a covering index.
But seems like it is a discount for 3.7.15.2?

- Selen

------------------------------------------------------------------------
*From:* Dan Kennedy <danielk1...@gmail.com>
*To:* sqlite-users@sqlite.org
*Sent:* Wednesday, January 16, 2013 1:12 PM
*Subject:* Re: [sqlite] Multi-column index is not used with IN operator

On 01/16/2013 06:25 PM, Selen Schabenberger wrote:
 > Below is the output of the dump. If it does not help reproduce the
error, then I can try to share the original database file itself.
 >
 > PRAGMA foreign_keys=OFF;
 > BEGIN TRANSACTION;
 > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag'
INTEGER NOT NULL, 'Flag' INTEGER NOT NULL );
 > ANALYZE sqlite_master;
 > INSERT INTO "sqlite_stat1"
VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
 > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
 > COMMIT;
 >

Got it this time. Considering this one:

SELECT * FROM message
WHERE tag IN
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)

AND flag=1
ORDER BY id LIMIT 200;

Looks like the query planner now assigns a higher cost to
external sorts (ORDER BY clauses that cannot use indexes) than it
did in 3.6.23.1. In both cases, SQLite assumes that scanning the
full table implies visiting 460132 rows. In both versions, this
plan is assigned a cost of 460132.

If there are 30 elements in the IN(...) set, SQLite assumes that
scanning the index requires visiting (30*1275)=38250 rows.
Basic cost of 38250, plus some insignificant amount for the 30
seek operations required.

In version 3.6.23.1, the penalty for the external sort is
(nRow*log10(nRow)), where nRow is the number of rows to sort (in
this case 38250). SQLite rounds up the log10() expression to 5,
so the penalty is roughly 191250. Total cost of 229500. It then
gets a 50% discount for using a covering index, so the overall
cost is roughly 115000. Making it preferable to use the index.

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

None of this jumps out as obviously incorrect. In practice, how
much slower is 3.7.16 at running the query above?

What does:

SELECT count(*) FROM message WHERE tag IN (....) AND flag=1;

return? Is it close to the 38250 that SQLite is using as an
estimate when planning the query?

Thanks,
Dan.




 >
 > Thanks!
 > Selen
 >
 > ________________________________
 > From: Dan Kennedy<danielk1...@gmail.com <mailto:danielk1...@gmail.com>>
 > To: sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
 > Sent: Wednesday, January 16, 2013 12:05 PM
 > Subject: Re: [sqlite] Multi-column index is not used with IN operator
 >
 > On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
 >> I attach a small database where it is possible to reproduce the
 >> issue. I deleted all irrelevant tables and all the tuples in the
 >> Message table to keep the file size small but had run ANALYZE before
 >> doing that.
 >
 > Mailing list does not allow attachments. Can you either upload the
 > db somewhere, or include the output of ".dump" in the body of the
 > message if it is small enough? Thanks.
 >



 >
 >
 >
 >>
 >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN
 >>
 >> SELECT * FROM message WHERE tag IN
 >>
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 > ) AND flag=1
 >> ORDER BY id LIMIT 200;
 >>
 >> I get this result: selectId order from detail
 >>
 >>
 >> 0 0 0 SCAN TABLE Message USING INTEGER
 >> PRIMARY KEY (~4601 rows) 0 0 0 EXECUTE
 >> LIST SUBQUERY 1
 >>
 >>
 >> Hope someone can help.
 >>
 >>
 >> - Selen
 >>
 >>
 >>
 >> ________________________________ From: Selen
 >> Schabenberger<selen_oz...@yahoo.com <mailto:selen_oz...@yahoo.com>>
To: General Discussion of SQLite
 >> Database<sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>>;
Richard Hipp<d...@sqlite.org <mailto:d...@sqlite.org>> Sent:
 >> Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite]
 >> Multi-column index is not used with IN operator
 >>
 >> Hi Richard, I tested the whole scenario one more time with the new
 >> SQLite version. As you suggested I put a plus sign in front of the
 >> Flag column and that really made the query much faster by using the
 >> multi column index (Tag, Flag, Id) instead of the primary index on
 >> the Id column. However what I don't get is, I actually had removed
 >> that single column index on the Flag before and run ANALZE. How come
 >> the query optimizer makes another decision when I put a + in front of
 >> a column which is not indexed alone? Is there another way to improve
 >> this query, other than using the + sign? I would really appreciate
 >> any suggestions. Happy new year! Regards,Selen
 >>
 >> --- On Fri, 12/14/12, Richard Hipp<d...@sqlite.org
<mailto:d...@sqlite.org>> wrote:
 >>
 >> From: Richard Hipp<d...@sqlite.org <mailto:d...@sqlite.org>> Subject:
Re: [sqlite] Multi-column
 >> index is not used with IN operator To: "Selen
 >> Schabenberger"<selen_oz...@yahoo.com
<mailto:selen_oz...@yahoo.com>>, "General Discussion of SQLite
 >> Database"<sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>>
Date: Friday, December 14, 2012,
 >> 3:09 PM
 >>
 >>
 >>
 >> On Thu, Dec 13, 2012 at 10:06 AM, Selen
 >> Schabenberger<selen_oz...@yahoo.com <mailto:selen_oz...@yahoo.com>>
wrote:
 >>
 >>
 >> Hi All,
 >>
 >>
 >>
 >> I am observing some strange behaviour on my database when I execute a
 >> query with an IN operator having more than "22" expressions. My table
 >> structure looks basically as follows:
 >>
 >>
 >>
 >> CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag"
 >> INTEGER NOT NULL, "Flag" INTEGER )
 >>
 >>
 >>
 >>
 >>
 >> I have a multi-column index on (Tag, Flag, Id) as well as a single
 >> column index on the Flag column.
 >>
 >> My guess is that the single-column index on Flag is misleading the
 >> query optimizer. You can probably fix this by either (1) running
 >> ANALYZE or (2) adding a "+" in front of the "Flag" column name in the
 >> WHERE clause of your query, like this: "... +Flag=1 ..."
 >>
 >>
 >>
 >>
 >>
 >>
 >> _______________________________________________ sqlite-users mailing
 >> list sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
 >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 >
 > _______________________________________________
 > sqlite-users mailing list
 > sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
 > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 > _______________________________________________
 > sqlite-users mailing list
 > sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
 > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 >

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to