My original query looks like this one and the Message table has in total 10 
columns in it (there is one BLOB). 

Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 
1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 
1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 
1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 
1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 
) ) and Flag=1)  limit 200


If the database file is on the network share (which is most likely the case), 
then it takes ~22000 ms (with the index 300ms!) to return the results. On the 
local drive it is ~300 ms (with the index 10 ms). There are in total 101 rows 
that match the where clause.

Thanks!
Selen



________________________________
 From: Dan Kennedy <danielk1...@gmail.com>
To: sqlite-users@sqlite.org 
Sent: Wednesday, January 16, 2013 1:12 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 
On 01/16/2013 06:25 PM, Selen Schabenberger wrote:
> Below is the output of the dump. If it does not help reproduce the error, 
> then I can try to share the original database file itself.
>
> PRAGMA foreign_keys=OFF;
> BEGIN TRANSACTION;
> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT 
> NULL, 'Flag' INTEGER NOT NULL );
> ANALYZE sqlite_master;
> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 
> 1289 1275 1');
> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
> COMMIT;
>

Got it this time. Considering this one:

   SELECT * FROM message
   WHERE tag IN 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)
 
AND flag=1
   ORDER BY id LIMIT 200;

Looks like the query planner now assigns a higher cost to
external sorts (ORDER BY clauses that cannot use indexes) than it
did in 3.6.23.1. In both cases, SQLite assumes that scanning the
full table implies visiting 460132 rows. In both versions, this
plan is assigned a cost of 460132.

If there are 30 elements in the IN(...) set, SQLite assumes that
scanning the index requires visiting (30*1275)=38250 rows.
Basic cost of 38250, plus some insignificant amount for the 30
seek operations required.

In version 3.6.23.1, the penalty for the external sort is
(nRow*log10(nRow)), where nRow is the number of rows to sort (in
this case 38250). SQLite rounds up the log10() expression to 5,
so the penalty is roughly 191250. Total cost of 229500. It then
gets a 50% discount for using a covering index, so the overall
cost is roughly 115000. Making it preferable to use the index.

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

None of this jumps out as obviously incorrect. In practice, how
much slower is 3.7.16 at running the query above?

What does:

   SELECT count(*) FROM message WHERE tag IN (....) AND flag=1;

return? Is it close to the 38250 that SQLite is using as an
estimate when planning the query?

Thanks,
Dan.




>
> Thanks!
> Selen
>
> ________________________________
>   From: Dan Kennedy<danielk1...@gmail.com>
> To: sqlite-users@sqlite.org
> Sent: Wednesday, January 16, 2013 12:05 PM
> Subject: Re: [sqlite] Multi-column index is not used with IN operator
>
> On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
>> I attach a small database where it is possible to reproduce the
>> issue. I deleted all irrelevant tables and all the tuples in the
>> Message table to keep the file size small but had run ANALYZE before
>> doing that.
>
> Mailing list does not allow attachments. Can you either upload the
> db somewhere, or include the output of ".dump" in the body of the
> message if it is small enough? Thanks.
>



>
>
>
>>
>> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN
>>
>> SELECT * FROM message WHERE tag IN
>> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ) AND flag=1
>> ORDER BY id LIMIT 200;
>>
>> I get this result: selectId   order      from       detail
>>
>>
>> 0          0          0          SCAN TABLE Message USING INTEGER
>> PRIMARY KEY (~4601 rows) 0          0          0          EXECUTE
>> LIST SUBQUERY 1
>>
>>
>> Hope someone can help.
>>
>>
>> - Selen
>>
>>
>>
>> ________________________________ From: Selen
>> Schabenberger<selen_oz...@yahoo.com>  To: General Discussion of SQLite
>> Database<sqlite-users@sqlite.org>; Richard Hipp<d...@sqlite.org>  Sent:
>> Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite]
>> Multi-column index is not used with IN operator
>>
>> Hi Richard, I tested the whole scenario one more time with the new
>> SQLite version. As you suggested I put a plus sign in front of the
>> Flag column and that really made the query much faster by using the
>> multi column index (Tag, Flag, Id) instead of the primary index on
>> the Id column. However what I don't get is, I actually had removed
>> that single column index on the Flag before and run ANALZE. How come
>> the query optimizer makes another decision when I put a + in front of
>> a column which is not indexed alone? Is there another way to improve
>> this query, other than using the + sign? I would really appreciate
>> any suggestions. Happy new year! Regards,Selen
>>
>> --- On Fri, 12/14/12, Richard Hipp<d...@sqlite.org>   wrote:
>>
>> From: Richard Hipp<d...@sqlite.org>  Subject: Re: [sqlite] Multi-column
>> index is not used with IN operator To: "Selen
>> Schabenberger"<selen_oz...@yahoo.com>, "General Discussion of SQLite
>> Database"<sqlite-users@sqlite.org>  Date: Friday, December 14, 2012,
>> 3:09 PM
>>
>>
>>
>> On Thu, Dec 13, 2012 at 10:06 AM, Selen
>> Schabenberger<selen_oz...@yahoo.com>   wrote:
>>
>>
>> Hi All,
>>
>>
>>
>> I am observing some strange behaviour on my database when I execute a
>> query with an IN operator having more than "22" expressions. My table
>> structure looks basically as follows:
>>
>>
>>
>> CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag"
>> INTEGER NOT NULL, "Flag" INTEGER )
>>
>>
>>
>>
>>
>> I have a multi-column index on  (Tag, Flag, Id) as well as a single
>> column index on the Flag column.
>>
>> My guess is that the single-column index on Flag is misleading the
>> query optimizer.  You can probably fix this by either (1) running
>> ANALYZE or (2) adding a "+" in front of the "Flag" column name in the
>> WHERE clause of your query, like this:   "... +Flag=1 ..."
>>
>>
>>
>>
>>
>>
>> _______________________________________________ sqlite-users mailing
>> list sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to