Re: [sqlite] Multi-column index is not used with IN operator

2013-01-17 Thread Dan Kennedy

On 01/17/2013 09:17 PM, Selen Schabenberger wrote:

I was trying to avoid the INDEXED BY or the "+" operator as I cannot
make sure that the search with the index will always (for every
database created by my application) be faster than with the PK. I
prefer the optimizer does the cost calculations and makes the choice.


Do you want me to write another ticket for the enhancement you
mentioned?


Don't worry about it for now. Changes to the optimizer are things
that need to be considered really carefully. It's too easy to
improve one type of query to the detriment of others.

Dan.




-Selen




 From: Dan
Kennedy To: General Discussion of SQLite
Database Sent: Thursday, January 17, 2013
2:38 PM Subject: Re: [sqlite] Multi-column index is not used with
IN operator

On 01/17/2013 06:32 PM, Selen Schabenberger wrote:


Dan, I have just realised that with the sqlite_stat3 table, the
query optimizer uses the INDEX IDX_TAG. When I drop this table,
the PK is used as you said. But the concatenated index is still
not used with the stat3.


Looks like with stat3 enabled the estimate of the number of rows
matched by the WHERE clause is better. So it uses INDEX_IDX_TAG.
But not the composite index as the stat3 estimation is disabled if
the query uses more than the first column of the index (in this
case it uses two - 'Tag' and 'Flag'). That's probably something
that could be enhanced at some point.


Knowing that the query could be 60 times faster, is there
anything I can do to speed up this query or you think it is a bug
in the optimizer?


You could figure out why the stat1 data is deceptive in this case
and whether there is anything that can be done about it.

Or add an INDEXED BY clause to the query.

Adding a unary '+' operator in front of 'Id' might work too.

Dan.


___ sqlite-users mailing
list sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-17 Thread Selen Schabenberger
I was trying to avoid the INDEXED BY or the "+" operator as I cannot make sure 
that the search with the index will always (for every database created by my 
application) be faster than with the PK. I prefer the optimizer does the cost 
calculations and makes the choice. 

Do you want me to write another ticket for the enhancement you mentioned?

-Selen



>
> From: Dan Kennedy 
>To: General Discussion of SQLite Database  
>Sent: Thursday, January 17, 2013 2:38 PM
>Subject: Re: [sqlite] Multi-column index is not used with IN operator
> 
>On 01/17/2013 06:32 PM, Selen Schabenberger wrote:
>>
>> Dan, I have just realised that with the sqlite_stat3 table, the query
>> optimizer uses the INDEX IDX_TAG. When I drop this table, the PK is used
>> as you said. But the concatenated index is still not used with the stat3.
>
>Looks like with stat3 enabled the estimate of the number of rows matched
>by the WHERE clause is better. So it uses INDEX_IDX_TAG. But not the
>composite index as the stat3 estimation is disabled if the query uses
>more than the first column of the index (in this case it uses two -
>'Tag' and 'Flag'). That's probably something that could be enhanced at
>some point.
>
>> Knowing that the query could be 60 times faster, is there anything I can
>> do to speed up this query or you think it is a bug in the optimizer?
>
>You could figure out why the stat1 data is deceptive in this case and
>whether there is anything that can be done about it.
>
>Or add an INDEXED BY clause to the query.
>
>Adding a unary '+' operator in front of 'Id' might work too.
>
>Dan.
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-17 Thread Dan Kennedy

On 01/17/2013 06:32 PM, Selen Schabenberger wrote:


Dan, I have just realised that with the sqlite_stat3 table, the query
optimizer uses the INDEX IDX_TAG. When I drop this table, the PK is used
as you said. But the concatenated index is still not used with the stat3.


Looks like with stat3 enabled the estimate of the number of rows matched
by the WHERE clause is better. So it uses INDEX_IDX_TAG. But not the
composite index as the stat3 estimation is disabled if the query uses
more than the first column of the index (in this case it uses two -
'Tag' and 'Flag'). That's probably something that could be enhanced at
some point.


Knowing that the query could be 60 times faster, is there anything I can
do to speed up this query or you think it is a bug in the optimizer?


You could figure out why the stat1 data is deceptive in this case and
whether there is anything that can be done about it.

Or add an INDEXED BY clause to the query.

Adding a unary '+' operator in front of 'Id' might work too.

Dan.





- Selen


*From:* Selen Schabenberger 
*To:* General Discussion of SQLite Database
; Dan Kennedy 
*Sent:* Wednesday, January 16, 2013 3:46 PM
    *Subject:* Re: [sqlite] Multi-column index is not used with IN operator

Sorry, wrong query wrong result.

But still when I add the order by, the index is used:

Select * from Message where ((Tag in ( 1146883, 1146884, 1146886,
1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922,
1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012,
1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042,
1148079, 1148136, 1148138, 1148191, 1148232, 1148234, 1167643,
1167659, 1167660, 1167663, 1167667, 1167671, 1167675 ) ) and Flag=1)
order by Id limit 200


selectId order from detail
0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~33 rows)
0 0 0 EXECUTE LIST SUBQUERY 1
0 0 0 &n bsp; USE TEMP B-TREE FOR ORDER BY





 >
 > From: Dan Kennedy mailto:danielk1...@gmail.com>>
 >To: General Discussion of SQLite Database mailto:sqlite-users@sqlite.org>>
 >Sent: Wednesday, January 16, 2013 3:40 PM
     >Subject: Re: [sqlite] Multi-column index is not used with IN operator
 >
 >On 01/16/2013 09:22 PM, Selen Schabenberger wrote:
 >> PRAGMA foreign_keys=OFF;
 >> BEGIN TRANSACTION;
 >> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag'
INTEGER
 >> NOT NULL, 'Flag' INTEGER NOT NULL );
 >> ANALYZE sqlite_master;
 >> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132
1289');
 >> INSERT INTO "sqlite_stat1"
 >> VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
 >> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag',
'Id');
 >> CREATE INDEX IDX_TAG on Message (Tag);
 >> COMMIT;
 >>
 >> Select * from Message where ((Tag in ( 1146883, 1146884, 1146886,
 >> 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922,
1147912,
 >> 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015,
1148016,
 >> 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136,
1148138,
 >> 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663,
1167667,
 >> 1167671, 1167675 ) ) and Flag=1) limit 200
 >>
 >> selectId order from detail
 >> 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows)
 >> 0 0 0 EXECUTE LIST SUBQUERY 1
 >> 0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY
 >
 >The query and the result above don't match.
 >
 >If I add the ORDER BY clause to the query it uses the IPK index.
 >
 >Dan.
 >___
 >sqlite-users mailing list
 >sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
 >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
 >
 >
 >
___
sqlite-users mailing list
sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-17 Thread Selen Schabenberger


Dan, I have just realised that with the sqlite_stat3 table, the query optimizer 
uses the  INDEX IDX_TAG. When I drop this table, the PK is used as you said. 
But the concatenated index is still not used with the stat3.

Knowing that the query could be 60 times faster, is there anything I can do to 
speed up this query or you think it is a bug in the optimizer? 

- Selen



>
> From: Selen Schabenberger 
>To: General Discussion of SQLite Database ; Dan 
>Kennedy  
>Sent: Wednesday, January 16, 2013 3:46 PM
>Subject: Re: [sqlite] Multi-column index is not used with IN operator
> 
>Sorry, wrong query wrong result.
>
>But still when I add the order by, the index is used:
>
>Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 
>1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 
>1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 
>1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 
>1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 
>1167675 ) ) and Flag=1) order by Id limit 200
>
>
>selectId   order  from   detail
>0  0  0  SEARCH TABLE Message USING INDEX IDX_TAG 
>(Tag=?) (~33 rows)
>0  0  0  EXECUTE LIST SUBQUERY 1   
>0  0  0  USE TEMP B-TREE FOR ORDER BY  
>
>
>
>
>
>>
>> From: Dan Kennedy 
>>To: General Discussion of SQLite Database  
>>Sent: Wednesday, January 16, 2013 3:40 PM
>>Subject: Re: [sqlite] Multi-column index is not used with IN operator
>> 
>>On 01/16/2013 09:22 PM, Selen Schabenberger wrote:
>>> PRAGMA foreign_keys=OFF;
>>> BEGIN TRANSACTION;
>>> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER
>>> NOT NULL, 'Flag' INTEGER NOT NULL );
>>> ANALYZE sqlite_master;
>>> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289');
>>> INSERT INTO "sqlite_stat1"
>>> VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
>>> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
>>> CREATE INDEX IDX_TAG on Message (Tag);
>>> COMMIT;
>>>
>>> Select * from Message where ((Tag in ( 1146883, 1146884, 1146886,
>>> 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912,
>>> 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016,
>>> 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138,
>>> 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667,
>>> 1167671, 1167675 ) ) and Flag=1) limit 200
>>>
>>> selectId order from detail
>>> 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows)
>>> 0 0 0 EXECUTE LIST SUBQUERY 1
>>> 0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY
>>
>>The query and the result above don't match.
>>
>>If I add the ORDER BY clause to the query it uses the IPK index.
>>
>>Dan.
>>___
>>sqlite-users mailing list
>>sqlite-users@sqlite.org
>>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>>
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger
Sorry, wrong query wrong result.

But still when I add the order by, the index is used:

Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 
1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 
1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 
1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 
1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 
) ) and Flag=1) order by Id limit 200


selectId   order  from   detail
0  0  0  SEARCH TABLE Message USING INDEX IDX_TAG 
(Tag=?) (~33 rows)
0  0  0  EXECUTE LIST SUBQUERY 1   
0  0  0  USE TEMP B-TREE FOR ORDER BY  





>
> From: Dan Kennedy 
>To: General Discussion of SQLite Database  
>Sent: Wednesday, January 16, 2013 3:40 PM
>Subject: Re: [sqlite] Multi-column index is not used with IN operator
> 
>On 01/16/2013 09:22 PM, Selen Schabenberger wrote:
>> PRAGMA foreign_keys=OFF;
>> BEGIN TRANSACTION;
>> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER
>> NOT NULL, 'Flag' INTEGER NOT NULL );
>> ANALYZE sqlite_master;
>> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289');
>> INSERT INTO "sqlite_stat1"
>> VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
>> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
>> CREATE INDEX IDX_TAG on Message (Tag);
>> COMMIT;
>>
>> Select * from Message where ((Tag in ( 1146883, 1146884, 1146886,
>> 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912,
>> 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016,
>> 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138,
>> 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667,
>> 1167671, 1167675 ) ) and Flag=1) limit 200
>>
>> selectId order from detail
>> 0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows)
>> 0 0 0 EXECUTE LIST SUBQUERY 1
>> 0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY
>
>The query and the result above don't match.
>
>If I add the ORDER BY clause to the query it uses the IPK index.
>
>Dan.
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Dan Kennedy

On 01/16/2013 09:22 PM, Selen Schabenberger wrote:

PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER
NOT NULL, 'Flag' INTEGER NOT NULL );
ANALYZE sqlite_master;
INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289');
INSERT INTO "sqlite_stat1"
VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
CREATE INDEX IDX_TAG on Message (Tag);
COMMIT;

Select * from Message where ((Tag in ( 1146883, 1146884, 1146886,
1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912,
1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016,
1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138,
1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667,
1167671, 1167675 ) ) and Flag=1) limit 200

selectId order from detail
0 0 0 SEARCH TABLE Message USING INDEX IDX_TAG (Tag=?) (~432 rows)
0 0 0 EXECUTE LIST SUBQUERY 1
0 0 0 & nbsp; USE TEMP B-TREE FOR ORDER BY


The query and the result above don't match.

If I add the ORDER BY clause to the query it uses the IPK index.

Dan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT 
NULL, 'Flag' INTEGER NOT NULL );
ANALYZE sqlite_master;
INSERT INTO "sqlite_stat1" VALUES('Message','IDX_TAG','460132 1289');
INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 
1275 1');
CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
CREATE INDEX IDX_TAG on Message (Tag);
COMMIT;

Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 
1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 
1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 
1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 
1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 
) ) and Flag=1)  limit 200


selectId   order  from   detail
0  0  0  SEARCH TABLE Message USING INDEX IDX_TAG 
(Tag=?) (~432 rows)
0  0  0  EXECUTE LIST SUBQUERY 1   
0  0  0  USE TEMP B-TREE FOR ORDER BY  


- Selen



>________
> From: Dan Kennedy 
>To: sqlite-users@sqlite.org 
>Sent: Wednesday, January 16, 2013 3:14 PM
>Subject: Re: [sqlite] Multi-column index is not used with IN operator
> 
>On 01/16/2013 09:04 PM, Selen Schabenberger wrote:
>>
>>
>> The entry for the Tag index in sqlite_stat1:
>>
>> 460132 1289
>>
>> The results are returned in 163 ms on the network share.
>
>I can't reproduce that. Once I add the entry to sqlite_stat1
>it uses the IPK index. Can you post the new .dump of the schema
>and query again?
>
>Dan.
>
>
>
>>
>> - Selen
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Dan Kennedy

On 01/16/2013 09:04 PM, Selen Schabenberger wrote:



The entry for the Tag index in sqlite_stat1:

460132 1289

The results are returned in 163 ms on the network share.


I can't reproduce that. Once I add the entry to sqlite_stat1
it uses the IPK index. Can you post the new .dump of the schema
and query again?

Dan.





- Selen



  From: Dan Kennedy
To: General Discussion of SQLite Database
Sent: Wednesday, January 16, 2013 2:56 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator

On 01/16/2013 08:48 PM, Selen Schabenberger wrote:

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.


If the index consists of only the Tag, then this index is used.


When you create the index on just the Tag column, what does the
corresponding sqlite_stat1 entry look like?

Dan.




   You

said, there would be a penalty if the index was not a covering index.
But seems like it is a discount for 3.7.15.2?

- Selen


*From:* Dan Kennedy
*To:* sqlite-users@sqlite.org
*Sent:* Wednesday, January 16, 2013 1:12 PM
*Subject:* Re: [sqlite] Multi-column index is not used with IN operator

On 01/16/2013 06:25 PM, Selen Schabenberger wrote:
   >  Below is the output of the dump. If it does not help reproduce the
error, then I can try to share the original database file itself.
   >
   >  PRAGMA foreign_keys=OFF;
   >  BEGIN TRANSACTION;
   >  CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag'
INTEGER NOT NULL, 'Flag' INTEGER NOT NULL );
   >  ANALYZE sqlite_master;
   >  INSERT INTO "sqlite_stat1"
VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
   >  CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
   >  COMMIT;
   >

Got it this time. Considering this one:

SELECT * FROM message
WHERE tag IN
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)

AND flag=1
ORDER BY id LIMIT 200;

Looks like the query planner now assigns a higher cost to
external sorts (ORDER BY clauses that cannot use indexes) than it
did in 3.6.23.1. In both cases, SQLite assumes that scanning the
full table implies visiting 460132 rows. In both versions, this
plan is assigned a cost of 460132.

If there are 30 elements in the IN(...) set, SQLite assumes that
scanning the index requires visiting (30*1275)=38250 rows.
Basic cost of 38250, plus some insignificant amount for the 30
seek operations required.

In version 3.6.23.1, the penalty for the external sort is
(nRow*log10(nRow)), where nRow is the number of rows to sort (in
this case 38250). SQLite rounds up the log10() expression to 5,
so the penalty is roughly 191250. Total cost of 229500. It then
gets a 50% discount for using a covering index, so the overall
cost is roughly 115000. Making it preferable to use the index.

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

None of this jumps out as obviously incorrect. In practice, how
much slower is 3.7.16 at running the query above?

What does:

SELECT count(*) FROM message WHERE tag IN () AND flag=1;

return? Is it close to the 38250 that SQLite is using as an
estimate when planning the query?

Thanks,
Dan.




   >
   >  Thanks!
   >  Selen
   >
   >  
   >  From: Dan Kennedymailto:danielk1...@gmail.com>>
   >  To: sqlite-users@sqlite.org<mailto:sqlite-users@sqlite.org>
   >  Sent: Wednesday, January 16, 2013 12:05 PM
   >  Subject: Re: [sqlite] Multi-column index is not used with IN operator
   >
   >  On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
   >>  I attach a small database where it is possible to reproduce the
   >>  issue. I deleted all irrelevant tables and all the tuples in the
   >>  Message table to keep the file size small but had run ANALYZE before
   >>  doing that.
   >
   >  Mailing list does not allow attachments. Can you either upload the
   >  db somewhere, or include the output of ".dump" in the body of the
   >  message if it is small enough? Thanks.
   >



   >
   >
   >
   >>
   >>  This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN
   >>
   >>  SELECT * FROM mes

Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger


The entry for the Tag index in sqlite_stat1:

460132 1289

The results are returned in 163 ms on the network share.

- Selen



 From: Dan Kennedy 
To: General Discussion of SQLite Database  
Sent: Wednesday, January 16, 2013 2:56 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 
On 01/16/2013 08:48 PM, Selen Schabenberger wrote:
>>However, in 3.7.16, the penalty for the external sort is
>>(3*nRow*log10(nRow)) and there is no discount for using a
>>covering index (instead, there would be another penalty if the
>>index were not a covering index). For a total cost of roughly
>>612000. So this version of SQLite does a full table scan.
>
> If the index consists of only the Tag, then this index is used.

When you create the index on just the Tag column, what does the
corresponding sqlite_stat1 entry look like?

Dan.




  You
> said, there would be a penalty if the index was not a covering index.
> But seems like it is a discount for 3.7.15.2?
>
> - Selen
>
> 
> *From:* Dan Kennedy 
> *To:* sqlite-users@sqlite.org
> *Sent:* Wednesday, January 16, 2013 1:12 PM
> *Subject:* Re: [sqlite] Multi-column index is not used with IN operator
>
> On 01/16/2013 06:25 PM, Selen Schabenberger wrote:
>  > Below is the output of the dump. If it does not help reproduce the
> error, then I can try to share the original database file itself.
>  >
>  > PRAGMA foreign_keys=OFF;
>  > BEGIN TRANSACTION;
>  > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag'
> INTEGER NOT NULL, 'Flag' INTEGER NOT NULL );
>  > ANALYZE sqlite_master;
>  > INSERT INTO "sqlite_stat1"
> VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
>  > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
>  > COMMIT;
>  >
>
> Got it this time. Considering this one:
>
> SELECT * FROM message
> WHERE tag IN
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)
>
> AND flag=1
> ORDER BY id LIMIT 200;
>
> Looks like the query planner now assigns a higher cost to
> external sorts (ORDER BY clauses that cannot use indexes) than it
> did in 3.6.23.1. In both cases, SQLite assumes that scanning the
> full table implies visiting 460132 rows. In both versions, this
> plan is assigned a cost of 460132.
>
> If there are 30 elements in the IN(...) set, SQLite assumes that
> scanning the index requires visiting (30*1275)=38250 rows.
> Basic cost of 38250, plus some insignificant amount for the 30
> seek operations required.
>
> In version 3.6.23.1, the penalty for the external sort is
> (nRow*log10(nRow)), where nRow is the number of rows to sort (in
> this case 38250). SQLite rounds up the log10() expression to 5,
> so the penalty is roughly 191250. Total cost of 229500. It then
> gets a 50% discount for using a covering index, so the overall
> cost is roughly 115000. Making it preferable to use the index.
>
> However, in 3.7.16, the penalty for the external sort is
> (3*nRow*log10(nRow)) and there is no discount for using a
> covering index (instead, there would be another penalty if the
> index were not a covering index). For a total cost of roughly
> 612000. So this version of SQLite does a full table scan.
>
> None of this jumps out as obviously incorrect. In practice, how
> much slower is 3.7.16 at running the query above?
>
> What does:
>
> SELECT count(*) FROM message WHERE tag IN () AND flag=1;
>
> return? Is it close to the 38250 that SQLite is using as an
> estimate when planning the query?
>
> Thanks,
> Dan.
>
>
>
>
>  >
>  > Thanks!
>  > Selen
>  >
>  > 
>  > From: Dan Kennedymailto:danielk1...@gmail.com>>
>  > To: sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
>  > Sent: Wednesday, January 16, 2013 12:05 PM
>  > Subject: Re: [sqlite] Multi-column index is not used with IN operator
>  >
>  > On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
>  >> I attach a small database where it is possible to reproduce the
>  >> issue. I deleted all irrelevant tables and all the tuples in the
>  >> Message table to keep the file size small but had run ANALYZE before
>  >> doing that.
>  >
>  > Mailing list does not allow attachments. Can you either upload the
>  > db somewhere, or include the output of ".dump" in the body of the
>  > m

Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Dan Kennedy

On 01/16/2013 08:48 PM, Selen Schabenberger wrote:

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.


If the index consists of only the Tag, then this index is used.


When you create the index on just the Tag column, what does the
corresponding sqlite_stat1 entry look like?

Dan.




 You

said, there would be a penalty if the index was not a covering index.
But seems like it is a discount for 3.7.15.2?

- Selen


*From:* Dan Kennedy 
*To:* sqlite-users@sqlite.org
*Sent:* Wednesday, January 16, 2013 1:12 PM
*Subject:* Re: [sqlite] Multi-column index is not used with IN operator

On 01/16/2013 06:25 PM, Selen Schabenberger wrote:
 > Below is the output of the dump. If it does not help reproduce the
error, then I can try to share the original database file itself.
 >
 > PRAGMA foreign_keys=OFF;
 > BEGIN TRANSACTION;
 > CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag'
INTEGER NOT NULL, 'Flag' INTEGER NOT NULL );
 > ANALYZE sqlite_master;
 > INSERT INTO "sqlite_stat1"
VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 1275 1');
 > CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
 > COMMIT;
 >

Got it this time. Considering this one:

SELECT * FROM message
WHERE tag IN
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)

AND flag=1
ORDER BY id LIMIT 200;

Looks like the query planner now assigns a higher cost to
external sorts (ORDER BY clauses that cannot use indexes) than it
did in 3.6.23.1. In both cases, SQLite assumes that scanning the
full table implies visiting 460132 rows. In both versions, this
plan is assigned a cost of 460132.

If there are 30 elements in the IN(...) set, SQLite assumes that
scanning the index requires visiting (30*1275)=38250 rows.
Basic cost of 38250, plus some insignificant amount for the 30
seek operations required.

In version 3.6.23.1, the penalty for the external sort is
(nRow*log10(nRow)), where nRow is the number of rows to sort (in
this case 38250). SQLite rounds up the log10() expression to 5,
so the penalty is roughly 191250. Total cost of 229500. It then
gets a 50% discount for using a covering index, so the overall
cost is roughly 115000. Making it preferable to use the index.

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

None of this jumps out as obviously incorrect. In practice, how
much slower is 3.7.16 at running the query above?

What does:

SELECT count(*) FROM message WHERE tag IN () AND flag=1;

return? Is it close to the 38250 that SQLite is using as an
estimate when planning the query?

Thanks,
Dan.




 >
 > Thanks!
 > Selen
 >
 > ____
 > From: Dan Kennedymailto:danielk1...@gmail.com>>
 > To: sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
 > Sent: Wednesday, January 16, 2013 12:05 PM
 > Subject: Re: [sqlite] Multi-column index is not used with IN operator
 >
 > On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
 >> I attach a small database where it is possible to reproduce the
 >> issue. I deleted all irrelevant tables and all the tuples in the
 >> Message table to keep the file size small but had run ANALYZE before
 >> doing that.
 >
 > Mailing list does not allow attachments. Can you either upload the
 > db somewhere, or include the output of ".dump" in the body of the
 > message if it is small enough? Thanks.
 >



 >
 >
 >
 >>
 >> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN
 >>
 >> SELECT * FROM message WHERE tag IN
 >>
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >>
 >>
 >
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
 >&g

Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger
>However, in 3.7.16, the penalty for the external sort is
>(3*nRow*log10(nRow)) and there is no discount for using a
>covering index (instead, there would be another penalty if the
>index were not a covering index). For a total cost of roughly
>612000. So this version of SQLite does a full table scan.


If the index consists of only the Tag, then this index is used. You said, there 
would be a penalty if the index was not a covering index. But seems like it is 
a discount for 3.7.15.2?

- Selen



 From: Dan Kennedy 
To: sqlite-users@sqlite.org 
Sent: Wednesday, January 16, 2013 1:12 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 
On 01/16/2013 06:25 PM, Selen Schabenberger wrote:
> Below is the output of the dump. If it does not help reproduce the error, 
> then I can try to share the original database file itself.
>
> PRAGMA foreign_keys=OFF;
> BEGIN TRANSACTION;
> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT 
> NULL, 'Flag' INTEGER NOT NULL );
> ANALYZE sqlite_master;
> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 
> 1289 1275 1');
> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
> COMMIT;
>

Got it this time. Considering this one:

   SELECT * FROM message
   WHERE tag IN 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)
 
AND flag=1
   ORDER BY id LIMIT 200;

Looks like the query planner now assigns a higher cost to
external sorts (ORDER BY clauses that cannot use indexes) than it
did in 3.6.23.1. In both cases, SQLite assumes that scanning the
full table implies visiting 460132 rows. In both versions, this
plan is assigned a cost of 460132.

If there are 30 elements in the IN(...) set, SQLite assumes that
scanning the index requires visiting (30*1275)=38250 rows.
Basic cost of 38250, plus some insignificant amount for the 30
seek operations required.

In version 3.6.23.1, the penalty for the external sort is
(nRow*log10(nRow)), where nRow is the number of rows to sort (in
this case 38250). SQLite rounds up the log10() expression to 5,
so the penalty is roughly 191250. Total cost of 229500. It then
gets a 50% discount for using a covering index, so the overall
cost is roughly 115000. Making it preferable to use the index.

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

None of this jumps out as obviously incorrect. In practice, how
much slower is 3.7.16 at running the query above?

What does:

   SELECT count(*) FROM message WHERE tag IN () AND flag=1;

return? Is it close to the 38250 that SQLite is using as an
estimate when planning the query?

Thanks,
Dan.




>
> Thanks!
> Selen
>
> ____
>   From: Dan Kennedy
> To: sqlite-users@sqlite.org
> Sent: Wednesday, January 16, 2013 12:05 PM
> Subject: Re: [sqlite] Multi-column index is not used with IN operator
>
> On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
>> I attach a small database where it is possible to reproduce the
>> issue. I deleted all irrelevant tables and all the tuples in the
>> Message table to keep the file size small but had run ANALYZE before
>> doing that.
>
> Mailing list does not allow attachments. Can you either upload the
> db somewhere, or include the output of ".dump" in the body of the
> message if it is small enough? Thanks.
>



>
>
>
>>
>> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN
>>
>> SELECT * FROM message WHERE tag IN
>> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15

Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger


My original query looks like this one and the Message table has in total 10 
columns in it (there is one BLOB). 

Select * from Message where ((Tag in ( 1146883, 1146884, 1146886, 1146888, 
1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 1147912, 1147914, 
1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 1148016, 1148018, 
1148020, 1148022, 1148040, 1148042, 1148079, 1148136, 1148138, 1148191, 
1148232, 1148234, 1167643, 1167659, 1167660, 1167663, 1167667, 1167671, 1167675 
) ) and Flag=1)  limit 200


If the database file is on the network share (which is most likely the case), 
then it takes ~22000 ms (with the index 300ms!) to return the results. On the 
local drive it is ~300 ms (with the index 10 ms). There are in total 101 rows 
that match the where clause.

Thanks!
Selen




 From: Dan Kennedy 
To: sqlite-users@sqlite.org 
Sent: Wednesday, January 16, 2013 1:12 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 
On 01/16/2013 06:25 PM, Selen Schabenberger wrote:
> Below is the output of the dump. If it does not help reproduce the error, 
> then I can try to share the original database file itself.
>
> PRAGMA foreign_keys=OFF;
> BEGIN TRANSACTION;
> CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT 
> NULL, 'Flag' INTEGER NOT NULL );
> ANALYZE sqlite_master;
> INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 
> 1289 1275 1');
> CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
> COMMIT;
>

Got it this time. Considering this one:

   SELECT * FROM message
   WHERE tag IN 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30)
 
AND flag=1
   ORDER BY id LIMIT 200;

Looks like the query planner now assigns a higher cost to
external sorts (ORDER BY clauses that cannot use indexes) than it
did in 3.6.23.1. In both cases, SQLite assumes that scanning the
full table implies visiting 460132 rows. In both versions, this
plan is assigned a cost of 460132.

If there are 30 elements in the IN(...) set, SQLite assumes that
scanning the index requires visiting (30*1275)=38250 rows.
Basic cost of 38250, plus some insignificant amount for the 30
seek operations required.

In version 3.6.23.1, the penalty for the external sort is
(nRow*log10(nRow)), where nRow is the number of rows to sort (in
this case 38250). SQLite rounds up the log10() expression to 5,
so the penalty is roughly 191250. Total cost of 229500. It then
gets a 50% discount for using a covering index, so the overall
cost is roughly 115000. Making it preferable to use the index.

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

None of this jumps out as obviously incorrect. In practice, how
much slower is 3.7.16 at running the query above?

What does:

   SELECT count(*) FROM message WHERE tag IN () AND flag=1;

return? Is it close to the 38250 that SQLite is using as an
estimate when planning the query?

Thanks,
Dan.




>
> Thanks!
> Selen
>
> ________
>   From: Dan Kennedy
> To: sqlite-users@sqlite.org
> Sent: Wednesday, January 16, 2013 12:05 PM
> Subject: Re: [sqlite] Multi-column index is not used with IN operator
>
> On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
>> I attach a small database where it is possible to reproduce the
>> issue. I deleted all irrelevant tables and all the tuples in the
>> Message table to keep the file size small but had run ANALYZE before
>> doing that.
>
> Mailing list does not allow attachments. Can you either upload the
> db somewhere, or include the output of ".dump" in the body of the
> message if it is small enough? Thanks.
>



>
>
>
>>
>> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN
>>
>> SELECT * FROM message WHERE tag IN
>> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>>
>>
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
&g

Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger


I attach a small database where it is possible to reproduce the issue. I 
deleted all irrelevant tables and all the tuples in the Message table to keep 
the file size small but had run ANALYZE before doing that.

This is the query to reproduce with 3.7.15.2:
EXPLAIN QUERY PLAN

SELECT * FROM message
WHERE tag IN 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
) AND flag=1
ORDER BY id LIMIT 200;

I get this result:
selectId   order  from   detail                        

0  0  0  SCAN TABLE Message USING INTEGER PRIMARY KEY 
(~4601 rows)
0  0  0  EXECUTE LIST SUBQUERY 1   


Hope someone can help.


- Selen



 From: Selen Schabenberger 
To: General Discussion of SQLite Database ; Richard 
Hipp  
Sent: Wednesday, January 2, 2013 12:22 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 
Hi Richard,
I tested the whole scenario one more time with the new SQLite version. As you 
suggested I put a plus sign in front of the Flag column and that really made 
the query much faster by using the multi column index (Tag, Flag, Id) instead 
of the primary index on the Id column. However what I don't get is, I actually 
had removed that single column index on the Flag before and run ANALZE. How 
come the query optimizer makes another decision when I put a + in front of a 
column which is not indexed alone?
Is there another way to improve this query, other than using the + sign? I 
would really appreciate any suggestions.
Happy new year!
Regards,Selen

--- On Fri, 12/14/12, Richard Hipp  wrote:

From: Richard Hipp 
Subject: Re: [sqlite] Multi-column index is not used with IN operator
To: "Selen Schabenberger" , "General Discussion of 
SQLite Database" 
Date: Friday, December 14, 2012, 3:09 PM



On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger  
wrote:


Hi All,



I am observing some strange behaviour on my database when I execute a query 
with an IN operator having more than "22" expressions. My table structure looks 
basically as follows:



CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT 
NULL, "Flag" INTEGER )





I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
index on the Flag column.

My guess is that the single-column index on Flag is misleading the query 
optimizer.  You can probably fix this by either (1) running ANALYZE or (2) 
adding a "+" in front of the "Flag" column name in the WHERE clause of your 
query, like this:   "... +Flag=1 ..."



-- 
D. Richard Hipp
d...@sqlite.org

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Dan Kennedy

On 01/16/2013 06:25 PM, Selen Schabenberger wrote:

Below is the output of the dump. If it does not help reproduce the error, then 
I can try to share the original database file itself.

PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT 
NULL, 'Flag' INTEGER NOT NULL );
ANALYZE sqlite_master;
INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 
1275 1');
CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
COMMIT;



Got it this time. Considering this one:

  SELECT * FROM message
  WHERE tag IN 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) 
AND flag=1

  ORDER BY id LIMIT 200;

Looks like the query planner now assigns a higher cost to
external sorts (ORDER BY clauses that cannot use indexes) than it
did in 3.6.23.1. In both cases, SQLite assumes that scanning the
full table implies visiting 460132 rows. In both versions, this
plan is assigned a cost of 460132.

If there are 30 elements in the IN(...) set, SQLite assumes that
scanning the index requires visiting (30*1275)=38250 rows.
Basic cost of 38250, plus some insignificant amount for the 30
seek operations required.

In version 3.6.23.1, the penalty for the external sort is
(nRow*log10(nRow)), where nRow is the number of rows to sort (in
this case 38250). SQLite rounds up the log10() expression to 5,
so the penalty is roughly 191250. Total cost of 229500. It then
gets a 50% discount for using a covering index, so the overall
cost is roughly 115000. Making it preferable to use the index.

However, in 3.7.16, the penalty for the external sort is
(3*nRow*log10(nRow)) and there is no discount for using a
covering index (instead, there would be another penalty if the
index were not a covering index). For a total cost of roughly
612000. So this version of SQLite does a full table scan.

None of this jumps out as obviously incorrect. In practice, how
much slower is 3.7.16 at running the query above?

What does:

  SELECT count(*) FROM message WHERE tag IN () AND flag=1;

return? Is it close to the 38250 that SQLite is using as an
estimate when planning the query?

Thanks,
Dan.






Thanks!
Selen

____________________
  From: Dan Kennedy
To: sqlite-users@sqlite.org
Sent: Wednesday, January 16, 2013 12:05 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator

On 01/16/2013 05:13 PM, Selen Schabenberger wrote:

I attach a small database where it is possible to reproduce the
issue. I deleted all irrelevant tables and all the tuples in the
Message table to keep the file size small but had run ANALYZE before
doing that.


Mailing list does not allow attachments. Can you either upload the
db somewhere, or include the output of ".dump" in the body of the
message if it is small enough? Thanks.











This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN

SELECT * FROM message WHERE tag IN
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



) AND flag=1

ORDER BY id LIMIT 200;

I get this result: selectId   order  from   detail


0  0  0  SCAN TABLE Message USING INTEGER
PRIMARY KEY (~4601 rows) 0  0  0  EXECUTE
LIST SUBQUERY 1


Hope someone can help.


- Selen



____ From: Selen
Schabenberger  To: General Discussion of SQLite
Database; Richard Hipp  Sent:
Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite]
Multi-column index is not used with IN operator

Hi Richard, I tested the whole scenario one more time with the new
SQLite version. As you suggested I put a plus sign in front of the
Flag column and that really made the query much faster by using the
multi column index (Tag, Flag, Id) instead of the primary inde

Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger
Below is the output of the dump. If it does not help reproduce the error, then 
I can try to share the original database file itself.

PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE 'Message' ('Id' INTEGER PRIMARY KEY NOT NULL, 'Tag' INTEGER NOT 
NULL, 'Flag' INTEGER NOT NULL );
ANALYZE sqlite_master;
INSERT INTO "sqlite_stat1" VALUES('Message','IDX_MSGS_TAG_FLAG_ID','460132 1289 
1275 1');
CREATE INDEX 'IDX_MSGS_TAG_FLAG_ID' on 'Message' ('Tag', 'Flag', 'Id');
COMMIT;


Thanks!
Selen

________________
 From: Dan Kennedy 
To: sqlite-users@sqlite.org 
Sent: Wednesday, January 16, 2013 12:05 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 
On 01/16/2013 05:13 PM, Selen Schabenberger wrote:
> I attach a small database where it is possible to reproduce the
> issue. I deleted all irrelevant tables and all the tuples in the
> Message table to keep the file size small but had run ANALYZE before
> doing that.

Mailing list does not allow attachments. Can you either upload the
db somewhere, or include the output of ".dump" in the body of the
message if it is small enough? Thanks.




>
> This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN
>
> SELECT * FROM message WHERE tag IN
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>
>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>
>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>
>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>
>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>
>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>
>
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
> ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
>
>
) AND flag=1
> ORDER BY id LIMIT 200;
>
> I get this result: selectId   order      from       detail
>
>
> 0          0          0          SCAN TABLE Message USING INTEGER
> PRIMARY KEY (~4601 rows) 0          0          0          EXECUTE
> LIST SUBQUERY 1
>
>
> Hope someone can help.
>
>
> - Selen
>
>
>
>  From: Selen
> Schabenberger To: General Discussion of SQLite
> Database; Richard Hipp Sent:
> Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite]
> Multi-column index is not used with IN operator
>
> Hi Richard, I tested the whole scenario one more time with the new
> SQLite version. As you suggested I put a plus sign in front of the
> Flag column and that really made the query much faster by using the
> multi column index (Tag, Flag, Id) instead of the primary index on
> the Id column. However what I don't get is, I actually had removed
> that single column index on the Flag before and run ANALZE. How come
> the query optimizer makes another decision when I put a + in front of
> a column which is not indexed alone? Is there another way to improve
> this query, other than using the + sign? I would really appreciate
> any suggestions. Happy new year! Regards,Selen
>
> --- On Fri, 12/14/12, Richard Hipp  wrote:
>
> From: Richard Hipp Subject: Re: [sqlite] Multi-column
> index is not used with IN operator To: "Selen
> Schabenberger", "General Discussion of SQLite
> Database" Date: Friday, December 14, 2012,
> 3:09 PM
>
>
>
> On Thu, Dec 13, 2012 at 10:06 AM, Selen
> Schabenberger  wrote:
>
>
> Hi All,
>
>
>
> I am observing some strange behaviour on my database when I execute a
> query with an IN operator having more than "22" expressions. My table
> structure looks basically as follows:
>
>
>
> CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag"
> INTEGER NOT NULL, "Flag" INTEGER )
>
>
>
>
>
> I have a multi-column index on  (Tag, Flag, Id) as well as a single
> column index on the Flag column.
>
> My guess is that the single-column index on Flag is misleading the
> query optimizer.  You c

Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Dan Kennedy

On 01/16/2013 05:13 PM, Selen Schabenberger wrote:

I attach a small database where it is possible to reproduce the
issue. I deleted all irrelevant tables and all the tuples in the
Message table to keep the file size small but had run ANALYZE before
doing that.


Mailing list does not allow attachments. Can you either upload the
db somewhere, or include the output of ".dump" in the body of the
message if it is small enough? Thanks.






This is the query to reproduce with 3.7.15.2: EXPLAIN QUERY PLAN

SELECT * FROM message WHERE tag IN
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30



) AND flag=1

ORDER BY id LIMIT 200;

I get this result: selectId   order  from   detail


0  0  0  SCAN TABLE Message USING INTEGER
PRIMARY KEY (~4601 rows) 0  0  0  EXECUTE
LIST SUBQUERY 1


Hope someone can help.


- Selen



 From: Selen
Schabenberger To: General Discussion of SQLite
Database; Richard Hipp Sent:
Wednesday, January 2, 2013 12:22 PM Subject: Re: [sqlite]
Multi-column index is not used with IN operator

Hi Richard, I tested the whole scenario one more time with the new
SQLite version. As you suggested I put a plus sign in front of the
Flag column and that really made the query much faster by using the
multi column index (Tag, Flag, Id) instead of the primary index on
the Id column. However what I don't get is, I actually had removed
that single column index on the Flag before and run ANALZE. How come
the query optimizer makes another decision when I put a + in front of
a column which is not indexed alone? Is there another way to improve
this query, other than using the + sign? I would really appreciate
any suggestions. Happy new year! Regards,Selen

--- On Fri, 12/14/12, Richard Hipp  wrote:

From: Richard Hipp Subject: Re: [sqlite] Multi-column
index is not used with IN operator To: "Selen
Schabenberger", "General Discussion of SQLite
Database" Date: Friday, December 14, 2012,
3:09 PM



On Thu, Dec 13, 2012 at 10:06 AM, Selen
Schabenberger  wrote:


Hi All,



I am observing some strange behaviour on my database when I execute a
query with an IN operator having more than "22" expressions. My table
structure looks basically as follows:



CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag"
INTEGER NOT NULL, "Flag" INTEGER )





I have a multi-column index on  (Tag, Flag, Id) as well as a single
column index on the Flag column.

My guess is that the single-column index on Flag is misleading the
query optimizer.  You can probably fix this by either (1) running
ANALYZE or (2) adding a "+" in front of the "Flag" column name in the
WHERE clause of your query, like this:   "... +Flag=1 ..."






___ sqlite-users mailing
list sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-16 Thread Selen Schabenberger
I attach a small database where it is possible to reproduce the issue. I 
deleted all irrelevant tables and all the tuples in the Message table to keep 
the file size small but had run ANALYZE before doing that.

This is the query to reproduce with 3.7.15.2:
EXPLAIN QUERY PLAN

SELECT * FROM message
WHERE tag IN 
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
            
,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
) AND flag=1
ORDER BY id LIMIT 200;

I get this result:
selectId   order  from   detail                        

0  0  0  SCAN TABLE Message USING INTEGER PRIMARY KEY 
(~4601 rows)
0  0  0  EXECUTE LIST SUBQUERY 1   


Hope someone can help.


- Selen




 From: Selen Schabenberger 
To: General Discussion of SQLite Database ; Richard 
Hipp  
Sent: Wednesday, January 2, 2013 12:22 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 
Hi Richard,
I tested the whole scenario one more time with the new SQLite version. As you 
suggested I put a plus sign in front of the Flag column and that really made 
the query much faster by using the multi column index (Tag, Flag, Id) instead 
of the primary index on the Id column. However what I don't get is, I actually 
had removed that single column index on the Flag before and run ANALZE. How 
come the query optimizer makes another decision when I put a + in front of a 
column which is not indexed alone?
Is there another way to improve this query, other than using the + sign? I 
would really appreciate any suggestions.
Happy new year!
Regards,Selen

--- On Fri, 12/14/12, Richard Hipp  wrote:

From: Richard Hipp 
Subject: Re: [sqlite] Multi-column index is not used with IN operator
To: "Selen Schabenberger" , "General Discussion of 
SQLite Database" 
Date: Friday, December 14, 2012, 3:09 PM



On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger  
wrote:


Hi All,



I am observing some strange behaviour on my database when I execute a query 
with an IN operator having more than "22" expressions. My table structure looks 
basically as follows:



CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT 
NULL, "Flag" INTEGER )





I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
index on the Flag column.

My guess is that the single-column index on Flag is misleading the query 
optimizer.  You can probably fix this by either (1) running ANALYZE or (2) 
adding a "+" in front of the "Flag" column name in the WHERE clause of your 
query, like this:   "... +Flag=1 ..."



-- 
D. Richard Hipp
d...@sqlite.org

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2013-01-02 Thread Selen Schabenberger
Hi Richard,
I tested the whole scenario one more time with the new SQLite version. As you 
suggested I put a plus sign in front of the Flag column and that really made 
the query much faster by using the multi column index (Tag, Flag, Id) instead 
of the primary index on the Id column. However what I don't get is, I actually 
had removed that single column index on the Flag before and run ANALZE. How 
come the query optimizer makes another decision when I put a + in front of a 
column which is not indexed alone?
Is there another way to improve this query, other than using the + sign? I 
would really appreciate any suggestions.
Happy new year!
Regards,Selen

--- On Fri, 12/14/12, Richard Hipp  wrote:

From: Richard Hipp 
Subject: Re: [sqlite] Multi-column index is not used with IN operator
To: "Selen Schabenberger" , "General Discussion of 
SQLite Database" 
Date: Friday, December 14, 2012, 3:09 PM



On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger  
wrote:


Hi All,



I am observing some strange behaviour on my database when I execute a query 
with an IN operator having more than "22" expressions. My table structure looks 
basically as follows:



CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT 
NULL, "Flag" INTEGER )





I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
index on the Flag column.

My guess is that the single-column index on Flag is misleading the query 
optimizer.  You can probably fix this by either (1) running ANALYZE or (2) 
adding a "+" in front of the "Flag" column name in the WHERE clause of your 
query, like this:   "... +Flag=1 ..."



-- 
D. Richard Hipp
d...@sqlite.org

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2012-12-14 Thread Selen Schabenberger
Actually I'm already running ANALYZE with the SQLITE_ENABLE_STAT3 enabled. By 
the way I observe different behaviour with different SQLite versions. Version 
3.6.19 (testing with Sqliteman 1.2.2)  is selecting the multi-column index, 
whereas the v3.7 primary index.

- Selen



 From: Hick Gunter 
To: 'Selen Schabenberger' ; 'General Discussion of 
SQLite Database'  
Sent: Friday, December 14, 2012 3:26 PM
Subject: AW: [sqlite] Multi-column index is not used with IN operator
 
From what I understand, SQLite performs query planning based on cost estimates.

A full table scan costs less per record than an index lookup.

1) (scan cost) * (IN Lookup)
        is compared with
2) (IN size) * (key lookup) * (sort factor)

IN Lookup cost is logarithmic; IN size is linear. The number 22 appears to be 
where a plot of the costs cross over.

ANALYZE will improve SQLite's cost estimates.

Gunter

-Ursprüngliche Nachricht-
Von: Selen Schabenberger [mailto:selen_oz...@yahoo.com]
Gesendet: Donnerstag, 13. Dezember 2012 16:07
An: sqlite-users@sqlite.org
Betreff: [sqlite] Multi-column index is not used with IN operator

Hi All,

I am observing some strange behaviour on my database when I execute a query 
with an IN operator having more than "22" expressions. My table structure looks 
basically as follows:

CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT 
NULL, "Flag" INTEGER )


I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
index on the Flag column.


If I execute the following query on this table, the Messages table is scanned 
using the primary key and the results are returned in 20 seconds.

> explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, 
> 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 
> 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 
> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 
> 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 
> 1167663, 1167667, 1167671, 1167675 )  and Flag=1) order by Messages.Id limit 
> 0, 100

>> Scan table Messages using integer primary key
>> Execute list subquery1

However if I reduce the number of expressions in the IN operator, the index on 
the (Tag, Flag, Id) is used.

> explain query plan Select Messages.MessageId from Messages where (AspTag in ( 
> 1146883, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 
> 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 
> 1148016, 1148018, 1148020, 1148022, 1148040, 1148042)  and PduFlag=1) order 
> by MessageId  limit 0, 100

>> Search table Messages using covering index IDX_TAG_FLAG_ID
>> Execute list subquery1
>> Use temp b-tree for order by

If I rewrite the first query in multiple IN operators connecting with OR then I 
can again use my multi-column index and it is much more efficient than the full 
table scan.

> explain query plan Select Messages.Id from Messages where (Tag in ( 
> 1148138,1148191, 1148232, 1148234, 1167643, 1167659,1146883, 1146884, 
> 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 
> 1147912, 1147914, 1147968, 1147970, 1147976, 1147978 )  or Tag in (1148012, 
> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 
> 1148136, 1167660, 1167663, 1167667, 1167671, 1167675) ) and Flag=1 order by 
> Messages.Id  limit 0, 100
>> Execute list subquery1
>> Search table Messages using covering index IDX_TAG_FLAG_ID
>> Execute list subquery1
>> Search table Messages using covering index IDX_TAG_FLAG_ID
>> Use temp b-tree for order by



Does anybody have any idea why it behaves so? Are there any restrictions with 
the IN operator?

Regards,
-Selen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


--
Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2012-12-14 Thread Hick Gunter
>From what I understand, SQLite performs query planning based on cost estimates.

A full table scan costs less per record than an index lookup.

1) (scan cost) * (IN Lookup)
is compared with
2) (IN size) * (key lookup) * (sort factor)

IN Lookup cost is logarithmic; IN size is linear. The number 22 appears to be 
where a plot of the costs cross over.

ANALYZE will improve SQLite's cost estimates.

Gunter

-Ursprüngliche Nachricht-
Von: Selen Schabenberger [mailto:selen_oz...@yahoo.com]
Gesendet: Donnerstag, 13. Dezember 2012 16:07
An: sqlite-users@sqlite.org
Betreff: [sqlite] Multi-column index is not used with IN operator

Hi All,

I am observing some strange behaviour on my database when I execute a query 
with an IN operator having more than "22" expressions. My table structure looks 
basically as follows:

CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT 
NULL, "Flag" INTEGER )


I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
index on the Flag column.


If I execute the following query on this table, the Messages table is scanned 
using the primary key and the results are returned in 20 seconds.

> explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, 
> 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 
> 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 
> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 
> 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 
> 1167663, 1167667, 1167671, 1167675 )  and Flag=1) order by Messages.Id limit 
> 0, 100

>> Scan table Messages using integer primary key
>> Execute list subquery1

However if I reduce the number of expressions in the IN operator, the index on 
the (Tag, Flag, Id) is used.

> explain query plan Select Messages.MessageId from Messages where (AspTag in ( 
> 1146883, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 
> 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 1148015, 
> 1148016, 1148018, 1148020, 1148022, 1148040, 1148042)  and PduFlag=1) order 
> by MessageId  limit 0, 100

>> Search table Messages using covering index IDX_TAG_FLAG_ID
>> Execute list subquery1
>> Use temp b-tree for order by

If I rewrite the first query in multiple IN operators connecting with OR then I 
can again use my multi-column index and it is much more efficient than the full 
table scan.

> explain query plan Select Messages.Id from Messages where (Tag in ( 
> 1148138,1148191, 1148232, 1148234, 1167643, 1167659,1146883, 1146884, 
> 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 1146922, 
> 1147912, 1147914, 1147968, 1147970, 1147976, 1147978 )  or Tag in (1148012, 
> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 
> 1148136, 1167660, 1167663, 1167667, 1167671, 1167675) ) and Flag=1 order by 
> Messages.Id  limit 0, 100
>> Execute list subquery1
>> Search table Messages using covering index IDX_TAG_FLAG_ID
>> Execute list subquery1
>> Search table Messages using covering index IDX_TAG_FLAG_ID
>> Use temp b-tree for order by



Does anybody have any idea why it behaves so? Are there any restrictions with 
the IN operator?

Regards,
-Selen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


--
 Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2012-12-14 Thread Selen Schabenberger
Reordering the index worked but actually I need the index in (Tag, Flag, Id) 
order. The Flag column has a very low cardinality, and my queries are mostly 
filtering Tag & Flag or the Tag alone. 

When I drop the index on the Flag, and leave the index on (Tag, Flag, Id), the 
query planner still chooses the primary key on the Id column.

-Selen



From: Richard Hipp 
To: Selen Schabenberger ; General Discussion of SQLite 
Database  
Sent: Friday, December 14, 2012 3:09 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator




On Thu, Dec 13, 2012 at 10:06 AM, Selen 
Schabenberger  wrote:

Hi All,
>
>I am observing some strange behaviour on my database when I execute a query 
>with an IN operator having more than "22" expressions. My table structure 
>looks basically as follows:
>
>CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER NOT 
>NULL, "Flag" INTEGER )
>
>
>I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
>index on the Flag column.
>

My guess is that the single-column index on Flag is misleading the query 
optimizer.  You can probably fix this by either (1) running ANALYZE or (2) 
adding a "+" in front of the "Flag" column name in the WHERE clause of your 
query, like this:   "... +Flag=1 ..."

-- 
D. Richard Hipp
d...@sqlite.org



 From: Simon Slavin 
To: Selen Schabenberger ; General Discussion of SQLite 
Database  
Sent: Friday, December 14, 2012 2:11 PM
Subject: Re: [sqlite] Multi-column index is not used with IN operator
 

On 13 Dec 2012, at 3:06pm, Selen Schabenberger  wrote:

>> explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, 
>> 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 
>> 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 
>> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 
>> 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 
>> 1167663, 1167667, 1167671, 1167675 )  and Flag=1) order by Messages.Id limit 
>> 0, 100

Just out of interest, that form where you go

... WHERE (Tag in (a,b,c) AND Flag=1) ...

Can you try a version where it uses instead

... WHERE Tag in (a,b,c) AND Flag=1 ...

or

... WHERE (Tag in (a,b,c)) AND (Flag=1) ...

I'm trying to predict how the parsing works here but I don't know whether I 
figured it out.  Another thing to try is to reverse the order of your index

> I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
> index on the Flag column.

Can you add a new one, or change your existing one, to (Flag, Tag, Id) ?

> Are there any restrictions with the IN operator?

Nothing that you got anywhere close to.  It's meant to be able to handle text 
and lists far longer than that.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2012-12-14 Thread Richard Hipp
On Thu, Dec 13, 2012 at 10:06 AM, Selen Schabenberger  wrote:

> Hi All,
>
> I am observing some strange behaviour on my database when I execute a
> query with an IN operator having more than "22" expressions. My table
> structure looks basically as follows:
>
> CREATE TABLE "Messages" ("Id" INTEGER PRIMARY KEY NOT NULL, "Tag" INTEGER
> NOT NULL, "Flag" INTEGER )
>
>
> I have a multi-column index on  (Tag, Flag, Id) as well as a single column
> index on the Flag column.
>

My guess is that the single-column index on Flag is misleading the query
optimizer.  You can probably fix this by either (1) running ANALYZE or (2)
adding a "+" in front of the "Flag" column name in the WHERE clause of your
query, like this:   "... +Flag=1 ..."

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multi-column index is not used with IN operator

2012-12-14 Thread Simon Slavin

On 13 Dec 2012, at 3:06pm, Selen Schabenberger  wrote:

>> explain query plan Select Messages.Id from Messages where (Tag in ( 1146883, 
>> 1146884, 1146886, 1146888, 1146892, 1146894, 1146896, 1146898, 1146920, 
>> 1146922, 1147912, 1147914, 1147968, 1147970, 1147976, 1147978, 1148012, 
>> 1148015, 1148016, 1148018, 1148020, 1148022, 1148040, 1148042, 1148079, 
>> 1148136, 1148138, 1148191, 1148232, 1148234, 1167643, 1167659, 1167660, 
>> 1167663, 1167667, 1167671, 1167675 )  and Flag=1) order by Messages.Id limit 
>> 0, 100

Just out of interest, that form where you go

... WHERE (Tag in (a,b,c) AND Flag=1) ...

Can you try a version where it uses instead

... WHERE Tag in (a,b,c) AND Flag=1 ...

or

... WHERE (Tag in (a,b,c)) AND (Flag=1) ...

I'm trying to predict how the parsing works here but I don't know whether I 
figured it out.  Another thing to try is to reverse the order of your index

> I have a multi-column index on  (Tag, Flag, Id) as well as a single column 
> index on the Flag column.

Can you add a new one, or change your existing one, to (Flag, Tag, Id) ?

> Are there any restrictions with the IN operator?

Nothing that you got anywhere close to.  It's meant to be able to handle text 
and lists far longer than that.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users