Thanks for answering both of my questions.

I guess this is similar to the 'Skip-Scan Optimization' mentioned in the 
documentation. That is what I assumed and explains the difference in query 
performance. Scanning the table once is faster than scanning the table for each 
and every value of column 'a' (and the table is too large for any sort of cache 
to be useful)...

I can try to see if the ANALYZE statement makes a difference. But it would only 
be useful if I can copy the 'impact' of this statement to other database files 
with equal table definitions. Because in my application, I am constantly 
creating new database file, filling them, and eventually deleting them.

Can I copy the "statistics tables" from one database file to another?

-----Original Message-----
From: sqlite-users-bounces at mailinglists.sqlite.org 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Hick Gunter
Sent: dinsdag 3 maart 2015 13:08
To: 'General Discussion of SQLite Database'
Subject: Re: [sqlite] full table scan ignores PK sort order?

The subquery is the index access (partial table scan), which is performed once 
for each and every value in your IN list for the column a (in effect, the IN 
list is transformed into an ephemeral table and joined to your test table).

Since you did not declare an index for your primary key constraint, SQLite has 
to invent one.

If you insert a representative data set and run ANALYZE then the query plan may 
well change to something that suits the shape of your data better.

-----Urspr?ngliche Nachricht-----
Von: Jan Asselman [mailto:jan.asselman at iba-benelux.com]
Gesendet: Dienstag, 03. M?rz 2015 11:27
An: General Discussion of SQLite Database
Betreff: Re: [sqlite] full table scan ignores PK sort order?

Thank you! I think I get it:

- primary key is nothing but a UNIQUE constraint (in my case comparable to a 
separate index == table with columns a, b and rowID)
- the full table scan returns rows in rowID order, which is the order in which 
the rows were added to the table

There is just one more thing I would like to understand:

Most of my queries are in the form
"SELECT * FROM test WHERE a == ? AND b < ?;"
and use the primary key index so that the rows are returned in the expected 
order without using the ORDER BY statement.

It's only a special case where the query "SELECT * FROM test WHERE b < ? AND c 
> ?;"
is used. That's why I decided not to add a separate index to column b.

What I was trying, in order to prevent a full table scan, is force the use of 
the primary key index for this query and have sqlite allow all possible values 
for column a:
"SELECT * FROM test WHERE a IN (<<256 different values>>) AND b < ? AND c > ?;"

The query plan confirms the use of the primary key index:

0, 0, 0, SEARCH TABLE data USING INDEX sqlite_autoindex_test_1 (a=? AND b<?) 0, 
0, 0, EXECUTE LIST SUBQUERY 1

 - Why is there a subquery?
 - For a query that reads all rows in the table, the throughput was a lot worse 
than executing a full table scan (25x slower). Is this behavior explainable?

Very much appreciated!
Jan Asselman

Reply via email to