[sqlite] Multiple connections and page cache(s) reusability

Milan Kříž Tue, 12 May 2015 17:57:59 +0200

Hello,

I have a question regarding caching mechanism in SQLite. Especially when using 
multiple connections.
My understanding was that each connection has its own page cache (if I don't 
use SQLITE_OPEN_SHAREDCACHE).
However I wrote some tests and results didn't met my expectations...


So what I tested:
* I have two FTS4 tables (each table contains about million of entries, 10 
columns, 5 columns in the FTS index, 
other columns are used as 'not_indexed')
* My application executes queries (with MATCH operator) to both tables
     * each query is querying only to a single table (no join used) and the 
queries to particular tables are 
mixed so that the order is: run query1 to table1, run query2 to table2, run 
query3 to table1, run query4 to 
table2 and so on)
* The tables are (from the application point of view) READ_ONLY - i.e. the 
content is not modified during the 
application live -> connections opened using 
sqlite3_open_v2(db_path,&p_db,SQLITE_OPEN_READONLY, NULL)
* I wanted to compare performance of my queries with different approaches, the 
complete time to execute all 
queries (10 different queries) was measured:
     1) Run each query with the NEW connection (open connection, execute query, 
close connection):
         * here I expected the worst performance
     2) Run all queries with the same connection (single opened connection 
during the whole test):
         * this was my first implementation and I wanted to improve performance 
to be better then with this approach
         * my assumption was that in this approach the page cache will not be 
reused since the queries are mixed 
and each consecutive query is querying different table
     3) Create two connections (on the same DB) and dedicate each connection 
for queries to a single table:
         * It means that connection1 is used only for queries to table1 and 
connection2 only for queries to table2
         * I expected the best performance here because I thought that each 
connection will keep its own cache 
and therefore the cache will be maximally reused

Results:
The approach 1 was really the slowest. However the approach 3 was unexpectedly 
slightly worse than approach 2.

Moreover, when I run similar tests with another artificial queries which all 
returned lot of entries, the 
approach 1 !!! was even the best -> i.e. almost 10% better than 2 and 3!
And I really cannot understand how it is possible. Does SQLite have yet some 
other global cache?
Or is it possible that "overwriting" the cache content (at least in approach 2) 
can even slow down the query 
time so that it will be slower then using a new connection (=clear cache) for 
each particular query?

Unfortunately I cannot post here exact queries / tables / tests from my project 
(its confidential :-) ) but I 
would like at least to check whether my thinking is correct or if I 
misunderstood something.
Would you expect that the approach 2 will be faster than approach 3? If yes, 
please explain why.

Many thanks,
Milan

[sqlite] Multiple connections and page cache(s) reusability

Reply via email to