Re: [sqlite] Distinguishing between sqlite3_stmts

Roger Binns Mon, 10 Nov 2008 13:01:10 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Douglas E. Fajardo wrote:
> John Stanton has correctly pointed out that there is a programming model here 
> an application 
> effectively does the cacheing itself by precompiling statements at startup. 
> In this situation, 
> the proposed cacheing feature represents double work, and double memory (or 
> worse).


Precompiling statements is not caching.  There would not be double work
even if the application implements its own caching.

The way a cache would work is that the sqlite3 pointer would contain an
extra data structure to store cached statements.  There is already a
hash type that is appropriate and would consume about 28 bytes of memory
if empty.

When sqlite3_prepare is called, the cache is consulted.  If there is no
entry then the current prepare is run.  When sqlite3_finalize is called
then instead of freeing, the statement is put in the statement cache.

>  John Stanton has correctly pointed out that there is a programming model 
> here an application effectively does the cacheing itself by precompiling
statements at startup.

That is not caching and it would be unaffected by any caching scheme.
sqlite3_finalize would never be called so nothing would enter the cache.
 sqlite3_prepare would be called a few times but the queries would not
be in the cache.  This is no different than today.

> In this situation, the proposed cacheing feature represents double
work, and double memory (or worse).

There is never double work.  The point of a statement cache is to avoid
calling prepare!

If the application had a cache and SQLite had a cache then you still
wouldn't have the same statement in both caches since it would be on
eviction from the application cache that the SQLite cache would see a
statement.

There is one way to increase memory consumption, but only a bit.  That
is on prepare to add the statement to the cache immediately rather than
waiting till finalize.  If the same query is used again then you
duplicate the statement in the cache.

This can be mitigated by splitting the current sqlite3_stmt into two
pieces.  One piece is the byte code and other meta-data and the second
piece is the data that changes while the statement executes (registers
etc).  If a statement is only in use once then the extra memory
consumption would consist of a pointer between the two pieces.  If the
statement is used multiple times then you get a memory saving since the
static piece would only exist once rather than per statement as is the
case now.  You would also save on the copy of the query string each
sqlite3_prepare_v2 makes since you would only need it once.

>  My proposal of a statement cache, which Rodger Binns supports,

(*Roger* - no 'd') I don't necessarily support a SQLite statement cache,
but I do think it important the issue is discussed and whichever way
things go is documented in the ticket so that it doesn't have to be
discussed again :-)

I already have a statement cache in my code (265 lines of executable
code - no counting of comments, declarations etc).  It works well, is
threadsafe and re-entrant and handles multiple statements correctly.
Getting everything right is hard especially the corner cases.  (Just
because it runs doesn't make it right!)

If everyone is implementing their own statement cache then I strongly
support SQLite doing it instead since it will be way better and not
wasting every developer's time.

Looking at the costs:

* A hash table in sqlite3 * (about 28 bytes empty)

* (Option A) A lazy cache that puts finalized statements into cache

* (Option B) An eager cache that puts prepared statements into cache at
prepare time, but splits into static and dynamic parts

For a simple program that calls prepare at startup then the costs are
negligible.  Option A would never result in any entries (finalize not
called).  Option B would but each stmt would grow by a pointer and a
reference count over today's stmt.  If the same query is prepared
multiple times (eg if it was used in multiple threads) then Option A has
the same memory consumption and Option B *saves* memory.

Since caches could be compiled out, or run time switchable that would
satisfy everyone anyway.  If caches are as prevalent as seems to be the
case then that is a very strong argument for the code to be part of the
core.  As SQLite is being used in more and more places and wrapped for
more and more languages and environments then the cache makes even more
sense.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkkYoQEACgkQmOOfHg372QT0jgCdEyXuvxrzbEZgtai+WwSlA2uT
T5EAnApwMA69jgOhuPerQ07utRD5iqX0
=Zuue
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Distinguishing between sqlite3_stmts

Reply via email to