Re: [sqlite] Thread safety guarantees

Cory Nelson Fri, 02 Sep 2005 21:15:59 -0700

I can't speak for Linux but as long as only one thread is working on a
connection at once, Windows is fine.


On 9/2/05, Igor Tandetnik <[EMAIL PROTECTED]> wrote:
> I'm trying to piece together the thread safety guarantees that SQLite
> provides. They don't appear to be spelled out explicitly. So I'm trying
> to infer the rules from the documents describing SQLite internals, as
> well as the recent "file locks on linux have thread affinity" story.
> 
> Here is my understanding of the situation - I would greatly appreciate
> it if SQLite experts would confirm or deny it.
> 
> For the purposes of this discussion, every connection handle (sqlite*)
> and every statement handle (sqlite_stmt*) can be in one of two states -
> "safe" and "unsafe". Various API calls transfer handles between these
> states. While in a safe state, a handle can be passed freely between
> threads. As soon as a call puts a handle into an unsafe state, all
> further calls must arrive on the same thread until the handle becomes
> safe again. Various levels of thread safety are determined by exactly
> what calls transition handles between states.
> 
> I could think of three thread safety levels, arranged from weakest to
> strongest. My question boils down to which level describes reality most
> closely.
> 
> 1. All handles are always unsafe, from sqlite3_open to sqlite3_close and
> from sqlite3_prepare to sqlite3_finalize. All  calls referring to a
> particular connection and all statements associated with it must occur
> on the same thread.
> 
> This is obviously a safe assumption, but also least useful in many
> practical situations.
> 
> 
> 2. A connection is safe as long as there is no activity on it - there is
> no open transaction and no statements. The connection is born safe by
> sqlite3_open, becomes unsafe as soon as a "begin transaction" is
> executed or sqlite3_prepare is called, and becomes safe again when the
> transaction (if any) is committed or rolled back and the last statement
> is finalized.
>     A statement is always unsafe.
> 
> This assumption allows creating a connection pool used by a thread pool.
> A worker thread grabs an idle connection, executes a batch of statements
> on it, and returns it back to the pool where another thread can now use
> it.
> 
> I'm not sure how useful this optimization is. I have some experience
> with traditional client-server databases, where establishing a
> connection is a pretty expensive operation and connection pooling is
> important. How expensive is sqlite3_open? Does the answer change if the
> client code has to register a few custom collations, custom functions
> and such every time it opens a connection? Is it worth it to maintain a
> connection pool, or is it fine to just open a new connection every time
> I need one?
> 
> 
> 3. A statement is safe right after sqlite3_prepare, becomes unsafe on
> the first sqlite3_step call, and safe again after sqlite3_reset. In
> other words, a statement can be tranferred between threads as long as it
> does not touch actual data.
>     A connection is safe as long as there are no open transactions or
> unsafe queries. As soon as a transaction opens or one statement is being
> stepped through, all activity should happen on the same thread. Once the
> activity stops (but there may still be freshly prepared or reset
> statements), the connection is safe again and can be transferred to a
> new thread.
> 
> This assumption allows creating a pool of objects that encapsulate an
> open connection together with a bunch of prepared statements - poor
> man's stored procedures if you will. I believe this may prove useful in
> some situations. E.g. imagine a system that receives a stream of records
> over the network and needs to insert them into the database. It
> maintains a thread pool where worker can execute a job consiting of
> inserting a single record. There are only a few different kinds of
> insert statements (one for each table). It would help if a worker thread
> can grab a connection and use a pre-compiled statement to execute its
> job. Saves some time preparing the same queries over and over.
> 
> 
> So, which assumption is correct? It appears to me that all three are
> compatible with "linux thread-unsafe locks" issue, but I'd like to
> receive confirmation.  And of course, there's a big chance I'm missing
> something obvious.
> 
> 
> Also, how does sqlite3_interrupt fit into the picture? It's clearly only
> useful if one can call it on a thread different from the one that's busy
> executing a statement on the connection. Can it be assumed that
> sqlite3_interrupt can be called from any thread at any time?
> 
> 
> A somewhat unrelated note: I think it would be useful to introduce a
> function that clones an existing connection - that is, opens a new
> connection to the same database as the existing one, and registers all
> the same custom collations, custom functions, an authorizer, maybe a
> busy handler, maybe a set of ATTACHed databases and so on. Even nicer
> would be an ability to clone a prepared statement so that the copy is
> associated with a different connection to the same database.
> 
> Igor Tandetnik
> 
> 


-- 
Cory Nelson
http://www.int64.org

Re: [sqlite] Thread safety guarantees

Reply via email to