Re: [sqlite] Thread safety guarantees

René Tegel Sat, 03 Sep 2005 03:58:10 -0700

Hi,

Just for your information, i've written an multi-threaded applicationthat accesses a sqlite database (well four of them in fact). All threads(many) use the same handle as retrieved from sqlite open. Each threadlocks the database by entering a critical section, performs a query,fetches results in case of a select, and leaves the critical sectionagain. So no two threads simultaniously access the same db. Forperformance reasons (both locking and indexin) i split my database infour databases.

Initially i let each thread start and end a transaction each time.However, again for performance reasons, i decided to use another method.A transaction is only started at application startup.For many minutes, ilet the journal file grow, filled by many threads, then once in a whilethe transaction is comitted and restarted, so that the journal filesgets merged with the database files. Although this all seem to workperfectly on windows, i feel obliged to warn against this practice ifyour data is actually _important_. However, in my case i was fightingperformance, and didn't really care for some data to be lost in the caseof a computer crash. As long as the database itself does not getcorrupted it is fine with me. If your data is important, be sure tostart/end transactions at the cost of (much) more disk access.

So, concluding: if you have a decent thread locking mechanism, you cansafely use sqlite and do anything you like as if it is one single threadaccessing the db. On windows that is. Development environment: delphi5/7/2005 + libsql.

Although not an exact answer to your questions i hope this informationis usefull.


regards,

rene



Igor Tandetnik wrote:

I'm trying to piece together the thread safety guarantees that SQLiteprovides. They don't appear to be spelled out explicitly. So I'mtrying to infer the rules from the documents describing SQLiteinternals, as well as the recent "file locks on linux have threadaffinity" story.
Here is my understanding of the situation - I would greatly appreciateit if SQLite experts would confirm or deny it.
For the purposes of this discussion, every connection handle (sqlite*)and every statement handle (sqlite_stmt*) can be in one of two states- "safe" and "unsafe". Various API calls transfer handles betweenthese states. While in a safe state, a handle can be passed freelybetween threads. As soon as a call puts a handle into an unsafe state,all further calls must arrive on the same thread until the handlebecomes safe again. Various levels of thread safety are determined byexactly what calls transition handles between states.
I could think of three thread safety levels, arranged from weakest tostrongest. My question boils down to which level describes realitymost closely.
1. All handles are always unsafe, from sqlite3_open to sqlite3_closeand from sqlite3_prepare to sqlite3_finalize. All calls referring toa particular connection and all statements associated with it mustoccur on the same thread.
This is obviously a safe assumption, but also least useful in manypractical situations.
2. A connection is safe as long as there is no activity on it - thereis no open transaction and no statements. The connection is born safeby sqlite3_open, becomes unsafe as soon as a "begin transaction" isexecuted or sqlite3_prepare is called, and becomes safe again when thetransaction (if any) is committed or rolled back and the laststatement is finalized.
   A statement is always unsafe.
This assumption allows creating a connection pool used by a threadpool. A worker thread grabs an idle connection, executes a batch ofstatements on it, and returns it back to the pool where another threadcan now use it.
I'm not sure how useful this optimization is. I have some experiencewith traditional client-server databases, where establishing aconnection is a pretty expensive operation and connection pooling isimportant. How expensive is sqlite3_open? Does the answer change ifthe client code has to register a few custom collations, customfunctions and such every time it opens a connection? Is it worth it tomaintain a connection pool, or is it fine to just open a newconnection every time I need one?
3. A statement is safe right after sqlite3_prepare, becomes unsafe onthe first sqlite3_step call, and safe again after sqlite3_reset. Inother words, a statement can be tranferred between threads as long asit does not touch actual data.A connection is safe as long as there are no open transactions orunsafe queries. As soon as a transaction opens or one statement isbeing stepped through, all activity should happen on the same thread.Once the activity stops (but there may still be freshly prepared orreset statements), the connection is safe again and can be transferredto a new thread.
This assumption allows creating a pool of objects that encapsulate anopen connection together with a bunch of prepared statements - poorman's stored procedures if you will. I believe this may prove usefulin some situations. E.g. imagine a system that receives a stream ofrecords over the network and needs to insert them into the database.It maintains a thread pool where worker can execute a job consiting ofinserting a single record. There are only a few different kinds ofinsert statements (one for each table). It would help if a workerthread can grab a connection and use a pre-compiled statement toexecute its job. Saves some time preparing the same queries over andover.
So, which assumption is correct? It appears to me that all three arecompatible with "linux thread-unsafe locks" issue, but I'd like toreceive confirmation. And of course, there's a big chance I'm missingsomething obvious.
Also, how does sqlite3_interrupt fit into the picture? It's clearlyonly useful if one can call it on a thread different from the onethat's busy executing a statement on the connection. Can it be assumedthat sqlite3_interrupt can be called from any thread at any time?
A somewhat unrelated note: I think it would be useful to introduce afunction that clones an existing connection - that is, opens a newconnection to the same database as the existing one, and registers allthe same custom collations, custom functions, an authorizer, maybe abusy handler, maybe a set of ATTACHed databases and so on. Even nicerwould be an ability to clone a prepared statement so that the copy isassociated with a different connection to the same database.
Igor Tandetnik

Re: [sqlite] Thread safety guarantees

Reply via email to