I'm doing some stress testing of an app that uses sqlite-amalgamation-3140100 compiled under gcc 4.8.5 (x86_64-posix-seh-rev0, Built by MinGW-W64 project) on Windows 7 x64 with the compile options:
#define SQLITE_MAX_MMAP_SIZE 0 #define SQLITE_OMIT_AUTORESET 1 #define SQLITE_OMIT_DEPRECATED 1 #define HAVE_MALLOC_USABLE_SIZE 1 #define HAVE_USLEEP 1 #define SQLITE_ENABLE_API_ARMOR 1 #define SQLITE_DEBUG 1 #define SQLITE_MEMDEBUG 1 SQLite is initialized with: sqlite3_config(SQLITE_CONFIG_MULTITHREAD) sqlite3_config(SQLITE_CONFIG_URI, 1) sqlite3_initialize() sqlite3_enable_shared_cache(1) and then a WAL database is created with: sqlite3_open("wal.db?cache=private", &db) sqlite3_busy_timeout(db, 0x70000000) sqlite3_exec(db, "PRAGMA page_size = 4096;", NULL, NULL, NULL) sqlite3_exec(db, "PRAGMA journal_mode = WAL;", NULL, NULL, NULL) sqlite3_exec(db, "PRAGMA PRAGMA wal_autocheckpoint = 0;", NULL, NULL, NULL) The app has multiple threads reading and writing the database using prepared statements, with each thread using its own sqlite3_stmt handles. The app also has a worker thread that sleeps for about a second and then calls either sqlite3_wal_checkpoint_v2(db, NULL, SQLITE_CHECKPOINT_PASSIVE, NULL, NULL) or sqlite3_wal_checkpoint_v2(db, NULL, SQLITE_CHECKPOINT_TRUNCATE, NULL, NULL), with about 10 seconds between the TRUNCATE checkpoints. Before writing or calling SQLITE_CHECKPOINT_TRUNCATE, a thread acquires an application-level write mutex that was put in to avoid having to handle LOCKED and BUSY errors. However, when reading the database, or when calling SQLITE_CHECKPOINT_PASSIVE, the application-level write mutex is not acquired, so reading the database and executing SQLITE_CHECKPOINT_PASSIVE can both happen concurrently with one thread that is writing to the database. When load testing, the app hammers the database for about 10 seconds and then, when executing sqlite3_step for a "begin exclusive;" statement on the WAL database, it hits the assertion on line 58000 of sqlite.c which reads: assert( p->sharable==0 || p->locked==0 || sqlite3_mutex_held(p->pBt->mutex) ); At the time it triggers this assertion, there is only one thread that has acquired the application-level write mutex and is attempting to execute "begin exclusive;"--any other writer thread has by this point executed "commit;" and released the application-level write mutex--and the worker thread that calls sqlite3_wal_checkpoint_v2 is at the time sleeping, not executing sqlite3_wal_checkpoint_v2. If I turn off SQLITE_DEBUG, then instead of the assertion, I eventually get an SQLITE_CORRUPT error. If I change my worker thread to acquire the application-level write mutex before calling SQLITE_CHECKPOINT_PASSIVE, then the assertion and SQLITE_CORRUPT errors disappear and the app runs fine. The effect of this change is the the SQLITE_CHECKPOINT_PASSIVE will never attempt to run at the same time another thread is inside a "begin exclusive;" block. Note however that would seem to defeat the purpose of SQLITE_CHECKPOINT_PASSIVE, which should be able to run even while another thread is writing to the WAL database. I was clued in to try this because under earlier load testing when SQLITE_DEBUG was not enabled, the call to SQLITE_CHECKPOINT_PASSIVE would very intermittently return SQLITE_LOCKED, even though that is not a documented return value for SQLITE_CHECKPOINT_PASSIVE. It was that anomalous behavior that led me to try putting SQLITE_CHECKPOINT_PASSIVE inside the application-level write mutex to see what would happen. Any thoughts on what the problem might be? Thanks much, Allen _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users