Hello,
I stumbled over some quite strange case.
Here's the easiest test case that triggers the behaviour:
One process performs very long reads from a db (multiple joins, so the cartesian product is *very* large, and the reader needs a while to complete).
Another process performs a "BEGIN TRANSACTION", then executes lots of "INSERT INTO ... VALUES".
At some point, this process will end up in sqlite3pager_get, when it tries to read some page from the database file (the main file, not a temp file or a journal). It detects that the page is not in the page cache (it ends up in the 'else' branch of if( pPg==0 )'). It runs down to the block of code covered by the following comment:
/* Write the page to the database file if it is dirty. */
In this block, pager_write_pagelist( pPg ) returns with SQLITE_BUSY. As a consequence, the changes are rolled back and SQLITE_IOERR is returned.
And here seems to be the problem:
First, the database file is locked, so I don't understand why the SQLITE_BUSY value isn't propagated back to the caller. If SQLITE_BUSY would be returned, then the application could restart the command.
Seconds, sqlite3VdbeHalt decides to perform a sqlite3BtreeRollbackStmt, so only the last command should be rolled back. However, this is not what happens! In fact, all commands back to the beginning of the transaction are rolled back; the transaction, however is not closed. Doesn't this violate the default rollback behaviour (roll back last command, keep transaction open)? As a consequence, even if the application would get SQLITE_BUSY, it couldn't properly react on it.
There are other places in sqlite3pager_get where SQLITE_IOERR are returned; I've not checked whether these can also be triggered by the db being locked or if they indicate serious problem.
If someone is interested, I can send a test case for this problem.
This all was done with SQLite version 3.0.8.
Best regards
Rolf Schäuble