> On Tue, 27 May 2014 06:40:08 -0400, Richard Hipp <d...@sqlite.org> said:

> Nothing has changed with SQLITE_PROTOCOL since 3.7.6.

> In big bold letters at the top of the ticket entry form, it says:  "Tickets
> for issues that have not been previously discussed on the mailing list will
> usually be deleted without comment."

> If you are getting PROTOCOL errors, then the thing to do is to describe the
> circumstances under which the errors occur so that we can begin to try to
> understand what is going wrong.

We occasionally see PROTOCOL errors in one of our applications, a somewhat
atypical caching proxy that uses SQLite (3.7.7.1) to index its cached data.
I will try to describe the most pertinent details.  Apologies for the very
long email.

The application runs on dedicated mid to high-end Linux (modified RHEL5)
servers with several (dozen) disks.  There is an SQLite cache-index
database per disk, on an xfs filesystem, accessed concurrently by a writer
thread and several (usually 5) reader threads, all with their own database
handles.  Write requests are dispatched to the writer thread, read requests
are sent to the readers in round-robin fashion.

All databases are configured with 16K pages and WAL journalling.  We have
our own busy handler, which increments some counters (not involving
locking) but is otherwise identical to sqliteDefaultBusyCallback.  The busy
timeout is normally set to 1000ms.

We also have our own WAL commit hook, whose behaviour is different to
SQLite's.  When the number of WAL pages is less than 1000, the callback
returns immediately.  Between 1000 and 1500 pages, it tries a PASSIVE
checkpoint.  Above 1500 pages it tries a RESTART checkpoint.  Any
SQLITE_BUSY errors returned by a RESTART checkpoint attempt are ignored up
to a yet higher threshold of uncheckpointed pages.

Because the disks containing the cache-index databases may be physically
ejected at any time, and because SQLite's -shm WAL index is mmapped, and,
further, because our application is not prepared to handle SIGBUS, we
arrange for the -shm files to be stored outside the filesystems that
contain the database files themselves on non-removable storage.


Most PROTOCOL errors we see occur in the lab, under stress tests, on
hardware that tends to have slow I/O, large disks and relatively little
RAM, with the application taking several dozen major page faults per
second.  There are long I/O queues and lots of churn in the page cache.
The PROTOCOL errors occur very infrequently, perhaps ~15 times in several
million successfully executed statements.

The query that tends to fail with SQLITE_PROTOCOL most often is a SELECT
that implements the main cache look-up -- unsurprisingly, the query that
gets run most often.  With SQLite's global logging enabled, whenever we
have observed this error, the log message has looked like so:

  sqlite message (15): statement aborts at 29: [SELECT a1, a2, a3, a4, a5,
  a6, a7, a8 FROM table_a WHERE (a3=? AND a2=?) ORDER BY a1 ASC] locking 
protocol

The statement always aborts at the same place (probably not surprising, see
the explain output below).  Less often, a PROTOCOL error occurs while
executing one of the write queries.  Here is an example log message for the
query that stores data:

  sqlite message (15): statement aborts at 64: [INSERT OR REPLACE INTO
  table_b (b1, b2, b3, b4, b5, b6) VALUES (?,?,?,?,?,?)] locking protocol

The same comment applies to this query w.r.t. where it aborts.


A particularly vexing aspect of this problem is that it occasionally causes
our application to enter a state in which, for one or more databases,
RESTART checkpoints always fail with SQLITE_BUSY and the WAL keeps growing.
Whenever this has come about, our logs have shown that a read and a write
query, usually the two mentioned above, have previously failed with
SQLITE_PROTOCOL at about the same time (well, in the same second).  (We can
tell that both queries executed on the same db from some additional error
logging that I have omitted).

Because this is highly correlated to the PROTOCOL errors and because at
this point we are fairly certain that we always reset statements, roll
transactions back and so on when we handle errors, there is the remote
possibility that there is a SQLite bug hiding in this area.  This has
prompted us to try and tweak our application's behaviour to stop the
PROTOCOL errors from occurring.

So far it seems that PROTOCOL errors are caused by high concurrency,
specifically by contention for the WAL reader locks.  We can make the
errors go away by decreasing the number of readers to (e.g.) 2 -- but this
may have some still-unquantified performance impact on our application.

We can also make them go away by rebuilding SQLite with a bigger
(e.g. doubled) SQLITE_SHM_NLOCK and keep the same number of threads.  In
either case we don't yet consider our tests conclusive -- and in any case
our understanding of the code in wal.c is highly incomplete to say the
least :-)  But we'd very much like to get to the bottom of this.


That's all there is for now.  I'm sure there are important bits I've left
out.  Please feel free to ask.


Explain output for the queries mentioned above.

explain SELECT a1, a2, a3, a4, a5, a6, a7, a8
FROM table_a WHERE (a3=? AND a2=?) ORDER BY a1 ASC

  addr  opcode         p1    p2    p3    p4             p5  comment
  ----  -------------  ----  ----  ----  -------------  --  -------------
  0     Trace          0     0     0                    00
  1     Noop           0     0     0                    00
  2     Variable       1     1     0                    00
  3     Variable       2     2     0                    00
  4     Goto           0     29    0                    00
  5     OpenRead       0     10    0     9              00
  6     OpenRead       2     11    0     keyinfo(3,BINARY,BINARY)  00
  7     SCopy          1     3     0                    00
  8     IsNull         3     26    0                    00
  9     SCopy          2     4     0                    00
  10    IsNull         4     26    0                    00
  11    Affinity       3     2     0     aa             00
  12    SeekGe         2     26    3     2              00
  13    IdxGE          2     26    3     2              01
  14    IdxRowid       2     5     0                    00
  15    Seek           0     5     0                    00
  16    IdxRowid       2     6     0                    00
  17    Column         2     1     7                    00
  18    Column         2     0     8                    00
  19    Column         0     3     9                    00
  20    Column         0     4     10                   00
  21    Column         0     5     11                   00
  22    Column         0     7     12                   00
  23    Column         0     8     13                   00
  24    ResultRow      6     8     0                    00
  25    Next           2     13    0                    00
  26    Close          0     0     0                    00
  27    Close          2     0     0                    00
  28    Halt           0     0     0                    00
  29    Transaction    0     0     0                    00
  30    VerifyCookie   0     10    0                    00
  31    TableLock      0     10    0     table_a        00
  32    Goto           0     5     0                    00

explain INSERT OR REPLACE INTO table_b (b1, b2, b3, b4, b5, b6)
VALUES (?,?,?,?,?,?)

  addr  opcode         p1    p2    p3    p4             p5  comment
  ----  -------------  ----  ----  ----  -------------  --  -------------
  0     Trace          0     0     0                    00
  1     Goto           0     64    0                    00
  2     OpenWrite      0     6     0     6              00
  3     OpenWrite      1     9     0     keyinfo(1,BINARY)  00
  4     OpenWrite      2     8     0     keyinfo(2,BINARY,BINARY)  00
  5     OpenWrite      3     7     0     keyinfo(2,BINARY,BINARY)  00
  6     NewRowid       0     4     0                    00
  7     Variable       1     5     0                    00
  8     Variable       2     6     0                    00
  9     Variable       3     7     0                    00
  10    Variable       4     8     0                    00
  11    Variable       5     9     0                    00
  12    Variable       6     10    0                    00
  13    SCopy          10    11    0                    00
  14    SCopy          4     12    0                    00
  15    MakeRecord     11    2     1     bb             00
  16    SCopy          4     13    0                    00
  17    IsUnique       1     31    13    11             00
  18    NotExists      0     31    13                   00
  19    Rowid          0     12    0                    00
  20    Column         0     5     11                   00
  21    IdxDelete      1     11    2                    00
  22    Rowid          0     16    0                    00
  23    Column         0     0     14                   00
  24    Column         0     3     15                   00
  25    IdxDelete      2     14    3                    00
  26    Rowid          0     16    0                    00
  27    Column         0     0     14                   00
  28    Column         0     1     15                   00
  29    IdxDelete      3     14    3                    00
  30    Delete         0     0     0                    00
  31    SCopy          5     14    0                    00
  32    SCopy          8     15    0                    00
  33    SCopy          4     16    0                    00
  34    MakeRecord     14    3     2     ddb            00
  35    SCopy          5     14    0                    00
  36    SCopy          6     15    0                    00
  37    SCopy          4     16    0                    00
  38    MakeRecord     14    3     3     ddb            00
  39    SCopy          4     13    0                    00
  40    IsUnique       3     54    13    14             00
  41    NotExists      0     54    13                   00
  42    Rowid          0     15    0                    00
  43    Column         0     5     14                   00
  44    IdxDelete      1     14    2                    00
  45    Rowid          0     19    0                    00
  46    Column         0     0     17                   00
  47    Column         0     3     18                   00
  48    IdxDelete      2     17    3                    00
  49    Rowid          0     19    0                    00
  50    Column         0     0     17                   00
  51    Column         0     1     18                   00
  52    IdxDelete      3     17    3                    00
  53    Delete         0     0     0                    00
  54    IdxInsert      3     3     0                    00
  55    IdxInsert      2     2     0                    00
  56    IdxInsert      1     1     0                    00
  57    MakeRecord     5     6     13    ddddbb         00
  58    Insert         0     13    4     table_b        0b
  59    Close          0     0     0                    00
  60    Close          1     0     0                    00
  61    Close          2     0     0                    00
  62    Close          3     0     0                    00
  63    Halt           0     0     0                    00
  64    Transaction    0     1     0                    00
  65    VerifyCookie   0     10    0                    00
  66    TableLock      0     6     1     table_b        00
  67    Goto           0     2     0                    00


Rgds,

--
Stelios.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to