[sqlite] recurrent failure mode

James K. Lowden Wed, 25 Feb 2015 22:00:39 -0500

On Wed, 25 Feb 2015 16:26:45 -0800
Dave Dyer <ddyer-sqlite at real-me.net> wrote:


> >Do you have any multi-access things going on ?  Two or more
> >computers, applications, processes or threads trying to access the
> >database at the same time ?
> 
> No, but it would be normal for the database to be on a different
> computer than the sqlite client, and be using whatever networked 
> file system is common.  The culprit clients seem to be macs, we're
> still seeking more information about the specifics.

You might want to read my message on the topic from the list archives,
dated Sat, 31 Jan 2015.  

Although "bugs" are frequently blamed, in fact the semantics of
networked filesystems are different from that of local filesystems.
Making a database work on a network filesystem might be possible, but
requires considerable work to support cache coherency.  Expecting
SQLite to do that is not too different from expecting to read a letter
the moment it's dropped in the mailbox across town.  

Just consider this scenario:  two clients open the same database, where
the file is on a fileserver somewhere on the network.  Each one does,
say, "select * from T" and peruses the data.  Then, 

        client A inserts record 17:
                lock record
                insert row
                free lock

All good.  Now,

        client B inserts record 17:
                lock record
                insert row
                free lock

Should result in a primary key violation, right?  No.  

Client A has updated his *local* cache, his in-memory image of some
part of the database.  That's not a SQLite cache; that's the kernel's
filebuffer cache representing the state of some part of the filesystem.
It's completely correct from A's point of view.  SQLite depends on that
cache being correct, and it is: from A's point of view.  

If B is on the same machine as A, they share a single, kernel-provided
filebuffer cache, and when B attempt to insert the duplicate record,
SQLite will see A's record and reject the insert.  

N machines is N caches.  When A flushes his cache with sync(2), how
long before B learns of the change?  With N=1 (same machine), B
learns instantaneously.  With N > 1?  NFS promises only that B will
see A's changes after closing and reopening the file.  

When B is on a different machine, the local representation of the state
of the filesystem does not include A's update.  SQLite examines the
"file", sees no record 17, and updates its local image.  When the
kernel eventually flushes B's update, A's local cache becomes stale.  

With some "luck", you can actually go on like this for a while with no
one noticing.  As long as different clients are updating different
parts of the database and fortuitously refereshing their caches (by
re-reading updated parts that the network isn't helpfully caching,
too), it can seem to sort of work.  Failure is guaranteed in the most
important scenario: when SQLite requires a coherent cache and doesn't
have one.  

So it really doesn't matter if the locking mechanism on network
filesystems are perfect.  By *design*, they make weaker promises than
Posix.  Expecting them to do something they're documented not to do is
asking for trouble.  Which is to say, unfortunately, that you got what
you asked for.  :-/

HTH.  

--jkl

[sqlite] recurrent failure mode

Reply via email to