Re: [sqlite] Database sync

drh Sat, 18 Nov 2006 13:48:58 -0800

"Isaac Raway" <[EMAIL PROTECTED]> wrote:
> 
> At any rate, anyone have experience syncing SQLite DB files?
>


I have done this on two separate projects.

In the first case, the databases to be synced all had a fixed
set of records (a few hundred thousand rows).  New rows were
never added or deleted.  The only changes to the database were
value changes in individual columns of certain tables.  To do
synchronization in this case, I timestamped every change.  When
the two database where brought together to synchronize, I just
choose the most recent change for each record.  (Complications
arise when the same record changes in different ways in the
two database during the same second.  But we will gloss over
those details here.)

In the second case, values could be changed as in the first case,
but in addition, new records could be added.  For that project,
I set the RowID for each newly created row equal to a 64-bit
random number from a good source of randomness().(The random() 
function built into SQLite works for this - no coincidence.)
Then to synchronize, I just take the union of both databases, 
and in cases where both databases contain the same record, I 
take the most recently changed version.

If the disconnected applications both insert a row and happen
to choose the same random RowID, you happen get a collision.
In my particular application, new rows were sufficiently 
infrequent and the consequences of a collision where 
sufficiently mild that using 64-bits of randomness was
sufficient for me not to have to worry about this.  If you 
are doing many, many inserts or if you really need to guarantee 
that collisions will not occur, then you need to go to 128 
or more bits of randomness for the record ids.  Of course,
there is always the chance of a collision, but if you use
good randomness and a large enough unique ID, the probability
of a collision can be made vanishingly small - much less than
the probability of errors from other sources such as, for 
example, a giant meteor striking earth and wiping out 
civilization.

Some people are upset by the concept of computer software
that works with "high probability".  They want to be able
to prove that a failure "never" occurs.  To this idea, I have 
three rebuttals:

  (1) For all practical purposes, "one failure every
      10 billion years" is the same as "never fails".

  (2) Nothing on a computer is certain.  When you add
      1 and 1 on a computer you get 2 with high probability.
      But if you believe in quantum physics (and the people
      who designed your CPU likely do) then you will
      recognize that there is a non-zero probability of
      getting a different result.  So having a non-zero
      failure probability is not a new thing in computer
      programming - it is just something that is not
      acknowledged very often.

  (3) The chance of failure because of random id collision
      is kazillions of times smaller than the chance of failure
      because you have a logic error somewhere else in your
      code.

Returning to the topic at hand:  Deletes are still a problem.
If when you delete a row you leave nothing behind, then when 
syncing to databases A and B you do not know if the row was 
added to A or deleted from B.  To work around this, I do not 
actually delete rows - I just mark them as "unused".

--
D. Richard Hipp  <[EMAIL PROTECTED]>


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Re: [sqlite] Database sync

Reply via email to