Re: Checkpointing on read only databases

Adam Kocoloski Wed, 23 Apr 2014 08:36:49 -0700

There's an algorithm in the BigCouch codebase for storing up to N checkpoints 
with exponentially increasing granularity (in terms of sequence values) between 
them. It strikes a nice balance between checkpoint document size and ability to 
resume with minimal replay.


Adam

> On Apr 23, 2014, at 11:28 AM, Calvin Metcalf <[email protected]> wrote:
> 
> with the rolling hash thingy, a checkpoint document could store more then
> one database hash, e.g. the last 5 but totally up to whoever is storing
> the checkpoint.  This would cover the case where you stop the replication
> after one of the dbs has stored the checkpoint but before the other one
> has.
> 
> 
>> On Tue, Apr 15, 2014 at 9:21 PM, Dale Harvey <[email protected]> wrote:
>> 
>> ah, yeh got it now, cheers
>> 
>> 
>>> On 16 April 2014 02:17, Calvin Metcalf <[email protected]> wrote:
>>> 
>>> Your source data base is upto seq 10, but the box its on catches fire.
>> You
>>> have a backup though but its at seq 8, same UUID though but you'll miss
>> the
>>> next 2 seqs.
>>>> On Apr 15, 2014 8:57 PM, "Dale Harvey" <[email protected]> wrote:
>>>> 
>>>> Sorry still dont understand the problem here
>>>> 
>>>> The uuid is stored inside the database file, you either have the same
>>> data
>>>> and the same uuid, or none of them?
>>>> 
>>>> 
>>>> On 15 April 2014 19:54, Calvin Metcalf <[email protected]>
>> wrote:
>>>> 
>>>>> I think the problem is not as much deleting and recreating a database
>>> but
>>>>> wiping a virtual machine and restoring from a backup, now you have
>> more
>>>> or
>>>>> less gone back in time with the target database and it has different
>>>> stuff
>>>>> but the same uuid.
>>>>> 
>>>>> 
>>>>>> On Tue, Apr 15, 2014 at 2:32 PM, Dale Harvey <[email protected]>
>>>>> wrote:
>>>>> 
>>>>>> I dont understand the problem with per db uuids, so the uuid isnt
>>>>>> multivalued nor is it queried
>>>>>> 
>>>>>>   A is readyonly, B is client, B starts replication from A
>>>>>>   B reads the db uuid from A / itself, generates a replication_id,
>>>>> stores
>>>>>> on B
>>>>>>   try to fetch replication checkpoint, if successful we query
>>> changes
>>>>> from
>>>>>> since?
>>>>>> 
>>>>>> In pouch we store the uuid along with the data, so file based
>> backups
>>>>> arent
>>>>>> a problem, seems couchdb could / should do that too
>>>>>> 
>>>>>> This also fixes the problem mentioned on the mailing list, and one
>> I
>>>> have
>>>>>> run into personally where people forward db requests but not server
>>>>>> requests via a proxy
>>>>>> 
>>>>>> 
>>>>>> On 15 April 2014 19:18, Calvin Metcalf <[email protected]>
>>>> wrote:
>>>>>> 
>>>>>>> except there is no way to calculate that from outside the
>> database
>>> as
>>>>>>> changes only ever gives the more recent document version.
>>>>>>> 
>>>>>>> 
>>>>>>> On Sun, Apr 13, 2014 at 9:47 PM, Calvin Metcalf <
>>>>>> [email protected]
>>>>>>>> wrote:
>>>>>>> 
>>>>>>>> oo didn't think of that, yeah uuids wouldn't hurt, though the
>>> more
>>>> I
>>>>>>> think
>>>>>>>> about the rolling hashing on revs, the more I like that
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Sun, Apr 13, 2014 at 6:00 PM, Adam Kocoloski <
>>>>>>> [email protected]>wrote:
>>>>>>>> 
>>>>>>>>> Yes, but then sysadmins have to be very very careful about
>>>> restoring
>>>>>>> from
>>>>>>>>> a file-based backup. We run the risk that {uuid, seq} could be
>>>>>>>>> multi-valued, which diminishes its value considerably.
>>>>>>>>> 
>>>>>>>>> I like the UUID in general -- we've added them to our internal
>>>> shard
>>>>>>>>> files at Cloudant -- but on their own they're not a
>> bulletproof
>>>>>> solution
>>>>>>>>> for read-only incremental replications.
>>>>>>>>> 
>>>>>>>>> Adam
>>>>>>>>> 
>>>>>>>>>> On Apr 13, 2014, at 5:16 PM, Calvin Metcalf <
>>>>>> [email protected]
>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> I mean if your going to add new features to couch you could
>>> just
>>>>>> have
>>>>>>>>> the
>>>>>>>>>> db generate a random uuid on creation that would be
>> different
>>> if
>>>>> it
>>>>>>> was
>>>>>>>>>> deleted and recreated
>>>>>>>>>>> On Apr 13, 2014 1:59 PM, "Adam Kocoloski" <
>>>>>> [email protected]>
>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Other thoughts:
>>>>>>>>>>> 
>>>>>>>>>>> - We could enhance the authorization system to have a role
>>> that
>>>>>>> allows
>>>>>>>>>>> updates to _local docs but nothing else. It wouldn't make
>>> sense
>>>>> for
>>>>>>>>>>> completely untrusted peers, but it could give peace of mind
>>> to
>>>>>>>>> sysadmins
>>>>>>>>>>> trying to execute replications with the minimum level of
>>> access
>>>>>>>>> possible.
>>>>>>>>>>> 
>>>>>>>>>>> - We could teach the sequence index to maintain a report of
>>>>> rolling
>>>>>>>>> hash
>>>>>>>>>>> of the {id,rev} pairs that comprise the database up to that
>>>>>> sequence,
>>>>>>>>>>> record that in the replication checkpoint document, and
>> check
>>>>> that
>>>>>>> it's
>>>>>>>>>>> unchanged on resume. It's a new API enhancement and it
>> grows
>>>> the
>>>>>>>>> amount of
>>>>>>>>>>> information stored with each sequence, but it completely
>>> closes
>>>>> off
>>>>>>> the
>>>>>>>>>>> probabilistic edge case associated with simply checking
>> that
>>>> the
>>>>>> {id,
>>>>>>>>> rev}
>>>>>>>>>>> associated with the checkpointed sequence has not changed.
>>>>> Perhaps
>>>>>>>>> overkill
>>>>>>>>>>> for what is admittedly a pretty low-probability event.
>>>>>>>>>>> 
>>>>>>>>>>> Adam
>>>>>>>>>>> 
>>>>>>>>>>> On Apr 13, 2014, at 1:50 PM, Adam Kocoloski <
>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Yeah, this is a subtle little thing. The main reason we
>>>>> checkpoint
>>>>>>> on
>>>>>>>>>>> both source and target and compare is to cover the case
>> where
>>>> the
>>>>>>>>> source
>>>>>>>>>>> database is deleted and recreated in between replication
>>>>> attempts.
>>>>>> If
>>>>>>>>> that
>>>>>>>>>>> were to happen and the replicator just resumes blindly from
>>> the
>>>>>>>>> checkpoint
>>>>>>>>>>> sequence stored on the target then the replication could
>>>>>> permanently
>>>>>>>>> miss
>>>>>>>>>>> some documents written to the new source.
>>>>>>>>>>>> 
>>>>>>>>>>>> I'd love to have a robust solution for incremental
>>> replication
>>>>> of
>>>>>>>>>>> read-only databases. To first order a UUID on the source
>>>> database
>>>>>>> that
>>>>>>>>> was
>>>>>>>>>>> fixed at create time could do the trick, but we'll run into
>>>>> trouble
>>>>>>>>> with
>>>>>>>>>>> file-based backup and restores. If a database file is
>>> restored
>>>>> to a
>>>>>>>>> point
>>>>>>>>>>> before the latest replication checkpoint we'd again be in a
>>>>>> position
>>>>>>> of
>>>>>>>>>>> potentially permanently missing updates.
>>>>>>>>>>>> 
>>>>>>>>>>>> Calvin's suggestion of storing e.g. {seq, id, rev} instead
>>> of
>>>>>> simply
>>>>>>>>> seq
>>>>>>>>>>> as the checkpoint information would dramatically reduce the
>>>>>>> likelihood
>>>>>>>>> of
>>>>>>>>>>> that type of permanent skip in the replication, but it's
>>> only a
>>>>>>>>>>> probabilistic answer.
>>>>>>>>>>>> 
>>>>>>>>>>>> Adam
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Apr 13, 2014, at 1:31 PM, Calvin Metcalf <
>>>>>>>>> [email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Though currently we have the opposite problem right if we
>>>>> delete
>>>>>>> the
>>>>>>>>>>> target
>>>>>>>>>>>>> db? (this on me brain storming)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Could we store last rev in addition to last seq?
>>>>>>>>>>>>>> On Apr 13, 2014 1:15 PM, "Dale Harvey" <
>>> [email protected]
>>>>> 
>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If the src database was to be wiped, when we restarted
>>>>>> replication
>>>>>>>>>>> nothing
>>>>>>>>>>>>>> would happen until the source database caught up to the
>>>>>> previously
>>>>>>>>>>> written
>>>>>>>>>>>>>> checkpoint
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> create A, write 5 documents
>>>>>>>>>>>>>> replicate 5 documents A -> B, write checkpoint 5 on B
>>>>>>>>>>>>>> destroy A
>>>>>>>>>>>>>> write 4 documents
>>>>>>>>>>>>>> replicate A -> B, pick up checkpoint from B and to
>>> ?since=5
>>>>>>>>>>>>>> .. no documents written
>> https://github.com/pouchdb/pouchdb/blob/master/tests/test.replication.js#L771is
>>>>>>>>>>>>>> our test that covers it
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 13 April 2014 18:02, Calvin Metcalf <
>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> If we were to unilaterally switch to checkpoint on
>> target
>>>>> what
>>>>>>>>> would
>>>>>>>>>>>>>>> happen, replication in progress would loose their
>> place?
>>>>>>>>>>>>>>>> On Apr 13, 2014 11:21 AM, "Dale Harvey" <
>>>>> [email protected]>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> So with checkpointing we write the checkpoint to both
>> A
>>>> and
>>>>> B
>>>>>>> and
>>>>>>>>>>>>>> verify
>>>>>>>>>>>>>>>> they match before using the checkpoint
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> What happens if the src of the replication is read
>> only?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> As far as I can tell couch will just checkout a
>>>>>>>>>>> checkpoint_commit_error
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> carry on from the start, The only improvement I can
>>> think
>>>> of
>>>>>> is
>>>>>>>>> the
>>>>>>>>>>>>>> user
>>>>>>>>>>>>>>>> specifies they know the src is read only and to only
>> use
>>>> the
>>>>>>>>> target
>>>>>>>>>>>>>>>> checkpoint, we can 'possibly' make that happen
>>>> automatically
>>>>>> if
>>>>>>>>> the
>>>>>>>>>>> src
>>>>>>>>>>>>>>>> specifically fails the write due to permissions.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> -Calvin W. Metcalf
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> -Calvin W. Metcalf
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> -Calvin W. Metcalf
> 
> 
> 
> -- 
> -Calvin W. Metcalf

Re: Checkpointing on read only databases

Reply via email to