Would it be safer to have a low- and high- watermark for the
update_seq in memory? What I mean is that the db writer will never
write out an update_seq that is N higher than the last committed one;
if it is forced to do so, to permit a write, it then fsync's and
resets high_seq to last_committed_seq. This way you can genuinely
ensure that you don't reuse an update_seq. In practice we could allow
a large delta, one that is larger than the number of fsyncs we expect
to manage in the commit interval.

Your idea to just bump the update_seq "significantly" mostly pans out
(I know a system that does precisely this) but it would be a data loss
scenario if when it doesn't pan out.

B.

On Mon, Apr 12, 2010 at 3:54 AM, Adam Kocoloski
<[email protected]> wrote:
> Currently a DB update_seq can be reused if there's a power failure before the 
> header is sync'ed to disk.  This adds some extra complexity and overhead to 
> the replicator, which must confirm before saving a checkpoint that the source 
> update_seq it is recording will not be reused later.  It does this by issuing 
> an ensure_full_commit call to the source DB, which may be a pretty expensive 
> operation if the source has a constant write load.
>
> Should we try to fix that?  One way to do so would be start at a 
> significantly higher update_seq than the committed one whenever the DB is 
> opened after an "unclean" shutdown; that is, one where the DB header is not 
> the last term stored in the file.  Although, I suppose that's not an ironclad 
> test for data loss -- it might be the case that none of the lost updates were 
> written to the file.  I suppose we could "bump" the update_seq on every 
> startup.
>
> Adam
>
>

Reply via email to