Edward Ned Harvey wrote on Wed, Nov 10, 2010 at 00:28:48 -0500:
> > From: Daniel Shahaf [mailto:d...@daniel.shahaf.name]
> > 
> > Can you compare the contents of /path/to/file/foo/bar between the master
> > and mirror, as of the last revision successfully synced to the mirror?
> 
> The latest rev which synced without reporting any error was 5045.  It was
> trying to go from 5045 to 5046 when it triggered the checksum failure.
> 
> I checked the history of the file in question, and it was changed in ~200
> different revs.  But the revs of interest are:  in 4390, it synced to the
> slave without reporting any error, however, from 4390 onward, if I checkout
> from the slave and master, the two files differ.  And the next rev where
> this file was changed was 5046, which is when svnsync notices the checksum
> mismatch, and dies.
> 

Okay.

> It would seem, all of this behavior could be explained by a simple
> undetected hardware error.  During sync of 4390, the slave wrote some bits
> to disk, which got written wrongly.  It is known that disks will do this
> rarely.  This is one of the huge arguments in favor of ZFS and BTRFS and
> filesystem checksumming in general.  Such filesystems detect and correct
> data corruption which would have otherwise passed silently...  Which seems
> to be what happened in my case.
> 

Yes, the question is whether this thread is just a bunch of hardware
errors, or something deeper.

> All servers and clients are running 1.6.12.  However, at the time when 4390
> was committed...  The master was 1.6.12, but the slave was probably 1.5.7
> 
> 
> > If you create a fresh mirror and svnsync it, from r0 to that revision,
> > does the
> > file /path/to/file/foo/bar in the fresh mirror differ from the one in the
> > master?
> 
> No problems.  Although ... I didn't let it sync from rev 0.  (That would be
> impossibly time consuming...  weeks....)  I did as mentioned before.
> Transferred a backup of the master to the slave, and used it as the "seed"
> for the sync, so I only needed to sync the last 100 revs or something like
> that...
> 

That would mean that the "last changed revision" --- r4390 --- is
contained in the seed and wasn't re-svnsync'd.  If we suspect that
svnsync committed a bogus r4390 to the slave, we'd better start with
a slave that /doesn't/ already have a knowingly-good r4390...

Of course, you can take that backup and use it to produce a repository
whose youngest revision is earlier than r4390.


Reply via email to