On Mon, Mar 02, 2015 at 04:02:37PM +0800, Liu Yuan wrote: > From: Liu Yuan <liuy...@cmss.chinamobile.com> > > One of the acceleration for recovery is we try to recover the object from > local > node as much as possible. It is straightforward implemented: > > 1 firstly get the hash of the object to be recoveried from stale directry if > any > 2 then compare the fingerprint to the remote node > 3 if identical, then we can safely recover it from local stale directory. > > But this logic is never executed in the following case: > > 0 sheep try to recover object A at from epoch 5, we note it as A.5 > 1 but sheep find we have a local copy A.2 due to a multiple node events > 2 then sheep get the fingerprint of A.2 and then compare to remote node. > 3 the figerprints are identical, so this sheep tries to recover it from A.2 > 4 if, unfortunately, A.5 is as well calcuated onto this node, even though this > sheep dosen't have it, our code will first try to link A.5 > 5 unfortunately, A.5 is never out there and before we really try to link A.2, > sheep fail out because ->link(A.5) return error. > > The fix is easy, just try to ->link(A.2) before ->link(A.5).
Hitoshi?...ping... -- sheepdog mailing list sheepdog@lists.wpkg.org https://lists.wpkg.org/mailman/listinfo/sheepdog