[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: recovery_fs: change init routine to create new dir hierarchy

GerritHub Wed, 15 Nov 2017 17:33:04 -0800

>From Jeff Layton <jlay...@redhat.com>:

Jeff Layton has uploaded this change for review. ( 
https://review.gerrithub.io/387709



Change subject: recovery_fs: change init routine to create new dir hierarchy
......................................................................

recovery_fs: change init routine to create new dir hierarchy

The current recovery scheme is fiddly and doesn't tolerate faults well.
If you restart ganesha while it's in its grace period, you can lose
recovery entries. It's also susceptible to crashes that occur when
trying to load the entries from disk. It opens two different databases
and copies entries between them. That can fail midstream, and we end up
with a mess.

The key to fixing this is recognizing that that no client can set new
state until the grace period is lifted. There is no benefit to ensuring
the durability of new client records that did not exist in the previous
database until then. The pre-crash database is always authoritative
until that point.

Rework the code around this principle:

When we start ganesha, create a new empty temporary directory with a
unique name to record incoming clients. We'll always create new client
records in this directory, both for brand new clients and those
performing reclaim.

Reclaim of previous state however must be gated on the previous recovery
db, so we read those records into memory when starting . We do _not_
however preemptively create new entries for these records in the new
recovery dir. That will be done naturally as the clients connect to
reclaim their state.

Once we're ready to lift the grace period, we must atomically switch to
the new database from the old, so that reclaim after a subsequent crash
will use the new db. This must be done before handing out new state.

This is tricky as the recovery_fs driver uses a directory here,
and we can't rename one on top of another. What we do instead is
create a symlink that points to the new recovery database and then
rename that symlink on top of the old one (if any).

This scheme naturally handles the case where the server crashes within its
grace period. That can happen an arbitrary number of times and we can
still allow clients to reclaim until we reach the end of a grace period.

For now, we do not handle the takeover case. That is doable, but will
take a bit more work. It also doesn't handle the transition from the
old directory handling scheme.

Change-Id: I886b310aa74cff356dd30ff44e8a63941ed579bd
Signed-off-by: Jeff Layton <jlay...@redhat.com>
---
M src/SAL/recovery/recovery_fs.c
1 file changed, 62 insertions(+), 183 deletions(-)



  git pull ssh://review.gerrithub.io:29418/ffilz/nfs-ganesha 
refs/changes/09/387709/1
-- 
To view, visit https://review.gerrithub.io/387709
To unsubscribe, visit https://review.gerrithub.io/settings

Gerrit-Project: ffilz/nfs-ganesha
Gerrit-Branch: next
Gerrit-MessageType: newchange
Gerrit-Change-Id: I886b310aa74cff356dd30ff44e8a63941ed579bd
Gerrit-Change-Number: 387709
Gerrit-PatchSet: 1
Gerrit-Owner: Jeff Layton <jlay...@redhat.com>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

[Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: recovery_fs: change init routine to create new dir hierarchy

Reply via email to