Hi, I have these patches laying around a long time... and it's maybe time to bring them up. It does the three changes in dlm recovery handling:
1. The dlm_lsop_recover_prep() callback should be called once after the lockspace is stopped and not if it's already stopped when the recovery is running. It will change possible: dlm_lsop_recover_prep() ... dlm_lsop_recover_prep() dlm_lsop_recover_done() to only have one possible prep call: dlm_lsop_recover_prep() dlm_lsop_recover_done() 2. If a new_lockspace() is created we wait until a point when members are successful pinged, then new_lockspace() returns to the caller. However the recovery might be still running. Mostly all users of dlm will workaround this with a dlm_lsop_recover_done() call wait to know the dlm lockspace can be used now. This should be backwards compatible with the existing dlm users, however they can drop their handling if they want. 3. There exists two ways how recovery can be triggered. Either somebody called new_lockspace(), that means a waiter waits until recovery is done. Or it is a complete async process e.g. nodes joining/leaving the lockspace. There is no caller in the async case which waits for dlm recovery is done, therefore there exists no error handling which reacts on possible recovery errors. This patch series will introduce a "best effort" approach to simple retry/schedule() the recovery on error and hope the error gets resolved. If this is not the case in 5 retries panic() will fence the node. - Alex Alexander Aring (7): fs: dlm: add notes for recovery and membership handling fs: dlm: call dlm_lsop_recover_prep once fs: dlm: let new_lockspace() wait until recovery fs: dlm: handle recovery result outside of ls_recover fs: dlm: handle recovery -EAGAIN case as retry fs: dlm: change -EINVAL recovery error to -EAGAIN fs: dlm: add WARN_ON for non waiter case fs/dlm/dlm_internal.h | 4 +-- fs/dlm/lock.c | 5 +++- fs/dlm/lockspace.c | 9 ++++--- fs/dlm/member.c | 30 +++++++++++----------- fs/dlm/recoverd.c | 60 ++++++++++++++++++++++++++++++++++++++++--- 5 files changed, 82 insertions(+), 26 deletions(-) -- 2.31.1