Hi, On Fri, Jun 10, 2022 at 1:06 PM Alexander Aring <aahri...@redhat.com> wrote: > > This patch changes a -EINVAL error for dlm_master_lookup() to -EAGAIN. > It is a critical error which should not happened, if it happens there > exists an issue. However we still track those issues inside the lock but > if they happen we try to run recovery again if those issues will get > resolved. If not recovery has a logic to fail this node after several > retries. > > Signed-off-by: Alexander Aring <aahri...@redhat.com> > --- > fs/dlm/lock.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c > index 226822f49d30..ad32a883c1fd 100644 > --- a/fs/dlm/lock.c > +++ b/fs/dlm/lock.c > @@ -1018,7 +1018,10 @@ int dlm_master_lookup(struct dlm_ls *ls, int > from_nodeid, char *name, int len, > from_nodeid, dir_nodeid, our_nodeid, hash, > ls->ls_num_nodes); > *r_nodeid = -1; > - return -EINVAL; > + /* this case should never occur, we try again > + * to hope it got resolved > + */ > + return -EAGAIN;
I moved this -EAGAIN return if dlm_master_lookup() in dlm_recover_directory() returns -EINVAL as this function is also used in non-recovery handling whereas dlm_recover_directory() is used in recovery handling only. There was once an issue that dlm_recover_directory() returned -EINVAL in recovery handling and this patch should somehow try to resolve the issue by assuming it is a temporal issue when exchanging messages or scheduling some other tasks... unfortunately there was no more information how this issue was triggered. - Alex