Hi,

On Fri, Jun 10, 2022 at 1:06 PM Alexander Aring <aahri...@redhat.com> wrote:
>
> This patch changes a -EINVAL error for dlm_master_lookup() to -EAGAIN.
> It is a critical error which should not happened, if it happens there
> exists an issue. However we still track those issues inside the lock but
> if they happen we try to run recovery again if those issues will get
> resolved. If not recovery has a logic to fail this node after several
> retries.
>
> Signed-off-by: Alexander Aring <aahri...@redhat.com>
> ---
>  fs/dlm/lock.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
> index 226822f49d30..ad32a883c1fd 100644
> --- a/fs/dlm/lock.c
> +++ b/fs/dlm/lock.c
> @@ -1018,7 +1018,10 @@ int dlm_master_lookup(struct dlm_ls *ls, int 
> from_nodeid, char *name, int len,
>                           from_nodeid, dir_nodeid, our_nodeid, hash,
>                           ls->ls_num_nodes);
>                 *r_nodeid = -1;
> -               return -EINVAL;
> +               /* this case should never occur, we try again
> +                * to hope it got resolved
> +                */
> +               return -EAGAIN;

I moved this -EAGAIN return if  dlm_master_lookup() in
dlm_recover_directory() returns -EINVAL as this function is also used
in non-recovery handling whereas dlm_recover_directory() is used in
recovery handling only. There was once an issue that
dlm_recover_directory() returned -EINVAL in recovery handling and this
patch should somehow try to resolve the issue by assuming it is a
temporal issue when exchanging messages or scheduling some other
tasks... unfortunately there was no more information how this issue
was triggered.

- Alex

Reply via email to