Hi,

On Fri, Jun 10, 2022 at 1:06 PM Alexander Aring <aahri...@redhat.com> wrote:
>
> This patch adds a WARN_ON if recovery hits a critical error but no
> caller was waiting in dlm_new_lockspace(), this can occur e.g. if a
> node got fences. The WARN_ON signals us to investigate into this case
> that it should not occur.
>
> Signed-off-by: Alexander Aring <aahri...@redhat.com>
> ---
>  fs/dlm/recoverd.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
> index eeb221c175a2..240267568aab 100644
> --- a/fs/dlm/recoverd.c
> +++ b/fs/dlm/recoverd.c
> @@ -311,6 +311,7 @@ static void do_ls_recovery(struct dlm_ls *ls)
>
>                                 /* let new_lockspace() get aware of critical 
> error */
>                                 ls->ls_recovery_result = error;
> +                               
> WARN_ON(completion_done(&ls->ls_recovery_done));

I will drop this patch, I think it can race because
dlm_new_lockspace() triggers recovery and then waits... race is
unlikely but I think the better approach is here to look at debug
messages to see why recovery fails then. Debug messages may need to be
improved depending on the case and I will just send patches if there
is any information missing.

- Alex

Reply via email to