Hi, On Fri, Jun 10, 2022 at 1:06 PM Alexander Aring <aahri...@redhat.com> wrote: > > This patch adds a WARN_ON if recovery hits a critical error but no > caller was waiting in dlm_new_lockspace(), this can occur e.g. if a > node got fences. The WARN_ON signals us to investigate into this case > that it should not occur. > > Signed-off-by: Alexander Aring <aahri...@redhat.com> > --- > fs/dlm/recoverd.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c > index eeb221c175a2..240267568aab 100644 > --- a/fs/dlm/recoverd.c > +++ b/fs/dlm/recoverd.c > @@ -311,6 +311,7 @@ static void do_ls_recovery(struct dlm_ls *ls) > > /* let new_lockspace() get aware of critical > error */ > ls->ls_recovery_result = error; > + > WARN_ON(completion_done(&ls->ls_recovery_done));
I will drop this patch, I think it can race because dlm_new_lockspace() triggers recovery and then waits... race is unlikely but I think the better approach is here to look at debug messages to see why recovery fails then. Debug messages may need to be improved depending on the case and I will just send patches if there is any information missing. - Alex