On 03/11/17 09:40, NeilBrown wrote:
> 

Hi Neil, and thanks taking the time to post the patch.

> Currently if the autofs kernel module gets an error when
> writing to the pipe which links to the daemon, then it
> marks the whole moutpoint as catatonic, and it will stop working.
> 
> It is possible that the error is transient.  This can happen
> if the daemon is slow and more than 16 requests queue up.
> If a subsequent process tries to queue a request, and is then signalled,
> the write to the pipe will return -ERESTARTSYS and autofs
> will take that as total failure.

Indeed it does.

And given the problems with a half dozen (or so) user space
applications consuming large amounts of CPU under heavy mount
and umount activity this could happen more easily than we
expect.

> 
> So change the code to assess -ERESTARTSYS and -ENOMEM as transient
> failures which only abort the current request, not the whole
> mountpoint.

This looks good to me.

> 
> Signed-off-by: NeilBrown <ne...@suse.com>
> ---
> 
> Do people think this should got to -stable ??
> It isn't a crash or a data corruption, but having autofs mountpoints
> suddenly stop working is rather inconvenient.

Perhaps that's a good idea given the CPU usage problem I refer
to above has been around for a while now.

> 
> Thanks,
> NeilBrown
> 
> 
>  fs/autofs4/waitq.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> index 4ac49d038bf3..8fc41705c7cd 100644
> --- a/fs/autofs4/waitq.c
> +++ b/fs/autofs4/waitq.c
> @@ -81,7 +81,8 @@ static int autofs4_write(struct autofs_sb_info *sbi,
>               spin_unlock_irqrestore(&current->sighand->siglock, flags);
>       }
>  
> -     return (bytes > 0);
> +     /* if 'wr' returned 0 (impossible) we assume -EIO (safe) */
> +     return bytes == 0 ? 0 : wr < 0 ? wr : -EIO;
>  }
>  
>  static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
> @@ -95,6 +96,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info 
> *sbi,
>       } pkt;
>       struct file *pipe = NULL;
>       size_t pktsz;
> +     int ret;
>  
>       pr_debug("wait id = 0x%08lx, name = %.*s, type=%d\n",
>                (unsigned long) wq->wait_queue_token,
> @@ -169,7 +171,18 @@ static void autofs4_notify_daemon(struct autofs_sb_info 
> *sbi,
>       mutex_unlock(&sbi->wq_mutex);
>  
>       if (autofs4_write(sbi, pipe, &pkt, pktsz))
> +     switch (ret = autofs4_write(sbi, pipe, &pkt, pktsz)) {
> +     case 0:
> +             break;
> +     case -ENOMEM:
> +     case -ERESTARTSYS:
> +             /* Just fail this one */
> +             autofs4_wait_release(sbi, wq->wait_queue_token, ret);
> +             break;
> +     default:
>               autofs4_catatonic_mode(sbi);
> +             break;
> +     }
>       fput(pipe);
>  }
>  
> 

Reply via email to