In article <20150619083656.gt19...@homeworld.netbsd.org>,
Emmanuel Dreyfus  <m...@netbsd.org> wrote:
>Hi
>
>I have encountered a bug with NetBSD NFS client. Despite a mount with
>-o intr,soft, we can hit situation where a process can remain hang in 
>kernel because the NFS server is gone.
>
>This happens when the ioflush does its duty, with the following code path:
>sync_fsync / nfs_sync / VOP_FSYNC / nfs_fsync / nfs_flush / VOP_PUTPAGES
>
>VOP_PUTPAGES has flags = PGO_ALLPAGES|PGO_FREE. It then goes through
>genfs_putpages and genfs_do_putpages, and get stuck in:
>
>       /* Wait for output to complete. */
>       if (!wasclean && !async && vp->v_numoutput != 0) {
>               while (vp->v_numoutput != 0)
>                       cv_wait(&vp->v_cv, slock);
>       }
>
>This cv_wait() is tiemout-less and uninterruptible. ioflush will 
>sleep there forever, holding vnode lock. Any other process doing
>I/O on the filesystem will sleep in tstile waiting for the vnode
>lock with this path: 
>sys_write / dofilewrite / vn_write / vn_lock / VOP_LOCK / rw_enter

Yes, but ioflush is not a user process... An interruptible mount
means that a user process can interrupt a syscall doing an NFS
operation. No other operating system I know of, takes this to mean
that you can unmount the filesystem or make delayed writes abort
and fail.

Having said that, yes it is a problem that you need to reboot
because an NFS server is gone, and we should make umount -f work
properly in that case. I don't think that we should introduce umount
-l (like linux) unless there is a compelling reason to do so.

christos

Reply via email to