On Fri, Jun 03, 2016 at 07:23:47PM -0700, Mark Johnston wrote:
> Hi,
> 
> I've recently observed a hang in a multi-threaded process that had hit
> an assertion failure and was attempting to dump core. One thread was
> sleeping interruptibly on an advisory lock with TDF_SBDRY set (our
> filesystem sets VFCF_SBDRY). SIGABRT caused the receipient thread to
> suspend other threads with thread_single(SINGLE_NO_EXIT), which fails
> to interrupt the sleeping thread, resulting in the hang.
> 
> My question is, why does the SA_CORE handler not force all threads to
> the user boundary before attempting to dump core? It must do so later
> anyway in order to exit. As I understand it, TDF_SBDRY is intended to
> avoid deadlocks that can occur when stopping a process, but in this
> case we don't stop the process with the intention of resuming it, so it
> seems erroneous to apply this flag.

Does your fs both set TDF_SBDRY and call lf_advlock()/lf_advlockasync() ?
This cannot work, regardless of the mode of single-threading.  TDF_SBDRY
makes such sleep non-interruptible by any single-threading request, on
the promise that the thread owns some resources (typically vnode locks).
I.e. changing the mode would not help.

I see two reasons to use SINGLE_NO_EXIT for coredumping.  It allows
coredump writer to record more exact state of the process into the notes.

Another one is that SINGLE_NO_EXIT is generally faster and more
reliable than SINGLE_BOUNDARY. Some states are already good enough for
SINGLE_NO_EXIT, while require more work to get into SINGLE_BOUNDARY. In
other words, core dump write starts earlier.

It might be not very significant reasons. 

>From what I see in the code, our NFS client has similar issue of calling
lf_advlock() with TDF_SBDRY set.  Below is the patch to fix that.
Similar bug existed in our fifofs, see r277321.

diff --git a/sys/fs/nfsclient/nfs_clvnops.c b/sys/fs/nfsclient/nfs_clvnops.c
index 2a8afa9..98625ee 100644
--- a/sys/fs/nfsclient/nfs_clvnops.c
+++ b/sys/fs/nfsclient/nfs_clvnops.c
@@ -2992,7 +2992,7 @@ nfs_advlock(struct vop_advlock_args *ap)
        struct proc *p = (struct proc *)ap->a_id;
        struct thread *td = curthread;  /* XXX */
        struct vattr va;
-       int ret, error = EOPNOTSUPP;
+       int sbdry, ret, error = EOPNOTSUPP;
        u_quad_t size;
        
        if (NFS_ISV4(vp) && (ap->a_flags & (F_POSIX | F_FLOCK)) != 0) {
@@ -3087,7 +3087,10 @@ nfs_advlock(struct vop_advlock_args *ap)
                if ((VFSTONFS(vp->v_mount)->nm_flag & NFSMNT_NOLOCKD) != 0) {
                        size = VTONFS(vp)->n_size;
                        NFSVOPUNLOCK(vp, 0);
+                       sbdry = sigallowstop();
                        error = lf_advlock(ap, &(vp->v_lockf), size);
+                       if (sbdry)
+                               sigdeferstop();
                } else {
                        if (nfs_advlock_p != NULL)
                                error = nfs_advlock_p(ap);
@@ -3114,7 +3117,7 @@ nfs_advlockasync(struct vop_advlockasync_args *ap)
 {
        struct vnode *vp = ap->a_vp;
        u_quad_t size;
-       int error;
+       int error, sbdry;
        
        if (NFS_ISV4(vp))
                return (EOPNOTSUPP);
@@ -3124,7 +3127,10 @@ nfs_advlockasync(struct vop_advlockasync_args *ap)
        if ((VFSTONFS(vp->v_mount)->nm_flag & NFSMNT_NOLOCKD) != 0) {
                size = VTONFS(vp)->n_size;
                NFSVOPUNLOCK(vp, 0);
+               sbdry = sigallowstop();
                error = lf_advlockasync(ap, &(vp->v_lockf), size);
+               if (sbdry)
+                       sigdeferstop();
        } else {
                NFSVOPUNLOCK(vp, 0);
                error = EOPNOTSUPP;
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to