On Monday February 5, [EMAIL PROTECTED] wrote:
> Seems recently, on both redhat 6.1 and 7.0 using kernel 2.4.1-ac3, I
> ran into this problem:
> 
> Stopping NFS says the following in the kernel logs:
> 
> nfsd: terminating on signal 9
> nfsd: terminating on signal 9
> nfsd: terminating on signal 9
> nfsd: terminating on signal 9
> nfsd: terminating on signal 9
> nfsd: terminating on signal 9
> nfsd: terminating on signal 9
> nfsd: terminating on signal 9
> svc: server socket destroy delayed
> 
> And restarting NFS has the following error message:
> 
> root:~> /etc/rc.d/init.d/nfs start
> Starting NFS services:                                     [  OK  ]
> Starting NFS quotas:                                       [  OK  ]
> Starting NFS mountd:                                       [  OK  ]
> Starting NFS daemon: nfssvc: Address already in use
>                                                            [FAILED]

How repeatable is this?  Is the server SMP?

There does seem to be a possible problem with sk_inuse not being
updated atomically, so a race between an increment and a decrement
could lose one of them.
svc_sock_release seems to often be called with no more protection than
the BKL, and it decrements sk_inuse.

svc_sock_enqueue, on the other hand increments sk_inuse, and is
protected by sv_lock, but not, I think, by the BKL, as it is called by
a networking layer callback.  So there might be a possibility for a
race here.

The attached patch might fix it, so if you are having reproducable
problems, it might be worth applying this patch.

Trond: any comments?

NeilBrown

[ a better fix would be to make sk_inuse atomic_t ]

--- net/sunrpc/svcsock.c        2001/02/05 23:45:54     1.1
+++ net/sunrpc/svcsock.c        2001/02/05 23:48:12
@@ -211,16 +211,22 @@
 svc_sock_release(struct svc_rqst *rqstp)
 {
        struct svc_sock *svsk = rqstp->rq_sock;
+       struct svc_serv *serv = svsk->sk_server;
 
        if (!svsk)
                return;
        svc_release_skb(rqstp);
        rqstp->rq_sock = NULL;
+
+       spin_lock_bh(&serv->sv_lock);
        if (!--(svsk->sk_inuse) && svsk->sk_dead) {
+               spin_unlock_bh(&serv->sv_lock);
                dprintk("svc: releasing dead socket\n");
                sock_release(svsk->sk_sock);
                kfree(svsk);
        }
+       else
+               spin_unlock_bh(&serv->sv_lock);
 }
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to