Re: NFS locking question
On Tue, Aug 15, 2006 at 06:11:14AM -0400, Kris Kennaway wrote: I was unable to obtain confirmation from anyone else (including the submitter who previously claimed it was necessary, and my own testing) that the patch actually solved a problem. Since it involves reverting useful functionality, someone would need to obtain further debugging from your system (tcpdump traces before/after, etc) to determine what it's actually solving. If I can help with further debugging I would only be happy to do so, but I'll have to set up a new box for testing. I have to use my workstation for ... working :) with regards, -- Morten A. Middelthon If you cannot convince them, confuse them. -- Harry S Truman pgphzh4xr98in.pgp Description: PGP signature
Re: NFS locking question
On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote: On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote: Hi, all! In our local office network we have a rather old FreeBSD 5.2.1 server acting as an NFS server for several other systems, mostly running 6.0. From time to time we experience processes on the NFS clients hanging in statd D with wchan lockd when accessing files over NFS. Try the attached patch on the 6.0 machines: Index: usr.sbin/rpc.lockd/lock_proc.c snip Hi, I have been plagued with this NFS lockd issue for quite some time now. It has kept me from installing FreeBSD 6.x on our workstations at work. I just tried applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has been working nicely. Has anyone else had the same experience? If so, maybe it should go into production? with regards, -- Morten A. Middelthon For every complex problem, there is a solution that is simple, neat, and wrong. -- H.L. Mencken pgpeEOFD7peZ9.pgp Description: PGP signature
Re: NFS locking question
On Tue, Aug 15, 2006 at 11:59:50AM +0200, Morten A. Middelthon wrote: On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote: On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote: Hi, all! In our local office network we have a rather old FreeBSD 5.2.1 server acting as an NFS server for several other systems, mostly running 6.0. From time to time we experience processes on the NFS clients hanging in statd D with wchan lockd when accessing files over NFS. Try the attached patch on the 6.0 machines: Index: usr.sbin/rpc.lockd/lock_proc.c snip Hi, I have been plagued with this NFS lockd issue for quite some time now. It has kept me from installing FreeBSD 6.x on our workstations at work. I just tried applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has been working nicely. Has anyone else had the same experience? If so, maybe it should go into production? I was unable to obtain confirmation from anyone else (including the submitter who previously claimed it was necessary, and my own testing) that the patch actually solved a problem. Since it involves reverting useful functionality, someone would need to obtain further debugging from your system (tcpdump traces before/after, etc) to determine what it's actually solving. Kris pgpk5emX1d8Kb.pgp Description: PGP signature
Re: NFS locking question
On 8/15/06, Kris Kennaway [EMAIL PROTECTED] wrote: On Tue, Aug 15, 2006 at 11:59:50AM +0200, Morten A. Middelthon wrote: On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote: On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote: Hi, all! In our local office network we have a rather old FreeBSD 5.2.1 server acting as an NFS server for several other systems, mostly running 6.0. From time to time we experience processes on the NFS clients hanging in statd D with wchan lockd when accessing files over NFS. Try the attached patch on the 6.0 machines: Index: usr.sbin/rpc.lockd/lock_proc.c snip Hi, I have been plagued with this NFS lockd issue for quite some time now. It has kept me from installing FreeBSD 6.x on our workstations at work. I just tried applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has been working nicely. Has anyone else had the same experience? If so, maybe it should go into production? I was unable to obtain confirmation from anyone else (including the submitter who previously claimed it was necessary, and my own testing) that the patch actually solved a problem. Since it involves reverting useful functionality, someone would need to obtain further debugging from your system (tcpdump traces before/after, etc) to determine what it's actually solving. Kris In my experiences, rpc.lockd dies automatically on both server and client. If this happens, then all processes that want to lock a file, they will be stuck in lockd (top will tell). In my case, rpc.lockd dies because write failed, and then a SIGPIPE generated. Two months ago, bin/97768 is sent and rodrigc@ committed (also MFC'ed in RELENG_6). That PR ignores SIGPIPE (since the code in server/client already takes care of write failed case). After I applied this PR, I'm quite happy with nfs locking. Regards, Rong-En Fan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NFS locking question
On Tue, Aug 15, 2006 at 07:49:10PM +0800, Rong-en Fan wrote: On 8/15/06, Kris Kennaway [EMAIL PROTECTED] wrote: On Tue, Aug 15, 2006 at 11:59:50AM +0200, Morten A. Middelthon wrote: On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote: On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote: Hi, all! In our local office network we have a rather old FreeBSD 5.2.1 server acting as an NFS server for several other systems, mostly running 6.0. From time to time we experience processes on the NFS clients hanging in statd D with wchan lockd when accessing files over NFS. Try the attached patch on the 6.0 machines: Index: usr.sbin/rpc.lockd/lock_proc.c snip Hi, I have been plagued with this NFS lockd issue for quite some time now. It has kept me from installing FreeBSD 6.x on our workstations at work. I just tried applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has been working nicely. Has anyone else had the same experience? If so, maybe it should go into production? I was unable to obtain confirmation from anyone else (including the submitter who previously claimed it was necessary, and my own testing) that the patch actually solved a problem. Since it involves reverting useful functionality, someone would need to obtain further debugging from your system (tcpdump traces before/after, etc) to determine what it's actually solving. Kris In my experiences, rpc.lockd dies automatically on both server and client. If this happens, then all processes that want to lock a file, they will be stuck in lockd (top will tell). In my case, rpc.lockd dies because write failed, and then a SIGPIPE generated. Two months ago, bin/97768 is sent and rodrigc@ committed (also MFC'ed in RELENG_6). That PR ignores SIGPIPE (since the code in server/client already takes care of write failed case). After I applied this PR, I'm quite happy with nfs locking. That's good to hear, but there are many other fundamental problems with the current rpc.lockd implementation in various situations, which can probably only be solved by a substantial rewrite. Kris pgpBInM52xrnw.pgp Description: PGP signature
Re: NFS locking question
On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote: Hi, all! In our local office network we have a rather old FreeBSD 5.2.1 server acting as an NFS server for several other systems, mostly running 6.0. From time to time we experience processes on the NFS clients hanging in statd D with wchan lockd when accessing files over NFS. Try the attached patch on the 6.0 machines: Index: usr.sbin/rpc.lockd/lock_proc.c === RCS file: /home/ncvs/src/usr.sbin/rpc.lockd/lock_proc.c,v retrieving revision 1.18 retrieving revision 1.17 diff -u -u -r1.18 -r1.17 --- usr.sbin/rpc.lockd/lock_proc.c 3 Feb 2005 22:21:19 - 1.18 +++ usr.sbin/rpc.lockd/lock_proc.c 9 Oct 2004 15:36:13 - 1.17 @@ -62,8 +62,6 @@ #defineCLIENT_CACHE_SIZE 64 /* No. of client sockets cached */ #defineCLIENT_CACHE_LIFETIME 120 /* In seconds */ -#definegetrpcaddr(rqstp) (struct sockaddr *)(svc_getrpccaller((rqstp)-rq_xprt)-buf) - static voidlog_from_addr(const char *, struct svc_req *); static voidlog_netobj(netobj *obj); static int addrcmp(struct sockaddr *, struct sockaddr *); @@ -196,7 +194,7 @@ { CLIENT *client; struct timeval retry_time, time_now; - int error, i; + int i; const char *netid; struct netconfig *nconf; char host[NI_MAXHOST]; @@ -243,11 +241,9 @@ * Need a host string for clnt_tp_create. Use NI_NUMERICHOST * to avoid DNS lookups. */ - error = getnameinfo(host_addr, host_addr-sa_len, host, sizeof host, - NULL, 0, NI_NUMERICHOST); - if (error != 0) { - syslog(LOG_ERR, unable to get name string for caller: %s, - gai_strerror(error)); + if (getnameinfo(host_addr, host_addr-sa_len, host, sizeof host, + NULL, 0, NI_NUMERICHOST) != 0) { + syslog(LOG_ERR, unable to get name string for caller); return NULL; } @@ -566,7 +562,8 @@ res.cookie = arg-cookie; res.stat.stat = getlock(arg4, rqstp, LOCK_ASYNC | LOCK_MON); - transmit_result(NLM_LOCK_RES, res, getrpcaddr(rqstp)); + transmit_result(NLM_LOCK_RES, res, + (struct sockaddr *)svc_getcaller(rqstp-rq_xprt)); return (NULL); } @@ -620,7 +617,8 @@ * a lock to cancel, so this call always fails. */ res.stat.stat = unlock(arg4, LOCK_CANCEL); - transmit_result(NLM_CANCEL_RES, res, getrpcaddr(rqstp)); + transmit_result(NLM_CANCEL_RES, res, + (struct sockaddr *)svc_getcaller(rqstp-rq_xprt)); return (NULL); } @@ -667,7 +665,8 @@ res.stat.stat = unlock(arg4, 0); res.cookie = arg-cookie; - transmit_result(NLM_UNLOCK_RES, res, getrpcaddr(rqstp)); + transmit_result(NLM_UNLOCK_RES, res, + (struct sockaddr *)svc_getcaller(rqstp-rq_xprt)); return (NULL); } @@ -724,7 +723,8 @@ nlm_granted, NULL, NLM_VERS) == 0 ? nlm_granted : nlm_denied; - transmit_result(NLM_GRANTED_RES, res, getrpcaddr(rqstp)); + transmit_result(NLM_GRANTED_RES, res, + (struct sockaddr *)svc_getcaller(rqstp-rq_xprt)); return (NULL); } @@ -1067,7 +1067,8 @@ res.cookie = arg-cookie; res.stat.stat = getlock(arg, rqstp, LOCK_MON | LOCK_ASYNC | LOCK_V4); - transmit4_result(NLM4_LOCK_RES, res, getrpcaddr(rqstp)); + transmit4_result(NLM4_LOCK_RES, res, + (struct sockaddr *)svc_getcaller(rqstp-rq_xprt)); return (NULL); } @@ -1115,7 +1116,8 @@ * a lock to cancel, so this call always fails. */ res.stat.stat = unlock(arg-alock, LOCK_CANCEL | LOCK_V4); - transmit4_result(NLM4_CANCEL_RES, res, getrpcaddr(rqstp)); + transmit4_result(NLM4_CANCEL_RES, res, + (struct sockaddr *)svc_getcaller(rqstp-rq_xprt)); return (NULL); } @@ -1156,7 +1158,8 @@ res.stat.stat = unlock(arg-alock, LOCK_V4); res.cookie = arg-cookie; - transmit4_result(NLM4_UNLOCK_RES, res, getrpcaddr(rqstp)); + transmit4_result(NLM4_UNLOCK_RES, res, + (struct sockaddr *)svc_getcaller(rqstp-rq_xprt)); return (NULL); } @@ -1212,7 +1215,8 @@ res.stat.stat = lock_answer(arg-alock.svid, arg-cookie, nlm4_granted, NULL, NLM_VERS4) == 0 ? nlm4_granted : nlm4_denied; - transmit4_result(NLM4_GRANTED_RES, res, getrpcaddr(rqstp)); + transmit4_result(NLM4_GRANTED_RES, res, + (struct sockaddr *)svc_getrpccaller(rqstp-rq_xprt)-buf); return (NULL); } Kris pgp4aErANQlO1.pgp Description: PGP signature
Re: NFS locking question
Hi, Kris! On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote: Try the attached patch on the 6.0 machines: Thanks. Will do. But: Index: usr.sbin/rpc.lockd/lock_proc.c ... So, rpc.lockd _is_ needed on the client? What about the statement in rc.conf(5) then, claiming it was only started on servers? Thanks, Patrick -- punkt.de GmbH Internet - Dienstleistungen - Beratung Vorholzstr. 25Tel. 0721 9109 -0 Fax: -100 76137 Karlsruhe http://punkt.de ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NFS locking question
So, rpc.lockd _is_ needed on the client? What about the statement in rc.conf(5) then, claiming it was only started on servers? Yes rpc.lockd and rpc.statd are needed on the client. I suspect it's the statements in rc.conf(5) that are wrong. Miguel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NFS locking question
On Tue, Feb 28, 2006 at 12:02:51PM +0100, Patrick M. Hausen wrote: Hi, Kris! On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote: Try the attached patch on the 6.0 machines: Thanks. Will do. But: Index: usr.sbin/rpc.lockd/lock_proc.c ... So, rpc.lockd _is_ needed on the client? Yes. If you weren't doing that, do that instead. What about the statement in rc.conf(5) then, claiming it was only started on servers? I only see this (correct) statement in that manpage: rpc_lockd_enable (bool) If set to ``YES'' and also an NFS server or client, run rpc.lockd(8) at boot time. Kris pgp6rj0DlBEK3.pgp Description: PGP signature
Re: NFS locking question
Kris Kennaway wrote: I only see this (correct) statement in that manpage: rpc_lockd_enable (bool) If set to ``YES'' and also an NFS server or client, run rpc.lockd(8) at boot time. .. and it will fail to run (at all) on a box configured only as a server because /dev/nfslock isn't created unless the client is there :-( Michael smime.p7s Description: S/MIME Cryptographic Signature
Re: NFS locking question
On Tue, Feb 28, 2006 at 09:46:30AM -0500, Michael Butler wrote: Kris Kennaway wrote: I only see this (correct) statement in that manpage: rpc_lockd_enable (bool) If set to ``YES'' and also an NFS server or client, run rpc.lockd(8) at boot time. .. and it will fail to run (at all) on a box configured only as a server because /dev/nfslock isn't created unless the client is there :-( Yeah, I think there's a PR about that. Kris pgpP5v8U9JpZY.pgp Description: PGP signature