Re: NFS locking question

2006-08-16 Thread Morten A. Middelthon
On Tue, Aug 15, 2006 at 06:11:14AM -0400, Kris Kennaway wrote:
 I was unable to obtain confirmation from anyone else (including the
 submitter who previously claimed it was necessary, and my own testing)
 that the patch actually solved a problem.  Since it involves reverting
 useful functionality, someone would need to obtain further debugging
 from your system (tcpdump traces before/after, etc) to determine what
 it's actually solving.

If I can help with further debugging I would only be happy to do so, but I'll
have to set up a new box for testing. I have to use my workstation for ...
working :)

with regards,

-- 
Morten A. Middelthon

If you cannot convince them, confuse them.
-- Harry S Truman


pgphzh4xr98in.pgp
Description: PGP signature


Re: NFS locking question

2006-08-15 Thread Morten A. Middelthon
On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote:
 On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote:
  Hi, all!
  
  In our local office network we have a rather old FreeBSD 5.2.1
  server acting as an NFS server for several other systems, mostly
  running 6.0.
  
  From time to time we experience processes on the NFS clients
  hanging in statd D with wchan lockd when accessing files
  over NFS.
 
 Try the attached patch on the 6.0 machines:
 
 Index: usr.sbin/rpc.lockd/lock_proc.c
snip

Hi,

I have been plagued with this NFS lockd issue for quite some time now. It has
kept me from installing FreeBSD 6.x on our workstations at work. I just tried
applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has
been working nicely. Has anyone else had the same experience? If so, maybe it
should go into production?

with regards,

-- 
Morten A. Middelthon

For every complex problem, there is a solution that is simple, neat,
and wrong.
-- H.L. Mencken


pgpeEOFD7peZ9.pgp
Description: PGP signature


Re: NFS locking question

2006-08-15 Thread Kris Kennaway
On Tue, Aug 15, 2006 at 11:59:50AM +0200, Morten A. Middelthon wrote:
 On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote:
  On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote:
   Hi, all!
   
   In our local office network we have a rather old FreeBSD 5.2.1
   server acting as an NFS server for several other systems, mostly
   running 6.0.
   
   From time to time we experience processes on the NFS clients
   hanging in statd D with wchan lockd when accessing files
   over NFS.
  
  Try the attached patch on the 6.0 machines:
  
  Index: usr.sbin/rpc.lockd/lock_proc.c
 snip
 
 Hi,
 
 I have been plagued with this NFS lockd issue for quite some time now. It has
 kept me from installing FreeBSD 6.x on our workstations at work. I just tried
 applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has
 been working nicely. Has anyone else had the same experience? If so, maybe it
 should go into production?

I was unable to obtain confirmation from anyone else (including the
submitter who previously claimed it was necessary, and my own testing)
that the patch actually solved a problem.  Since it involves reverting
useful functionality, someone would need to obtain further debugging
from your system (tcpdump traces before/after, etc) to determine what
it's actually solving.

Kris


pgpk5emX1d8Kb.pgp
Description: PGP signature


Re: NFS locking question

2006-08-15 Thread Rong-en Fan

On 8/15/06, Kris Kennaway [EMAIL PROTECTED] wrote:

On Tue, Aug 15, 2006 at 11:59:50AM +0200, Morten A. Middelthon wrote:
 On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote:
  On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote:
   Hi, all!
  
   In our local office network we have a rather old FreeBSD 5.2.1
   server acting as an NFS server for several other systems, mostly
   running 6.0.
  
   From time to time we experience processes on the NFS clients
   hanging in statd D with wchan lockd when accessing files
   over NFS.
 
  Try the attached patch on the 6.0 machines:
 
  Index: usr.sbin/rpc.lockd/lock_proc.c
 snip

 Hi,

 I have been plagued with this NFS lockd issue for quite some time now. It has
 kept me from installing FreeBSD 6.x on our workstations at work. I just tried
 applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has
 been working nicely. Has anyone else had the same experience? If so, maybe it
 should go into production?

I was unable to obtain confirmation from anyone else (including the
submitter who previously claimed it was necessary, and my own testing)
that the patch actually solved a problem.  Since it involves reverting
useful functionality, someone would need to obtain further debugging
from your system (tcpdump traces before/after, etc) to determine what
it's actually solving.

Kris


In my experiences, rpc.lockd dies automatically on both server and
client. If this happens, then all processes that want to lock a file, they
will be stuck in lockd (top will tell). In my case, rpc.lockd dies because
write failed, and then a SIGPIPE generated. Two months ago, bin/97768
is sent and rodrigc@ committed (also MFC'ed in RELENG_6). That PR
ignores SIGPIPE (since the code in server/client already takes care of
write failed case). After I applied this PR, I'm quite happy with nfs locking.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS locking question

2006-08-15 Thread Kris Kennaway
On Tue, Aug 15, 2006 at 07:49:10PM +0800, Rong-en Fan wrote:
 On 8/15/06, Kris Kennaway [EMAIL PROTECTED] wrote:
 On Tue, Aug 15, 2006 at 11:59:50AM +0200, Morten A. Middelthon wrote:
  On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote:
   On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote:
Hi, all!
   
In our local office network we have a rather old FreeBSD 5.2.1
server acting as an NFS server for several other systems, mostly
running 6.0.
   
From time to time we experience processes on the NFS clients
hanging in statd D with wchan lockd when accessing files
over NFS.
  
   Try the attached patch on the 6.0 machines:
  
   Index: usr.sbin/rpc.lockd/lock_proc.c
  snip
 
  Hi,
 
  I have been plagued with this NFS lockd issue for quite some time now. 
 It has
  kept me from installing FreeBSD 6.x on our workstations at work. I just 
 tried
  applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it 
 has
  been working nicely. Has anyone else had the same experience? If so, 
 maybe it
  should go into production?
 
 I was unable to obtain confirmation from anyone else (including the
 submitter who previously claimed it was necessary, and my own testing)
 that the patch actually solved a problem.  Since it involves reverting
 useful functionality, someone would need to obtain further debugging
 from your system (tcpdump traces before/after, etc) to determine what
 it's actually solving.
 
 Kris
 
 In my experiences, rpc.lockd dies automatically on both server and
 client. If this happens, then all processes that want to lock a file, they
 will be stuck in lockd (top will tell). In my case, rpc.lockd dies because
 write failed, and then a SIGPIPE generated. Two months ago, bin/97768
 is sent and rodrigc@ committed (also MFC'ed in RELENG_6). That PR
 ignores SIGPIPE (since the code in server/client already takes care of
 write failed case). After I applied this PR, I'm quite happy with nfs 
 locking.

That's good to hear, but there are many other fundamental problems
with the current rpc.lockd implementation in various situations, which
can probably only be solved by a substantial rewrite.

Kris



pgpBInM52xrnw.pgp
Description: PGP signature


Re: NFS locking question

2006-02-28 Thread Kris Kennaway
On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote:
 Hi, all!
 
 In our local office network we have a rather old FreeBSD 5.2.1
 server acting as an NFS server for several other systems, mostly
 running 6.0.
 
 From time to time we experience processes on the NFS clients
 hanging in statd D with wchan lockd when accessing files
 over NFS.

Try the attached patch on the 6.0 machines:

Index: usr.sbin/rpc.lockd/lock_proc.c
===
RCS file: /home/ncvs/src/usr.sbin/rpc.lockd/lock_proc.c,v
retrieving revision 1.18
retrieving revision 1.17
diff -u -u -r1.18 -r1.17
--- usr.sbin/rpc.lockd/lock_proc.c  3 Feb 2005 22:21:19 -   1.18
+++ usr.sbin/rpc.lockd/lock_proc.c  9 Oct 2004 15:36:13 -   1.17
@@ -62,8 +62,6 @@
 #defineCLIENT_CACHE_SIZE   64  /* No. of client sockets cached 
*/
 #defineCLIENT_CACHE_LIFETIME   120 /* In seconds */
 
-#definegetrpcaddr(rqstp)   (struct sockaddr 
*)(svc_getrpccaller((rqstp)-rq_xprt)-buf)
-
 static voidlog_from_addr(const char *, struct svc_req *);
 static voidlog_netobj(netobj *obj);
 static int addrcmp(struct sockaddr *, struct sockaddr *);
@@ -196,7 +194,7 @@
 {
CLIENT *client;
struct timeval retry_time, time_now;
-   int error, i;
+   int i;
const char *netid;
struct netconfig *nconf;
char host[NI_MAXHOST];
@@ -243,11 +241,9 @@
 * Need a host string for clnt_tp_create. Use NI_NUMERICHOST
 * to avoid DNS lookups.
 */
-   error = getnameinfo(host_addr, host_addr-sa_len, host, sizeof host,
-   NULL, 0, NI_NUMERICHOST);
-   if (error != 0) {
-   syslog(LOG_ERR, unable to get name string for caller: %s,
-  gai_strerror(error));
+   if (getnameinfo(host_addr, host_addr-sa_len, host, sizeof host,
+   NULL, 0, NI_NUMERICHOST) != 0) {
+   syslog(LOG_ERR, unable to get name string for caller);
return NULL;
}
 
@@ -566,7 +562,8 @@
 
res.cookie = arg-cookie;
res.stat.stat = getlock(arg4, rqstp, LOCK_ASYNC | LOCK_MON);
-   transmit_result(NLM_LOCK_RES, res, getrpcaddr(rqstp));
+   transmit_result(NLM_LOCK_RES, res,
+   (struct sockaddr *)svc_getcaller(rqstp-rq_xprt));
 
return (NULL);
 }
@@ -620,7 +617,8 @@
 * a lock to cancel, so this call always fails.
 */
res.stat.stat = unlock(arg4, LOCK_CANCEL);
-   transmit_result(NLM_CANCEL_RES, res, getrpcaddr(rqstp));
+   transmit_result(NLM_CANCEL_RES, res,
+   (struct sockaddr *)svc_getcaller(rqstp-rq_xprt));
return (NULL);
 }
 
@@ -667,7 +665,8 @@
res.stat.stat = unlock(arg4, 0);
res.cookie = arg-cookie;
 
-   transmit_result(NLM_UNLOCK_RES, res, getrpcaddr(rqstp));
+   transmit_result(NLM_UNLOCK_RES, res,
+   (struct sockaddr *)svc_getcaller(rqstp-rq_xprt));
return (NULL);
 }
 
@@ -724,7 +723,8 @@
nlm_granted, NULL, NLM_VERS) == 0 ?
nlm_granted : nlm_denied;
 
-   transmit_result(NLM_GRANTED_RES, res, getrpcaddr(rqstp));
+   transmit_result(NLM_GRANTED_RES, res,
+   (struct sockaddr *)svc_getcaller(rqstp-rq_xprt));
return (NULL);
 }
 
@@ -1067,7 +1067,8 @@
 
res.cookie = arg-cookie;
res.stat.stat = getlock(arg, rqstp, LOCK_MON | LOCK_ASYNC | LOCK_V4);
-   transmit4_result(NLM4_LOCK_RES, res, getrpcaddr(rqstp));
+   transmit4_result(NLM4_LOCK_RES, res,
+   (struct sockaddr *)svc_getcaller(rqstp-rq_xprt));
 
return (NULL);
 }
@@ -1115,7 +1116,8 @@
 * a lock to cancel, so this call always fails.
 */
res.stat.stat = unlock(arg-alock, LOCK_CANCEL | LOCK_V4);
-   transmit4_result(NLM4_CANCEL_RES, res, getrpcaddr(rqstp));
+   transmit4_result(NLM4_CANCEL_RES, res,
+   (struct sockaddr *)svc_getcaller(rqstp-rq_xprt));
return (NULL);
 }
 
@@ -1156,7 +1158,8 @@
res.stat.stat = unlock(arg-alock, LOCK_V4);
res.cookie = arg-cookie;
 
-   transmit4_result(NLM4_UNLOCK_RES, res, getrpcaddr(rqstp));
+   transmit4_result(NLM4_UNLOCK_RES, res,
+   (struct sockaddr *)svc_getcaller(rqstp-rq_xprt));
return (NULL);
 }
 
@@ -1212,7 +1215,8 @@
res.stat.stat = lock_answer(arg-alock.svid, arg-cookie,
nlm4_granted, NULL, NLM_VERS4) == 0 ?
nlm4_granted : nlm4_denied;
-   transmit4_result(NLM4_GRANTED_RES, res, getrpcaddr(rqstp));
+   transmit4_result(NLM4_GRANTED_RES, res,
+   (struct sockaddr *)svc_getrpccaller(rqstp-rq_xprt)-buf);
return (NULL);
 }
 
Kris

pgp4aErANQlO1.pgp
Description: PGP signature


Re: NFS locking question

2006-02-28 Thread Patrick M. Hausen
Hi, Kris!

On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote:

 Try the attached patch on the 6.0 machines:

Thanks. Will do.

But:

 Index: usr.sbin/rpc.lockd/lock_proc.c
 ...

So, rpc.lockd _is_ needed on the client?
What about the statement in rc.conf(5) then, claiming it was
only started on servers?

Thanks,
Patrick
-- 
punkt.de GmbH Internet - Dienstleistungen - Beratung
Vorholzstr. 25Tel. 0721 9109 -0 Fax: -100
76137 Karlsruhe   http://punkt.de
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS locking question

2006-02-28 Thread Miguel Lopes Santos Ramos
 So, rpc.lockd _is_ needed on the client?
 What about the statement in rc.conf(5) then, claiming it was
 only started on servers?

Yes rpc.lockd and rpc.statd are needed on the client.
I suspect it's the statements in rc.conf(5) that are wrong.

Miguel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS locking question

2006-02-28 Thread Kris Kennaway
On Tue, Feb 28, 2006 at 12:02:51PM +0100, Patrick M. Hausen wrote:
 Hi, Kris!
 
 On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote:
 
  Try the attached patch on the 6.0 machines:
 
 Thanks. Will do.
 
 But:
 
  Index: usr.sbin/rpc.lockd/lock_proc.c
  ...
 
 So, rpc.lockd _is_ needed on the client?

Yes.  If you weren't doing that, do that instead.

 What about the statement in rc.conf(5) then, claiming it was
 only started on servers?

I only see this (correct) statement in that manpage:

 rpc_lockd_enable
 (bool) If set to ``YES'' and also an NFS server or client,
 run rpc.lockd(8) at boot time.

Kris


pgp6rj0DlBEK3.pgp
Description: PGP signature


Re: NFS locking question

2006-02-28 Thread Michael Butler

Kris Kennaway wrote:

I only see this (correct) statement in that manpage:

 rpc_lockd_enable
 (bool) If set to ``YES'' and also an NFS server or client,
 run rpc.lockd(8) at boot time.


 .. and it will fail to run (at all) on a box configured only as a 
server because /dev/nfslock isn't created unless the client is there :-(


Michael


smime.p7s
Description: S/MIME Cryptographic Signature


Re: NFS locking question

2006-02-28 Thread Kris Kennaway
On Tue, Feb 28, 2006 at 09:46:30AM -0500, Michael Butler wrote:
 Kris Kennaway wrote:
 I only see this (correct) statement in that manpage:
 
  rpc_lockd_enable
  (bool) If set to ``YES'' and also an NFS server or client,
  run rpc.lockd(8) at boot time.
 
  .. and it will fail to run (at all) on a box configured only as a 
 server because /dev/nfslock isn't created unless the client is there :-(

Yeah, I think there's a PR about that.

Kris


pgpP5v8U9JpZY.pgp
Description: PGP signature