Re: mbuf leakage in NFSv3 writes, possbile?
Well, backing out now is not really an option... But given my past history with NFS, and knowledge of this site I think I have a fair idea where the leak is... I think it is in the nfsv3 "commit" handler. Why do I think this? Simple, this problem started when a user started running a large job on out origin 2k, prior to that our server had been up for 30-ish days sans any problems, since his start it requires a boot-a-day (mbuf clusters are up to 8k). Also supporting this is the fact that the clusters are used at a fairly constant rate. Now (following that hunch), I did a tcpdump against that host for tcp traffic, and noticed a fairly steady stream of "commit" NFS traffic. I realize none of this is a smoking gun, but that is where my "hunch" lies. How is mbuf cluster cleanyup done? If I knew I might have a shot in heck at locating this problem. BTW: updated netstat -m for the machine: 4855/5344 mbufs in use: 4848 mbufs allocated to data 7 mbufs allocated to packet headers 4774/4850/8704 mbuf clusters in use (current/peak/max) 10368 Kbytes allocated to network (97% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines That's alot of buffer ;) -- David Cross | email: cro...@cs.rpi.edu Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Rensselaer Polytechnic Institute, | Ph: 518.276.2860 Department of Computer Science| Fax: 518.276.4033 I speak only for myself. | WinNT:Linux::Linux:FreeBSD To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
I'd say that there is a definite leak somewhere. The question is where. I'll run some buildworld tests w/ the latest current to see if I can cause mbufs to leak. I suspect that the problem may be related to a specific situation, though, such as a particular type of FS op failure. Unfortunately, I do not have any -stable machines setup at the moment that I can use for NFS testing. You could try backing out the nfs_serv.c and related patches from your -stable source to see if that fixes this particular problem. If it does then we will know where to look. -Matt Matthew Dillon :Ok, here are some real stats : :"w" is the read-only machine, it services everything that "s" (the :read-write machine) does... in fact it services more. : :*w crossd $ strings -a /kernel | grep \^___maxusers :___maxusers 96 :*w crossd $ uname -a :FreeBSD w.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #1: Tue Jun 29 09:36:32 EDT 1999 r...@w.cs.rpi.edu:/usr/src/sys/compile/WOBBLE i386 :*w crossd $ uptime : 1:43PM up 24 days, 2:08, 3 users, load averages: 0.00, 0.00, 0.00 :*w crossd $ netstat -m :106/2688 mbufs in use: :85 mbufs allocated to data :21 mbufs allocated to packet headers :64/426/2048 mbuf clusters in use (current/peak/max) :1188 Kbytes allocated to network (11% in use) :0 requests for memory denied :0 requests for memory delayed :0 calls to protocol drain routines : :*s crossd $ uname -a :FreeBSD s.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #0: Thu Jul 22 18:12:21 EDT 1999 r...@phoenix.cs.rpi.edu:/usr/src/sys/compile/STAGGER i386 :*s crossd $ strings -a /kernel | grep \^___maxusers :___maxusers 512 :*s crossd $ uptime : 1:43PM up 19:23, 2 users, load averages: 0.02, 0.01, 0.00 :*s crossd $ netstat -m :3629/4096 mbufs in use: :3621 mbufs allocated to data :8 mbufs allocated to packet headers :3550/3660/8704 mbuf clusters in use (current/peak/max) :7832 Kbytes allocated to network (96% in use) :0 requests for memory denied :0 requests for memory delayed :0 calls to protocol drain routines : :-- :David Cross | email: cro...@cs.rpi.edu :Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd :Rensselaer Polytechnic Institute, | Ph: 518.276.2860 :Department of Computer Science| Fax: 518.276.4033 :I speak only for myself. | WinNT:Linux::Linux:FreeBSD : To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
Well, backing out now is not really an option... But given my past history with NFS, and knowledge of this site I think I have a fair idea where the leak is... I think it is in the nfsv3 "commit" handler. Why do I think this? Simple, this problem started when a user started running a large job on out origin 2k, prior to that our server had been up for 30-ish days sans any problems, since his start it requires a boot-a-day (mbuf clusters are up to 8k). Also supporting this is the fact that the clusters are used at a fairly constant rate. Now (following that hunch), I did a tcpdump against that host for tcp traffic, and noticed a fairly steady stream of "commit" NFS traffic. I realize none of this is a smoking gun, but that is where my "hunch" lies. How is mbuf cluster cleanyup done? If I knew I might have a shot in heck at locating this problem. BTW: updated netstat -m for the machine: 4855/5344 mbufs in use: 4848 mbufs allocated to data 7 mbufs allocated to packet headers 4774/4850/8704 mbuf clusters in use (current/peak/max) 10368 Kbytes allocated to network (97% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines That's alot of buffer ;) -- David Cross | email: [EMAIL PROTECTED] Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Rensselaer Polytechnic Institute, | Ph: 518.276.2860 Department of Computer Science| Fax: 518.276.4033 I speak only for myself. | WinNT:Linux::Linux:FreeBSD To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
I'd say that there is a definite leak somewhere. The question is where. I'll run some buildworld tests w/ the latest current to see if I can cause mbufs to leak. I suspect that the problem may be related to a specific situation, though, such as a particular type of FS op failure. Unfortunately, I do not have any -stable machines setup at the moment that I can use for NFS testing. You could try backing out the nfs_serv.c and related patches from your -stable source to see if that fixes this particular problem. If it does then we will know where to look. -Matt Matthew Dillon <[EMAIL PROTECTED]> :Ok, here are some real stats : :"w" is the read-only machine, it services everything that "s" (the :read-write machine) does... in fact it services more. : :*w crossd $ strings -a /kernel | grep \^___maxusers :___maxusers 96 :*w crossd $ uname -a :FreeBSD w.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #1: Tue Jun 29 09:36:32 EDT 1999 : [EMAIL PROTECTED]:/usr/src/sys/compile/WOBBLE i386 :*w crossd $ uptime : 1:43PM up 24 days, 2:08, 3 users, load averages: 0.00, 0.00, 0.00 :*w crossd $ netstat -m :106/2688 mbufs in use: :85 mbufs allocated to data :21 mbufs allocated to packet headers :64/426/2048 mbuf clusters in use (current/peak/max) :1188 Kbytes allocated to network (11% in use) :0 requests for memory denied :0 requests for memory delayed :0 calls to protocol drain routines : :*s crossd $ uname -a :FreeBSD s.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #0: Thu Jul 22 18:12:21 EDT 1999 : [EMAIL PROTECTED]:/usr/src/sys/compile/STAGGER i386 :*s crossd $ strings -a /kernel | grep \^___maxusers :___maxusers 512 :*s crossd $ uptime : 1:43PM up 19:23, 2 users, load averages: 0.02, 0.01, 0.00 :*s crossd $ netstat -m :3629/4096 mbufs in use: :3621 mbufs allocated to data :8 mbufs allocated to packet headers :3550/3660/8704 mbuf clusters in use (current/peak/max) :7832 Kbytes allocated to network (96% in use) :0 requests for memory denied :0 requests for memory delayed :0 calls to protocol drain routines : :-- :David Cross | email: [EMAIL PROTECTED] :Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd :Rensselaer Polytechnic Institute, | Ph: 518.276.2860 :Department of Computer Science| Fax: 518.276.4033 :I speak only for myself. | WinNT:Linux::Linux:FreeBSD : To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
Ok, here are some real stats "w" is the read-only machine, it services everything that "s" (the read-write machine) does... in fact it services more. *w crossd $ strings -a /kernel | grep \^___maxusers ___maxusers 96 *w crossd $ uname -a FreeBSD w.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #1: Tue Jun 29 09:36:32 EDT 1999 r...@w.cs.rpi.edu:/usr/src/sys/compile/WOBBLE i386 *w crossd $ uptime 1:43PM up 24 days, 2:08, 3 users, load averages: 0.00, 0.00, 0.00 *w crossd $ netstat -m 106/2688 mbufs in use: 85 mbufs allocated to data 21 mbufs allocated to packet headers 64/426/2048 mbuf clusters in use (current/peak/max) 1188 Kbytes allocated to network (11% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines *s crossd $ uname -a FreeBSD s.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #0: Thu Jul 22 18:12:21 EDT 1999 r...@phoenix.cs.rpi.edu:/usr/src/sys/compile/STAGGER i386 *s crossd $ strings -a /kernel | grep \^___maxusers ___maxusers 512 *s crossd $ uptime 1:43PM up 19:23, 2 users, load averages: 0.02, 0.01, 0.00 *s crossd $ netstat -m 3629/4096 mbufs in use: 3621 mbufs allocated to data 8 mbufs allocated to packet headers 3550/3660/8704 mbuf clusters in use (current/peak/max) 7832 Kbytes allocated to network (96% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines -- David Cross | email: cro...@cs.rpi.edu Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Rensselaer Polytechnic Institute, | Ph: 518.276.2860 Department of Computer Science| Fax: 518.276.4033 I speak only for myself. | WinNT:Linux::Linux:FreeBSD To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
Ok, here are some real stats "w" is the read-only machine, it services everything that "s" (the read-write machine) does... in fact it services more. *w crossd $ strings -a /kernel | grep \^___maxusers ___maxusers 96 *w crossd $ uname -a FreeBSD w.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #1: Tue Jun 29 09:36:32 EDT 1999 [EMAIL PROTECTED]:/usr/src/sys/compile/WOBBLE i386 *w crossd $ uptime 1:43PM up 24 days, 2:08, 3 users, load averages: 0.00, 0.00, 0.00 *w crossd $ netstat -m 106/2688 mbufs in use: 85 mbufs allocated to data 21 mbufs allocated to packet headers 64/426/2048 mbuf clusters in use (current/peak/max) 1188 Kbytes allocated to network (11% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines *s crossd $ uname -a FreeBSD s.cs.rpi.edu 3.2-STABLE FreeBSD 3.2-STABLE #0: Thu Jul 22 18:12:21 EDT 1999 [EMAIL PROTECTED]:/usr/src/sys/compile/STAGGER i386 *s crossd $ strings -a /kernel | grep \^___maxusers ___maxusers 512 *s crossd $ uptime 1:43PM up 19:23, 2 users, load averages: 0.02, 0.01, 0.00 *s crossd $ netstat -m 3629/4096 mbufs in use: 3621 mbufs allocated to data 8 mbufs allocated to packet headers 3550/3660/8704 mbuf clusters in use (current/peak/max) 7832 Kbytes allocated to network (96% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines -- David Cross | email: [EMAIL PROTECTED] Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Rensselaer Polytechnic Institute, | Ph: 518.276.2860 Department of Computer Science| Fax: 518.276.4033 I speak only for myself. | WinNT:Linux::Linux:FreeBSD To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
:> :> I do not think those changes have been backported to -STABLE. : :julian 1999/06/30 15:05:20 PDT : : Modified files:(Branch: RELENG_3) :sys/nfs nfs_serv.c nfs_subs.c nfs_syscalls.c : nfsm_subs.h : Log: : MFC: Bring in NFS cleanups by Matt. :. : David. Whup! So much for that idea! -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
On Fri, Jul 23, 1999 at 09:06:01AM -0700, Matthew Dillon wrote: > There is a good chance the leakage is in nfs_serv.c, which I fixed for > -current. > > I do not think those changes have been backported to -STABLE. julian 1999/06/30 15:05:20 PDT Modified files:(Branch: RELENG_3) sys/nfs nfs_serv.c nfs_subs.c nfs_syscalls.c nfsm_subs.h Log: MFC: Bring in NFS cleanups by Matt. . . . David. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
:"David E. Cross" wrote: :> :> Well, I just -STABLED the server to see if it fixed it, but I was certainly :> running out. the server had only 3000-ish mbuf chains, and it would go through :> them all in a day. : : Well, have you tried increasing the number of available mbufs and see if :you reach a point of stability? Assuming you have enough physical ram you :could do 15k mbufs on -Stable without a problem. Check LINT for the :nmbclusters option if you need help with it. : :Good luck, : :Doug Well, the cache shouldn't eat up *that* many mbufs! The problem is likely to be real. There is a good chance the leakage is in nfs_serv.c, which I fixed for -current. I do not think those changes have been backported to -STABLE. -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
:> :> I do not think those changes have been backported to -STABLE. : :julian 1999/06/30 15:05:20 PDT : : Modified files:(Branch: RELENG_3) :sys/nfs nfs_serv.c nfs_subs.c nfs_syscalls.c : nfsm_subs.h : Log: : MFC: Bring in NFS cleanups by Matt. :. : David. Whup! So much for that idea! -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
On Fri, Jul 23, 1999 at 09:06:01AM -0700, Matthew Dillon wrote: > There is a good chance the leakage is in nfs_serv.c, which I fixed for > -current. > > I do not think those changes have been backported to -STABLE. julian 1999/06/30 15:05:20 PDT Modified files:(Branch: RELENG_3) sys/nfs nfs_serv.c nfs_subs.c nfs_syscalls.c nfsm_subs.h Log: MFC: Bring in NFS cleanups by Matt. . . . David. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
:"David E. Cross" wrote: :> :> Well, I just -STABLED the server to see if it fixed it, but I was certainly :> running out. the server had only 3000-ish mbuf chains, and it would go through :> them all in a day. : : Well, have you tried increasing the number of available mbufs and see if :you reach a point of stability? Assuming you have enough physical ram you :could do 15k mbufs on -Stable without a problem. Check LINT for the :nmbclusters option if you need help with it. : :Good luck, : :Doug Well, the cache shouldn't eat up *that* many mbufs! The problem is likely to be real. There is a good chance the leakage is in nfs_serv.c, which I fixed for -current. I do not think those changes have been backported to -STABLE. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
"David E. Cross" wrote: > > Well, I just -STABLED the server to see if it fixed it, but I was certainly > running out. the server had only 3000-ish mbuf chains, and it would go > through > them all in a day. Well, have you tried increasing the number of available mbufs and see if you reach a point of stability? Assuming you have enough physical ram you could do 15k mbufs on -Stable without a problem. Check LINT for the nmbclusters option if you need help with it. Good luck, Doug To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
"David E. Cross" wrote: > > Well, I just -STABLED the server to see if it fixed it, but I was certainly > running out. the server had only 3000-ish mbuf chains, and it would go through > them all in a day. Well, have you tried increasing the number of available mbufs and see if you reach a point of stability? Assuming you have enough physical ram you could do 15k mbufs on -Stable without a problem. Check LINT for the nmbclusters option if you need help with it. Good luck, Doug To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
Well, I just -STABLED the server to see if it fixed it, but I was certainly running out. the server had only 3000-ish mbuf chains, and it would go through them all in a day. -- David Cross | email: cro...@cs.rpi.edu Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Rensselaer Polytechnic Institute, | Ph: 518.276.2860 Department of Computer Science| Fax: 518.276.4033 I speak only for myself. | WinNT:Linux::Linux:FreeBSD To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
:I have 2 NFS servers. One is primarily read-only, the other read-write, they :service the same clients (the read-only services more). They are (were) of :the same build. I have a problem on the read/write server where it chews :through mbuf clusters (it goes through about 3k in a day). Especially late :at night the machine is not busy. And now it is also not busy, yet every :minute or so it goes through a few mbuf clusters. The rate is about 108 :minutes for 300 clusters. Does it sound reasonable that there is a mbuf leak :in the NFS code somewhere? : :-- :David Cross | email: cro...@cs.rpi.edu The server side caches mbuf chains to hold replies to NFS requests. This is done because it is quite common for requests to be repeated. The question is whether you are simply seeing the effect of this caching, or whether you have an actual mbuf leak. Does the mbuf usage / memory usage stabilize after a while or do you actually run out? -Matt Matthew Dillon To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
Well, I just -STABLED the server to see if it fixed it, but I was certainly running out. the server had only 3000-ish mbuf chains, and it would go through them all in a day. -- David Cross | email: [EMAIL PROTECTED] Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Rensselaer Polytechnic Institute, | Ph: 518.276.2860 Department of Computer Science| Fax: 518.276.4033 I speak only for myself. | WinNT:Linux::Linux:FreeBSD To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: mbuf leakage in NFSv3 writes, possbile?
:I have 2 NFS servers. One is primarily read-only, the other read-write, they :service the same clients (the read-only services more). They are (were) of :the same build. I have a problem on the read/write server where it chews :through mbuf clusters (it goes through about 3k in a day). Especially late :at night the machine is not busy. And now it is also not busy, yet every :minute or so it goes through a few mbuf clusters. The rate is about 108 :minutes for 300 clusters. Does it sound reasonable that there is a mbuf leak :in the NFS code somewhere? : :-- :David Cross | email: [EMAIL PROTECTED] The server side caches mbuf chains to hold replies to NFS requests. This is done because it is quite common for requests to be repeated. The question is whether you are simply seeing the effect of this caching, or whether you have an actual mbuf leak. Does the mbuf usage / memory usage stabilize after a while or do you actually run out? -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message