Re: Route table leaks
As I recall, Joe Greco wrote: > Hell, I've been seeing this for well over a year. The last time I mentioned > it, everybody seemed to think I was nuts. :-) > > FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999 > > routetbl289178 40961K 40961K 40960K 4357410 0 16,32,64,128,256 Doesn't happen on 2.2... -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= chad> uname -v FreeBSD 2.2.8-STABLE #5: Sun Sep 19 19:18:11 MST 1999 [EMAIL PROTECTED]:/usr/src/sys/compile/freeway chad> uptime 12:26PM up 80 days, 13:51, 3 users, load averages: 1.06, 1.03, 1.00 chad> vmstat -m | grep routetbl routetbl46 6K 19K 18468K 23860 0 16,32,64,128,256 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= -crl -- Chad R. Larson (CRL15) 602-953-1392 Brother, can you paradigm? [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] DCF, Inc. - 14623 North 49th Place, Scottsdale, Arizona 85254-2207 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
At 12:17 AM +0100 1999/12/10, Brad Knowles wrote: > In -CURRENT, I would say that this could probably be committed, > if John feels safe. I am not yet convinced that it should be > committed to -STABLE, although things do look good so far. Well, things continue to look good: Fri Dec 10 10:59:55 CET 1999 netstat -ran | wc -l 121 vmstat -m | grep routetbl | grep K routetbl 24634K 35K 40960K 2750 0 16,32,64,128,256 uptime 10:59AM up 16:08, 0 users, load averages: 3.49, 3.83, 3.61 Fri Dec 10 11:00:56 CET 1999 netstat -ran | wc -l 120 vmstat -m | grep routetbl | grep K routetbl 24434K 35K 40960K 2750 0 16,32,64,128,256 uptime 11:00AM up 16:09, 0 users, load averages: 3.41, 3.81, 3.62 Looking at our stats for yesterday on this machine, we came pretty close to setting some new records for volume, and did quite a lot of articles. At this stage, given that this patch has fixed John's problems, that the previous patch appears to have fixed Joe's problems, and that I seem to be running fine after almost a day, I'd feel more comfortable if John decides he wants to commit this patch to -STABLE. When that happens, I'll cvsup & rebuild all the machines I can, so that they can all get the benefit of this patch and the other changes that have gone in recently. Thanks! -- These are my opinions -- not to be taken as official Skynet policy |o| Brad Knowles, <[EMAIL PROTECTED]>Belgacom Skynet NV/SA |o| |o| Systems Architect, News & FTP Admin Rue Col. Bourg, 124 |o| |o| Phone/Fax: +32-2-706.11.11/12.49 B-1140 Brussels |o| |o| http://www.skynet.be Belgium |o| \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Unix is like a wigwam -- no Gates, no Windows, and an Apache inside. Unix is very user-friendly. It's just picky who its friends are. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
Brad Knowles wrote: > > In -CURRENT, I would say that this could probably be committed, > if John feels safe. I am not yet convinced that it should be > committed to -STABLE, although things do look good so far. Just to clarify, I committed it to -current already this morning. John To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
At 3:00 PM -0800 1999/12/9, Julian Elischer wrote: > so can it be committed? In -CURRENT, I would say that this could probably be committed, if John feels safe. I am not yet convinced that it should be committed to -STABLE, although things do look good so far. -- These are my opinions -- not to be taken as official Skynet policy |o| Brad Knowles, <[EMAIL PROTECTED]>Belgacom Skynet NV/SA |o| |o| Systems Architect, News & FTP Admin Rue Col. Bourg, 124 |o| |o| Phone/Fax: +32-2-706.11.11/12.49 B-1140 Brussels |o| |o| http://www.skynet.be Belgium |o| \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Unix is like a wigwam -- no Gates, no Windows, and an Apache inside. Unix is very user-friendly. It's just picky who its friends are. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
so can it be committed? On Thu, 9 Dec 1999, Joe Greco wrote: > The patch previously mentioned has completely fixed my problem, as far as I > can tell. > > routetbl 13117K 25K 40960K936240 0 16,32,64,128,256 > > after a day of uptime. > > > here's mine.. > > this is from a single homed machine, with a default route. it's also a IRC > > server (irc.stanford.edu), with a LOT of filtering of inbound traffic. > > > > FreeBSD 3.3-STABLE #8: Sat Nov 27 17:15:49 PST 1999 > > > > 11:33PM up 2 days, 20:41, 1 user, load averages: 0.03, 0.03, 0.00 > > > > routetbl 20529K 10489K 10489K 34799600 0 16,32,64,128,256 > > > > note that the table maxed out at some point (during a DoS attack.) > > > > root-irc.stanford.edu-[11:34pm-52]#t> netstat -ran | wc > > 70 4094741 > > > > looks like it leaked 135 in 2.8 days.. > > > > > >-- Welcome My Son, Welcome To The Machine -- > > Bob Vaughan | techie@{w6yx|tantivy}.stanford.edu | [EMAIL PROTECTED] > > | P.O. Box 9792, Stanford, Ca 94309-9792 > > -- I am Me, I am only Me, And no one else is Me, What could be simpler? -- > > > > > To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
The patch previously mentioned has completely fixed my problem, as far as I can tell. routetbl 13117K 25K 40960K936240 0 16,32,64,128,256 after a day of uptime. > here's mine.. > this is from a single homed machine, with a default route. it's also a IRC > server (irc.stanford.edu), with a LOT of filtering of inbound traffic. > > FreeBSD 3.3-STABLE #8: Sat Nov 27 17:15:49 PST 1999 > > 11:33PM up 2 days, 20:41, 1 user, load averages: 0.03, 0.03, 0.00 > > routetbl 20529K 10489K 10489K 34799600 0 16,32,64,128,256 > > note that the table maxed out at some point (during a DoS attack.) > > root-irc.stanford.edu-[11:34pm-52]#t> netstat -ran | wc > 70 4094741 > > looks like it leaked 135 in 2.8 days.. > > >-- Welcome My Son, Welcome To The Machine -- > Bob Vaughan | techie@{w6yx|tantivy}.stanford.edu | [EMAIL PROTECTED] >| P.O. Box 9792, Stanford, Ca 94309-9792 > -- I am Me, I am only Me, And no one else is Me, What could be simpler? -- > -- ... Joe --- Joe Greco - Systems Administrator [EMAIL PROTECTED] Solaria Public Access UNIX - Milwaukee, WI 414/342-4847 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
here's mine.. this is from a single homed machine, with a default route. it's also a IRC server (irc.stanford.edu), with a LOT of filtering of inbound traffic. FreeBSD 3.3-STABLE #8: Sat Nov 27 17:15:49 PST 1999 11:33PM up 2 days, 20:41, 1 user, load averages: 0.03, 0.03, 0.00 routetbl 20529K 10489K 10489K 34799600 0 16,32,64,128,256 note that the table maxed out at some point (during a DoS attack.) root-irc.stanford.edu-[11:34pm-52]#t> netstat -ran | wc 70 4094741 looks like it leaked 135 in 2.8 days.. -- Welcome My Son, Welcome To The Machine -- Bob Vaughan | techie@{w6yx|tantivy}.stanford.edu | [EMAIL PROTECTED] | P.O. Box 9792, Stanford, Ca 94309-9792 -- I am Me, I am only Me, And no one else is Me, What could be simpler? -- To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> > > > > Please use 'netstat -rna' to get a listing of *all* the routes, including > > > > the temporary ones, not just the non-temporary routes. > > FWIW, another datapoint: > > set$ netstat -ran | wc -l > 15 > set$ vmstat -m | grep routetbl|grep K Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) > routetbl35 5K 18K 26535K 5110 0 > 16,32,64,128,256 > set$ uname -a > FreeBSD set.spradley.org 3.3-STABLE FreeBSD 3.3-STABLE #2: Wed Oct 6 > 19:10:52 CDT 1999 [EMAIL PROTECTED]:/scratch/source/src/sys/ > compile/Set i386 > set$ uptime > 10:21PM up 9 days, 3:24, 0 users, load averages: 0.05, 0.11, 0.06 > > > This is my desktop at home, used for reading mail and surfing the web, > no routed or gated, mostly idle. 26.5 Mbytes looks kinda high to me... Go back and read the headings of vmstat -m, your only using 5K for routes, the 26.5M is the limit on the vmspace. The 18K was the highest usage. So you look pretty normal at 5K/35 -> 146 bytes/route. -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> > > Please use 'netstat -rna' to get a listing of *all* the routes, including > > > the temporary ones, not just the non-temporary routes. FWIW, another datapoint: set$ netstat -ran | wc -l 15 set$ vmstat -m | grep routetbl|grep K routetbl35 5K 18K 26535K 5110 0 16,32,64,128,256 set$ uname -a FreeBSD set.spradley.org 3.3-STABLE FreeBSD 3.3-STABLE #2: Wed Oct 6 19:10:52 CDT 1999 [EMAIL PROTECTED]:/scratch/source/src/sys/ compile/Set i386 set$ uptime 10:21PM up 9 days, 3:24, 0 users, load averages: 0.05, 0.11, 0.06 This is my desktop at home, used for reading mail and surfing the web, no routed or gated, mostly idle. 26.5 Mbytes looks kinda high to me... To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
On Wed, 8 Dec 1999, Joe Greco wrote: > > > > : > > :At 1:26 PM -0600 1999/12/8, Joe Greco wrote: > > : > > :>> vmstat -m | grep routetbl|grep K > > :> routetbl289178 40961K 40961K 40960K 4357410 0 > > :>16,32,64,128,256 > > :>> netstat -rn | wc -l > > :>16 > > > > Please use 'netstat -rna' to get a listing of *all* the routes, including > > the temporary ones, not just the non-temporary routes. > > > > -Matt > > > netstat -rna |wc -l > 17 > > netstat -rn | wc -l > 16 > > arp -an |wc -l >0 quite heavily loaded web server: # netstat -ran | wc -l 106 (that was 177 a few mins ago). # vmstat -m | grep routetbl|grep K routetbl 22832K649K 19661K 11209800 0 16,32,64,128,256 # uname -r 2.2.2-RELEASE # uptime 11:55PM up 23 days, 12:31, 1 user, load averages: 0.04, 0.03, 0.00 # Slightly more atomic mesurament: # netstat -ran | wc -l ; vmstat -m | grep routetbl | grep '[0-9]K' 175 routetbl 35449K649K 19661K 11211060 0 16,32,64,128,256 -- Internet Vision Internet Consultancy Tel: 0171 589 4500 60 Albert Court& Web developmentFax: 0171 589 4522 Prince Consort Road [EMAIL PROTECTED] London SW7 2BE http://www.ivision.co.uk/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> > : > :At 1:26 PM -0600 1999/12/8, Joe Greco wrote: > : > :>> vmstat -m | grep routetbl|grep K > :> routetbl289178 40961K 40961K 40960K 4357410 0 > :>16,32,64,128,256 > :>> netstat -rn | wc -l > :>16 > > Please use 'netstat -rna' to get a listing of *all* the routes, including > the temporary ones, not just the non-temporary routes. > > -Matt > netstat -rna |wc -l 17 > netstat -rn | wc -l 16 > arp -an |wc -l 0 (yes, really) I'm not sure a more recent box would be different. ... Joe --- Joe Greco - Systems Administrator [EMAIL PROTECTED] Solaria Public Access UNIX - Milwaukee, WI 414/342-4847 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
: :At 1:26 PM -0600 1999/12/8, Joe Greco wrote: : :>> vmstat -m | grep routetbl|grep K :> routetbl289178 40961K 40961K 40960K 4357410 0 :>16,32,64,128,256 :>> netstat -rn | wc -l :>16 Please use 'netstat -rna' to get a listing of *all* the routes, including the temporary ones, not just the non-temporary routes. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
At 1:26 PM -0600 1999/12/8, Joe Greco wrote: >> vmstat -m | grep routetbl|grep K > routetbl289178 40961K 40961K 40960K 4357410 0 >16,32,64,128,256 >> netstat -rn | wc -l >16 I had never looked at this on my machines (main news peering server in the Top 100, one Intel EtherExpress Pro 10/100+ 100-Base-TX interface with a default route, running 3.2-RELEASE): $ vmstat -m | grep routetbl | grep K routetbl 24634K 36K 40960K 9200 0 16,32,64,128,256 $ netstat -nr | wc -l 13 $ uptime 9:07PM up 7 days, 8:06, 1 user, load averages: 2.87, 3.14, 3.15 $ ps axl | grep ':' | wc -l 379 > 289178 blocks (and 40960K - that's 40MB) in use to support 16 routes (that > is 2.5MB of memory used per listed route) is a bit on the excessive side. This machine hasn't been up very long, is running an application profile that I assume is somewhat similar to yours (although I'm sure yours is much more heavily tuned, as well as loaded), but 2,835.692 bytes per route (26K/13) still seems a bit excessive. I've got another machine (an internal mailing list server, very very lightly loaded, one Intel EtherExpress Pro 10/100+ 100-Base-TX interface with a default route, running 3.0-RELEASE) that looks much more reasonable: $ vmstat -m | grep routetbl | grep K routetbl32 4K 8K 10400K132120 0 16,32,64,128,256 $ netstat -nr | wc -l 11 $ uptime 9:25PM up 135 days, 11:04, 1 user, load averages: 0.02, 0.01, 0.00 $ ps axl | grep ':' | wc -l 30 However, even 744.727 bytes per route (8K/11) seems a little higher than what I would expect, although this is *much* better than almost 3KB/route, and especially better than 2,621,504.000 bytes/route (40MB/16). The 312.402 bytes/route (20.731MB/69585) that Mike reported seems much more realistic. > I'd think that inbound connections are less likely to be an issue than > outbound ones, as inbound connections get really heavily exercised on > things like web servers. But that is off-the-top-of-my-head speculation, > and I've nothing to support that theory. Unfortunately, I don't have any FreeBSD web servers here that I can test that theory with. I'm trying to get more FreeBSD production servers installed here, but progress has been rather slow -- I can only roll them in as old servers need to be replaced, and as FreeBSD supports the hardware & software I need to use in order to support the application. -- These are my opinions -- not to be taken as official Skynet policy |o| Brad Knowles, <[EMAIL PROTECTED]>Belgacom Skynet NV/SA |o| |o| Systems Architect, News & FTP Admin Rue Col. Bourg, 124 |o| |o| Phone/Fax: +32-2-706.11.11/12.49 B-1140 Brussels |o| |o| http://www.skynet.be Belgium |o| \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Unix is like a wigwam -- no Gates, no Windows, and an Apache inside. Unix is very user-friendly. It's just picky who its friends are. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> > Have any of you been seeing route table leaks in -current? I noticed > > this week that cvsup-master.freebsd.org is suffering from them. I > > actually had to reboot it because it couldn't allocate any more. From > > the "vmstat -m" output: > > > > Memory statistics by type Type Kern > > Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) > > [...] > > routetbl150907 21221K 21221K 21221K 4621840 0 16,32,64,128,256 > > [...] > > I can think of some experiments to try in order to start to diagnose > > it. But first, have any of you seen this problem? > > Hell, I've been seeing this for well over a year. The last time I mentioned > it, everybody seemed to think I was nuts. :-) :-) > FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999 > > routetbl289178 40961K 40961K 40960K 4357410 0 16,32,64,128,256 > Mine has leeked very much, and this is on a bgp4 gated box: routetbl143395 19599K 21961K 32768K 23449660 0 16,32,64,128,256 Note the request counts vs total table size, oh and: {104}% netstat -ran | wc 69398 418030 4862684 {105}% uptime 11:20AM up 7 days, 8:15, 1 user, load averages: 0.21, 0.06, 0.02 {106}% uname -a FreeBSD br1 3.3-STABLE FreeBSD 3.3-STABLE #0: Tue Nov 23 20:15:59 PST 1999 I haven't leaked away as much as you have, so it seems that actually having the full routing table reduces it :-) > When it gets like that, it starts losing the ability to add further ARP > table entries and essentially starts going randomly deaf to local hosts > (and to a lesser extent remote hosts). Thats what I have seen on 3 occassions now, you get a can't allocate llinfo error from arpresolve/arplookup: /var/log/messages.1.gz:Dec 1 18:01:18 br1 /kernel: arpresolve: can't allocate llinfo for 205.238.40.30rt Note the bad printf output, that ``rt'' really is in my syslogs :-( > > I've also seen it on a 3.3-RELEASE box, but it's not currently happening > to any of them right now. > > Machines in question are SMP boxes, and get hit fairly heavily in various > Usenet news server roles. Seems to happen quite a bit more often on boxes > that talk to a wide variety of host types, and I can't recall having seen > it on boxes that only talk to other FreeBSD boxes. But that could also be > because the network environment is much more controlled internally. > Running a few full blown IBGP and EBGP sessions carrying 2 or more view of the full 68K internet route routing table and it takes about 7 to 10 days of route churn on a large KVM space kernel to cause it to have the llinfo problem... or at least I think this is what I have been seeing since I upgraded our 3.2 systems to 3.3-stable about 3 weeks ago... before this we where getting at least 30 day uptimes (about all I'd let it get before some other change had has rebooting, not due to a problem on the boxes) -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
In article <[EMAIL PROTECTED]>, Joe Greco <[EMAIL PROTECTED]> wrote: > > Hell, I've been seeing this for well over a year. The last time I mentioned > it, everybody seemed to think I was nuts. :-) > > FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999 > > routetbl289178 40961K 40961K 40960K 4357410 0 16,32,64,128,256 Yes, the leak is real. I posted a patch for it in this thread. I'm going to commit it soon. Why don't you try the patch? It fixed the leaks 100% on cvsup-master.freebsd.org. Yes, the leaks are probably present on -stable too. John -- John Polstra [EMAIL PROTECTED] John D. Polstra & Co., Inc.Seattle, Washington USA "No matter how cynical I get, I just can't keep up."-- Nora Ephron To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> At 08:51 AM 12/8/99 -0600, Joe Greco wrote: > >Most of which are routes pointing at the 3 private-net interfaces on the > >machine. > > The info was provided more as a comparison, that quantity of routes do not > necessary mean leak ? Or perhaps it does. But after 90 days, you would > think the problem would have been hit no ? My _point_ was that this issue (or some variant) has been around for some time. I suspect it doesn't have to do with packet forwarding, but does somehow have to do with machines that actually establish or receive TCP connections. Why this only affects certain types of systems, I don't know. Certainly a large number of routes doesn't mean anything. However, > vmstat -m | grep routetbl|grep K routetbl289178 40961K 40961K 40960K 4357410 0 16,32,64,128,256 > netstat -rn | wc -l 16 289178 blocks (and 40960K - that's 40MB) in use to support 16 routes (that is 2.5MB of memory used per listed route) is a bit on the excessive side. Your example was more along the lines of 20MB to support 65000 routes, only a few hundred bytes per route, which is roughly on the order of what I'd expect per route. I'd think that inbound connections are less likely to be an issue than outbound ones, as inbound connections get really heavily exercised on things like web servers. But that is off-the-top-of-my-head speculation, and I've nothing to support that theory. ... Joe --- Joe Greco - Systems Administrator [EMAIL PROTECTED] Solaria Public Access UNIX - Milwaukee, WI 414/342-4847 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> >Hell, I've been seeing this for well over a year. The last time I mentioned > >it, everybody seemed to think I was nuts. :-) > > > >FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999 > > > > routetbl289178 40961K 40961K 40960K 4357410 0 > 16,32,64,128,256 > > Well, I havent seen problems of this nature (yet), but for reference, > > > netstat -nr | wc >69585 419164 4875822 > > routetbl143718 19653K 21229K 21229K 65271520 0 16,32,64,128,256 > > FreeBSD 3.3-RC #0: Wed Sep 8 13:37:19 EDT 1999 > uptime > 9:44AM up 90 days, 20:35, 2 users, load averages: 0.00, 0.01, 0.00 > > This is a border router with 2 views of the net running defaultless. See my other email, and now upon further though having full routes without a default means the clonning code doesn't get used much, since you already have real routes :-). Thus your problem would be less. Hu let me go to a box running ``defaulted'' yet producing several 100k connections/day and see how bad it's route space looks. routetbl 32945K 1532K 42709K 5040600 0 16,32,64,128,256 :rgrimes{100}% netstat -ran | wc 69 4034675 Yep... looks like it leaked 329-69==260 in 17 days uptime :-( -- Rod Grimes - KD7CAX @ CN85sl - (RWG25) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
At 08:51 AM 12/8/99 -0600, Joe Greco wrote: >Most of which are routes pointing at the 3 private-net interfaces on the >machine. The info was provided more as a comparison, that quantity of routes do not necessary mean leak ? Or perhaps it does. But after 90 days, you would think the problem would have been hit no ? ---Mike Mike Tancsa, tel +1 519 651 3400 Network Administrator,[EMAIL PROTECTED] Sentex Communications www.sentex.net Cambridge, Ontario Canada To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> >Hell, I've been seeing this for well over a year. The last time I mentioned > >it, everybody seemed to think I was nuts. :-) > > > >FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999 > > > > routetbl289178 40961K 40961K 40960K 4357410 0 > 16,32,64,128,256 > > Well, I havent seen problems of this nature (yet), but for reference, > > > netstat -nr | wc >69585 419164 4875822 > > routetbl143718 19653K 21229K 21229K 65271520 0 16,32,64,128,256 > > FreeBSD 3.3-RC #0: Wed Sep 8 13:37:19 EDT 1999 > uptime > 9:44AM up 90 days, 20:35, 2 users, load averages: 0.00, 0.01, 0.00 > > This is a border router with 2 views of the net running defaultless. Yeah, nice :-), but the machine I'm looking at is one with a default route and > netstat -rn | wc -l 16 Most of which are routes pointing at the 3 private-net interfaces on the machine. ... Joe --- Joe Greco - Systems Administrator [EMAIL PROTECTED] Solaria Public Access UNIX - Milwaukee, WI 414/342-4847 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
>Hell, I've been seeing this for well over a year. The last time I mentioned >it, everybody seemed to think I was nuts. :-) > >FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999 > > routetbl289178 40961K 40961K 40960K 4357410 0 16,32,64,128,256 Well, I havent seen problems of this nature (yet), but for reference, netstat -nr | wc 69585 419164 4875822 routetbl143718 19653K 21229K 21229K 65271520 0 16,32,64,128,256 FreeBSD 3.3-RC #0: Wed Sep 8 13:37:19 EDT 1999 uptime 9:44AM up 90 days, 20:35, 2 users, load averages: 0.00, 0.01, 0.00 This is a border router with 2 views of the net running defaultless. ---Mike Mike Tancsa, tel +1 519 651 3400 Network Administrator,[EMAIL PROTECTED] Sentex Communications www.sentex.net Cambridge, Ontario Canada To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
> Have any of you been seeing route table leaks in -current? I noticed > this week that cvsup-master.freebsd.org is suffering from them. I > actually had to reboot it because it couldn't allocate any more. From > the "vmstat -m" output: > > Memory statistics by type Type Kern > Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) > [...] > routetbl150907 21221K 21221K 21221K 4621840 0 16,32,64,128,256 > [...] > I can think of some experiments to try in order to start to diagnose > it. But first, have any of you seen this problem? Hell, I've been seeing this for well over a year. The last time I mentioned it, everybody seemed to think I was nuts. :-) FreeBSD 3.0-19981015-BETA #1: Tue Jan 12 03:30:56 CST 1999 routetbl289178 40961K 40961K 40960K 4357410 0 16,32,64,128,256 When it gets like that, it starts losing the ability to add further ARP table entries and essentially starts going randomly deaf to local hosts (and to a lesser extent remote hosts). I've also seen it on a 3.3-RELEASE box, but it's not currently happening to any of them right now. Machines in question are SMP boxes, and get hit fairly heavily in various Usenet news server roles. Seems to happen quite a bit more often on boxes that talk to a wide variety of host types, and I can't recall having seen it on boxes that only talk to other FreeBSD boxes. But that could also be because the network environment is much more controlled internally. ... Joe --- Joe Greco - Systems Administrator [EMAIL PROTECTED] Solaria Public Access UNIX - Milwaukee, WI 414/342-4847 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
In article <[EMAIL PROTECTED]>, Garrett Wollman <[EMAIL PROTECTED]> wrote: > < said: > > > The route disappears from the routing table, but it is > > not freed. (The Leak.) > > Actually, no. > > > Now cause some packets to travel on the connection. A new cloned > > route is created and added to the routing table. > > Here is where the leak happens, as demonstrated by your patch. By "The Leak" I meant the moment when a routing table entry becomes unreferenced without being freed. But it's not worth arguing about. > (Great detective work, BTW.) Thanks! I definitely learned a lot about the routing code. :-) > > 1. Do I really need the splnet calls around RTFREE? > > I don't think so. All calls into the routing code should already be > protected by splnet. Good -- that's what I was hoping you'd say. I'll go over all the calls, and if I'm reasonably convinced that they're already at splnet then I'll test it for awhile without the hopefully-redundant splnet calls. I ran it that way here at home on a very lightly loaded system for a day or so without problems. But that doesn't prove much. > We may have to revisit this in the future for finer-grained locking. Good point. > > 2. To eliminate all the duplicated code, shall I make rtalloc just > > call rtalloc_ign(ro, 0UL)? I assume that was avoided originally for > > performance reasons, but now there's more code than before. > > Actually, it was avoided originally because it was easier to just cut > and paste. There is no inherent reason for the duplication, although > Matt's suggestion of topologically sorting the routines so that GCC > will have a chance at inlining is not a bad one. Thanks, that makes me feel better. > (I'd actually like to find all the calls to rtalloc() and simply add > an extra argument to them. I can't fathom why I didn't do that five > years ago) I'm just wondering how much of a "standard" kernel API rtalloc() is. I.e., might 3rd-party drivers call it? The only one I have a copy of is the driver from ET Inc., and it doesn't call any of the routing functions. John -- John Polstra [EMAIL PROTECTED] John D. Polstra & Co., Inc.Seattle, Washington USA "No matter how cynical I get, I just can't keep up."-- Nora Ephron To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
< said: > The route disappears from the routing table, but it is > not freed. (The Leak.) Actually, no. > Now cause some packets to travel on the connection. A new cloned > route is created and added to the routing table. Here is where the leak happens, as demonstrated by your patch. (Great detective work, BTW.) > 1. Do I really need the splnet calls around RTFREE? I don't think so. All calls into the routing code should already be protected by splnet. We may have to revisit this in the future for finer-grained locking. > 2. To eliminate all the duplicated code, shall I make rtalloc just > call rtalloc_ign(ro, 0UL)? I assume that was avoided originally for > performance reasons, but now there's more code than before. Actually, it was avoided originally because it was easier to just cut and paste. There is no inherent reason for the duplication, although Matt's suggestion of topologically sorting the routines so that GCC will have a chance at inlining is not a bad one. (I'd actually like to find all the calls to rtalloc() and simply add an extra argument to them. I can't fathom why I didn't do that five years ago) -GAWollman -- Garrett A. Wollman | O Siem / We are all family / O Siem / We're all the same [EMAIL PROTECTED] | O Siem / The fires of freedom Opinions not those of| Dance in the burning flame MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
:> :> Yes. Because the route table may be flushed from an interrupt in :> a low memory situation. : :I guess I didn't state the question very well. I realize that RTFREE :has to be executed at splnet. But I think it's likely that rtalloc :and rtalloc_ign are always called at splnet or better. If that's the :case and it's already required, then adding the redundant splnet calls :would be obfuscatory. I'd rather add a comment instead. I ran across this situation in a number of places while fixing the VM system. If you want to get rid of the splnet() calls you have to document the procedure containing the calls in the comments above the procedure, adding something like: /* * fubar: fubar the kernel * * This procedure must be called at splnet() * This procedure does not block * This procedure must */ And then make sure that all calls to the procedure indeed occur at splnet(). Then you can get rid of the splnet() calls within the procedure. For examples of the type of documentation necessary, look at vm/swap_pager.c. So what it comes down to are the requirements you wish to impose on the official use of the procedure. Note that making spl*() calls when the current process is already at that spl level do not impose any real overhead. : :John :-- : John Polstra [EMAIL PROTECTED] -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
In article <[EMAIL PROTECTED]>, Matthew Dillon <[EMAIL PROTECTED]> wrote: > :1. Do I really need the splnet calls around RTFREE? > > Yes. Because the route table may be flushed from an interrupt in > a low memory situation. I guess I didn't state the question very well. I realize that RTFREE has to be executed at splnet. But I think it's likely that rtalloc and rtalloc_ign are always called at splnet or better. If that's the case and it's already required, then adding the redundant splnet calls would be obfuscatory. I'd rather add a comment instead. > :2. To eliminate all the duplicated code, shall I make rtalloc just > :call rtalloc_ign(ro, 0UL)? I assume that was avoided originally for > :performance reasons, but now there's more code than before. > : > Hmm. One trick I used in the VM code was to put the common code in an > inline static function and leave the external functions broken out to > avoid an unnecessary call chain. OK, that's a possibility. I was hoping our network-meister (Yo, Garrett!) would give me a sign as to whether it would be worthwhile or not. John -- John Polstra [EMAIL PROTECTED] John D. Polstra & Co., Inc.Seattle, Washington USA "No matter how cynical I get, I just can't keep up."-- Nora Ephron To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
:+ s = splnet(); :+ RTFREE(rt); :+ splx(s); :... :+ s = splnet(); :+ RTFREE(rt); :+ splx(s); :+ } : ro->ro_rt = rtalloc1(&ro->ro_dst, 1, ignore); : } : : :Now for my questions: : :1. Do I really need the splnet calls around RTFREE? Yes. Because the route table may be flushed from an interrupt in a low memory situation. :2. To eliminate all the duplicated code, shall I make rtalloc just :call rtalloc_ign(ro, 0UL)? I assume that was avoided originally for :performance reasons, but now there's more code than before. : :John :-- : John Polstra [EMAIL PROTECTED] Hmm. One trick I used in the VM code was to put the common code in an inline static function and leave the external functions broken out to avoid an unnecessary call chain. So, for example, if rtalloc() and rtalloc_ign() require a bunch of extra code prior to calling rtalloc1(), then a good solution would be to put the bulk of that code or perhaps even all of it in an inline and then have rtalloc() and rtalloc_ign() both call the inline with appropriate arguments. Remember that inline function calls *WILL* optimize constants passed as arguments. It's a very effective way to genericize a block of code without creating any procedural recursion. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
I think I've finally found the route table leak. At least I found _a_ leak, and I think it's the one that's been plaguing cvsup-master. I have a question or two (see below) before I commit the fix. Here's how to leak a route table entry. Establish a TCP connection with another machine so that you have a cloned route to that host. With the connection idle, use "route delete" to remove the cloned route. The route disappears from the routing table, but it is not freed. (The Leak.) Now cause some packets to travel on the connection. A new cloned route is created and added to the routing table. Each time you do that, you leak a struct rtentry and also a 32-byte chunk that's used to hold a couple of address structures. Routed is doing these route deletions regularly on cvsup-master. I haven't tried to figure out why. The leak is in rtalloc() and rtalloc_ign(), and here's the patch I'm using to fix it: Index: route.c === RCS file: /home/ncvs/src/sys/net/route.c,v retrieving revision 1.53 diff -u -r1.53 route.c --- route.c 1999/08/28 00:48:28 1.53 +++ route.c 1999/11/27 01:21:56 @@ -88,8 +88,16 @@ rtalloc(ro) register struct route *ro; { - if (ro->ro_rt && ro->ro_rt->rt_ifp && (ro->ro_rt->rt_flags & RTF_UP)) - return; /* XXX */ + struct rtentry *rt; + int s; + + if ((rt = ro->ro_rt) != NULL) { + if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP) + return; + s = splnet(); + RTFREE(rt); + splx(s); + } ro->ro_rt = rtalloc1(&ro->ro_dst, 1, 0UL); } @@ -98,8 +106,16 @@ register struct route *ro; u_long ignore; { - if (ro->ro_rt && ro->ro_rt->rt_ifp && (ro->ro_rt->rt_flags & RTF_UP)) - return; /* XXX */ + struct rtentry *rt; + int s; + + if ((rt = ro->ro_rt) != NULL) { + if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP) + return; + s = splnet(); + RTFREE(rt); + splx(s); + } ro->ro_rt = rtalloc1(&ro->ro_dst, 1, ignore); } The original code was jamming a new pointer into ro->ro_rt, but it didn't free the old rtentry that was referenced there. Now for my questions: 1. Do I really need the splnet calls around RTFREE? 2. To eliminate all the duplicated code, shall I make rtalloc just call rtalloc_ign(ro, 0UL)? I assume that was avoided originally for performance reasons, but now there's more code than before. John -- John Polstra [EMAIL PROTECTED] John D. Polstra & Co., Inc.Seattle, Washington USA "No matter how cynical I get, I just can't keep up."-- Nora Ephron To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Route table leaks
I've been working on the cvsup-master route table leaks. I haven't found the bug yet, but I've got some clues now. If this info inspires a Eureka! from any of you, please let me know. I started by running this script to print out key information every 2 seconds while I ran a test: #! /bin/sh while :; do date "+%H:%M:%S" vmstat -m | grep 'routetbl[ ]' || exit netstat -ran | egrep 'default|206\.213\.73\.12' || exit echo "=" || exit sleep 2 || exit done (Yes, this is crude. But remember, the machine is 900 miles away and I'm trying not to disrupt service too much.) The output began like this: 19:41:56 routetbl 3682 518K522K 21221K114730 0 16,32,64,128,256 default204.216.27.17 UGc 1823 60 wb0 = 19:41:58 routetbl 3682 518K522K 21221K114730 0 16,32,64,128,256 default204.216.27.17 UGc 1823 60 wb0 I.e., there were 3682 route table structures in use, and 1823 references to the default route. Then I made a connection to the CVSup server from one of my own machines (206.213.73.12): 19:42:00 routetbl 3684 518K522K 21221K114750 0 16,32,64,128,256 default204.216.27.17 UGc 1824 60 wb0 206.213.73.12 204.216.27.17 UGHW1 64 wb0 So far, so good. A cloned route was created and the refcount on the default route was incremented by one. Two new route table entries were allocated, and that seems to be normal and OK. I immediately closed the connection, and it entered the TIME_WAIT state on cvsup-master. The script output remained as above for 60 seconds (2 * MSL), after which it changed to this: 19:42:59 routetbl 3684 518K522K 21221K114750 0 16,32,64,128,256 default204.216.27.17 UGc 1824 62 wb0 206.213.73.12 204.216.27.17 UGHW3 0 64 wb0 3600 This still looks OK. The cloned route has gained the "3" flag and a 1-hour expiration time. That is because the TIME_WAIT state has ended, the PCB has been discarded, and the cloned route is now being managed by the caching code in "sys/netinet/in_rmx.c". (That's what the "3" flag signifies.) The basic idea of this code (as I understand it) is to keep cloned routes for dead connections around for awhile in case they are needed again soon. That's useful since the cloned routes contain RTT estimates and so forth. Now we get to the interesting part. One would expect the route to remain cached for 3600 seconds. There are ways that in_rmx.c can expire it sooner than that, but I confirmed that those situations (e.g., too many cached routes) aren't arising. Nevertheless, the route is deleted after roughly another 200 seconds: 19:46:14 routetbl 3684 518K522K 21221K114780 0 16,32,64,128,256 default204.216.27.17 UGc 1824 64 wb0 206.213.73.12 204.216.27.17 UGHW3 0 64 wb0 3405 = 19:46:16 routetbl 3682 518K522K 21221K114800 0 16,32,64,128,256 default204.216.27.17 UGc 1823 64 wb0 Using DDB and some of routed's tracing options I determined that routed is deleting the route. More on that later. Anyway, given that the route is being deleted, things still look OK in the above. The route is deleted, the 2 route table entries are freed again, and the refcount on the default route is decremented back to its original value. But look what happens in the next 2 seconds: 19:46:18 routetbl 3684 518K522K 21221K114820 0 16,32,64,128,256 default204.216.27.17 UGc 1824 64 wb0 The 2 entries were allocated again and the refcount on the default route was incremented. Why? I don't know (yet). But the numbers remain that way thereafter. That's the leak, and I can reproduce it reliably on cvsup-master. Unfortunately, I cannot reproduce the problem here on my own -current machine. I tried to simulate the environment as accurately as possible, including running routed. On my machine, routed deletes the cached route before it has expired too, but the leak doesn't happen. One other thing. Back on cvsup-master, I changed rc.conf so that it sets the default route statically, and I disabled routed. That has completely eliminated the route table leak. Any ideas? Using DDB remotely through a console switch really isn't much fun. I'd prefer a Eureka! from somebody. :-) John -- John Polstra [EMAIL PROTECTED] John D. Polstra & Co., Inc.Seattle, Washington USA "No matter how cynical I get, I just can't keep up."-- Nora Ephron To Unsubscribe: send mail to [EMAIL PROTE
RE: Route table leaks
Garrett Wollman wrote: > Now things start to make sense: > > root@xyz(55)$ netstat -f inet -n | fgrep CLOSING | wc -l > 1287 > > (this machine still has the bug that Jonathan Lemon fixed). Now it's > clear what's going on. The ``missing'' routes have been deleted from > the routing table, but have not yet been freed because these old PCBs > still hold a reference. Unfortunately, that doesn't seem to be it on cvsup-master: cvsup-master# netstat -nf inet Active Internet connections Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp0 0 204.216.27.25.1379204.216.27.21.5998ESTABLISHED tcp0 17520 204.216.27.25.1378204.216.27.21.5999ESTABLISHED tcp0 0 204.216.27.25.5999194.151.64.11.3112ESTABLISHED tcp0 60 204.216.27.25.22 206.213.73.13.3626ESTABLISHED tcp0 0 204.216.27.25.1377204.216.27.21.5998TIME_WAIT tcp0 0 204.216.27.25.1376204.216.27.21.5999TIME_WAIT tcp0 72 204.216.27.25.5999195.205.77.13.2154ESTABLISHED tcp0 12 204.216.27.25.5999193.193.193.113.3678 ESTABLISHED udp0 0 127.0.0.1.123 *.* udp0 0 204.216.27.25.123 *.* cvsup-master# vmstat -m | grep 'routetbl ' routetbl 2274 320K320K 21221K 74240 0 16,32,64,128,256 cvsup-master# netstat -ran Routing tables Internet: DestinationGatewayFlags Refs Use Netif Expire default204.216.27.17 UGc 1119 11 wb0 127.0.0.1 127.0.0.1 UH 3 28 lo0 130.240.16.109 204.216.27.17 UGHW1 119 wb0 193.193.193.113204.216.27.17 UGHW1 111 wb0 194.151.64.11 204.216.27.17 UGHW1 41 wb0 195.205.77.13 204.216.27.17 UGHW1 172 wb0 204.216.27.3 204.216.27.17 UGHW3 0 13 wb0 3563 204.216.27.16/28 link#1 UC 00 wb0 204.216.27.17 0:0:c:4:2e:2e UHLW70 wb0453 204.216.27.18 0:a0:c9:97:e8:ae UHLW0 88 wb0 1056 204.216.27.20 link#1 UHLW2 94 wb0 204.216.27.21 0:a0:c9:a6:e:a6UHLW2 485678 wb0 1054 204.216.27.26 link#1 UHLW20 wb0 204.216.27.27 0:a0:c9:a5:f3:7f UHLW290723 wb0 3 204.216.27.31 ff:ff:ff:ff:ff:ff UHLWb 01 wb0 204.216.27.192/28 204.216.27.26 UGc 00 wb0 204.216.27.224/28 204.216.27.26 UGc 00 wb0 206.213.73.13 204.216.27.17 UGHW1 58 wb0 John To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Route table leaks
< said: >> [quoting me:] >> What does `netstat -ran' say? You're not seeing all the routes >> without the `-a' flag. > It lists some additional routes with -a, but not many. Here's the > latest output (still growing, as you can see): > cvsup-master# vmstat -m | grep 'routetbl ' > routetbl 822 115K115K 21221K 26690 0 16,32,64,128,256 Hmmm. On one of my machines: Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) routetbl 17124K184K 10366K 9762730 0 16,32,64,128,256 Looks fine. Another machine says: routetbl 2755 384K394K 42708K 9280430 0 16,32,64,128,256 It also tells me: root@xyz(49)$ netstat -ran | wc -l 118 root@xyz(50)$ netstat -ran | fgrep default default18.24.10.3 UGc2513963 ti0 root@xyz(51)$ netstat -f inet -n | wc -l 1331 Now things start to make sense: root@xyz(55)$ netstat -f inet -n | fgrep CLOSING | wc -l 1287 (this machine still has the bug that Jonathan Lemon fixed). Now it's clear what's going on. The ``missing'' routes have been deleted from the routing table, but have not yet been freed because these old PCBs still hold a reference. -GAWollman -- Garrett A. Wollman | O Siem / We are all family / O Siem / We're all the same [EMAIL PROTECTED] | O Siem / The fires of freedom Opinions not those of| Dance in the burning flame MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Route table leaks
Garrett Wollman wrote: > < said: > >> are climbing steadily. Also the references to the default route as >> reported by "netstat -rn" are climbing. (They went from 187 to 193 in >> the past 2 minutes or so.) > > What does `netstat -ran' say? You're not seeing all the routes > without the `-a' flag. It lists some additional routes with -a, but not many. Here's the latest output (still growing, as you can see): cvsup-master# vmstat -m | grep 'routetbl ' routetbl 822 115K115K 21221K 26690 0 16,32,64,128,256 cvsup-master# netstat -ran Routing tables Internet: DestinationGatewayFlags Refs Use Netif Expire default204.216.27.17 UGc 3937 wb0 127.0.0.1 127.0.0.1 UH 3 19 lo0 194.151.64.11 204.216.27.17 UGHW1 585 wb0 195.113.19.84 204.216.27.17 UGHW3 0 508 wb0 3569 203.139.121.132204.216.27.17 UGHW3 0 352 wb0 3556 203.178.140.4 204.216.27.17 UGHW1 867 wb0 204.216.27.3 204.216.27.17 UGHW3 09 wb0 3523 204.216.27.16/28 link#1 UC 00 wb0 204.216.27.17 0:0:c:4:2e:2e UHLW70 wb0261 204.216.27.18 0:a0:c9:97:e8:ae UHLW0 38 wb0 1005 204.216.27.20 0:0:d7:0:4:14 UHLW2 50 wb0 204.216.27.21 0:a0:c9:a6:e:a6UHLW2 188782 wb0 1002 204.216.27.26 link#1 UHLW20 wb0 204.216.27.27 0:a0:c9:a5:f3:7f UHLW232737 wb0 1051 204.216.27.31 ff:ff:ff:ff:ff:ff UHLWb 01 wb0 204.216.27.192/28 204.216.27.26 UGc 00 wb0 204.216.27.224/28 204.216.27.26 UGc 00 wb0 206.213.73.13 204.216.27.17 UGHW1 62 wb0 John To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message