Re: Quggaa locking hard.
On Sat, Dec 5, 2009 at 9:11 AM, Mike Tancsa wrote: > At 04:07 PM 12/4/2009, K. Macy wrote: > >> If you have a large number of routes then you will want to disable the >> flowtable. >> > > Thanks! I will remove from boxes that act as routers / large firewalls. > However, the high load avg is something new. Even when the box is doing > nothing, it sits at 2.00 for some reason. This was not happening from the > code base a week ago or so. > Just to add something really interesting to this, "ifconfig vlan101 unplumb" hangs after this has happened. It seems like it should be related. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Quggaa locking hard.
At 04:07 PM 12/4/2009, K. Macy wrote: If you have a large number of routes then you will want to disable the flowtable. Thanks! I will remove from boxes that act as routers / large firewalls. However, the high load avg is something new. Even when the box is doing nothing, it sits at 2.00 for some reason. This was not happening from the code base a week ago or so. ---Mike Mike Tancsa, tel +1 519 651 3400 Sentex Communications,m...@sentex.net Providing Internet since 1994www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Quggaa locking hard.
What is the simplest way to reproduce this? Although flowtable is not expected to help your use case, it should not cripple it. -Kip On Dec 4, 2009, at 6:56 AM, Mike Tancsa wrote: > At 10:46 PM 12/3/2009, Zaphod Beeblebrox wrote: >> I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0 >> and not locking hard on 7.2. It seems (at this early point in the >> investigation) that both bgpd and zebra are wedging and zebra is listed as >> being in the "RUN" state. >> >> curiously, the load is also 4.0 (exactly the number of cores in the machine) >> even though the machine also reads 100% idle. > > > I think I am seeing something similar on a test box. I was loading up the > box with 200k routes to do testing with. Kernel is default, save for a few > unused drivers removed. If I take out > optionsFLOWTABLE # per-cpu routing cache > from the kernel, load avg is back to normal. This issue only seems to have > come up in the past week or so as the previous kernel from ~8 days ago was OK. > > last pid: 6229; load averages: 2.00, 2.00, 2.00up > 1+17:33:02 09:39:31 > 141 processes: 7 running, 106 sleeping, 28 waiting > CPU: 0.0% user, 0.0% nice, 22.2% system, 0.0% interrupt, 77.8% idle > Mem: 98M Active, 2233M Inact, 187M Wired, 36K Cache, 112M Buf, 979M Free > Swap: 8192M Total, 8192M Free > > PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND > 22 root 76- 0K 8K CPU33 41.5H 100.00% flowcleaner > 11 root 171 ki31 0K32K CPU22 41.5H 100.00% {idle: cpu2} > 11 root 171 ki31 0K32K CPU11 41.5H 100.00% {idle: cpu1} > 11 root 171 ki31 0K32K RUN 0 41.4H 100.00% {idle: cpu0} > 869 root 40 64860K 64488K select 0 4:12 0.00% bgpd > 11 root 171 ki31 0K32K RUN 3 2:09 0.00% {idle: cpu3} > 20 root 44- 0K 8K syncer 0 1:00 0.00% syncer > 12 root -32- 0K 224K WAIT1 0:47 0.00% {swi4: clock} >0 root -680 0K80K - 2 0:03 0.00% {fw0_taskq} > 1230 root 760 3348K 1160K ttyin 2 0:02 0.00% getty > 863 root 960 24640K 24232K RUN 2 0:02 0.00% zebra > 12 root -32- 0K 224K WAIT2 0:01 0.00% {swi4: clock} > 14 root -16- 0K 8K - 0 0:01 0.00% yarrow > >> ___ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications,m...@sentex.net > Providing Internet since 1994www.sentex.net > Cambridge, Ontario Canada www.sentex.net/mike > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Quggaa locking hard.
If you have a large number of routes then you will want to disable the flowtable. The default maximum number of cacheable flows is fairly small, raising it can help on the low-end, but fundamentally its an optimization for systems that have fewer than a few thousand simultaneous peers - the common case. I do have longer term plans for moving to lock-free L3 and L2 so that applications with large numbers of prefixes will also no longer be hampered by high locking overhead. -Kip On Fri, Dec 4, 2009 at 6:56 AM, Mike Tancsa wrote: > At 10:46 PM 12/3/2009, Zaphod Beeblebrox wrote: >> >> I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0 >> and not locking hard on 7.2. It seems (at this early point in the >> investigation) that both bgpd and zebra are wedging and zebra is listed as >> being in the "RUN" state. >> >> curiously, the load is also 4.0 (exactly the number of cores in the >> machine) >> even though the machine also reads 100% idle. > > > I think I am seeing something similar on a test box. I was loading up the > box with 200k routes to do testing with. Kernel is default, save for a few > unused drivers removed. If I take out > options FLOWTABLE # per-cpu routing cache > from the kernel, load avg is back to normal. This issue only seems to have > come up in the past week or so as the previous kernel from ~8 days ago was > OK. > > last pid: 6229; load averages: 2.00, 2.00, 2.00 up > 1+17:33:02 09:39:31 > 141 processes: 7 running, 106 sleeping, 28 waiting > CPU: 0.0% user, 0.0% nice, 22.2% system, 0.0% interrupt, 77.8% idle > Mem: 98M Active, 2233M Inact, 187M Wired, 36K Cache, 112M Buf, 979M Free > Swap: 8192M Total, 8192M Free > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 22 root 76 - 0K 8K CPU3 3 41.5H 100.00% flowcleaner > 11 root 171 ki31 0K 32K CPU2 2 41.5H 100.00% {idle: cpu2} > 11 root 171 ki31 0K 32K CPU1 1 41.5H 100.00% {idle: cpu1} > 11 root 171 ki31 0K 32K RUN 0 41.4H 100.00% {idle: cpu0} > 869 root 4 0 64860K 64488K select 0 4:12 0.00% bgpd > 11 root 171 ki31 0K 32K RUN 3 2:09 0.00% {idle: cpu3} > 20 root 44 - 0K 8K syncer 0 1:00 0.00% syncer > 12 root -32 - 0K 224K WAIT 1 0:47 0.00% {swi4: clock} > 0 root -68 0 0K 80K - 2 0:03 0.00% {fw0_taskq} > 1230 root 76 0 3348K 1160K ttyin 2 0:02 0.00% getty > 863 root 96 0 24640K 24232K RUN 2 0:02 0.00% zebra > 12 root -32 - 0K 224K WAIT 2 0:01 0.00% {swi4: clock} > 14 root -16 - 0K 8K - 0 0:01 0.00% yarrow > >> ___ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, m...@sentex.net > Providing Internet since 1994 www.sentex.net > Cambridge, Ontario Canada www.sentex.net/mike > > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Quggaa locking hard.
At 10:46 PM 12/3/2009, Zaphod Beeblebrox wrote: I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0 and not locking hard on 7.2. It seems (at this early point in the investigation) that both bgpd and zebra are wedging and zebra is listed as being in the "RUN" state. curiously, the load is also 4.0 (exactly the number of cores in the machine) even though the machine also reads 100% idle. I think I am seeing something similar on a test box. I was loading up the box with 200k routes to do testing with. Kernel is default, save for a few unused drivers removed. If I take out optionsFLOWTABLE # per-cpu routing cache from the kernel, load avg is back to normal. This issue only seems to have come up in the past week or so as the previous kernel from ~8 days ago was OK. last pid: 6229; load averages: 2.00, 2.00, 2.00 up 1+17:33:02 09:39:31 141 processes: 7 running, 106 sleeping, 28 waiting CPU: 0.0% user, 0.0% nice, 22.2% system, 0.0% interrupt, 77.8% idle Mem: 98M Active, 2233M Inact, 187M Wired, 36K Cache, 112M Buf, 979M Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 22 root 76- 0K 8K CPU33 41.5H 100.00% flowcleaner 11 root 171 ki31 0K32K CPU22 41.5H 100.00% {idle: cpu2} 11 root 171 ki31 0K32K CPU11 41.5H 100.00% {idle: cpu1} 11 root 171 ki31 0K32K RUN 0 41.4H 100.00% {idle: cpu0} 869 root 40 64860K 64488K select 0 4:12 0.00% bgpd 11 root 171 ki31 0K32K RUN 3 2:09 0.00% {idle: cpu3} 20 root 44- 0K 8K syncer 0 1:00 0.00% syncer 12 root -32- 0K 224K WAIT1 0:47 0.00% {swi4: clock} 0 root -680 0K80K - 2 0:03 0.00% {fw0_taskq} 1230 root 760 3348K 1160K ttyin 2 0:02 0.00% getty 863 root 960 24640K 24232K RUN 2 0:02 0.00% zebra 12 root -32- 0K 224K WAIT2 0:01 0.00% {swi4: clock} 14 root -16- 0K 8K - 0 0:01 0.00% yarrow ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" Mike Tancsa, tel +1 519 651 3400 Sentex Communications,m...@sentex.net Providing Internet since 1994www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Quggaa locking hard.
I have also seen this with a recent version of FreeBSD 8 (I know 8.0-BETA2 didn't have this problem, also I have an 8.0-RC1 without problems, but I think RC3 did have it, and I'm sure -RELEASE has it). A few more details: It happened both on amd64 and i386. I couldn't debug amd64 (it was a live server and we couldn't afford it), but on i386 flowcleaner was using a LOT of CPU. It seemed to happen after booting, when quagga was importing global routing tables (~300k routes) from 2 BGP sessions. At least one of the sessions seemed to finish importing routes, but the kernel routing table seemed to be growing very slowly. Doing "netstat -nr | wc -l" took way longer than usual (20-30 seconds versus 9 seconds now), and it only reported about 100k routes. Doing it again after a minute or so showed the number of routes grew by around 10k. During this time, both quagga and zebra were very slow to respond to a new telnet session opened to them. As a workaround, I did sysctl net.inet.flowtable.enable=0. This didn't ease the load on the CPU, but having it in /etc/sysctl.conf and rebooting did help (quagga started up normally and all routes are where they should be). Hope this helps Alex --- On Fri, 12/4/09, Zaphod Beeblebrox wrote: > From: Zaphod Beeblebrox > Subject: Quggaa locking hard. > To: "FreeBSD Stable" > Date: Friday, December 4, 2009, 5:46 AM > I'm still investigating this, but my > quagga is locking hard on FreeBSD 8.0 > and not locking hard on 7.2. It seems (at this early > point in the > investigation) that both bgpd and zebra are wedging and > zebra is listed as > being in the "RUN" state. > > curiously, the load is also 4.0 (exactly the number of > cores in the machine) > even though the machine also reads 100% idle. > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Quggaa locking hard.
I'm still investigating this, but my quagga is locking hard on FreeBSD 8.0 and not locking hard on 7.2. It seems (at this early point in the investigation) that both bgpd and zebra are wedging and zebra is listed as being in the "RUN" state. curiously, the load is also 4.0 (exactly the number of cores in the machine) even though the machine also reads 100% idle. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"