high cpu usage on natd / dhcpd
Hi I have a small system running FreeBSD 8.2 that does NAT using ipfw and natd to systems attached to two interfaces: em0 and wlan0. I have a dhcpd daemon issuing leases on those interfaces. The system has an em1 interface plugged into a cable modem where it obtains a DHCP lease from an ISP. For some reason, when traffic from the Internet terminates on the system itself (I scp a file from the computer) the natd and dhcpd processes consume significant CPU, and the throughput is less than I expect. Traffic that passes through to a computer behind the NAT flows without causing the natd or dhcpd processes to measurably consume CPU. From top: CPU: 10.9% user, 0.0% nice, 56.0% system, 21.1% interrupt, 12.0% idle Mem: 225M Active, 92M Inact, 162M Wired, 556K Cache, 112M Buf, 1506M Free PID USERNAMETHR PRI NICE SIZERES STATETIME WCPU COMMAND 1222 root 1 1040 3572K 1448K RUN 1:29 39.36% natd 1676 root 1 620 5340K 3544K select 0:59 24.56% dhcpd What is going on? My ipfw ruleset is below, and is based on the example in the FreeBSD handbook. 1 allow ip from any to any via lo0 2 allow ip from any to any via em0 3 allow ip from any to any via wlan0 00101 divert 8668 ip from any to any in via em1 00102 check-state 00110 skipto 500 tcp from any to any out via em1 setup keep-state 00111 skipto 500 udp from any to any out via em1 keep-state 00112 skipto 500 icmp from any to any out via em1 keep-state 00201 allow udp from any to any dst-port 68 in keep-state 00202 allow tcp from any to me dst-port 80 in via em1 setup keep-state 00210 allow tcp from 130.217.250.13 to me in via em1 setup keep-state 00211 allow tcp from 199.109.33.1 to me in via em1 setup keep-state 00212 allow tcp from 192.172.226.78 to me in via em1 setup keep-state 00213 allow tcp from 192.172.226.95 to me in via em1 setup keep-state 00230 allow tcp from any to me dst-port 6984 in via em1 setup keep-state 00231 allow udp from any to me dst-port 6984 in via em1 00240 allow icmp from any to me in via em1 00300 unreach filter-prohib log ip from any to any 00500 divert 8668 ip from any to any out via em1 00501 allow ip from any to any 65535 allow ip from any to any ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: 9-stable - ifmedia_set: no match for 0x0/0xfffffff
ok, i o used device.hints to disable both bge interfaces o booted successfully o used serial console o ifconfiged bge0 to the normal addresses o and it is working i suspect that something sucks in bge initialization at startup. insightful, i know. sorry. randy ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: msk0: watchdog timeout interface hang
Hi, On Wed, Jan 25, 2012 at 3:26 PM, Kim Culhan wrote: > Running 10-curent from 01-20-12 > the msk0 interface hung, on the console: > > msk0: watchdog timeout > msk0: prefetch unit stuck? > msk0: initialization failed: no memory for Rx buffers > > Verbose boot dmesg output attached. > known issue affecting at least 8-STABLE, 9-STABLE (assumed) and -current. Already reported in these threads: http://lists.freebsd.org/pipermail/freebsd-net/2011-December/030635.html http://lists.freebsd.org/pipermail/freebsd-questions/2011-November/235646.html - Arnaud > Any help is greatly appreciated. > > -kim > > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: 9-stable - ifmedia_set: no match for 0x0/0xfffffff
way cool. a /boot/device.hints entry of hint.acpi.bge.1.disable=1 did disable bge1. but now it's bge0, and i need that interface. and media are present! so i tried /etc/rc.conf ifconfig_bge0="198.180.150.1/25 media 1000baseTX" ifconfig_bge0_ipv6="inet6 2001:418:8006::1/64" ifconfig_bge0_alias0="inet 198.180.150.2/32" ifconfig_bge1="media 1000baseTX" pcib4: irq 12 at device 28.2 on pci0 pcib0: allocated type 3 (0xd010-0xd01f) for rid 20 of pcib4 pcib4: domain0 pcib4: secondary bus 4 pcib4: subordinate bus 4 pcib4: memory decode 0xd010-0xd01f pcib4: no prefetched decode ACPI: Found matching pin for 4.0.INTA at func 0: 12 pci4: on pcib4 pci4: domain=0, physical bus=4 found-> vendor=0x14e4, dev=0x1659, revid=0x11 domain=0, bus=4, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0006, statreg=0x0010, cachelnsz=8 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=12 powerspec 2 supports D0 D3 current D0 MSI supports 8 messages, 64 bit map[10]: type Memory, range 64, base 0xd010, size 16, enabled pcib4: allocated memory range (0xd010-0xd010) for rid 10 of pci0:4:0:0 pcib4: matched entry for 4.0.INTA (src \_SB_.PCI0.LNKC:0) pcib4: slot 0 INTA routed to irq 12 via \_SB_.PCI0.LNKC pci0:4:0:0: bad VPD cksum, remain 14 bge0: mem 0xd010-0xd010 irq 12 at device 0.0 on pci4 bge0: CHIP ID 0x4101; ASIC REV 0x04; CHIP REV 0x41; PCI-E miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: OUI 0x001018, model 0x0018, rev. 0 brgphy0: no media present ifmedia_set: no match for 0x0/0xfff panic: ifmedia_set KDB: stack backtrace: #0 0xc05bc257 at kdb_backtrace+0x47 #1 0xc058db2f at panic+0xaf #2 0xc063e3d1 at ifmedia_set+0x41 #3 0xc04e94fa at miibus_mediainit+0x8a #4 0xc04e227f at brgphy_attach+0x3bf #5 0xc05b5f6f at device_attach+0x36f #6 0xc05b745c at device_probe_and_attach+0x2c #7 0xc05b7489 at bus_generic_attach+0x19 #8 0xc04e9987 at miibus_attach+0xd7 #9 0xc05b5f6f at device_attach+0x36f #10 0xc05b745c at device_probe_and_attach+0x2c #11 0xc05b7489 at bus_generic_attach+0x19 #12 0xc04e9f0c at mii_attach+0x40c #13 0xc04db0f3 at bge_attach+0x3a93 #14 0xc05b5f6f at device_attach+0x36f #15 0xc05b745c at device_probe_and_attach+0x2c #16 0xc05b7489 at bus_generic_attach+0x19 #17 0xc049e984 at acpi_pci_attach+0x194 Uptime: 1s Automatic reboot in 15 seconds - press a key on the console to abort --> Press a key on the console to reboot, --> or switch off the system now. randy ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/164475: [gre] gre misses RUNNING flag after a reboot
Old Synopsis: gre misses RUNNING flag after a reboot New Synopsis: [gre] gre misses RUNNING flag after a reboot Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Thu Jan 26 02:23:58 UTC 2012 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=164475 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/164495: [igb] connect double head igb to switch cause system to halt
Old Synopsis: connect double head igb to switch cause system to halt New Synopsis: [igb] connect double head igb to switch cause system to halt Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Thu Jan 26 02:23:09 UTC 2012 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=164495 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
9-stable - ifmedia_set: no match for 0x0/0xfffffff
day old i386 current bge1: mem 0xd020-0xd020 irq 10 at device 0.0 on pci5 bge1: CHIP ID 0x4101; ASIC REV 0x04; CHIP REV 0x41; PCI-E miibus1: on bge1 brgphy1: PHY 1 on miibus1 brgphy1: OUI 0x001018, model 0x0018, rev. 0 brgphy1: no media present ifmedia_set: no match for 0x0/0xfff panic: ifmedia_set KDB: stack backtrace: #0 0xc05bc257 at kdb_backtrace+0x47 #1 0xc058db2f at panic+0xaf #2 0xc063e3d1 at ifmedia_set+0x41 #3 0xc04e94fa at miibus_mediainit+0x8a #4 0xc04e227f at brgphy_attach+0x3bf #5 0xc05b5f6f at device_attach+0x36f #6 0xc05b745c at device_probe_and_attach+0x2c #7 0xc05b7489 at bus_generic_attach+0x19 #8 0xc04e9987 at miibus_attach+0xd7 #9 0xc05b5f6f at device_attach+0x36f #10 0xc05b745c at device_probe_and_attach+0x2c #11 0xc05b7489 at bus_generic_attach+0x19 #12 0xc04e9f0c at mii_attach+0x40c #13 0xc04db0f3 at bge_attach+0x3a93 #14 0xc05b5f6f at device_attach+0x36f #15 0xc05b745c at device_probe_and_attach+0x2c #16 0xc05b7489 at bus_generic_attach+0x19 #17 0xc049e984 at acpi_pci_attach+0x194 Uptime: 1s Automatic reboot in 15 seconds - press a key on the console to abort --> Press a key on the console to reboot, --> or switch off the system now. randy ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: low network speed
Eugene M. Zheganin wrote: > Hi. > > I'm suffering from low network performance on one of my FreeBSDs. > I have an i386 8.2-RELEASE machine with an fxp(4) adapter. It's > connected though a bunch of catalysts 2950 to another 8.2. While other > machines in this server room using the same sequence of switches and > the > same target source server (which, btw, is equipped with an em(4) and a > gigabit link bia catalyst 3750) show sufficient speed, this particular > machine while using scp starts with a speed of 200 Kbytes/sec and > while > copying the file shows speed about 600-800 Kbytes/sec. > > I've added this tweak to the sysctl: > > net.local.stream.recvspace=196605 > net.local.stream.sendspace=196605 > net.inet.tcp.sendspace=196605 > net.inet.tcp.recvspace=196605 > net.inet.udp.recvspace=196605 > kern.ipc.maxsockbuf=2621440 > kern.ipc.somaxconn=4096 > net.inet.tcp.sendbuf_max=524288 > net.inet.tcp.recvbuf_max=524288 > > With these settings the copying starts at 9.5 Mbytes/sec speed, but > then, as file is copying, drops down to 3.5 Megs/sec in about > two-three > minutes. > > Is there some way to maintain 9.5 Mbytes/sec (I like this speed more) > ? > You might want to try disabling the hardware checksumming via ifconfig. (I very vaguely recall doing that for a fxp(4) interface some time ago, but am probably completely wrong.:-) rick > > Thanks. > Eugene. > > P.S. This machine also runs zfs, I don't know if it's important but I > decided to mention it. > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Problem with nat traversal
I have problem with nat traversal. The server is directly connected to the Internet, the client is behind a gateway that use nat. The problem is that the server tries to respond to the clients internal private address 192.168.1.10, (and the ISP sends icmp messages back to the server, telling it blocks 192.168 addresses). (I don't have access to the real output from tcpdump right now...) tcpdump on the server shows something like this: client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 4500 srv-ext-ip 4500 > client-INT-ip UDP icmp from isp-router telling client-INT-ip is filtered client-ext-ip > srv-ext-ip UDP 4500 srv-ext-ip 4500 > client-INT-ip UDP icmp from isp-router telling client-INT-ip is filtered client-ext-ip > srv-ext-ip UDP 4500 srv-ext-ip 4500 > client-INT-ip UDP icmp from isp-router telling client-INT-ip is filtered windump on the client with win7 shows something like this: client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 4500 client-ext-ip > srv-ext-ip UDP 4500 client-ext-ip > srv-ext-ip UDP 4500 I get the same problem with FreeBSD 8.1R i386 + ipsec-tools 0.8.0 FreeBSD 8.2R amd64 + ipsec-tools 0.7.3 FreeBSD 8.2R amd64 + ipsec-tools 0.8.0 I have compiled the kernel with options IPSEC options IPSEC_DEBUG options IPSEC_FILTERTUNNEL options IPSEC_NAT_T device crypto device enc and I have "nat_traversal on" in racoon.conf. Why is the server trying to send packets to the clients internal address ? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Problem with nat traversal
I have problem with nat traversal. The server is directly connected to the Internet, the client is behind a gateway that use nat. The problem is that the server tries to respond to the clients internal private address 192.168.1.10, (and the ISP sends icmp messages back to the server, telling it blocks 192.168 addresses). (I don't have access to the real output from tcpdump right now...) tcpdump on the server shows something like this: client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 4500 srv-ext-ip 4500 > client-INT-ip UDP icmp from isp-router telling client-INT-ip is filtered client-ext-ip > srv-ext-ip UDP 4500 srv-ext-ip 4500 > client-INT-ip UDP icmp from isp-router telling client-INT-ip is filtered client-ext-ip > srv-ext-ip UDP 4500 srv-ext-ip 4500 > client-INT-ip UDP icmp from isp-router telling client-INT-ip is filtered windump on the client with win7 shows something like this: client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 500 srv-ext-ip UDP 500 > client-ext-ip client-ext-ip > srv-ext-ip UDP 4500 client-ext-ip > srv-ext-ip UDP 4500 client-ext-ip > srv-ext-ip UDP 4500 I get the same problem with FreeBSD 8.1R i386 + ipsec-tools 0.8.0 FreeBSD 8.2R amd64 + ipsec-tools 0.7.3 FreeBSD 8.2R amd64 + ipsec-tools 0.8.0 I have compiled the kernel with options IPSEC options IPSEC_DEBUG options IPSEC_FILTERTUNNEL options IPSEC_NAT_T device crypto device enc and I have "nat_traversal on" in racoon.conf. Why is the server trying to send packets to the clients internal address ? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: "ifconfig media off"?
On Sat, Jan 21, 2012 at 12:58:08AM +0100, Stefan Bethke wrote: > Am 14.12.2011 um 02:16 schrieb Marius Strobl: > > > On Tue, Dec 13, 2011 at 10:53:48AM -0800, YongHyeon PYUN wrote: > >> On Tue, Dec 13, 2011 at 11:04:51AM +0100, Stefan Bethke wrote: > >>> Am 13.12.2011 um 03:50 schrieb YongHyeon PYUN: > >>> > On Tue, Dec 13, 2011 at 12:56:22AM +0100, Stefan Bethke wrote: > > I'm currently writing a driver to configure an ethernet switch chip > > (see TL-WR1043ND on -embedded). > > > > I noticed that there doesn't seem to be a way to power down a phy right > > now through the ifconfig media command. > > > > Would there be objections to extend the media subtype definitions to > > include an "off", "poweroff" or "down" media subtype, and add code to > > the relevant phy drivers to power down the phy for this media subtype? > > > > The difference between media subtype "none" and this new one would be > > that there will be no link, even if there is a physical connection. > > With media subtype "none", a 10 MBit/s half-duplex connection is > > established, potentially confusing the remote end about the > > availability of this link. On the local side, the link is down, so no > > packets are exchanged. > > > > I think "none" means "isolated" so should have no established link > and probably you can also power down the PHY. > I vaguely guess the PHY of switch chip does not correctly support > isolated mode so you may have wanted to power down. > >>> > >>> > >>> After looking at the code a bit more, I think the common code just > >>> doesn't set the BMCR_PDOWN (but clears it when bringing up the PHY). > >>> > >> > >> Yes, and most PHYs could be powered down when BMCR_ISO is chosen. > >> I'm not sure whether this could be applied to hardwares that > >> support multiple PHYs(i.e. internal and external transceivers) > >> though. Marius may have some opinions on this(CCed). > >> However powering down PHY with BMCR_ISO looks natural to me. > >> > >>> Index: sys/dev/mii/mii_physubr.c > >>> === > >>> --- sys/dev/mii/mii_physubr.c (revision 228402) > >>> +++ sys/dev/mii/mii_physubr.c (working copy) > >>> @@ -58,7 +58,7 @@ > >>> */ > >>> static const struct mii_media mii_media_table[MII_NMEDIA] = { > >>> /* None */ > >>> - { BMCR_ISO, ANAR_CSMA, > >>> + { BMCR_ISO | BMCR_PDOWN,ANAR_CSMA, > >>> 0, }, > >>> > >>> /* 10baseT */ > >>> > >>> I've opened kern/163240. > >>> http://www.freebsd.org/cgi/query-pr.cgi?pr=163240 > > I'd like to revisit this. Just to reiterate my motivation for the change: I > want to be able to indicate to the remote end that my station is not active. > With the PHY just isolated from the MII, the link stays up and functional > (and even autoneg continues to work), so the remote has no indication that > it's just shouting into a void. Yes, I understand the motivation and generally agree that this should be implemented. IMO the above is just a quick-hack though and no proper solution, on the other hand I neither see a need to grown an "off" media for this. > > > I don't think powering down the PHY along with IFM_NONE especially > > in that way is a good idea for several reasons: > > - It's incomplete as not all PHY drivers use mii_phy_add_media()/ > > mii_phy_setmedia(). > > - Even for those that do IFM_NONE isn't added when the PHY driver > > sets MIIF_NOISOLATE (for some PHYs BMCR_ISO either just doesn't > > work as especially the built-in ones probably have been designed > > with only single-PHY configurations in mind or even wedges the > > chip up to the point that even a reset doesn't get it working > > again). In general though, BMCR_ISO and BMCR_PDOWN are orthogonal > > (even in IEEE 802.3-2008 as far as I can see), i.e. while BMCR_ISO > > might be broken, BMCR_PDOWN could work (actually I'd expect > > BMCR_PDOWN to be less fragile than BMCR_ISO). > > I didn't expect my suggestion to be the be-all end-all, only a quick and easy > way to allow compliant PHYs to be powered down, and I'm not sure why a > "complete" solution is required. I'd assume that PHYs setting MIIF_NOISOLATE > have specific requirements, so it's OK to not have the power-down option > available there. (Plus I don't have hardware I could test that case on). I wouldn't call it "specific requirements". The PHY drivers I've flagged with MIIF_NOISOLATE so far fall into one of two categories: a) Setting BMCR_ISO just doesn't have any effect and the PHY happily continues to pass traffic. Setting MIIF_NOISOLATE in this case is done in order to not add an non-working "none" media. b) Upon setting BMCR_ISO the hardware wedges up to a way that a power-cycle is required in order to get it into a working state again. MIIF_NOISOLATE is set here in order to protect the users from s
Re: kern/164490: [pfil] Incorrect IP checksum on pfil pass from ip_output()
Old Synopsis: Incorrect IP checksum on pfil pass from ip_output() New Synopsis: [pfil] Incorrect IP checksum on pfil pass from ip_output() Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Wed Jan 25 19:58:14 UTC 2012 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=164490 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: low network speed
On 25/01/2012 06:27, Eugene M. Zheganin wrote: Hi. I'm suffering from low network performance on one of my FreeBSDs. I have an i386 8.2-RELEASE machine with an fxp(4) adapter. It's connected though a bunch of catalysts 2950 to another 8.2. Another thing to try would be to upgrade both ends to 8-STABLE and try the high-performance network buffer sizing in ssh (enabled by default). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
livelock with full loaded em(4)
Hello. I have test boxes with em(4) network card - Intel 82563EB FreeBSD version - 8.2 stable from 2012-01-15, amd64 When this NIC is full loaded livelock occurs - system is unresponsive even from local console. To generate load I use netsend from /usr/src/tools/tools/netrate/ but other traffic source (e. g. TCP instead UDP) cause same problem. There is need 2 conditions for this livelock: 1. With full NIC load, kernel thread "em1 taskq" hogs CPU. top -zISHP for interface load a bit less, than full. Traffic is generated by # netsend 172.16.0.2 9001 8500 14300 3600 where 14300 is packets per second: 112 processes: 10 running, 82 sleeping, 20 waiting CPU 0: 0.0% user, 0.0% nice, 27.1% system, 0.0% interrupt, 72.9% idle CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 2: 2.3% user, 0.0% nice, 97.7% system, 0.0% interrupt, 0.0% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 6: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 7: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 26M Active, 378M Inact, 450M Wired, 132K Cache, 63M Buf, 15G Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 7737 ayuzhaninov 1190 5832K 1116K CPU22 0:04 100.00% netsend 0 root -680 0K 144K - 0 2:17 22.27% {em1 taskq} top -zISHP for full interface load (some drops occurs), load is generated by # netsend 172.16.0.2 9001 8500 14400 3600 112 processes: 11 running, 81 sleeping, 20 waiting CPU 0: 0.0% user, 0.0% nice, 100% system, 0.0% interrupt, 0.0% idle CPU 1: 4.1% user, 0.0% nice, 95.9% system, 0.0% interrupt, 0.0% idle CPU 2: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 4: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 5: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 6: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle CPU 7: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 26M Active, 378M Inact, 450M Wired, 132K Cache, 63M Buf, 15G Free Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZERES STATE C TIME WCPU COMMAND 0 root -680 0K 144K CPU00 2:17 100.00% {em1 taskq} 7759 ayuzhaninov 1190 5832K 1116K CPU11 0:01 100.00% netsend So pps increased from 14300 to 14400 (0.7%), but CPU load from "em1 taskq" thread increased from 27.1% to 100.00% This at least strange, but system still works fine until I run sysctl dev.cpu.0.temperature 2. sysctl handler code for coretemp must be executed on target CPU, e. g. for dev.cpu.0.temperature code executed on CPU0. If CPU0 is fully loaded by "em1 taskq" sysctl handler for dev.cpu.0.temperature acquires Giant mutex lock then tries to run code on CPU0, but it can't - CPU0 is busy. If Giant mutex hold for long time system is unresponsive. In my case Giant mutex acquired when sysctl dev.cpu.0.temperature started and hold all time while netsend is running. This seems to be a scheduler problem: 1. Why "em1 taskq" runs only on CPU0 (there is no affinity for this tread)? # procstat -k 0 | egrep '(PID|em1)' PIDTID COMM TDNAME KSTACK 0 100038 kernel em1 taskq # cpuset -g -t 100038 tid 100038 mask: 0, 1, 2, 3, 4, 5, 6, 7 2. Why "em1 taskq" is not preempted to execute sysctl handler code? This is not short term condition - is netsend running for a hour, "em1 taskq" is not preempted for a hour - sysctl all this time in running state but don't have a chance to be executed. -- Anton Yuzhaninov P. S. I tried to use EM_MULTIQUEUE, but this is don't help in my case. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Ethernet Switch Framework
Am 25.01.2012 um 08:12 schrieb Adrian Chadd: > So when will you two have something consensus-y to commit? :-) > > What I'm hoping for is: > > * some traction on the MII bus / MDIO bus split and tidyup from stb, which is > nice; > * ray's switch API for speaking to userland with; > * agreeing on whether the correct place to put the driver(s) is where stb, > ray, or a mix of both approaches says so. > > I've been (mostly) trying to stay out of this to see where both of you have > gone. I think we've made some good progress; now it's time to solidify a > design for the first pass of what we want in -HEAD and figure out how to move > forward. My suggestion is to take my bus attachment code (incl. mdio and miiproxy) and ray's ioctl and userland code. Aleksandr's approach for the driver attachment is to have a generic switch "bus" driver that abstracts the mii, i2c, memory mapped I/O, etc. busses the devices are physically attached to, and present a uniform register file to the chip-specific switch driver. I believe that this is unnecessarily complicated for two reasons: newbus already provides that abstraction, and chip-specific drivers usually differ in so many aspects, including their register files, that code sharing between chips will be somewhat limited anyway. One aspect that I would enjoy looking into in more detail is how register accesses on, for example, MDIO, can be provided through the bus space API. From my cursory reading, it seems that the code currently is tailored towards register accesses that can be implemented through CPU native instructions, instead of indirectly through a controller. Aleksandr has defined a quite comprehensive ethernet switch control API that the framework provides towards in-kernel clients as well as userland. I think it would be really helpful if we could concentrate on those API functions that can be controlled through the userland utility, have immediate use cases (for example, VLAN configuration on the TL-WR1043ND to separate the WAN from the LAN ports), and we have test hardware for. In short, don't commit dead code. Having a description of the generic switch model that the API assumes and driver-specific documentation also wouldn't hurt. (Yes, I'm volunteering.) Stefan -- Stefan BethkeFon +49 151 14070811 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"