Re: Slow Intel 10GbE CX4 adapter behaviour
Problem solved, I'm so embarrassed :) The issue on 7.2 mentioned above with ixgbe (tons of fragmentation failed errors) was real. The issue in 8.3-RC3 was because dummynet wasn't being loaded at all... so no traffic could pass on it, despite dummynet_load=YES being set in /boot/loader.conf. So I turned it on in /etc/rc.conf : dummynet_enable=YES and loaded it kldload dummynet in order to do without a reboot. Works like a charm so far. Thanks to all! ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: igb driver RX (was TX) hangs when out of mbuf clusters
Hi Jack, I could recreate the problem. When the problem occurs, we see rx_nxt_check = n rx_nxt_refresh = n + 1 (This was also reported in a mail from Karim) This means that the *whole* receive ring has no buffers anymore. This can occur if, for some amount of time, no clusters are available. Now outside of the driver, at some point of time, clusters are freed. I don't think that igb_refresh_mbufs() gets called, since it only gets called from igb_rxeof(), which gets called when a packet has been received, which can not happen since the receive ring is empty. So how can the driver know? I have no idea. Maybe we can periodically check for such an event and call igb_refresh_mbufs(). Does this make sense to you? Best regards Michael On Feb 9, 2011, at 8:32 AM, Jack Vogel wrote: Hmmm, well so much for that theory :) Jack On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin fodillemlinka...@gmail.com wrote: 2011/2/8 Jack Vogel jfvo...@gmail.com I have been following this, and thinking about it. I still am working from a theoretical standpoint, but based on a patch I got quite a long time back and never quite groked, I believe now that I might have a solution. The original PR and patch was kern/150516 from Beezar Liu, I was never quite comfortable with the code changes, nor convinced that it was a real issue and not a misunderstanding. However I think now that this very report might be behind what we are seeing today. I have a slightly different approach to solving it, of course it remains to be seen if it handles it properly. Please try the patch I've attached, I'm open to further correction or polishing of the changes. And thanks to Beezar for his original report and changes, this is not for em, but if this eliminates the problem its clearly needed in all drivers. Jack Hi Jack, Thanks for your help. I tried your patch and it didn't work so I added a couple of printf to see if the added code was getting hit: --- a/freebsd/sys/dev/e1000/if_igb.c --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c @@ -612,7 +612,7 @@ igb_attach(device_t dev) device_get_nameunit(dev)); INIT_DEBUGOUT(igb_attach: end); - + printf(this driver has a patch from Jack Vogel\n); return (0); err_late: @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, int *done) struct mbuf *sendmp, *mh, *mp; struct igb_rx_buf *rxbuf; u16 hlen, plen, hdr, vtag; + int commit; booleop = FALSE; cur = rxr-rx_base[i]; @@ -4255,10 +4256,23 @@ next_desc: bus_dmamap_sync(rxr-rxdma.dma_tag, rxr-rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); + commit = i; /* capture the old index */ + /* Advance our pointers to the next descriptor. */ if (++i == adapter-num_rx_desc) i = 0; /* + ** Sanity test for ring full, if this + ** happens we need to refresh immediately + ** or refresh may deadlock. + */ + if (i == rxr-next_to_refresh) { + igb_refresh_mbufs(rxr, commit); + printf(igb_refresh_mbufs called with commit %d\n, commit); + processed = 0; + } + + /* ** Send to the stack or LRO */ if (sendmp != NULL) { Here is the results: # dmesg | grep Vogel this driver has a patch from Jack Vogel this driver has a patch from Jack Vogel # netstat -m 60453/52707/113160 mbufs in use (current/cache/total) 48416/51584/10/10 mbuf clusters in use (current/cache/total/max) 2894/690 mbuf+clusters out of packet secondary zone in use (current/cache) 11946/854/12800/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 164834K/119760K/284595K bytes allocated to network (current/cache/total) 0/339/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/4/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines # dmesg | grep commit At this point RX has hung. Somehow the check (i == rxr-next_to_refresh) is never true in this case. Also, I did read kern/150516 and couldn't wrap my head around the patch for the em driver that Beezar Liu suggested. Regards, Karim. ___ freebsd-net@freebsd.org
Re: Slow Intel 10GbE CX4 adapter behaviour
On 9 February 2011 12:37, rihad ri...@mail.ru wrote: Problem solved, I'm so embarrassed :) The issue on 7.2 mentioned above with ixgbe (tons of fragmentation failed errors) was real. The issue in 8.3-RC3 was because dummynet wasn't being loaded at all... so no traffic could pass on it, despite dummynet_load=YES being set in /boot/loader.conf. So I turned it on in /etc/rc.conf : dummynet_enable=YES and loaded it kldload dummynet in order to do without a reboot. Works like a charm so far. Thanks to all! Looks like loading dummynet.ko via /boot/loader.conf doesn't work because dummynet.ko depends on dummynet.ko but of the different version. There are even more strange things: 1) dummynet.ko declares itself as version 1: /sys/netinet/ipfw/ip_dummynet.c: MODULE_VERSION(dummynet, 1); 2) dummynet.ko compiles into itself the various schedulers: fifo, prio, rr, etc; 3) these schedulers presumably think they are compiled standalone, so they are explicitly and strongly depend on dummynet of version 3 (why?): /sys/netinet/ipfw/dn_sched.h: MODULE_DEPEND(name, dummynet, 3, 3, 3); * That makes loader to error like dummynet: loading required module 'dummynet'. and, if loading dummynet.ko in loader prompt manually, then module 'dummynet' exists but with wrong version] This shall fix the problem: rebuilding only dummynet should be enough. %%% Index: /sys/netinet/ipfw/ip_dummynet.c === --- /sys/netinet/ipfw/ip_dummynet.c (revision 218026) +++ /sys/netinet/ipfw/ip_dummynet.c (working copy) @@ -2294,7 +2294,7 @@ #defineDN_MODEV_ORD(SI_ORDER_ANY - 128) /* after ipfw */ DECLARE_MODULE(dummynet, dummynet_mod, DN_SI_SUB, DN_MODEV_ORD); MODULE_DEPEND(dummynet, ipfw, 2, 2, 2); -MODULE_VERSION(dummynet, 1); +MODULE_VERSION(dummynet, 3); /* * Starting up. Done in order after dummynet_modevent() has been called. %%% -- wbr, pluknet ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Slow Intel 10GbE CX4 adapter behaviour
On 02/09/2011 05:47 PM, Sergey Kandaurov wrote: On 9 February 2011 12:37, rihadri...@mail.ru wrote: Problem solved, I'm so embarrassed :) The issue on 7.2 mentioned above with ixgbe (tons of fragmentation failed errors) was real. The issue in 8.3-RC3 was because dummynet wasn't being loaded at all... so no traffic could pass on it, despite dummynet_load=YES being set in /boot/loader.conf. So I turned it on in /etc/rc.conf : dummynet_enable=YES and loaded it kldload dummynet in order to do without a reboot. Works like a charm so far. Thanks to all! Looks like loading dummynet.ko via /boot/loader.conf doesn't work because dummynet.ko depends on dummynet.ko but of the different version. Would dummynet_enable=YES in rc.conf still work? We haven't yet had a chance to reboot to test that. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Slow Intel 10GbE CX4 adapter behaviour
On 9 February 2011 18:15, rihad ri...@mail.ru wrote: On 02/09/2011 05:47 PM, Sergey Kandaurov wrote: On 9 February 2011 12:37, rihadri...@mail.ru wrote: Problem solved, I'm so embarrassed :) The issue on 7.2 mentioned above with ixgbe (tons of fragmentation failed errors) was real. The issue in 8.3-RC3 was because dummynet wasn't being loaded at all... so no traffic could pass on it, despite dummynet_load=YES being set in /boot/loader.conf. So I turned it on in /etc/rc.conf : dummynet_enable=YES and loaded it kldload dummynet in order to do without a reboot. Works like a charm so far. Thanks to all! Looks like loading dummynet.ko via /boot/loader.conf doesn't work because dummynet.ko depends on dummynet.ko but of the different version. Would dummynet_enable=YES in rc.conf still work? We haven't yet had a chance to reboot to test that. Yes, it would. Note that it depends on firewall_enable=YES also present in rc.conf. -- wbr, pluknet ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
route messages from NDP
Hello. In my routing table I see entries after Neighbor Discovery Protocol processed: ... 2a02:6b8:0:401:51:4809:8158:1dcd 00:22:fb:3d:82:fe UHLWvlan438 ... I'd like to catch them via a routing socket when they appear. First, try to add a static entry: ndp -s 2a02:6b8:0:403::1:1 00:0e:0c:09:2e:7b and look at route -n monitor output: got message of size 240 on Wed Feb 9 17:26:50 2011 RTM_ADD: Add Route: len 240, pid: 82741, seq 2, errno 0, flags:HOST,DONE,STATIC locks: inits: sockaddrs: DST,GATEWAY 2a02:6b8:0:403::1:1 0.e.c.9.2e.7b We have two sections here - DST and GATEWAY. DST is a IPv6 address (sa_family == AF_INET6) and GATEWAY is a MAC (sa_family == AF_LINK). Just for info sockaddr_dl looks like this: $1 = {sdl_len = 54 '6', sdl_family = 18 '\022', sdl_index = 24, sdl_type = 135 '\207', sdl_nlen = 0 '\0', sdl_alen = 6 '\006', sdl_slen = 0 '\0', sdl_data = \000\016\f\t.{, '\0' repeats 39 times} Looks good. Lets wait for NDP entry... Here is it: got message of size 328 on Wed Feb 9 17:27:11 2011 RTM_ADD: Add Route: len 328, pid: 0, seq 0, errno 0, flags:UP,HOST,DONE,LLINFO,WASCLONED locks: inits: sockaddrs: DST,GATEWAY,IFP,IFA 2a02:6b8:0:40c:daa2:5eff:fe8c:139 vlan438:0.30.48.33.4.92 fe80::230:48ff:fe33:492%vlan438 We have four section here DST, GATEWAY, IFP, IFA. DST is IPv6 address, IFP and IFA I don't care and GATEWAY section is empty. Let's see why: $1 = {sdl_len = 54 '6', sdl_family = 18 '\022', sdl_index = 8, sdl_type = 135 '\207', sdl_nlen = 0 '\0', sdl_alen = 0 '\0', sdl_slen = 0 '\0', sdl_data = '\0' repeats 45 times} family is AF_LINK (18), it's a correct one. But sdl_alen, sdl_data are zeros. I see this for all routing messages from NDP. All created routing table entries are good (no problems here). Why sockaddr_dl in GATEWAY section has a zero address? Is it a bug? -- Sem. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: igb driver RX (was TX) hangs when out of mbuf clusters
OK, but the question is why does the ring get totally consumed this way, the ring has 1024 descriptors, it seems unintuitive that that whole quantity can be used without some being recharged. Do you see the system mbuf pool being depleted at the same time? Since you can reproduce it, do me a favor, in rxeof, change the processed value from 8 to 4 and then 1, effectively call refresh every descriptor, see if that eliminates the issue. Thanks for your help, Jack On Wed, Feb 9, 2011 at 2:36 AM, Michael Tuexen tue...@freebsd.org wrote: Hi Jack, I could recreate the problem. When the problem occurs, we see rx_nxt_check = n rx_nxt_refresh = n + 1 (This was also reported in a mail from Karim) This means that the *whole* receive ring has no buffers anymore. This can occur if, for some amount of time, no clusters are available. Now outside of the driver, at some point of time, clusters are freed. I don't think that igb_refresh_mbufs() gets called, since it only gets called from igb_rxeof(), which gets called when a packet has been received, which can not happen since the receive ring is empty. So how can the driver know? I have no idea. Maybe we can periodically check for such an event and call igb_refresh_mbufs(). Does this make sense to you? Best regards Michael On Feb 9, 2011, at 8:32 AM, Jack Vogel wrote: Hmmm, well so much for that theory :) Jack On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin fodillemlinka...@gmail.com wrote: 2011/2/8 Jack Vogel jfvo...@gmail.com I have been following this, and thinking about it. I still am working from a theoretical standpoint, but based on a patch I got quite a long time back and never quite groked, I believe now that I might have a solution. The original PR and patch was kern/150516 from Beezar Liu, I was never quite comfortable with the code changes, nor convinced that it was a real issue and not a misunderstanding. However I think now that this very report might be behind what we are seeing today. I have a slightly different approach to solving it, of course it remains to be seen if it handles it properly. Please try the patch I've attached, I'm open to further correction or polishing of the changes. And thanks to Beezar for his original report and changes, this is not for em, but if this eliminates the problem its clearly needed in all drivers. Jack Hi Jack, Thanks for your help. I tried your patch and it didn't work so I added a couple of printf to see if the added code was getting hit: --- a/freebsd/sys/dev/e1000/if_igb.c --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c @@ -612,7 +612,7 @@ igb_attach(device_t dev) device_get_nameunit(dev)); INIT_DEBUGOUT(igb_attach: end); - + printf(this driver has a patch from Jack Vogel\n); return (0); err_late: @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, int *done) struct mbuf *sendmp, *mh, *mp; struct igb_rx_buf *rxbuf; u16 hlen, plen, hdr, vtag; + int commit; booleop = FALSE; cur = rxr-rx_base[i]; @@ -4255,10 +4256,23 @@ next_desc: bus_dmamap_sync(rxr-rxdma.dma_tag, rxr-rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); + commit = i; /* capture the old index */ + /* Advance our pointers to the next descriptor. */ if (++i == adapter-num_rx_desc) i = 0; /* + ** Sanity test for ring full, if this + ** happens we need to refresh immediately + ** or refresh may deadlock. + */ + if (i == rxr-next_to_refresh) { + igb_refresh_mbufs(rxr, commit); + printf(igb_refresh_mbufs called with commit %d\n, commit); + processed = 0; + } + + /* ** Send to the stack or LRO */ if (sendmp != NULL) { Here is the results: # dmesg | grep Vogel this driver has a patch from Jack Vogel this driver has a patch from Jack Vogel # netstat -m 60453/52707/113160 mbufs in use (current/cache/total) 48416/51584/10/10 mbuf clusters in use (current/cache/total/max) 2894/690 mbuf+clusters out of packet secondary zone in use (current/cache) 11946/854/12800/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 164834K/119760K/284595K bytes allocated to network (current/cache/total) 0/339/0 requests for mbufs denied
Re: if_run in hostap mode: issue with stations in the power save mode
On Tuesday 08 February 2011 10:52:53 Bernhard Schmidt wrote: I've combined both patches (see attachment), if I get an ACK from both of you I'll try get this into the tree ASAP. Committed, thanks! -- Bernhard ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: igb driver RX (was TX) hangs when out of mbuf clusters
On Feb 9, 2011, at 6:35 PM, Jack Vogel wrote: OK, but the question is why does the ring get totally consumed this way, the ring has 1024 descriptors, it seems unintuitive that that whole quantity can be used without some being recharged. Do you see the system mbuf pool being depleted at the same time? That was the test case I created: I set up a server accepting connections but not reading anything. So the driver passes the mbufs to the transport stack and they are not consumed. Then the problem occurs. Then I kill the server. Now there are mbufs available again, but the driver doesn't know. I had the impression that these were the circumstances in which the problem showed up (mbuf allocations failing). Since you can reproduce it, do me a favor, in rxeof, change the processed value from 8 to 4 and then 1, effectively call refresh every descriptor, see if that eliminates the issue. I will do. Need to see if I can do it remotely, since I'm not in my lab right now. Can do it tomorrow for sure. But I do not think that this solves the problem, since I did the things very slowly and you call it at least when you are leaving rxeof. Best regards Michael Thanks for your help, Jack On Wed, Feb 9, 2011 at 2:36 AM, Michael Tuexen tue...@freebsd.org wrote: Hi Jack, I could recreate the problem. When the problem occurs, we see rx_nxt_check = n rx_nxt_refresh = n + 1 (This was also reported in a mail from Karim) This means that the *whole* receive ring has no buffers anymore. This can occur if, for some amount of time, no clusters are available. Now outside of the driver, at some point of time, clusters are freed. I don't think that igb_refresh_mbufs() gets called, since it only gets called from igb_rxeof(), which gets called when a packet has been received, which can not happen since the receive ring is empty. So how can the driver know? I have no idea. Maybe we can periodically check for such an event and call igb_refresh_mbufs(). Does this make sense to you? Best regards Michael On Feb 9, 2011, at 8:32 AM, Jack Vogel wrote: Hmmm, well so much for that theory :) Jack On Tue, Feb 8, 2011 at 4:06 PM, Karim Fodil-Lemelin fodillemlinka...@gmail.com wrote: 2011/2/8 Jack Vogel jfvo...@gmail.com I have been following this, and thinking about it. I still am working from a theoretical standpoint, but based on a patch I got quite a long time back and never quite groked, I believe now that I might have a solution. The original PR and patch was kern/150516 from Beezar Liu, I was never quite comfortable with the code changes, nor convinced that it was a real issue and not a misunderstanding. However I think now that this very report might be behind what we are seeing today. I have a slightly different approach to solving it, of course it remains to be seen if it handles it properly. Please try the patch I've attached, I'm open to further correction or polishing of the changes. And thanks to Beezar for his original report and changes, this is not for em, but if this eliminates the problem its clearly needed in all drivers. Jack Hi Jack, Thanks for your help. I tried your patch and it didn't work so I added a couple of printf to see if the added code was getting hit: --- a/freebsd/sys/dev/e1000/if_igb.c --More--(byte 1253)+++ b/freebsd/sys/dev/e1000/if_igb.c @@ -612,7 +612,7 @@ igb_attach(device_t dev) device_get_nameunit(dev)); INIT_DEBUGOUT(igb_attach: end); - + printf(this driver has a patch from Jack Vogel\n); return (0); err_late: @@ -4131,6 +4131,7 @@ igb_rxeof(struct igb_queue *que, int count, int *done) struct mbuf *sendmp, *mh, *mp; struct igb_rx_buf *rxbuf; u16 hlen, plen, hdr, vtag; + int commit; booleop = FALSE; cur = rxr-rx_base[i]; @@ -4255,10 +4256,23 @@ next_desc: bus_dmamap_sync(rxr-rxdma.dma_tag, rxr-rxdma.dma_map, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); + commit = i; /* capture the old index */ + /* Advance our pointers to the next descriptor. */ if (++i == adapter-num_rx_desc) i = 0; /* + ** Sanity test for ring full, if this + ** happens we need to refresh immediately + ** or refresh may deadlock. + */ + if (i == rxr-next_to_refresh) { + igb_refresh_mbufs(rxr, commit); + printf(igb_refresh_mbufs called with commit %d\n, commit); + processed = 0; + } + + /* **
Problem with re0
Both the em and re drivers have had a lot of work done recently. Are you trying with 8.2RC1 ? Tried with 8.2RC2 (via fixit shell with em): the same symptoms sadly. Card recognized, driver loaded as a result ifconfig reports it as available interface. Though neither static IP addressing nor DHCP makes it accessible on network. Interface cannot ping even the default gateway and neither this machine can be pinged. I am very sad :-( ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bge wedging 8.2-RC1
On Mon, Feb 07, 2011 at 08:27:43PM -0600, Peter Lai wrote: On Feb 7, 2011 7:38 PM, Pyun YongHyeon pyu...@gmail.com wrote: On Mon, Feb 07, 2011 at 06:09:16PM -0600, Peter Lai wrote: Hello I've got a new Dell Precision workstation here with a BCM5761 on intel mobo for westmere xeons that is wedging with interrupt storm and will lockup the system randomly. I have turned HTT and auto powermanagement off in bios (system cannot sleep), lowest cpu acpi state is C1. Here is dmesg: bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x5761100 mem 0xf3be-0xf3be,0xf3bf-0xf3bf irq 17 at device 0.0 on pci6 bge0: CHIP ID 0x05761100; ASIC REV 0x5761; CHIP REV 0x57611; PCI-E miibus0: MII bus on bge0 brgphy0: BCM5761 10/100/1000baseTX PHY PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow Here is pciconf -lv: bge0@pci0:6:0:0: class=0x02 card=0x026d1028 chip=0x168114e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'Broadcom 57XX Gigabit Integrated Controller (BCM5761)' class = network subclass = ethernet here is the setup in rc.conf: ifconfig_bge0=polling -tso -vlanhwtso -vlanhwtag -vlanmtu inet 192.168.123.124 netmask 255.255.255.0 I have the card plugged into a dlink DSS8 100mbps switch with one other 100mbps device on it (rich man's crossover cable). Before turning off TSO4 and VLAN tagging (because I don't use them), the card would do several things: 1. 1 out of 3 reboots: Fail to bring interface up. ifconfig would hang and systat/vmstat showed 800+ interrupts per second on IRQ256 This is strange. bge(4) does not use MSI if you build bge(4) with DEVICE_POLLING so seeing IRQ256 interrupts looks odd to me. Are you sure bge(4) is using IRQ256? This is with GENERIC. I will rebuild with POLLING and try... Let me know attached patch makes any difference on your box. The patch contains some other changes but that wouldn't affect your BCM5761 controller. If you see CLKREQ enabled message after applying the patch also let me know that too. 2. After a few hours lock up the system, requiring hard reboot After disabling TSO4 and VLAN stuff: bge0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=80083RXCSUM,TXCSUM,VLAN_HWCSUM,LINKSTATE media: Ethernet autoselect (100baseTX full-duplex,flowcontrol,rxpause,txpause) Everything seemed fine for about two weeks and then suddenly started acting up again, locked up, after hard reboot, soft reboot, link will not come up and I see interrupt storm again If you don't use DEVICE_POLLING, rebuild bge(4) with DEVICE_POLLING. For most cases, you don't need to enable polling on intelligent controllers like bge(4). I also have BCM5761 PCIe controller which shows no such issues. I know there is an edge case(send BD corruption) for BCM5761/BCM5784/ BCM57780 which needs to be investigated. I'm not sure you're seeing that edge case though. I am close to buying an intel card to replace the bcm, but then I noticed that the main intel desktop PCI-E card is 82574L-based and people are having em driver wedging on that too. So now I have broken ethernet on this box; my primary link is atheros 5212 pci card and I may be out of pci slots (or else I might try a pci intel card). Index: sys/dev/bge/if_bgereg.h === --- sys/dev/bge/if_bgereg.h (revision 218409) +++ sys/dev/bge/if_bgereg.h (working copy) @@ -2004,6 +2004,11 @@ #define BGE_EECTL_DATAOUT 0x0010 #define BGE_EECTL_DATAIN 0x0020 +/* PCIe Link control register */ +#define BGE_PCIE_LNKCTL 0x7D54 +#define BGE_PCIE_LNKCTL_L1_PLL_PD_ENB 0x0008 +#define BGE_PCIE_LNKCTL_L1_PLL_PD_DIS 0x0080 + /* MDI (MII/GMII) access register */ #define BGE_MDI_DATA 0x0001 #define BGE_MDI_DIR 0x0002 @@ -2769,6 +2774,7 @@ #define BGE_FLAG_4G_BNDRY_BUG 0x0200 #define BGE_FLAG_RX_ALIGNBUG 0x0400 #define BGE_FLAG_SHORT_DMA_BUG 0x0800 +#define BGE_FLAG_CLKREQ_BUG 0x1000 uint32_t bge_phy_flags; #define BGE_PHY_WIRESPEED 0x0001 #define BGE_PHY_ADC_BUG 0x0002 Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c (revision 218409) +++ sys/dev/bge/if_bge.c (working copy) @@ -879,6 +879,8 @@ { struct bge_softc *sc; struct mii_data *mii; + uint16_t lnkctl; + sc = device_get_softc(dev); mii = device_get_softc(sc-bge_miibus); @@ -905,6 +907,18 @@ sc-bge_link = 0; if (sc-bge_link == 0) return; + /* Disable CLKREQ when controller is running at 10/100Mbps. */ + if (sc-bge_flags BGE_FLAG_CLKREQ_BUG) { + lnkctl = pci_read_config(sc-bge_dev, sc-bge_expcap + +
Re: bge wedging 8.2-RC1
Let me know attached patch makes any difference on your box. The patch contains some other changes but that wouldn't affect your BCM5761 controller. If you see CLKREQ enabled message after applying the patch also let me know that too. Can I apply this to 8.2-RC1 or should I update it to -RC3? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bge wedging 8.2-RC1
On Wed, Feb 09, 2011 at 06:28:31PM -0600, Peter Lai wrote: Let me know attached patch makes any difference on your box. The patch contains some other changes but that wouldn't affect your BCM5761 controller. If you see CLKREQ enabled message after applying the patch also let me know that too. Can I apply this to 8.2-RC1 or should I update it to -RC3? I guess you can apply it to 8.2-RC1 without a problem. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Slow Intel 10GbE CX4 adapter behaviour
On 02/09/2011 07:27 PM, Sergey Kandaurov wrote: On 9 February 2011 18:15, rihadri...@mail.ru wrote: On 02/09/2011 05:47 PM, Sergey Kandaurov wrote: On 9 February 2011 12:37, rihadri...@mail.ruwrote: Problem solved, I'm so embarrassed :) The issue on 7.2 mentioned above with ixgbe (tons of fragmentation failed errors) was real. The issue in 8.3-RC3 was because dummynet wasn't being loaded at all... so no traffic could pass on it, despite dummynet_load=YES being set in /boot/loader.conf. So I turned it on in /etc/rc.conf : dummynet_enable=YES and loaded it kldload dummynet in order to do without a reboot. Works like a charm so far. Thanks to all! Looks like loading dummynet.ko via /boot/loader.conf doesn't work because dummynet.ko depends on dummynet.ko but of the different version. Would dummynet_enable=YES in rc.conf still work? We haven't yet had a chance to reboot to test that. Yes, it would. Note that it depends on firewall_enable=YES also present in rc.conf. Thanks, I see. Now I think that changing through rc.conf is the official, or supported, way of enabling dummynet upon reboot, but loader.conf is a little way under the hood. I always asked myself why it was settable in two places, and not one. But now I know. The fact that dummynet can be set to load in loader.conf is more like an undesired effect of generality. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/154600: [tcp] [panic] Random kernel panics on tcp_output
Old Synopsis: Random kernel panics on tcp_output New Synopsis: [tcp] [panic] Random kernel panics on tcp_output Responsible-Changed-From-To: freebsd-amd64-freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Thu Feb 10 05:41:32 UTC 2011 Responsible-Changed-Why: reclassify and assign. http://www.freebsd.org/cgi/query-pr.cgi?pr=154600 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/154591: [msk] [panic] if_msk driver causes kernel panic (fatal trap while in kernel mode)
Old Synopsis: if_msk driver causes kernel panic (fatal trap while in kernel mode) New Synopsis: [msk] [panic] if_msk driver causes kernel panic (fatal trap while in kernel mode) Responsible-Changed-From-To: freebsd-bugs-freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Thu Feb 10 05:43:45 UTC 2011 Responsible-Changed-Why: reassign. http://www.freebsd.org/cgi/query-pr.cgi?pr=154591 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org