On 09/23/2010 02:33 PM, Tom Judge wrote: > The throttle command I am using in the tests is the one from here: > > http://klicman.org/throttle/ > > > On 09/23/2010 02:26 PM, Tom Judge wrote: > >> On 09/23/2010 01:21 PM, David Christensen wrote: >> >> >>>>>> Under testing I have yet to see a memory fragmentation issue with >>>>>> >>>>>> >>>>>> >>>> this >>>> >>>> >>>> >>>>>> driver. I follow up if/when I find a problem with this again. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> So here we are again. The system is locking up again because of 9k >>>> mbuf >>>> allocation failures. >>>> >>>> >>>> >>> Failure to allocate a new buffer should cause the driver to >>> drop the received frame and reuse the buffer, not lock up the >>> system. Are you seeing the lockup come from bce(4) or does >>> it come from somewhere else due to the dropped data? >>> >>> >>> >>> >> The lockup is not from the NIC as such, the systems have the appearance >> of locking up as home directories are on NFS and the user information is >> stored in a remote LDAP server. When the system starts to drop frames >> due to lack of 9k memory regions it tends to last for a few minutes >> (when it is really bad) and stop all traffic into the system. This >> appears to the average user as a complete system pause. >> >> >> >> >>>>>> Is there a way to fix the RX buffer shortage issues (when header >>>>>> splitting is turned on) so that they are guarded by flow control. >>>>>> >>>>>> >>>>>> >>>> Maybe >>>> >>>> >>>> >>>>>> change the low watermark for flow control when its enabled? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> I'm not sure how much it would help but try changing RX low >>>>> watermark. Default value is 32 which seems to be reasonable value. >>>>> But it's only for 5709/5716 controllers and Linux seems to use >>>>> different default value. >>>>> >>>>> >>>>> >>>>> >>>> These are: NetXtreme II BCM5709 Gigabit Ethernet >>>> >>>> So my next task is to turn the watermark related defines into sysctls >>>> and turn on header splitting so that I can try to tune them without >>>> having to reboot. >>>> >>>> >>>> >>>> >>> Do you have flow control enabled? There are arguments both for >>> and against flow control. For bce(4), I haven't tested flow control >>> for quite a while and it's behavior may have changed since it is >>> controlled by firmware. Keep an eye on the hardware statistics >>> to see that's it's actively generating pause frames. >>> >>> >>> >> 3) With flow control enabled and header splitting on flood the server >> with very small frames (200 bytes). (Using the same test as in case 1). >> My aim is to tune the watermark here so that there are no frames dropped >> due to BD shortages. >> >>
Card info unhidden: bce0: ASIC (0x57092003); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.2.2); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.8) So having done lots of testing with flow control turned on as well as header splitting it seems like flow control may be broken with header splitting? I have been using the patch attached to play with the flow control water marks. I have tried with with following data points and am finding it difficult to get flow control to kick in before the card runs out of descriptors and starts dropping frames: low: 16 high: 127 low: 32 high: 127 low: 64 high: 127 low: 96 high: 127 low: 32 high: 196 low: 64 high: 196 low: 128 high: 256 None of these seem to have any noticeable or effect on the drop rate or the number of dev.bce.0.stat_FlowControlDone's in the sample period. Thoughs? Tom -- TJU13-ARIN
Index: if_bce.c =================================================================== --- if_bce.c (revision 949) +++ if_bce.c (working copy) @@ -511,6 +511,21 @@ SYSCTL_UINT(_hw_bce, OID_AUTO, msi_enable, CTLFLAG_RDTUN, &bce_msi_enable, 0, "MSI-X|MSI|INTx selector"); + +/* Tunable RX flow control low water mark. */ +/* Without header splitting the default is 32 */ +static int bce_rx_low_water_mark = BCE_L2CTX_RX_LO_WATER_MARK_DEFAULT; +TUNABLE_INT("hw.bce.rx_low_water_mark", &bce_rx_low_water_mark); +SYSCTL_UINT(_hw_bce, OID_AUTO, rx_low_water_mark, CTLFLAG_RDTUN, &bce_rx_low_water_mark, 0, +"Default RX Flow Control Low Water Mark"); + +/* Tunable RX flow control high water mark. */ +/* Without header splitting the default is 32 */ +static int bce_rx_high_water_mark = USABLE_RX_BD / 4; +TUNABLE_INT("hw.bce.rx_high_water_mark", &bce_rx_high_water_mark); +SYSCTL_UINT(_hw_bce, OID_AUTO, rx_high_water_mark, CTLFLAG_RDTUN, &bce_rx_high_water_mark, 0, +"Default RX Flow Control High Water Mark"); + /* ToDo: Add tunable to enable/disable strict MTU handling. */ /* Currently allows "loose" RX MTU checking (i.e. sets the */ /* H/W RX MTU to the size of the largest receive buffer, or */ @@ -1780,11 +1795,15 @@ } if (mii->mii_media_active & IFM_FLAG1) { + BCE_PRINTF("%s(%d): Enabling TX flow control.\n", + __FILE__, __LINE__); DBPRINT(sc, BCE_INFO_PHY, "%s(): Enabling TX flow control.\n", __FUNCTION__); BCE_SETBIT(sc, BCE_EMAC_TX_MODE, BCE_EMAC_TX_MODE_FLOW_EN); sc->bce_flags |= BCE_USING_TX_FLOW_CONTROL; } else { + BCE_PRINTF("%s(%d): Disabling TX flow control.\n", + __FILE__, __LINE__); DBPRINT(sc, BCE_INFO_PHY, "%s(): Disabling TX flow control.\n", __FUNCTION__); BCE_CLRBIT(sc, BCE_EMAC_TX_MODE, BCE_EMAC_TX_MODE_FLOW_EN); @@ -5414,7 +5433,7 @@ u32 lo_water, hi_water; if (sc->bce_flags && BCE_USING_TX_FLOW_CONTROL) { - lo_water = BCE_L2CTX_RX_LO_WATER_MARK_DEFAULT; + lo_water = bce_rx_low_water_mark; } else { lo_water = 0; } @@ -5423,11 +5442,12 @@ lo_water = 0; } - hi_water = USABLE_RX_BD / 4; + hi_water = bce_rx_high_water_mark; if (hi_water <= lo_water) { lo_water = 0; } + BCE_PRINTF("Setting Up Flow Control (Pre Scaling), Low Watermark: %d, High Watermark: %d\n", (int)lo_water, (int)hi_water); lo_water /= BCE_L2CTX_RX_LO_WATER_MARK_SCALE; hi_water /= BCE_L2CTX_RX_HI_WATER_MARK_SCALE; @@ -5436,7 +5456,8 @@ hi_water = 0xf; else if (hi_water == 0) lo_water = 0; - + + BCE_PRINTF("Setting Up Flow Control (Post Scaling), Low Watermark: %d, High Watermark: %d\n", (int)lo_water, (int)hi_water); val |= (lo_water << BCE_L2CTX_RX_LO_WATER_MARK_SHIFT) | (hi_water << BCE_L2CTX_RX_HI_WATER_MARK_SHIFT); }
_______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"