On Tue, 12 Jun 2007, Stephan Koenig wrote:

We have some servers that have a very high packet rate in a normal
production mode, and require polling to keep the CPU load reasonable.

We have Kern.HZ=4000, and even still, have some dropped packets.

On our servers that use the intel "em" driver, we have tuned the
drivers as following, by default, if_em.h has:

#define EM_MIN_TXD              80
#define EM_MAX_TXD_82543        256
#define EM_MAX_TXD              4096
#define EM_DEFAULT_TXD          EM_MAX_TXD_82543

#define EM_MIN_RXD              80
#define EM_MAX_RXD_82543        256
#define EM_MAX_RXD              4096
#define EM_DEFAULT_RXD          EM_MAX_RXD_82543


We have changed EM_DEFAULT_TXD and EM_DEFAULT_RXD to 4096 -- This
solved the problem on these servers.

The question is now what to do on our servers with Broadcom "bge"
series cards.  Does anyone know how to tune this driver in a similar
matter?

1) Change BGE_SSLOTS from 256 to 512.  This corresponds to changing
   EM_DEFAULT_RXD from 256 to 4096, except the max is much smaller.
2) Don't use polling.  Polling "works" to reduce CPU by dropping packets
   on input and be reducing throughput on output.  It works particularly
   badly for bge.
3) When not using polling, change the interrupt coalescing parameters.
   These can be set to give similar behaviour to polling, without most
   of pollings losses or features.  E.g., setting interrupt moderation
   timeouts to 250 uS gives behaviour similar to polling at 4000 Hz.
   For input, the main difference is that the interrupts are high priority
   so dropping packets is less likely.  For output, the throttling
   behaviour of polling is more useful and without it bge interfaces may
   use more CPU so as to actually send packets as fast as possible.

   I use the following simple coalescing tuning in RELENG_6.  This
   essentially restores the old tuning.  The tuning is now essentially
   Linux''s and is not good for FreeBSD.

% Index: if_bge.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/dev/bge/if_bge.c,v
% retrieving revision 1.91.2.23
% diff -u -2 -r1.91.2.23 if_bge.c
% --- if_bge.c  8 May 2007 16:18:21 -0000       1.91.2.23
% +++ if_bge.c  9 May 2007 10:09:55 -0000
% @@ -2391,7 +2391,7 @@
%       sc->bge_stat_ticks = BGE_TICKS_PER_SEC;
%       sc->bge_rx_coal_ticks = 150;
% -     sc->bge_tx_coal_ticks = 150;
% -     sc->bge_rx_max_coal_bds = 10;
% -     sc->bge_tx_max_coal_bds = 10;
% +     sc->bge_tx_coal_ticks = 1000000;
% +     sc->bge_rx_max_coal_bds = 64;
% +     sc->bge_tx_max_coal_bds = 384;
% % /* Set up ifnet structure */

The parameters here are:

sc->bge_rx_coal_ticks:
        Set this to 100000/N to give essentially the same behaviour as
        polling at N Hz.  The default of 150 gives polling at 6667 Hz.
        This is a good default.
sc->bge_tx_coal_ticks:
        Like the rx coal_ticks, but not really needed since tx
        can be controlled better by coal_bds.  I set it to 1000000
        (1 second) since it is only used to free inactive tx descriptors
        if the device becomes idle.
sc->bge_rx_max_coal_bds:
        Set this to 1 to give minimum latency and maximum CPU use.  Set it
        to 0 (infinity) or large to give bad latency but less CPU use.
        The default of 64 gave a good traeoff.  The current and RELENG_6
        value is too small.  For 1500-byte packets, the regression in this
        parameter little effect when the rx ticks timeout is 150 uS, since
        the timeout fires first for either value, but for tiny packets a
        value of 10 for this parameter asks for 150k interrupts per second
        which is too many.
sc->bge_tx_max_coal_bds:
        Set this to almost the maximum possible to give minimum CPU
        use at almost no cost to latency.  The maximum possible is 511
        (496?), but that is too agressive.  I use 384.  The old value
        of 128 works OK too (costs ~10% more CPU than 512).  The current
        and RELENG_6 value of 10 is not good.  It costs about 100%
        more CPU than a value of 384 and has no observable good effects.
        (Above 384, there are some observable bad effects due to bus
        contention with rx.  I only tested on PCI 33MHz buses.  The
        tradeoffs might be a little different on faster buses.)

I use dynamic tuning of bge rx coalescing (by rate-limiting interrupts)
in -current.  em does this in hardware.

Bruce
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to