On Wed, 2 Mar 2011 21:12:24 +0100
Claudio Jeker <cje...@diehard.n-r-g.com> wrote:
>| > >| One thing that seems to have a big performance impact is
>| > >| net.inet.ip.ifq.maxlen. If and only if your network cards are all
>| > >| supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat
>| > >| mbufs', you can try increasing ifq.maxlen until you don't see
>| > >| net.inet.ip.ifq.drops incrementing anymore under constant load.
>| > 
>| > Yes all my nic interfaces have LWM/CWM/HWM values:
>| > System                        256 83771        5502
>| >                                2k   160        1252
>| > em0                      37    2k     4     4   256     4
>| > em1                     258    2k     4     4   256     4
>| > em2                  372751    2k     7     4   256     7
>| > em3                    8258    2k     4     4   256     4
>| > em4                   25072    2k    63     4   256    63
>| > em5                    3658    2k     8     4   256     8
>| > em6                  501288    2k    24     4   256    24
>| > em7                      22    2k     4     4   256     4
>| > em8                   36551    2k    23     4   256    23
>| > em9                   52053    2k     5     4   256     4
>| > 
>| Woohoo. That is a lot of livelocks you hit. In other words you are losing
>| ticks by something spinning to long in the kernel. Interfaces with a very
>| low CWM but a high pps rate are the ones you need to investigate about.

Hum OK.
A strange thing on livelocks is the big difference beetwen for example em2 and 

Name    Mtu   Network     Ipkts         Ierrs   Opkts           Oerrs Colls
em2     1500  <Link>       8868034600    42899  6562765482      0     0
em2     1500  fe80::%em2/  8868034600    42899  6562765482      0     0
em4     1500  <Link>      33934108692 19371393 20672882997      0     0
em4     1500  fe80::%em4/ 33934108692 19371393 20672882997     0     0

There's more livelocks on em2 but less packets (or may be counters were reseted 
to 0 after reaching max value)

>| Additionally I would like to see your netstat -m and vmstat -m output.

netstat -m:
18472 mbufs in use:
        18449 mbufs allocated to data
        16 mbufs allocated to packet headers
        7 mbufs allocated to socket names and addresses
331/4188/6144 mbuf 2048 byte clusters in use (current/peak/max)
0/8/6144 mbuf 4096 byte clusters in use (current/peak/max)
0/8/6144 mbuf 8192 byte clusters in use (current/peak/max)
0/8/6144 mbuf 9216 byte clusters in use (current/peak/max)
0/8/6144 mbuf 12288 byte clusters in use (current/peak/max)
0/8/6144 mbuf 16384 byte clusters in use (current/peak/max)
0/8/6144 mbuf 65536 byte clusters in use (current/peak/max)
30704 Kbytes allocated to network (70% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

vmstat -m:
Memory statistics by bucket size
    Size   In Use   Free           Requests  HighWater  Couldfree
      16   113578 195414           32414058    1280       6712
      32   378705    687           74930489     640       6824
      64     7707    869           11878746     320      27074
     128    11411     45           36424677     160         78
     256     7875    973          328666338      80   60487950
     512     1951     65            6017929      40     413368
    1024      331    177            1947159      20     880831
    2048       57      3             496398      10          0
    4096     5164     15             260948       5     166561
    8192       36      5             226431       5      18240
   16384       12      0            8279177       5          0
   32768        5      0                 11       5          0
   65536        2      0                  2       5          0

Memory usage type by bucket size
    Size  Type(s)
      16  devbuf, pcb, routetbl, sysctl, UFS mount, dirhash, ACPI, exec,
          xform_data, VM swap, UVM amap, UVM aobj, USB, USB device, temp
      32  devbuf, pcb, routetbl, ifaddr, UFS mount, sem, dirhash, ACPI,
          ip_moptions, in_multi, exec, pfkey data, xform_data, UVM amap, USB,
      64  devbuf, pcb, routetbl, fragtbl, ifaddr, vnodes, UFS mount, dirhash,
          ACPI, proc, VFS cluster, in_multi, ether_multi, VM swap, UVM amap,
          USB, USB device, NDP, temp
     128  devbuf, pcb, routetbl, fragtbl, ifaddr, mount, sem, dirhash, ACPI,
          VFS cluster, MFS node, NFS srvsock, ip_moptions, ttys, pfkey data,
          UVM amap, USB, USB device, NDP, temp
     256  devbuf, routetbl, ifaddr, ioctlops, iov, vnodes, shm, VM map, dirhash,
          ACPI, ip_moptions, exec, UVM amap, USB, USB device, ip6_options, temp
     512  devbuf, ifaddr, sysctl, ioctlops, iov, vnodes, dirhash, file desc,
          NFS daemon, ttys, newblk, UVM amap, USB, USB device, temp
    1024  devbuf, pcb, sysctl, ioctlops, iov, mount, UFS mount, shm, ACPI, proc,
          ttys, exec, UVM amap, USB HC, crypto data, temp
    2048  devbuf, ioctlops, iov, UFS mount, ACPI, VM swap, UVM amap, UVM aobj,
    4096  devbuf, ifaddr, ioctlops, iov, proc, UVM amap, memdesc, temp
    8192  devbuf, iov, ttys, pagedep, UVM amap, USB, temp
   16384  devbuf, iov, MSDOSFS mount, temp
   32768  devbuf, UFS quota, UFS mount, ISOFS mount, inodedep
   65536  devbuf

Memory statistics by type                           Type  Kern
          Type InUse MemUse HighUse  Limit Requests Limit Limit Size(s)
        devbuf  9428 21719K  21757K 78644K  1219316    0     0  
           pcb    57    14K     15K 78644K    75257    0     0  
      routetbl375605 11783K  11862K 78644K316114307    0     0  16,32,64,128,256
       fragtbl     0     0K      1K 78644K        6    0     0  64,128
        ifaddr   484   101K    101K 78644K      682    0     0  
        sysctl     3     2K      2K 78644K        3    0     0  16,512,1024
      ioctlops     0     0K      4K 78644K    73636    0     0  
           iov     0     0K     32K 78644K 20500234    0     0  
         mount    12     7K      7K 78644K       12    0     0  128,1024
        vnodes    51    13K    105K 78644K   444285    0     0  64,256,512
     UFS quota     1    32K     32K 78644K        1    0     0  32768
     UFS mount    25    61K     61K 78644K       25    0     0  
           shm     2     2K      2K 78644K        2    0     0  256,1024
        VM map     2     1K      1K 78644K        2    0     0  256
           sem     2     1K      1K 78644K        2    0     0  32,128
       dirhash   219    45K     87K 78644K    76185    0     0  
          ACPI  6204   717K    731K 78644K    25557    0     0  
     file desc     1     1K      1K 78644K       66    0     0  512
          proc    15    10K     10K 78644K       15    0     0  64,1024,4096
   VFS cluster     0     0K      1K 78644K    65131    0     0  64,128
      MFS node     2     1K      1K 78644K        2    0     0  128
   NFS srvsock     1     1K      1K 78644K        1    0     0  128
    NFS daemon     1     1K      1K 78644K        1    0     0  512
   ip_moptions    20     3K      3K 78644K       34    0     0  32,128,256
      in_multi   687    32K     32K 78644K     1169    0     0  32,64
   ether_multi   425    27K     27K 78644K      876    0     0  64
   ISOFS mount     1    32K     32K 78644K        1    0     0  32768
 MSDOSFS mount     1    16K     16K 78644K        1    0     0  16384
          ttys   420   308K    308K 78644K      420    0     0  
          exec     0     0K      3K 78644K  1043888    0     0  16,32,256,1024
    pfkey data     2     1K      1K 78644K        3    0     0  32,128
    xform_data     0     0K      1K 78644K    32283    0     0  16,32
       pagedep     1     8K      8K 78644K        1    0     0  8192
      inodedep     1    32K     32K 78644K        1    0     0  32768
        newblk     1     1K      1K 78644K        1    0     0  512
       VM swap     1     1K      3K 78644K        4    0     0  16,64,2048
      UVM amap132353  5150K   6938K 78644K 60829512    0     0  
      UVM aobj     2     3K      3K 78644K        2    0     0  16,2048
           USB   155    45K     45K 78644K      157    0     0  
    USB device    52    17K     17K 78644K       52    0     0  
        USB HC     1     1K      1K 78644K        1    0     0  1024
       memdesc     1     4K      4K 78644K        1    0     0  4096
   crypto data     1     1K      1K 78644K        1    0     0  1024
   ip6_options     0     0K      1K 78644K        4    0     0  256
           NDP   117    12K     12K 78644K      157    0     0  64,128
          temp   481    37K     54K 78644K101039070    0     0  

Memory Totals:  In Use    Free    Requests
                40227K   3694K    501542367
Memory resource pool statistics
Name        Size Requests Fail    InUse Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
extentpl      40      236    0      148     2     0     2     2     0     8    0
phpool        96    12690    0    10719   262     0   262   262     0     8    0
pmappl       176  1048171    0       42     3     0     3     3     0     8    1
pvpl          32 494188732   0   202861  3000    11  2989  2989     0   263  251
pdppl       4096  1048171    0       42   660   613    47    62     0     8    5
vmsppl       296  1048171    0       42     5     0     5     5     0     8    1
vmmpepl      144 4584721742  0   126526 36068 27355  8713 11753     0   304  304
vmmpekpl     144  4810977    0       37     3     0     3     3     0     8    1
aobjpl        72        1    0        1     1     0     1     1     0     8    0
amappl        72 59796600    0   124908  7097  1656  5441  5744     0    75   75
anonpl        24 135786454   0   195915  2211     0  2211  2211     0   287  157
bufpl        272  6196121    0    23830  2377   757  1620  1653     0     8    8
mbpl         256 369184744396 0   83949  5534     0  5534  5534     1   384  114
mcl2k       2048 122966085882 0     272  2094     0  2094  2094     4  3072 1952
sockpl       376 16443450    0       84    12     0    12    12     0     8    3
procpl       504  1048206    0       77    13     0    13    13     0     8    3
processpl    120  1048206    0       77     3     0     3     3     0     8    0
zombiepl     144  1048129    0        0     1     0     1     1     0     8    1
ucredpl       80    11424    0       20     1     0     1     1     0     8    0
pgrppl        40    15174    0       31     1     0     1     1     0     8    0
sessionpl     64     7308    0       28     1     0     1     1     0     8    0
pcredpl       24  1048206    0       77     1     0     1     1     0     8    0
lockfpl       88    18664    0        2     1     0     1     1     0     8    0
filepl       120 144590106   0      181     7     0     7     7     0     8    1
fdescpl      440  1048172    0       43     7     0     7     7     0     8    2
pipepl       120  3739218    0       24     2     0     2     2     0     8    1
kqueuepl     256       42    0        6     1     0     1     1     0     8    0
knotepl      104  4204807    0       34     1     0     1     1     0     8    0
sigapl       488  1048171    0       42     8     0     8     8     0     8    2
wqtasks       40 16542853    0        0     1     0     1     1     0     8    1
wdcspl       176  5111262    0        0     1     0     1     1     0     8    1
scxspl       200   128692    0        0     1     0     1     1     0     8    1
namei       1024 164173312   0        0     2     0     2     2     0     8    2
vnodes       264     5927    0     5927   396     0   396   396     0     8    0
nchpl        144 43086123    0     3416   754   621   133   220     0     8    6
ffsino       232 38122197    0     5918   353     4   349   349     0     8    0
dino1pl      128 38122197    0     5918   191     0   191   191     0     8    0
dirhash     1024    94189    0      422   736   601   135   157     0   128   29
pfrulepl    1272       16    0        2     2     0     2     2     0     8    1
pfstatepl    296    21762 23821       0   770     0   770   770     0   770  770
pfstatekeypl 104    21762    0        0   264   256     8   264     0     8    8
pfstateitempl 24    21762    0        0    61    53     8    61     0     8    8
pfrktable   1296       16    0        2     2     0     2     2     0     8    1
pfrke_plain  160       50    0       10     1     0     1     1     0     8    0
pfosfpen     112     7656    0      696   140   120    20    20     0     8    0
pfosfp        40     4477    0      407     5     0     5     5     0     8    0
pffrent       32   709698    0        0     1     0     1     1     0    40    1
pffrag        80   354531    0        0     1     0     1     1     0    20    1
rtentpl      200  7035558    0   345666 73243 55937 17306 17348     0     8    1
rttmrpl       64       43    0        0     1     0     1     1     0     8    1
ipqepl        40        3    0        0     1     0     1     1     0     8    1
ipqpl         40        3    0        0     1     0     1     1     0     8    1
tcpcbpl      552    63418    0       32    11     0    11    11     0     8    6
tcpqepl       32     8286    0        0     1     0     1     1     0    25    1
sackhlpl      24      527    0        0     1     0     1     1     0   198    1
synpl        248     1176    0        0     1     0     1     1     0     8    1
plimitpl     152     7271    0       14     1     0     1     1     0     8    0
inpcbpl      352 16368989    0       41     8     0     8     8     0     8    4

In use 138878K, total allocated 193584K; utilization 71.7%

>| If I see it right you have 83771 mbufs allocated in your system. This
>| sounds like a serious mbuf leak and could actually be the reason for your
>| bad performance. It is very well possible that most of your buffer
>| allocations fail causing the tiny rings and suboptimal performance.

Hum, yes that seems to be a good way to explore. What can I try to 
confirm (or not) this ?

>| > I've already increased to 2048 some time ago with good effect on ifq.drops 
>| > but even when ifq.drops doesn't increase, I still have
>| > Ierrs on interfaces (I've just verified this right now) :-)
>| Having some Ierrs is not a big issue always put them in perspective with
>| the number of packets received.
>| e.g.
>| em6     1500  <Link>      00:30:48:9c:3a:80 72007980648 143035 62166589667 0 
>    0
>| This interface had 143035 Ierrs but it also passed 72 billion packets so
>| this is far less then 1% and not a problem.

Yes I know but I'd like to find an explanation of this as it doesn't seems 
and be sure (as far as I can) it doesn't hide a more or less important problem 

>| The FIFO on the card don't matter that much. The problem is the DMA ring
>| and the amount of slots on the ring that are actually usable. This is the
>| CWM in the systat mbuf output. MCLGETI() reduces the buffers on the ring
>| to limit the work getting into the system over a specific network card. 


>| > One of my interrogation is how to know that the system is heavy loaded.
>| > systat -s 2 vmstat, give me these informations:
>| > 
>| > Proc:r  d  s  w    Csw   Trp   Sys   Int   Sof  Flt
>| >           14       149     2   509 20118    98   31
>| >                                                    
>| >    3.5%Int   0.5%Sys   0.0%Usr   0.0%Nic  96.0%Idle
>| > |    |    |    |    |    |    |    |    |    |    |
>| > 
>| > which make me think that the system is really not very loaded but I may 
>| > a point....
>| > 
>| So you have this 3.5% Int and 0.5% Sys load and are still hitting tons of
>| LIVELOCKS (e.g. the counters increase all the time)? It really looks like
>| there is a different problem (the mentioned mbuf leak) slowing you down.


2 points which may help:
On a same hardware but with v 4.7, I have high livelocks too:
System                        256    94         271
                               2k    84         615
em0                    1520    2k     4     4   256     4
em1                     196    2k     5     4   256     4
em2                   27221    2k     6     4   256     6
em3                       1    2k     4     4   256     4
em4                       1    2k     4     4   256     4
em5                   48408    2k     7     4   256     7
em6                     379    2k     4     4   256     4
em7                       2
em8                       1    2k     4     4   256     4
em9                   55612    2k     9     4   256     9

The 2 systems are generic MP kernel build with options:
option  MPLS

And option EM_DEBUG on "core3" (the 4.8 system): I've added it to try 
to debug this problem.

core3 run ospf (v4 & v6) and bgpd with 13 peers and around 344000 
routes (3 peers feed 344000 route each). The problem was here before 
we run ospf6d.


Manuel Guesdon - OXYMIUM <mgues...@oxymium.net>
4 rue Auguste Gillot  -  93200 Saint-Denis  -  France
Standard: 0 811 093 286 (Cout d'un appel local)   Fax: +33 1 7473 3971
LD Support: +33 1 7473 3973                       LD:  +33 1 7473 3980


Manuel Guesdon

Manuel Guesdon - OXYMIUM

Reply via email to