On Wed, 2 Mar 2011 21:12:24 +0100 Claudio Jeker <cje...@diehard.n-r-g.com> wrote: >| > >| One thing that seems to have a big performance impact is >| > >| net.inet.ip.ifq.maxlen. If and only if your network cards are all >| > >| supported by MCLGETI (ie, they show LWM/CWM/HWM values in 'systat >| > >| mbufs', you can try increasing ifq.maxlen until you don't see >| > >| net.inet.ip.ifq.drops incrementing anymore under constant load. >| > >| > Yes all my nic interfaces have LWM/CWM/HWM values: >| > IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM >| > System 256 83771 5502 >| > 2k 160 1252 >| > em0 37 2k 4 4 256 4 >| > em1 258 2k 4 4 256 4 >| > em2 372751 2k 7 4 256 7 >| > em3 8258 2k 4 4 256 4 >| > em4 25072 2k 63 4 256 63 >| > em5 3658 2k 8 4 256 8 >| > em6 501288 2k 24 4 256 24 >| > em7 22 2k 4 4 256 4 >| > em8 36551 2k 23 4 256 23 >| > em9 52053 2k 5 4 256 4 >| > >| >| Woohoo. That is a lot of livelocks you hit. In other words you are losing >| ticks by something spinning to long in the kernel. Interfaces with a very >| low CWM but a high pps rate are the ones you need to investigate about.
Hum OK. A strange thing on livelocks is the big difference beetwen for example em2 and em4: Name Mtu Network Ipkts Ierrs Opkts Oerrs Colls em2 1500 <Link> 8868034600 42899 6562765482 0 0 em2 1500 fe80::%em2/ 8868034600 42899 6562765482 0 0 em4 1500 <Link> 33934108692 19371393 20672882997 0 0 em4 1500 fe80::%em4/ 33934108692 19371393 20672882997 0 0 There's more livelocks on em2 but less packets (or may be counters were reseted to 0 after reaching max value) >| Additionally I would like to see your netstat -m and vmstat -m output. netstat -m: 18472 mbufs in use: 18449 mbufs allocated to data 16 mbufs allocated to packet headers 7 mbufs allocated to socket names and addresses 331/4188/6144 mbuf 2048 byte clusters in use (current/peak/max) 0/8/6144 mbuf 4096 byte clusters in use (current/peak/max) 0/8/6144 mbuf 8192 byte clusters in use (current/peak/max) 0/8/6144 mbuf 9216 byte clusters in use (current/peak/max) 0/8/6144 mbuf 12288 byte clusters in use (current/peak/max) 0/8/6144 mbuf 16384 byte clusters in use (current/peak/max) 0/8/6144 mbuf 65536 byte clusters in use (current/peak/max) 30704 Kbytes allocated to network (70% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines vmstat -m: Memory statistics by bucket size Size In Use Free Requests HighWater Couldfree 16 113578 195414 32414058 1280 6712 32 378705 687 74930489 640 6824 64 7707 869 11878746 320 27074 128 11411 45 36424677 160 78 256 7875 973 328666338 80 60487950 512 1951 65 6017929 40 413368 1024 331 177 1947159 20 880831 2048 57 3 496398 10 0 4096 5164 15 260948 5 166561 8192 36 5 226431 5 18240 16384 12 0 8279177 5 0 32768 5 0 11 5 0 65536 2 0 2 5 0 Memory usage type by bucket size Size Type(s) 16 devbuf, pcb, routetbl, sysctl, UFS mount, dirhash, ACPI, exec, xform_data, VM swap, UVM amap, UVM aobj, USB, USB device, temp 32 devbuf, pcb, routetbl, ifaddr, UFS mount, sem, dirhash, ACPI, ip_moptions, in_multi, exec, pfkey data, xform_data, UVM amap, USB, temp 64 devbuf, pcb, routetbl, fragtbl, ifaddr, vnodes, UFS mount, dirhash, ACPI, proc, VFS cluster, in_multi, ether_multi, VM swap, UVM amap, USB, USB device, NDP, temp 128 devbuf, pcb, routetbl, fragtbl, ifaddr, mount, sem, dirhash, ACPI, VFS cluster, MFS node, NFS srvsock, ip_moptions, ttys, pfkey data, UVM amap, USB, USB device, NDP, temp 256 devbuf, routetbl, ifaddr, ioctlops, iov, vnodes, shm, VM map, dirhash, ACPI, ip_moptions, exec, UVM amap, USB, USB device, ip6_options, temp 512 devbuf, ifaddr, sysctl, ioctlops, iov, vnodes, dirhash, file desc, NFS daemon, ttys, newblk, UVM amap, USB, USB device, temp 1024 devbuf, pcb, sysctl, ioctlops, iov, mount, UFS mount, shm, ACPI, proc, ttys, exec, UVM amap, USB HC, crypto data, temp 2048 devbuf, ioctlops, iov, UFS mount, ACPI, VM swap, UVM amap, UVM aobj, temp 4096 devbuf, ifaddr, ioctlops, iov, proc, UVM amap, memdesc, temp 8192 devbuf, iov, ttys, pagedep, UVM amap, USB, temp 16384 devbuf, iov, MSDOSFS mount, temp 32768 devbuf, UFS quota, UFS mount, ISOFS mount, inodedep 65536 devbuf Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) devbuf 9428 21719K 21757K 78644K 1219316 0 0 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768,65536 pcb 57 14K 15K 78644K 75257 0 0 16,32,64,128,1024 routetbl375605 11783K 11862K 78644K316114307 0 0 16,32,64,128,256 fragtbl 0 0K 1K 78644K 6 0 0 64,128 ifaddr 484 101K 101K 78644K 682 0 0 32,64,128,256,512,4096 sysctl 3 2K 2K 78644K 3 0 0 16,512,1024 ioctlops 0 0K 4K 78644K 73636 0 0 256,512,1024,2048,4096 iov 0 0K 32K 78644K 20500234 0 0 256,512,1024,2048,4096,8192,16384 mount 12 7K 7K 78644K 12 0 0 128,1024 vnodes 51 13K 105K 78644K 444285 0 0 64,256,512 UFS quota 1 32K 32K 78644K 1 0 0 32768 UFS mount 25 61K 61K 78644K 25 0 0 16,32,64,1024,2048,32768 shm 2 2K 2K 78644K 2 0 0 256,1024 VM map 2 1K 1K 78644K 2 0 0 256 sem 2 1K 1K 78644K 2 0 0 32,128 dirhash 219 45K 87K 78644K 76185 0 0 16,32,64,128,256,512 ACPI 6204 717K 731K 78644K 25557 0 0 16,32,64,128,256,1024,2048 file desc 1 1K 1K 78644K 66 0 0 512 proc 15 10K 10K 78644K 15 0 0 64,1024,4096 VFS cluster 0 0K 1K 78644K 65131 0 0 64,128 MFS node 2 1K 1K 78644K 2 0 0 128 NFS srvsock 1 1K 1K 78644K 1 0 0 128 NFS daemon 1 1K 1K 78644K 1 0 0 512 ip_moptions 20 3K 3K 78644K 34 0 0 32,128,256 in_multi 687 32K 32K 78644K 1169 0 0 32,64 ether_multi 425 27K 27K 78644K 876 0 0 64 ISOFS mount 1 32K 32K 78644K 1 0 0 32768 MSDOSFS mount 1 16K 16K 78644K 1 0 0 16384 ttys 420 308K 308K 78644K 420 0 0 128,512,1024,8192 exec 0 0K 3K 78644K 1043888 0 0 16,32,256,1024 pfkey data 2 1K 1K 78644K 3 0 0 32,128 xform_data 0 0K 1K 78644K 32283 0 0 16,32 pagedep 1 8K 8K 78644K 1 0 0 8192 inodedep 1 32K 32K 78644K 1 0 0 32768 newblk 1 1K 1K 78644K 1 0 0 512 VM swap 1 1K 3K 78644K 4 0 0 16,64,2048 UVM amap132353 5150K 6938K 78644K 60829512 0 0 16,32,64,128,256,512,1024,2048,4096,8192 UVM aobj 2 3K 3K 78644K 2 0 0 16,2048 USB 155 45K 45K 78644K 157 0 0 16,32,64,128,256,512,8192 USB device 52 17K 17K 78644K 52 0 0 16,64,128,256,512 USB HC 1 1K 1K 78644K 1 0 0 1024 memdesc 1 4K 4K 78644K 1 0 0 4096 crypto data 1 1K 1K 78644K 1 0 0 1024 ip6_options 0 0K 1K 78644K 4 0 0 256 NDP 117 12K 12K 78644K 157 0 0 64,128 temp 481 37K 54K 78644K101039070 0 0 16,32,64,128,256,512,1024,2048,4096,8192,16384 Memory Totals: In Use Free Requests 40227K 3694K 501542367 Memory resource pool statistics Name Size Requests Fail InUse Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle extentpl 40 236 0 148 2 0 2 2 0 8 0 phpool 96 12690 0 10719 262 0 262 262 0 8 0 pmappl 176 1048171 0 42 3 0 3 3 0 8 1 pvpl 32 494188732 0 202861 3000 11 2989 2989 0 263 251 pdppl 4096 1048171 0 42 660 613 47 62 0 8 5 vmsppl 296 1048171 0 42 5 0 5 5 0 8 1 vmmpepl 144 4584721742 0 126526 36068 27355 8713 11753 0 304 304 vmmpekpl 144 4810977 0 37 3 0 3 3 0 8 1 aobjpl 72 1 0 1 1 0 1 1 0 8 0 amappl 72 59796600 0 124908 7097 1656 5441 5744 0 75 75 anonpl 24 135786454 0 195915 2211 0 2211 2211 0 287 157 bufpl 272 6196121 0 23830 2377 757 1620 1653 0 8 8 mbpl 256 369184744396 0 83949 5534 0 5534 5534 1 384 114 mcl2k 2048 122966085882 0 272 2094 0 2094 2094 4 3072 1952 sockpl 376 16443450 0 84 12 0 12 12 0 8 3 procpl 504 1048206 0 77 13 0 13 13 0 8 3 processpl 120 1048206 0 77 3 0 3 3 0 8 0 zombiepl 144 1048129 0 0 1 0 1 1 0 8 1 ucredpl 80 11424 0 20 1 0 1 1 0 8 0 pgrppl 40 15174 0 31 1 0 1 1 0 8 0 sessionpl 64 7308 0 28 1 0 1 1 0 8 0 pcredpl 24 1048206 0 77 1 0 1 1 0 8 0 lockfpl 88 18664 0 2 1 0 1 1 0 8 0 filepl 120 144590106 0 181 7 0 7 7 0 8 1 fdescpl 440 1048172 0 43 7 0 7 7 0 8 2 pipepl 120 3739218 0 24 2 0 2 2 0 8 1 kqueuepl 256 42 0 6 1 0 1 1 0 8 0 knotepl 104 4204807 0 34 1 0 1 1 0 8 0 sigapl 488 1048171 0 42 8 0 8 8 0 8 2 wqtasks 40 16542853 0 0 1 0 1 1 0 8 1 wdcspl 176 5111262 0 0 1 0 1 1 0 8 1 scxspl 200 128692 0 0 1 0 1 1 0 8 1 namei 1024 164173312 0 0 2 0 2 2 0 8 2 vnodes 264 5927 0 5927 396 0 396 396 0 8 0 nchpl 144 43086123 0 3416 754 621 133 220 0 8 6 ffsino 232 38122197 0 5918 353 4 349 349 0 8 0 dino1pl 128 38122197 0 5918 191 0 191 191 0 8 0 dirhash 1024 94189 0 422 736 601 135 157 0 128 29 pfrulepl 1272 16 0 2 2 0 2 2 0 8 1 pfstatepl 296 21762 23821 0 770 0 770 770 0 770 770 pfstatekeypl 104 21762 0 0 264 256 8 264 0 8 8 pfstateitempl 24 21762 0 0 61 53 8 61 0 8 8 pfrktable 1296 16 0 2 2 0 2 2 0 8 1 pfrke_plain 160 50 0 10 1 0 1 1 0 8 0 pfosfpen 112 7656 0 696 140 120 20 20 0 8 0 pfosfp 40 4477 0 407 5 0 5 5 0 8 0 pffrent 32 709698 0 0 1 0 1 1 0 40 1 pffrag 80 354531 0 0 1 0 1 1 0 20 1 rtentpl 200 7035558 0 345666 73243 55937 17306 17348 0 8 1 rttmrpl 64 43 0 0 1 0 1 1 0 8 1 ipqepl 40 3 0 0 1 0 1 1 0 8 1 ipqpl 40 3 0 0 1 0 1 1 0 8 1 tcpcbpl 552 63418 0 32 11 0 11 11 0 8 6 tcpqepl 32 8286 0 0 1 0 1 1 0 25 1 sackhlpl 24 527 0 0 1 0 1 1 0 198 1 synpl 248 1176 0 0 1 0 1 1 0 8 1 plimitpl 152 7271 0 14 1 0 1 1 0 8 0 inpcbpl 352 16368989 0 41 8 0 8 8 0 8 4 In use 138878K, total allocated 193584K; utilization 71.7% >| If I see it right you have 83771 mbufs allocated in your system. This >| sounds like a serious mbuf leak and could actually be the reason for your >| bad performance. It is very well possible that most of your buffer >| allocations fail causing the tiny rings and suboptimal performance. Hum, yes that seems to be a good way to explore. What can I try to confirm (or not) this ? >| > I've already increased to 2048 some time ago with good effect on ifq.drops >| > but even when ifq.drops doesn't increase, I still have >| > Ierrs on interfaces (I've just verified this right now) :-) >| >| Having some Ierrs is not a big issue always put them in perspective with >| the number of packets received. >| e.g. >| em6 1500 <Link> 00:30:48:9c:3a:80 72007980648 143035 62166589667 0 > 0 >| >| This interface had 143035 Ierrs but it also passed 72 billion packets so >| this is far less then 1% and not a problem. Yes I know but I'd like to find an explanation of this as it doesn't seems 'normal' and be sure (as far as I can) it doesn't hide a more or less important problem :-) >| The FIFO on the card don't matter that much. The problem is the DMA ring >| and the amount of slots on the ring that are actually usable. This is the >| CWM in the systat mbuf output. MCLGETI() reduces the buffers on the ring >| to limit the work getting into the system over a specific network card. OK >| > One of my interrogation is how to know that the system is heavy loaded. >| > systat -s 2 vmstat, give me these informations: >| > >| > Proc:r d s w Csw Trp Sys Int Sof Flt >| > 14 149 2 509 20118 98 31 >| > >| > 3.5%Int 0.5%Sys 0.0%Usr 0.0%Nic 96.0%Idle >| > | | | | | | | | | | | >| > >| > which make me think that the system is really not very loaded but I may >miss >| > a point.... >| > >| >| So you have this 3.5% Int and 0.5% Sys load and are still hitting tons of >| LIVELOCKS (e.g. the counters increase all the time)? It really looks like >| there is a different problem (the mentioned mbuf leak) slowing you down. :-) 2 points which may help: On a same hardware but with v 4.7, I have high livelocks too: IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM System 256 94 271 2k 84 615 lo0 em0 1520 2k 4 4 256 4 em1 196 2k 5 4 256 4 em2 27221 2k 6 4 256 6 em3 1 2k 4 4 256 4 em4 1 2k 4 4 256 4 em5 48408 2k 7 4 256 7 em6 379 2k 4 4 256 4 em7 2 em8 1 2k 4 4 256 4 em9 55612 2k 9 4 256 9 The 2 systems are generic MP kernel build with options: option MULTIPROCESSOR option MPLS And option EM_DEBUG on "core3" (the 4.8 system): I've added it to try to debug this problem. core3 run ospf (v4 & v6) and bgpd with 13 peers and around 344000 routes (3 peers feed 344000 route each). The problem was here before we run ospf6d. Manuel -- ______________________________________________________________________ Manuel Guesdon - OXYMIUM <mgues...@oxymium.net> 4 rue Auguste Gillot - 93200 Saint-Denis - France Standard: 0 811 093 286 (Cout d'un appel local) Fax: +33 1 7473 3971 LD Support: +33 1 7473 3973 LD: +33 1 7473 3980 -- Cordialement, Manuel Guesdon -- ______________________________________________________________________ Manuel Guesdon - OXYMIUM