On Thursday, April 3, 2014, Oleg A. Arkhangelsky <syso...@yandex.ru> wrote:
> Hello all, > > We've got very strange behavior when testing IP packet forwarding > performance > on Sandy Bridge platform (Supermicro X9DRH with the latest BIOS). This is > two > socket E5-2690 CPU system. Using different PC we're generating DDoS-like > traffic > with rate of about 4.5 million packets per second. Traffic is receiving by > two > Intel 82599 NICs and forwarding using the second port of one of this NICs. > All > load is evenly distributed among two nodes, so each of 32 CPUs SI usage is > virtually equal. > > Now the strangest part. Few moments after pktgen start on traffic > generator PC, > average CPU usage on SB system goes to 30-35%. No packet drops, > no rx_missed_errors, no rx_no_dma_resources. Very nice. But SI usage > starts to > decreasing gradually. After about 10 seconds we see ~15% SI average among > all > CPUs. Still no packet drops, the same RX rate as in the beginning, RX > packet > count is equal to TX packet count. After some time we see that average SI > usage > start to go up. Peaked at initial 30-35% it goes down to 15% again. This > pattern > is repeated every 80 seconds. Interval is very stable. It is undoubtedly > bind > to the test start time, because if we start test, then interrupt it after > 10 > seconds and start it again we see the same 30% SI peak in a few moments. > Then > all timings will be the same. > > During the high load time we see this in "perf top -e cache-misses": > > 14017.00 24.9% __netdev_alloc_skb [kernel.kallsyms] > 5172.00 9.2% _raw_spin_lock [kernel.kallsyms] > 4722.00 8.4% build_skb [kernel.kallsyms] > 3603.00 6.4% fib_table_lookup [kernel.kallsyms] > > During the "15% load time" top is different: > > 11090.00 20.9% build_skb [kernel.kallsyms] > 4879.00 9.2% fib_table_lookup [kernel.kallsyms] > 4756.00 9.0% ipt_do_table > /lib/modules/3.12.15-BUILD-g2e94e30-dirty/kernel/net/ipv4/netfilter/ip_tables.ko > 3042.00 5.7% nf_iterate [kernel.kallsyms] > > And __netdev_alloc_skb is at the end of list: > > 911.00 0.5% __netdev_alloc_skb [kernel.kallsyms] > > Some info from "perf stat -a sleep 2": > > 15% SI: > 28640006291 cycles # 0.447 GHz > [83.23%] > 38764605205 instructions # 1.35 insns per cycle > > 30% SI: > 56225552442 cycles # 0.877 GHz > [83.23%] > 39718182298 instructions # 0.71 insns per cycle > > CPUs never go above C1 state, all cores speed from /proc/cpuinfo is > constant at > 2899.942 MHz. ASPM is disabled. > > All non-essential userspace apps was explicitly killed for test time, there > was no active cron jobs too. So we should assume no interference with > userspace. > > Kernel version is 3.12.15 (ixgbe 3.21.2), but we have the same behavior > with > ancient 2.6.35 (ixgbe 3.10.16). Although on 2.6.35 we sometimes get 160-170 > seconds interval and different symbols at the "perf top" output (especially > local_bh_enable() which is completely blows my mind). > > Does anybody have some thoughts about the reasons of this kind of behavior? > Sandy Bridge CPU has many uncore/offcore events, which I can sample, maybe > some of them can shed some light on such behavior? > > Is it NUMA system ? This happens when node tries to access memory connected to other CPU. Abu Raheda
_______________________________________________ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies