Hi, At 2017-12-18 18:26:28, "Paolo Abeni" <pab...@redhat.com> wrote: >Hi, > >On Mon, 2017-12-18 at 12:11 +0800, zhangliping wrote: >> From: zhangliping <zhanglipin...@baidu.com> >> >> Under our udp pressure performance test, after gro is disabled, rx rate >> will be improved from ~2500kpps to ~2800kpps. We can find some difference >> from perf report: >> 1. gro is enabled: >> 24.23% [kernel] [k] udp4_lib_lookup2 >> 5.42% [kernel] [k] __memcpy >> 3.87% [kernel] [k] fib_table_lookup >> 3.76% [kernel] [k] __netif_receive_skb_core >> 3.68% [kernel] [k] ip_rcv >> >> 2. gro is disabled: >> 9.66% [kernel] [k] udp4_lib_lookup2 >> 9.47% [kernel] [k] __memcpy >> 4.75% [kernel] [k] fib_table_lookup >> 4.71% [kernel] [k] __netif_receive_skb_core >> 3.90% [kernel] [k] virtnet_poll >> >> So if there's no udp tunnel(such as vxlan) configured, we can skip >> the udp gro processing. > >I tested something similar some time ago, but I measured a much smaller >gain. Also the topmost perf offenders looks quite different from what I >see here, can you please share more details about the test case?
My test case is very simple, two VMs were connected via ovs + dpdk. Inside VM, rps is enabled. Then one VM runs "iperf -s -u &", another VM runs "iperf -c 1.1.1.2 -P 12 -u -b 10Gbps -l 40 -t 36000". On the iperf server side, use the sar tool to watch the rx rate performance. >> +DEFINE_STATIC_KEY_FALSE(udp_gro_needed); >> +EXPORT_SYMBOL_GPL(udp_gro_needed); >> + > >I think that adding a new static key is not required, as we can >probably reuse 'udp_encap_needed' and 'udpv6_encap_needed'. The latter >choice allows earlier branching (in >udp4_gro_receive()/udp6_gro_receive() instead of udp_gro_receive(). Yes, we can reuse udpX_encap_needed, I indeed want to do like this at my first attempt. But I find some udp tunnel doesn't support gro receive(such as l2tp, udp_media). And udpX_encap_needed won't be disabled after it is enabled, at least for now. So I finally chose to add a new udp_gro_needed, which seems a little redundant. :(