Hi Thomas, >As you are doing optimizations, it's important to know the performance gain. >It could help to mitigate future reworks. >So please, could you provide some benchmarking numbers in the commit log?
Some performance data below. Also, forgot to mention that new code path can be switched on/off by setting ENABLE_MULTI_BUFFER_OPTIMIZE macro to 1/0. Do I need to resubmit the whole patch series, or just a cover letter, or ...? Konstantin SUT: dual-socket board IVB 2.8 GHz with 4 ports on 4 NIC (all at socket 0) connected to the traffic generator. 2x1GB pages, kernel: 3.11.3-201.fc19.x86_64, gcc 4.8.2. 64B packets, using the packet flooding method. All 4 ports are managed by one logical core: Optimised scalar PMD RX/TX was used. DIFF % (NEW-OLD) IPV4-CONT-BURST: +23% IPV6-CONT-BURST : +13% IPV4/IPV6-CONT-BURST: +8% IPV4-4STREAMSX8: +7% IPV4-4STREAMSX1: -2% Test cases description: IPV4-CONT-BURST - IPV4 packets all packets from the one input port are destined for the same output port. IPV6-CONT-BURST - IPV6 packets all packets from the one input port are destined for the same output port. IPV4/IPV6-CONT-BURST - mix of the first 2 with interleave=1 (e.g: IPV4,IPV6,IPV4,IPV6, ...) IPV4-4STREAMSX1 - 4 streams of IPV4 packets, where all packets from same stream are destined for the same output port (e.g: IPV4_DST_P0, IPV4_DST_P1, IPV4_DST_P2, IPV4_DST_P3, IPV4_DST_P0, ...) IPV4-4STREAMSX8 - same as above but packets for each stream are coming in groups of 8 (e.g: IPV4_DST_P0 X 8, IPV4_DST_P1 X 8, IPV4_DST_P2 X 8, IPV4_DST_P3 X 8, IPV4_DST_P0 X 8, ...)