> -----Original Message----- > From: Maxime Coquelin <maxime.coque...@redhat.com> > Sent: Monday, October 12, 2020 5:57 PM > To: Liu, Yong <yong....@intel.com>; Xia, Chenbo <chenbo....@intel.com>; > Wang, Zhihong <zhihong.w...@intel.com> > Cc: dev@dpdk.org > Subject: Re: [PATCH v3 0/5] vhost add vectorized data path > > Hi Marvin, > > On 10/12/20 11:10 AM, Liu, Yong wrote: > > > > > >> -----Original Message----- > >> From: Maxime Coquelin <maxime.coque...@redhat.com> > >> Sent: Monday, October 12, 2020 4:22 PM > >> To: Liu, Yong <yong....@intel.com>; Xia, Chenbo > <chenbo....@intel.com>; > >> Wang, Zhihong <zhihong.w...@intel.com> > >> Cc: dev@dpdk.org > >> Subject: Re: [PATCH v3 0/5] vhost add vectorized data path > >> > >> Hi Marvin, > >> > >> On 10/9/20 10:14 AM, Marvin Liu wrote: > >>> Packed ring format is imported since virtio spec 1.1. All descriptors > >>> are compacted into one single ring when packed ring format is on. It is > >>> straight forward that ring operations can be accelerated by utilizing > >>> SIMD instructions. > >>> > >>> This patch set will introduce vectorized data path in vhost library. If > >>> vectorized option is on, operations like descs check, descs writeback, > >>> address translation will be accelerated by SIMD instructions. On skylake > >>> server, it can bring 6% performance gain in loopback case and around 4% > >>> performance gain in PvP case. > >> > >> IMHO, 4% gain on PVP is not a significant gain if we compare to the > >> added complexity. Moreover, I guess this is 4% gain with testpmd-based > >> PVP? If this is the case it may be even lower with OVS-DPDK PVP > >> benchmark, I will try to do a benchmark this week. > >> > > > > Maxime, > > I have observed around 3% gain with OVS-DPDK in first version. But the > number is not reliable as datapath has been changed. > > I will try again after fixed OVS integration issue with latest dpdk. > > Thanks for the information. > > Also, wouldn't using AVX512 lower the CPU frequency? > If so, could it have an impact on the workload running on the other > CPUs? >
All AVX512 instructions used in vhost are lightweight ones, frequency won't be affected. Theoretically system performance won’t be affected if only lightweight instructions are used. Thanks. > Thanks, > Maxime > > >> Thanks, > >> Maxime > >> > >>> Vhost application can choose whether using vectorized acceleration, > just > >>> like external buffer feature. If platform or ring format not support > >>> vectorized function, vhost will fallback to use default batch function. > >>> There will be no impact in current data path. > >>> > >>> v3: > >>> * rename vectorized datapath file > >>> * eliminate the impact when avx512 disabled > >>> * dynamically allocate memory regions structure > >>> * remove unlikely hint for in_order > >>> > >>> v2: > >>> * add vIOMMU support > >>> * add dequeue offloading > >>> * rebase code > >>> > >>> Marvin Liu (5): > >>> vhost: add vectorized data path > >>> vhost: reuse packed ring functions > >>> vhost: prepare memory regions addresses > >>> vhost: add packed ring vectorized dequeue > >>> vhost: add packed ring vectorized enqueue > >>> > >>> doc/guides/nics/vhost.rst | 5 + > >>> doc/guides/prog_guide/vhost_lib.rst | 12 + > >>> drivers/net/vhost/rte_eth_vhost.c | 17 +- > >>> lib/librte_vhost/meson.build | 16 ++ > >>> lib/librte_vhost/rte_vhost.h | 1 + > >>> lib/librte_vhost/socket.c | 5 + > >>> lib/librte_vhost/vhost.c | 11 + > >>> lib/librte_vhost/vhost.h | 239 +++++++++++++++++++ > >>> lib/librte_vhost/vhost_user.c | 26 +++ > >>> lib/librte_vhost/virtio_net.c | 258 ++++----------------- > >>> lib/librte_vhost/virtio_net_avx.c | 344 > ++++++++++++++++++++++++++++ > >>> 11 files changed, 718 insertions(+), 216 deletions(-) > >>> create mode 100644 lib/librte_vhost/virtio_net_avx.c > >>> > >