Hi, Yuanhan: > -----Original Message----- > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yuanhan Liu > Sent: Wednesday, October 12, 2016 11:22 AM > To: Michael S. Tsirkin <m...@redhat.com>; Thomas Monjalon > <thomas.monja...@6wind.com> > Cc: Wang, Zhihong <zhihong.w...@intel.com>; Maxime Coquelin > <maxime.coque...@redhat.com>; Stephen Hemminger > <step...@networkplumber.org>; d...@dpdk.org; qemu- > de...@nongnu.org > Subject: Re: [dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout > feature > > On Tue, Oct 11, 2016 at 02:57:49PM +0800, Yuanhan Liu wrote: > > > > > > > There was an example: the vhost enqueue optmization patchset > > > > > > > from Zhihong [0] uses memset, and it introduces more than > > > > > > > 15% drop (IIRC) > > Though it doesn't matter now, but I have verified it yesterday (with and > wihtout memset), the drop could be up to 30+%. > > This is to let you know that it could behaviour badly if memset is not > inlined. > > > > > > > > on my Ivybridge server: it has no such issue on his server though. > > > > > > > > > > > > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html > > > > > > > > > > > > > > --yliu > > > > > > > > > > > > I'd say that's weird. what's your config? any chance you are > > > > > > using an old compiler? > > > > > > > > > > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. > > > > > IIRC, he said the memset is not well optimized for Ivybridge server. > > > > > > > > The dst is remote in that case. It's fine on Haswell but has > > > > complication in Ivy Bridge which (wasn't supposed to but) causes > serious frontend issue. > > > > > > > > I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1. > > > > > > > > > So try something like this then: > > > > Yes, I saw memset is inlined when this diff is applied. > > I have another concern though: It's a trick could let gcc do the inline, I am > not > quite sure whether that's ture with other compilers (i.e. clang, icc, or even, > older gcc). > > For this case, I think I still prefer some trick like > *(struct ..*) = {0, } > > Or even, we may could introduce rte_memset(). IIRC, that has been > proposed somehow before? >
I'm trying to introduce rte_memset to have a prototype It have Gotten some performance enhancement For small size, I'm optimize it further. --Zhiyong > --yliu