Hi, Yuanhan:

> -----Original Message-----
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Yuanhan Liu
> Sent: Wednesday, October 12, 2016 11:22 AM
> To: Michael S. Tsirkin <m...@redhat.com>; Thomas Monjalon
> <thomas.monja...@6wind.com>
> Cc: Wang, Zhihong <zhihong.w...@intel.com>; Maxime Coquelin
> <maxime.coque...@redhat.com>; Stephen Hemminger
> <step...@networkplumber.org>; d...@dpdk.org; qemu-
> de...@nongnu.org
> Subject: Re: [dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout
> feature
> 
> On Tue, Oct 11, 2016 at 02:57:49PM +0800, Yuanhan Liu wrote:
> > > > > > > There was an example: the vhost enqueue optmization patchset
> > > > > > > from Zhihong [0] uses memset, and it introduces more than
> > > > > > > 15% drop (IIRC)
> 
> Though it doesn't matter now, but I have verified it yesterday (with and
> wihtout memset), the drop could be up to 30+%.
> 
> This is to let you know that it could behaviour badly if memset is not 
> inlined.
> 
> > > > > > > on my Ivybridge server: it has no such issue on his server though.
> > > > > > >
> > > > > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> > > > > > >
> > > > > > >   --yliu
> > > > > >
> > > > > > I'd say that's weird. what's your config? any chance you are
> > > > > > using an old compiler?
> > > > >
> > > > > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more.
> > > > > IIRC, he said the memset is not well optimized for Ivybridge server.
> > > >
> > > > The dst is remote in that case. It's fine on Haswell but has
> > > > complication in Ivy Bridge which (wasn't supposed to but) causes
> serious frontend issue.
> > > >
> > > > I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1.
> > >
> > >
> > > So try something like this then:
> >
> > Yes, I saw memset is inlined when this diff is applied.
> 
> I have another concern though: It's a trick could let gcc do the inline, I am 
> not
> quite sure whether that's ture with other compilers (i.e. clang, icc, or even,
> older gcc).
> 
> For this case, I think I still prefer some trick like
>     *(struct ..*) = {0, }
> 
> Or even, we may could introduce rte_memset(). IIRC, that has been
> proposed somehow before?
> 

I'm trying to introduce rte_memset to have a prototype  It have
Gotten some performance enhancement For small size, I'm optimize it further.

--Zhiyong

>       --yliu

Reply via email to