On Mon, Oct 10, 2016 at 11:37:44AM +0800, Yuanhan Liu wrote:
> On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> > Yes but two points.
> > 
> > 1. why is this memset expensive?
> 
> I don't have the exact answer, but just some rough thoughts:
> 
> It's an external clib function: there is a call stack and the
> IP register will bounch back and forth.

for memset 0?  gcc 5.3.1 on fedora happily inlines it.

> BTW, It's kind of an
> overkill to use that for resetting 14 bytes structure.
> 
> Some trick like
>     *(struct virtio_net_hdr *)hdr = {0, };
> 
> Or even 
>     hdr->xxx = 0;
>     hdr->yyy = 0;
> 
> should behaviour better.
> 
> There was an example: the vhost enqueue optmization patchset from
> Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
> on my Ivybridge server: it has no such issue on his server though.
> 
> [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> 
>       --yliu

I'd say that's weird. what's your config? any chance you
are using an old compiler?


> > Is the test completely skipping looking
> >    at the packet otherwise?
> > 
> > 2. As long as we are doing this, see
> >     Alignment vs. Networking
> >     ========================
> > in Documentation/unaligned-memory-access.txt
> > 
> > 
> > > From the micro-benchmarks results, we can expect +10% compared to
> > > indirect descriptors, and + 5% compared to using 2 descs in the
> > > virtqueue.
> > > Also, it should have the same benefits as indirect descriptors for 0%
> > > pkt loss (as we can fill 2x more packets in the virtqueue).
> > > 
> > > What do you think?
> > > 
> > > Thanks,
> > > Maxime

Reply via email to