On Mon, Oct 10, 2016 at 11:37:44AM +0800, Yuanhan Liu wrote: > On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote: > > On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote: > > > > > > > > > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote: > > Yes but two points. > > > > 1. why is this memset expensive? > > I don't have the exact answer, but just some rough thoughts: > > It's an external clib function: there is a call stack and the > IP register will bounch back and forth.
for memset 0? gcc 5.3.1 on fedora happily inlines it. > BTW, It's kind of an > overkill to use that for resetting 14 bytes structure. > > Some trick like > *(struct virtio_net_hdr *)hdr = {0, }; > > Or even > hdr->xxx = 0; > hdr->yyy = 0; > > should behaviour better. > > There was an example: the vhost enqueue optmization patchset from > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC) > on my Ivybridge server: it has no such issue on his server though. > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html > > --yliu I'd say that's weird. what's your config? any chance you are using an old compiler? > > Is the test completely skipping looking > > at the packet otherwise? > > > > 2. As long as we are doing this, see > > Alignment vs. Networking > > ======================== > > in Documentation/unaligned-memory-access.txt > > > > > > > From the micro-benchmarks results, we can expect +10% compared to > > > indirect descriptors, and + 5% compared to using 2 descs in the > > > virtqueue. > > > Also, it should have the same benefits as indirect descriptors for 0% > > > pkt loss (as we can fill 2x more packets in the virtqueue). > > > > > > What do you think? > > > > > > Thanks, > > > Maxime