On Sun, Jun 12, 2016 at 02:29:42PM +0000, Jianfeng Tan wrote:
> Compile DPDK with clang, below line in virtio_rxtx.c could be
> optimized with four "VMOVAPS ymm, m256".
>   memset(&rxvq->fake_mbuf, 0, sizeof(rxvq->fake_mbuf));
> 
> This instruction requires memory address is 32-byte aligned.
> Or, it leads to segfault. Although only tested with Clang 3.6.0,
> it can be reproduced in any compilers, which do aggressive
> optimization, aka, change memset of known length to VMOVAPS.
> 
> The fact that struct rte_mbuf is cache line aligned, can only make
> sure fake_mbuf is aligned compared to the start address of struct
> virtnet_rx. Unfortunately, this address is not necessarily aligned
> because it's allocated by:
>   rxvq = (struct virtnet_rx *)RTE_PTR_ADD(vq, sz_vq);
> 
> When sz_vq is not aligned, then rxvq cannot be allocated with an
> aligned address, and then rxvq->fake_mbuf (addr of rxvq + cache line
> size) is not an aligned address.
> 
> The fix is very simple that making sz_vq 32-byte aligned. Here we
> make it cache line aligned for future optimization.
> 
> Fixes: a900472aedef ("virtio: split virtio Rx/Tx queue")

Folded this fix (and the other fix) to above commit, so that we could
have a clean working tree.

Thanks for good catch!

        --yliu

Reply via email to