Compile DPDK with clang, below line in virtio_rxtx.c could be optimized with four "VMOVAPS ymm, m256". memset(&rxvq->fake_mbuf, 0, sizeof(rxvq->fake_mbuf));
This instruction requires memory address is 32-byte aligned. Or, it leads to segfault. Although only tested with Clang 3.6.0, it can be reproduced in any compilers, which do aggressive optimization, aka, change memset of known length to VMOVAPS. The fact that struct rte_mbuf is cache line aligned, can only make sure fake_mbuf is aligned compared to the start address of struct virtnet_rx. Unfortunately, this address is not necessarily aligned because it's allocated by: rxvq = (struct virtnet_rx *)RTE_PTR_ADD(vq, sz_vq); When sz_vq is not aligned, then rxvq cannot be allocated with an aligned address, and then rxvq->fake_mbuf (addr of rxvq + cache line size) is not an aligned address. The fix is very simple that making sz_vq 32-byte aligned. Here we make it cache line aligned for future optimization. Fixes: a900472aedef ("virtio: split virtio Rx/Tx queue") Signed-off-by: Jianfeng Tan <jianfeng.tan at intel.com> --- drivers/net/virtio/virtio_ethdev.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index a995520..ad0f5a6 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -337,7 +337,10 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, snprintf(vq_name, sizeof(vq_name), "port%d_%s%d", dev->data->port_id, queue_names[queue_type], queue_idx); - sz_vq = sizeof(*vq) + vq_size * sizeof(struct vq_desc_extra); + + sz_vq = RTE_ALIGN_CEIL(sizeof(*vq) + + vq_size * sizeof(struct vq_desc_extra), + RTE_CACHE_LINE_SIZE); if (queue_type == VTNET_RQ) { sz_q = sz_vq + sizeof(*rxvq); } else if (queue_type == VTNET_TQ) { -- 2.1.4