From: LiuGang <[email protected]> In x86 environments, writing the Rx descriptor status byte with a single pci_dma_write() works correctly. However, on aarch64 with glibc 2.24+, the memcpy/memmove implementation uses a branchless sequence that copies the same byte three times when count == 1. This results in three consecutive STRB operations. Ref: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/memcpy.S;hb=refs/heads/release/2.24/master#l124
When the guest uses an e1000 NIC and DPDK, this behavior can cause packet reception to stop or packets to be dropped repeatedly. Under normal operation, DPDK clears the DD flag after processing a packet and then updates RDT. But due to the triple STRB from memcpy/memmove, the DD flag may be cleared and then immediately set again. This causes DPDK to consume the same packet twice, making RDT be incremented by one extra. As a result, RDH == RDT, and QEMU mistakenly considers the receive queue full, stopping packet reception. The issue can be reproduced on aarch64 using the adapted test program: https://github.com/cdkey/e1000_poc Replace pci_dma_write() with address_space_stb() to write the DD status byte directly, avoiding the problematic memcpy/memmove sequence and resolving the issue. Fixes: 034d00d48581("e1000: set RX descriptor status in a separate operation") Signed-off-by: Liu Gang <[email protected]> Signed-off-by: Ding Hui <[email protected]> --- hw/net/e1000.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/hw/net/e1000.c b/hw/net/e1000.c index 202ad40401..f7be035e0f 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -869,6 +869,14 @@ e1000_receiver_overrun(E1000State *s, size_t size) set_ics(s, 0, E1000_ICS_RXO); } +static inline void +e1000_dma_write_byte(PCIDevice *d, dma_addr_t addr, uint8_t val) +{ + AddressSpace *as = pci_get_address_space(d); + dma_barrier(as, DMA_DIRECTION_FROM_DEVICE); + address_space_stb(as, addr, val, MEMTXATTRS_UNSPECIFIED, NULL); +} + static ssize_t e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt) { @@ -980,8 +988,8 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt) } pci_dma_write(d, base, &desc, sizeof(desc)); desc.status |= (vlan_status | E1000_RXD_STAT_DD); - pci_dma_write(d, base + offsetof(struct e1000_rx_desc, status), - &desc.status, sizeof(desc.status)); + e1000_dma_write_byte(d, base + offsetof(struct e1000_rx_desc, status), + desc.status); if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN]) s->mac_reg[RDH] = 0; -- 2.36.0.windows.1
