On Thu, 26 Mar 2026 10:28:28 +0800 Junlong Wang <[email protected]> wrote:
> This patch optimizes the ZXDH PMD's receive and transmit path for better > performance through several improvements: > > - Add simple TX/RX burst functions (zxdh_xmit_pkts_simple and > zxdh_recv_single_pkts) for single-segment packet scenarios. > - Remove RX software ring (sw_ring) to reduce memory allocation and > copy. > - Optimize descriptor management with prefetching and simplified > cleanup. > - Reorganize structure fields for better cache locality. > > These changes reduce CPU cycles and memory bandwidth consumption, > resulting in improved packet processing throughput. > > Signed-off-by: Junlong Wang <[email protected]> I saw some things when reviewing but AI found lots more On Thu, 26 Mar 2026 10:28:28 +0800 Junlong Wang <[email protected]> wrote: > This patch optimizes the ZXDH PMD's receive and transmit path for better > performance through several improvements: Several issues found in review. Errors: 1. zxdh_rxtx.c, pkt_padding(): The return value is never checked by the caller submit_to_backend_simple(). If rte_pktmbuf_prepend() fails and pkt_padding() returns -1, the descriptor is still written with the mbuf's iova and data_len, submitting a corrupt packet to the device. Must check the return value and skip the packet on failure. 2. zxdh_rxtx.c, zxdh_recv_single_pkts(): When zxdh_init_mbuf() fails the loop does "break" instead of continuing or freeing the remaining mbufs. The mbufs at rcv_pkts[i+1] through rcv_pkts[num-1] were already dequeued from the virtqueue by zxdh_dequeue_burst_rx_packed() but are never freed, leaking them. 3. zxdh_rxtx.c, refill_desc_unwrap(): Descriptors are written with a plain store "start_dp[idx].flags = flags" instead of using zxdh_queue_store_flags_packed(). The original zxdh_enqueue_recv_refill_packed() uses the store-barrier version to ensure addr/len are visible before the flags. Without the barrier, the device could see the available flag before the descriptor data is committed. The rte_io_wmb() at the end of refill_que_descs() is after all flags are already written, so it does not help. 4. zxdh_rxtx.c, zxdh_xmit_pkts_prepare(): The removal of rte_net_intel_cksum_prepare() means packets requesting checksum offload will not have their pseudo-headers prepared. If the HW expects a pseudo-header, transmitted checksums will be incorrect. 5. zxdh_queue.h, zxdh_queue_enable_intr(): This function checks "if (event_flags_shadow == DISABLE)" then sets it to DISABLE again. It never actually enables interrupts. Pre-existing bug but this patch touches the function and should fix it. 6. zxdh_ethdev.c, zxdh_init_queue(): The hdr_mz NULL check logic is contradictory. Lines 158-162 check "if (hdr_mz == NULL)" and goto fail_q_alloc, but line 169 then checks "if (hdr_mz)" before assigning zxdh_net_hdr_mem. If the first check fires, the second is unreachable. If it doesn't fire, the second is always true. Pick one guard and use it consistently. Warnings: 1. zxdh_rxtx.c, zxdh_xmit_pkts_simple(): stats.bytes is never incremented. The packed path uses zxdh_update_packet_stats() but the simple path only counts packets and idle. The good_bytes xstat will always read zero on the simple TX path. 2. zxdh_rxtx.c, zxdh_recv_single_pkts(): Same issue -- stats.bytes is never incremented, so good_bytes will always be zero on the single-packet receive path. 3. zxdh_rxtx.c, zxdh_init_mbuf(): rte_pktmbuf_dump(stdout, rxm, 40) should not be in production code. It writes to stdout unconditionally on the error path. Use PMD_RX_LOG or remove it. 4. zxdh_ethdev.c, zxdh_dev_free_mbufs(): Changed from rte_pktmbuf_free() to rte_pktmbuf_free_seg(). If any mbufs in the TX queue are multi-segment (from the packed path which handles multi-seg via zxdh_xmit_enqueue_append), only the first segment will be freed, leaking the rest. 5. This patch is large (~800 lines, 8 files) and combines multiple independent changes: structure reorganization, new fast-path functions, sw_ring removal, descriptor management, removal of rte_net_intel_cksum_prepare, and MTU validation. Splitting into separate patches would make review and bisection easier.

