On 5/11/2024 6:26 AM, Frank Du wrote: > The current calculation assumes that the mbufs are contiguous. However, > this assumption is incorrect when the memory spans across a huge page. > Correct to directly read the size from the mempool memory chunks. > > Signed-off-by: Frank Du <frank...@intel.com> > > --- > v2: > * Add virtual contiguous detect for for multiple memhdrs. > --- > drivers/net/af_xdp/rte_eth_af_xdp.c | 34 ++++++++++++++++++++++++----- > 1 file changed, 28 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c > b/drivers/net/af_xdp/rte_eth_af_xdp.c > index 268a130c49..7456108d6d 100644 > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c > @@ -1039,16 +1039,35 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused, > } > > #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG) > -static inline uintptr_t get_base_addr(struct rte_mempool *mp, uint64_t > *align) > +static inline uintptr_t get_memhdr_info(struct rte_mempool *mp, uint64_t > *align, size_t *len) > { > - struct rte_mempool_memhdr *memhdr; > + struct rte_mempool_memhdr *memhdr, *next; > uintptr_t memhdr_addr, aligned_addr; > + size_t memhdr_len = 0; > > + /* get the mempool base addr and align */ > memhdr = STAILQ_FIRST(&mp->mem_list); > memhdr_addr = (uintptr_t)memhdr->addr; > aligned_addr = memhdr_addr & ~(getpagesize() - 1); > *align = memhdr_addr - aligned_addr; >
I am aware this is not part of this patch, but as note, can't we use 'RTE_ALIGN_FLOOR' to calculate aligned address. > + memhdr_len += memhdr->len; > + > + /* check if virtual contiguous memory for multiple memhdrs */ > + next = STAILQ_NEXT(memhdr, next); > + while (next != NULL) { > + if ((uintptr_t)next->addr != (uintptr_t)memhdr->addr + > memhdr->len) { > + AF_XDP_LOG(ERR, "memory chunks not virtual contiguous, " > + "next: %p, cur: %p(len: %" PRId64 " > )\n", > + next->addr, memhdr->addr, memhdr->len); > + return 0; > + } > Isn't there a mempool flag that can help us figure out mempool is not IOVA contiguous? Isn't it sufficient on its own? > + /* virtual contiguous */ > + memhdr = next; > + memhdr_len += memhdr->len; > + next = STAILQ_NEXT(memhdr, next); > + } > > + *len = memhdr_len; > return aligned_addr; > } > This function goes too much details of the mempool object, and any change in mempool details has potential to break this code. @Andrew, @Morten, do you think does it make sense to have 'rte_mempool_info_get()' kind of function, that provides at least address and length of the mempool, and used here? This helps to hide internal details and complexity of the mempool for users. > > @@ -1125,6 +1144,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals > *internals, > void *base_addr = NULL; > struct rte_mempool *mb_pool = rxq->mb_pool; > uint64_t umem_size, align = 0; > + size_t len = 0; > > if (internals->shared_umem) { > if (get_shared_umem(rxq, internals->if_name, &umem) < 0) > @@ -1156,10 +1176,12 @@ xsk_umem_info *xdp_umem_configure(struct > pmd_internals *internals, > } > > umem->mb_pool = mb_pool; > - base_addr = (void *)get_base_addr(mb_pool, &align); > - umem_size = (uint64_t)mb_pool->populated_size * > - (uint64_t)usr_config.frame_size + > - align; > + base_addr = (void *)get_memhdr_info(mb_pool, &align, &len); > Is this calculation correct if mempool is not already aligned to page size? Like in an example page size is '0x1000', and "memhdr_addr = 0x000a1080" returned aligned address is '0x000a1000', "base_addr = 0x000a1000" Any access between '0x000a1000' & '0x000a1080' is invalid. Is this expected? > + if (!base_addr) { > + AF_XDP_LOG(ERR, "Failed to parse memhdr info from > pool\n"); > Log message is not accurate, it is not parsing memhdr info failed, but mempool was not satisfying expectation. > + goto err; > + } > + umem_size = (uint64_t)len + align; > > ret = xsk_umem__create(&umem->umem, base_addr, umem_size, > &rxq->fq, &rxq->cq, &usr_config);