subject:"RE\: \[PATCH v2\] common\/mlx5\: Optimize mlx5 mempool get extmem"

Re: [PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem

2024-10-07 Thread John Romein

Dear Stephen, The problem has not been solved, but I found a workaround. According to the documentation (https://doc.dpdk.org/guides/prog_guide/gpudev.html, sec 11.3), rte_extmem_register should be invoked with GPU_PAGE_SIZE as an argument. If GPU_PAGE_SIZE is set to 2 MB instead of 64 kB, r

Re: [PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem

2023-11-02 Thread John Romein

Dear Slava, Thank you for looking at the patch. With the original code, I saw that the application spent literally hours in this function during program start up, if tens of gigabytes of GPU memory are registered. This was due to qsort being invoked for every new added item (to keep the list

RE: [PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem

2023-11-01 Thread Slava Ovsiienko

Hi, Thank you for this optimizing patch. My concern is this line: > + heap = malloc(mp->size * sizeof(struct mlx5_range)); The pool size can be huge and it might cause the large memory allocation (on host CPU side). What is the reason causing "hours" of registering? Reallocs per each pool e