Ping.
Request the reviewers to go through the patch and let us know if you have any 
queries with respect to it.

-
Praveen Kumar Kannoju. 


-----Original Message-----
From: Praveen Kannoju [mailto:[email protected]] 
Sent: 16 March 2021 06:39 PM
To: [email protected]; [email protected]; [email protected]; 
[email protected]; [email protected]
Cc: Rajesh Sivaramasubramaniom <[email protected]>; Rama 
Nichanamatlu <[email protected]>; Aruna Ramakrishna 
<[email protected]>; Jeffery Yoder <[email protected]>; 
Praveen Kannoju <[email protected]>
Subject: [PATCH v2] IB/mlx5: Reduce max order of memory allocated for xlt update

To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly allocation 
can sometimes take very long to return (a few seconds), especially if the 
system is fragmented and does not have any free chunks for orders >= 3. This 
causes the calling application to hang for a long time. To avoid these long 
latency spikes, limit max order of allocation to order 3, and reuse that buffer 
to populate_xlt() for that MR. This will increase the latency slightly (in the 
order of microseconds) for each
mlx5_ib_update_xlt() call, especially for larger MRs (since were making 
multiple calls to populate_xlt()), but its a small price to pay to avoid the 
large latency spikes with higher order allocations. The flag __GFP_NORETRY is 
used while fetching the free pages to ensure that there are no long compaction 
stalls when the system's memory is in fragmented condition.

Signed-off-by: Praveen Kumar Kannoju <[email protected]>
---
 drivers/infiniband/hw/mlx5/mr.c | 22 +++-------------------
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c 
index db05b0e..dac19f0 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1004,9 +1004,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd 
*pd,
        return mr;
 }
 
-#define MLX5_MAX_UMR_CHUNK ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - \
-                           MLX5_UMR_MTT_ALIGNMENT)
-#define MLX5_SPARE_UMR_CHUNK 0x10000
+#define MLX5_SPARE_UMR_CHUNK 0x8000
 
 /*
  * Allocate a temporary buffer to hold the per-page information to transfer to 
@@ -1028,30 +1026,16 @@ static void *mlx5_ib_alloc_xlt(size_t *nents, size_t 
ent_size, gfp_t gfp_mask)
         */
        might_sleep();
 
-       gfp_mask |= __GFP_ZERO;
+       gfp_mask |= __GFP_ZERO | __GFP_NORETRY;
 
-       /*
-        * If the system already has a suitable high order page then just use
-        * that, but don't try hard to create one. This max is about 1M, so a
-        * free x86 huge page will satisfy it.
-        */
        size = min_t(size_t, ent_size * ALIGN(*nents, xlt_chunk_align),
-                    MLX5_MAX_UMR_CHUNK);
+                    MLX5_SPARE_UMR_CHUNK);
        *nents = size / ent_size;
        res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
                                       get_order(size));
        if (res)
                return res;
 
-       if (size > MLX5_SPARE_UMR_CHUNK) {
-               size = MLX5_SPARE_UMR_CHUNK;
-               *nents = get_order(size) / ent_size;
-               res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
-                                              get_order(size));
-               if (res)
-                       return res;
-       }
-
        *nents = PAGE_SIZE / ent_size;
        res = (void *)__get_free_page(gfp_mask);
        if (res)
--
1.8.3.1

Reply via email to