Haggai,

Thanks for review! I will add the message you suggested and re-post.

thanks,
wengang

在 2015年07月06日 14:18, Haggai Eran 写道:
On 24/06/2015 07:54, Wengang Wang wrote:
There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
failed(mr pool running out). this lead to the refcount overflow.

A complain in line 117(see following) is seen. From vmcore:
s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
to return ERR_PTR(-EAGAIN).

115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
116 {
117         BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
118         if (atomic_dec_and_test(&rds_ibdev->refcount))
119                 queue_work(rds_wq, &rds_ibdev->free_work);
120 }

fix is to drop refcount when rds_ib_alloc_fmr failed.

Signed-off-by: Wengang Wang <wen.gang.w...@oracle.com>
---
  net/rds/ib_rdma.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index 273b8bf..657ba9f 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -759,8 +759,10 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long 
nents,
        }
ibmr = rds_ib_alloc_fmr(rds_ibdev);
-       if (IS_ERR(ibmr))
+       if (IS_ERR(ibmr)) {
+               rds_ib_dev_put(rds_ibdev);
                return ibmr;
+       }
ret = rds_ib_map_fmr(rds_ibdev, ibmr, sg, nents);
        if (ret == 0)

It seems like the function indeed is missing a put on the rds_ibdev in
that case.

Reviewed-by: Haggai Eran <hagg...@mellanox.com>

You may also want to add:
Fixes: 3e0249f9c05c ("RDS/IB: add refcount tracking to struct
rds_ib_device")

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to