Re: mlx5 + SRP: max_qp_sz mismatch
Just got back from a few days in Denver, I'll give it a try ASAP. We also have a ton of ConnectX-3's and a few ConnectX-2's. I'll give it a quick try on those too just for fun. And if anyone ever needs to test something against one of these (and the test isn't prohibitively difficult to set up) I would be happy to give it a try. Thanks, Mark On Thu, Aug 28, 2014 at 9:58 AM, Bart Van Assche bvanass...@acm.org wrote: On 08/27/14 13:28, Eli Cohen wrote: On 08/26/14 18:10, Sagi Grimberg wrote: Since I don't know how true send queue size can be computed from the device capabilities at the moment -I can suggest a fix to srpt to retry with srp_sq_size/2 (ans so on until it succeeds...) The device capabilities provide the maximum number of send work requests that the device supports but the actual number of work requests that can be supported in a specific case depends on other characteristics of the work requests. For example, in the case of Connect-IB, the actual number depends on the number of s/g entries, the transport type, etc. This is in compliance with the IB spec: 11.2.1.2 QUERY HCA Description: Returns the attributes for the specified HCA. The maximum values defined in this section are guaranteed not-to-exceed values. It is possible for an implementation to allocate some HCA resources from the same space. In that case, the maximum values returned are not guaranteed for all of those resources simultaneously. So, a well written application should try smaller values if it fails with ENOMEM. Hello Mark, It would help if you could test the patch below. Sorry but I don't have access to a ConnectIB setup myself. Thanks, Bart. Reported-by: Mark Lehrer leh...@gmail.com Signed-off-by: Bart Van Assche bvanass...@acm.org --- drivers/infiniband/ulp/srpt/ib_srpt.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index fe09f27..3ffaf4e 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -2091,6 +2091,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) if (!qp_init) goto out; +retry: ch-cq = ib_create_cq(sdev-device, srpt_completion, NULL, ch, ch-rq_size + srp_sq_size, 0); if (IS_ERR(ch-cq)) { @@ -2114,6 +2115,13 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) ch-qp = ib_create_qp(sdev-pd, qp_init); if (IS_ERR(ch-qp)) { ret = PTR_ERR(ch-qp); + if (ret == -ENOMEM) { + srp_sq_size /= 2; + if (srp_sq_size = MIN_SRPT_SQ_SIZE) { + ib_destroy_cq(ch-cq); + goto retry; + } + } printk(KERN_ERR failed to create_qp ret= %d\n, ret); goto err_destroy_cq; } -- 1.8.4.5 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mlx5 + SRP: max_qp_sz mismatch
From what I see srp_sq_size is controlled via configfs. Can you set it to 2048 just for the sake of confirmation this is indeed the issue? Now that we have confirmed that this works, what is the proper way to fix the problem? Is the core issue with mlx5 or srp? Thanks, Mark On Tue, Aug 19, 2014 at 11:01 AM, Mark Lehrer leh...@gmail.com wrote: From what I see srp_sq_size is controlled via configfs. Can you set it to 2048 just for the sake of confirmation this is indeed the issue? Yes! This setting allowed the two machines to establish an SRP session. I'll try some I/O tests to see how well it works. Thanks, Mark On Tue, Aug 19, 2014 at 8:50 AM, Sagi Grimberg sa...@dev.mellanox.co.il wrote: On 8/19/2014 2:20 AM, Mark Lehrer wrote: I have a client machine that is trying to establish an SRP connection, and it is failing due to an ENOMEM memory allocation error. I traced it down to the max_qp_sz field -- the mlx5 limits this to 16384 but the request wants 32768. I spent some time trying to figure out how this limit is set, but it isn't quite obvious. Is there a driver parameter I can set, or a hard coded limit somewhere? I'm using Ubuntu 14.04 and targetcli on the target side, Windows 2008r2 and WinOFED on the client side. Hi Mark, I think the issue here is that the SRP target asks for srp_sq_size (default 4096) to allocate room for send WRs, but it also asks for SRPT_DEF_SG_PER_WQE(16) to allocate room for max_send_sge which generally makes the work queue entries bigger as they come inline. That probably exceeds the mlx5 max send queue size supported... It is strange that mlx5 driver is not able to fit same send queue lengths as mlx4... Probably work queue entries are slightly bigger (I can check that - and I will). I just wander how can the ULP requesting the space reservations in the send queue know that, and if it should know that at all... From what I see srp_sq_size is controlled via configfs. Can you set it to 2048 just for the sake of confirmation this is indeed the issue? Thanks, Sagi. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mlx5 + SRP: max_qp_sz mismatch
From what I see srp_sq_size is controlled via configfs. Can you set it to 2048 just for the sake of confirmation this is indeed the issue? Yes! This setting allowed the two machines to establish an SRP session. I'll try some I/O tests to see how well it works. Thanks, Mark On Tue, Aug 19, 2014 at 8:50 AM, Sagi Grimberg sa...@dev.mellanox.co.il wrote: On 8/19/2014 2:20 AM, Mark Lehrer wrote: I have a client machine that is trying to establish an SRP connection, and it is failing due to an ENOMEM memory allocation error. I traced it down to the max_qp_sz field -- the mlx5 limits this to 16384 but the request wants 32768. I spent some time trying to figure out how this limit is set, but it isn't quite obvious. Is there a driver parameter I can set, or a hard coded limit somewhere? I'm using Ubuntu 14.04 and targetcli on the target side, Windows 2008r2 and WinOFED on the client side. Hi Mark, I think the issue here is that the SRP target asks for srp_sq_size (default 4096) to allocate room for send WRs, but it also asks for SRPT_DEF_SG_PER_WQE(16) to allocate room for max_send_sge which generally makes the work queue entries bigger as they come inline. That probably exceeds the mlx5 max send queue size supported... It is strange that mlx5 driver is not able to fit same send queue lengths as mlx4... Probably work queue entries are slightly bigger (I can check that - and I will). I just wander how can the ULP requesting the space reservations in the send queue know that, and if it should know that at all... From what I see srp_sq_size is controlled via configfs. Can you set it to 2048 just for the sake of confirmation this is indeed the issue? Thanks, Sagi. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
mlx5 + SRP: max_qp_sz mismatch
I have a client machine that is trying to establish an SRP connection, and it is failing due to an ENOMEM memory allocation error. I traced it down to the max_qp_sz field -- the mlx5 limits this to 16384 but the request wants 32768. I spent some time trying to figure out how this limit is set, but it isn't quite obvious. Is there a driver parameter I can set, or a hard coded limit somewhere? I'm using Ubuntu 14.04 and targetcli on the target side, Windows 2008r2 and WinOFED on the client side. Thanks, Mark -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
On Feb 28, 2014, Vasiliy Tolstov wrote: I'm update firmware to 2.7.200 (that is the latest from ftp supermicro). Now i'm try to test it. Which type of HCA do you have? The firmware I downloaded from ftp.supermicro.com was 2.9.1000, and once I upgraded to this version from 2.7.200 I was able to use mlx4_en. lspci shows my card as: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] Mark -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
nfs-rdma performance
Awesome work on nfs-rdma in the later kernels! I had been having panic problems for awhile and now things appear to be quite reliable. Now that things are more reliable, I would like to help work on speed issues. On this same hardware with SMB Direct and the standard storage review 8k 70/30 test, I get combined read write performance of around 2.5GB/sec. With nfs-rdma it is pushing about 850MB/sec. This is simply an unacceptable difference. I'm using the standard settings -- connected mode, 65520 byte MTU, nfs-server-side async, lots of nfsd's, and nfsver=3 with large buffers. Does anyone have any tuning suggestions and/or places to start looking for bottlenecks? Thanks, Mark -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nfs-rdma performance
I am using ConnectX-3 HCA's and Dell R720 servers. On Thu, Jun 12, 2014 at 2:00 PM, Steve Wise sw...@opengridcomputing.com wrote: On 6/12/2014 2:54 PM, Mark Lehrer wrote: Awesome work on nfs-rdma in the later kernels! I had been having panic problems for awhile and now things appear to be quite reliable. Now that things are more reliable, I would like to help work on speed issues. On this same hardware with SMB Direct and the standard storage review 8k 70/30 test, I get combined read write performance of around 2.5GB/sec. With nfs-rdma it is pushing about 850MB/sec. This is simply an unacceptable difference. I'm using the standard settings -- connected mode, 65520 byte MTU, nfs-server-side async, lots of nfsd's, and nfsver=3 with large buffers. Does anyone have any tuning suggestions and/or places to start looking for bottlenecks? What RDMA device? Steve. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html