Re: mlx5 + SRP: max_qp_sz mismatch

2014-09-02 Thread Mark Lehrer
Just got back from a few days in Denver, I'll give it a try ASAP.

We also have a ton of ConnectX-3's and a few ConnectX-2's.  I'll give
it a quick try on those too just for fun.  And if anyone ever needs to
test something against one of these (and the test isn't prohibitively
difficult to set up)  I would be happy to give it a try.

Thanks,
Mark


On Thu, Aug 28, 2014 at 9:58 AM, Bart Van Assche bvanass...@acm.org wrote:
 On 08/27/14 13:28, Eli Cohen wrote:
 On 08/26/14 18:10, Sagi Grimberg wrote:

 Since I don't know how true send queue size can be computed from the
 device capabilities at the moment -I can suggest a fix to srpt to
 retry with srp_sq_size/2 (ans so on until it succeeds...)

 The device capabilities provide the maximum number of send work
 requests that the device supports but the actual number of work
 requests that can be supported in a specific case depends on other
 characteristics of the work requests. For example, in the case of
 Connect-IB, the actual number depends on the number of s/g entries,
 the transport type, etc. This is in compliance with the IB spec:

 11.2.1.2 QUERY HCA
 Description:
 Returns the attributes for the specified HCA.
 The maximum values defined in this section are guaranteed
 not-to-exceed values. It is possible for an implementation to allocate
 some HCA resources from the same space. In that case, the maximum
 values returned are not guaranteed for all of those resources
 simultaneously.

 So, a well written application should try smaller values if it fails
 with ENOMEM.

 Hello Mark,

 It would help if you could test the patch below. Sorry but I don't
 have access to a ConnectIB setup myself.

 Thanks,

 Bart.

 Reported-by: Mark Lehrer leh...@gmail.com
 Signed-off-by: Bart Van Assche bvanass...@acm.org
 ---
  drivers/infiniband/ulp/srpt/ib_srpt.c | 8 
  1 file changed, 8 insertions(+)

 diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
 b/drivers/infiniband/ulp/srpt/ib_srpt.c
 index fe09f27..3ffaf4e 100644
 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c
 +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
 @@ -2091,6 +2091,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch)
 if (!qp_init)
 goto out;

 +retry:
 ch-cq = ib_create_cq(sdev-device, srpt_completion, NULL, ch,
   ch-rq_size + srp_sq_size, 0);
 if (IS_ERR(ch-cq)) {
 @@ -2114,6 +2115,13 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch)
 ch-qp = ib_create_qp(sdev-pd, qp_init);
 if (IS_ERR(ch-qp)) {
 ret = PTR_ERR(ch-qp);
 +   if (ret == -ENOMEM) {
 +   srp_sq_size /= 2;
 +   if (srp_sq_size = MIN_SRPT_SQ_SIZE) {
 +   ib_destroy_cq(ch-cq);
 +   goto retry;
 +   }
 +   }
 printk(KERN_ERR failed to create_qp ret= %d\n, ret);
 goto err_destroy_cq;
 }
 --
 1.8.4.5


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx5 + SRP: max_qp_sz mismatch

2014-08-20 Thread Mark Lehrer
 From what I see srp_sq_size is controlled via
 configfs. Can you set it to 2048 just for the sake of
 confirmation this is indeed the issue?

Now that we have confirmed that this works, what is the proper way to
fix the problem?  Is the core issue with mlx5 or srp?

Thanks,
Mark


On Tue, Aug 19, 2014 at 11:01 AM, Mark Lehrer leh...@gmail.com wrote:
 From what I see srp_sq_size is controlled via
 configfs. Can you set it to 2048 just for the sake of
 confirmation this is indeed the issue?

 Yes!  This setting allowed the two machines to establish an SRP session.

 I'll try some I/O tests to see how well it works.

 Thanks,
 Mark


 On Tue, Aug 19, 2014 at 8:50 AM, Sagi Grimberg sa...@dev.mellanox.co.il 
 wrote:
 On 8/19/2014 2:20 AM, Mark Lehrer wrote:

 I have a client machine that is trying to establish an SRP connection,
 and it is failing due to an ENOMEM memory allocation error.  I traced
 it down to the max_qp_sz field -- the mlx5 limits this to 16384 but
 the request wants 32768.

 I spent some time trying to figure out how this limit is set, but it
 isn't quite obvious.  Is there a driver parameter I can set, or a hard
 coded limit somewhere?

 I'm using Ubuntu 14.04 and targetcli on the target side, Windows
 2008r2 and WinOFED on the client side.


 Hi Mark,

 I think the issue here is that the SRP target asks for srp_sq_size (default
 4096) to allocate room for send WRs, but it also asks for
 SRPT_DEF_SG_PER_WQE(16) to allocate room for max_send_sge which
 generally makes the work queue entries bigger as they come inline. That
 probably exceeds the mlx5 max send queue size supported...

 It is strange that mlx5 driver is not able to fit same send queue
 lengths as mlx4... Probably work queue entries are slightly bigger (I
 can check that - and I will).

 I just wander how can the ULP requesting the space reservations in the
 send queue know that, and if it should know that at all...

 From what I see srp_sq_size is controlled via configfs. Can you set
 it to 2048 just for the sake of confirmation this is indeed the issue?

 Thanks,
 Sagi.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx5 + SRP: max_qp_sz mismatch

2014-08-19 Thread Mark Lehrer
 From what I see srp_sq_size is controlled via
 configfs. Can you set it to 2048 just for the sake of
 confirmation this is indeed the issue?

Yes!  This setting allowed the two machines to establish an SRP session.

I'll try some I/O tests to see how well it works.

Thanks,
Mark


On Tue, Aug 19, 2014 at 8:50 AM, Sagi Grimberg sa...@dev.mellanox.co.il wrote:
 On 8/19/2014 2:20 AM, Mark Lehrer wrote:

 I have a client machine that is trying to establish an SRP connection,
 and it is failing due to an ENOMEM memory allocation error.  I traced
 it down to the max_qp_sz field -- the mlx5 limits this to 16384 but
 the request wants 32768.

 I spent some time trying to figure out how this limit is set, but it
 isn't quite obvious.  Is there a driver parameter I can set, or a hard
 coded limit somewhere?

 I'm using Ubuntu 14.04 and targetcli on the target side, Windows
 2008r2 and WinOFED on the client side.


 Hi Mark,

 I think the issue here is that the SRP target asks for srp_sq_size (default
 4096) to allocate room for send WRs, but it also asks for
 SRPT_DEF_SG_PER_WQE(16) to allocate room for max_send_sge which
 generally makes the work queue entries bigger as they come inline. That
 probably exceeds the mlx5 max send queue size supported...

 It is strange that mlx5 driver is not able to fit same send queue
 lengths as mlx4... Probably work queue entries are slightly bigger (I
 can check that - and I will).

 I just wander how can the ULP requesting the space reservations in the
 send queue know that, and if it should know that at all...

 From what I see srp_sq_size is controlled via configfs. Can you set
 it to 2048 just for the sake of confirmation this is indeed the issue?

 Thanks,
 Sagi.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx5 + SRP: max_qp_sz mismatch

2014-08-18 Thread Mark Lehrer
I have a client machine that is trying to establish an SRP connection,
and it is failing due to an ENOMEM memory allocation error.  I traced
it down to the max_qp_sz field -- the mlx5 limits this to 16384 but
the request wants 32768.

I spent some time trying to figure out how this limit is set, but it
isn't quite obvious.  Is there a driver parameter I can set, or a hard
coded limit somewhere?

I'm using Ubuntu 14.04 and targetcli on the target side, Windows
2008r2 and WinOFED on the client side.

Thanks,
Mark
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with strange mlx4_core error

2014-06-26 Thread Mark Lehrer
On Feb 28, 2014, Vasiliy Tolstov wrote:

 I'm update firmware to 2.7.200 (that is the latest from ftp
 supermicro). Now i'm try to test it.

Which type of HCA do you have?  The firmware I downloaded from
ftp.supermicro.com was 2.9.1000, and once I upgraded to this version
from 2.7.200 I was able to use mlx4_en.

lspci shows my card as: Mellanox Technologies MT26428 [ConnectX VPI
PCIe 2.0 5GT/s - IB QDR / 10GigE]


Mark
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


nfs-rdma performance

2014-06-12 Thread Mark Lehrer
Awesome work on nfs-rdma in the later kernels!  I had been having
panic problems for awhile and now things appear to be quite reliable.

Now that things are more reliable, I would like to help work on speed
issues.  On this same hardware with SMB Direct and the standard
storage review 8k 70/30 test, I get combined read  write performance
of around 2.5GB/sec.  With nfs-rdma it is pushing about 850MB/sec.
This is simply an unacceptable difference.

I'm using the standard settings -- connected mode, 65520 byte MTU,
nfs-server-side async, lots of nfsd's, and nfsver=3 with large
buffers.  Does anyone have any tuning suggestions and/or places to
start looking for bottlenecks?

Thanks,
Mark
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nfs-rdma performance

2014-06-12 Thread Mark Lehrer
I am using ConnectX-3 HCA's and Dell R720 servers.

On Thu, Jun 12, 2014 at 2:00 PM, Steve Wise sw...@opengridcomputing.com wrote:
 On 6/12/2014 2:54 PM, Mark Lehrer wrote:

 Awesome work on nfs-rdma in the later kernels!  I had been having
 panic problems for awhile and now things appear to be quite reliable.

 Now that things are more reliable, I would like to help work on speed
 issues.  On this same hardware with SMB Direct and the standard
 storage review 8k 70/30 test, I get combined read  write performance
 of around 2.5GB/sec.  With nfs-rdma it is pushing about 850MB/sec.
 This is simply an unacceptable difference.

 I'm using the standard settings -- connected mode, 65520 byte MTU,
 nfs-server-side async, lots of nfsd's, and nfsver=3 with large
 buffers.  Does anyone have any tuning suggestions and/or places to
 start looking for bottlenecks?


 What RDMA device?

 Steve.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html