RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic

2014-05-19 Thread Devesh Sharma
While testing with ocrdma driver I am finding server side SQ full. Following is 
the log, yet to identify why it's happening. Once this is reported Client side 
crashes due to some reason.
My kdump is not working properly therefore I am not able to analyze the 
situation properly.

May 19 23:47:02 neo01-el64 kernel: svcrdma: RDMA_WRITE rmr=8008b12, 
to=45a2d790c, xdr_off=0, write_len=68, vec-sge=88086cb4a0c8, vec-count=2
May 19 23:47:02 neo01-el64 kernel: svcrdma: send_reply returns 0
May 19 23:47:02 neo01-el64 kernel: svc: server 88086409a000 waiting for 
data (to = 360)
May 19 23:47:02 neo01-el64 kernel: svc: transport 88087dfa2400 served by 
daemon 88086409a000
May 19 23:47:02 neo01-el64 kernel: svc: server 88086409a000, pool 0, 
transport 88087dfa2400, inuse=18
May 19 23:47:02 neo01-el64 kernel: svcrdma: rqstp=88086409a000
May 19 23:47:02 neo01-el64 kernel: svcrdma: processing ctxt=880866754540 on 
xprt=88087dfa2400, rqstp=88086409a000, status=0
May 19 23:47:02 neo01-el64 kernel: svcrdma: failed to post SQ WR rc=-22, 
sc_sq_count=0, sc_sq_depth=128
May 19 23:47:02 neo01-el64 kernel: svcrdma: Error -22 posting RDMA_READ
May 19 23:47:02 neo01-el64 kernel: svc: got len=0
May 19 23:47:02 neo01-el64 kernel: svc: transport 88087dfa2400 served by 
daemon 88086e782000
May 19 23:47:02 neo01-el64 kernel: svc: transport 88087dfa2400 busy, not 
enqueued
May 19 23:47:02 neo01-el64 kernel: svc: server 88086409a000 waiting for 
data (to = 360)
May 19 23:47:02 neo01-el64 kernel: svc_recv: found XPT_CLOSE
May 19 23:47:02 neo01-el64 kernel: svc: svc_delete_xprt(88087dfa2400)
May 19 23:47:02 neo01-el64 kernel: svc: svc_rdma_detach(88087dfa2400)

 -Original Message-
 From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
 ow...@vger.kernel.org] On Behalf Of Steve Wise
 Sent: Wednesday, May 07, 2014 2:39 AM
 To: J. Bruce Fields
 Cc: linux-...@vger.kernel.org; linux-rdma@vger.kernel.org;
 t...@opengridcomputing.com
 Subject: Re: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic
 
 On 5/6/2014 2:27 PM, J. Bruce Fields wrote:
  On Tue, May 06, 2014 at 12:46:21PM -0500, Steve Wise wrote:
  This patch series refactors the NFSRDMA server marshalling logic to
  remove the intermediary map structures.  It also fixes an existing
  bug where the NFSRDMA server was not minding the device fast register
  page list length limitations.
 
  I've also made a git repo available with these patches on top of 3.15-rc4:
 
  git://git.openfabrics.org/~swise/linux svcrdma-refactor
 
  Changes since V1:
 
  - fixed regression for devices that don't support FRMRs (see
 rdma_read_chunk_lcl())
 
  - split patch up for closer review.  However I request it be squashed
 before merging as they is not bisectable, and I think these changes
 should all be a single commit anyway.
  If it's not split up in a way that's bisectable, then yes, just don't
  bother.
 
 I didn't see a good way to split it up, have it bisectable, and not have all 
 the
 big stuff in one patch.  I think its a little more reviewable in these 3 
 patches,
 but when I post V3, I'll put it back as an uber patch.
 Hopefully folks can have a look at these 3 patches ignoring the bisect
 issue.Having said that, the rdma read logic really is better
 reviewed by look at the code after applying the patches.   That's why I
 published a git branch.
 
 Thanks!
 
 Steve.
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in the
 body of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic

2014-05-19 Thread Steve Wise


 -Original Message-
 From: linux-nfs-ow...@vger.kernel.org 
 [mailto:linux-nfs-ow...@vger.kernel.org] On Behalf
 Of Devesh Sharma
 Sent: Monday, May 19, 2014 2:07 PM
 To: Steve Wise; J. Bruce Fields
 Cc: linux-...@vger.kernel.org; linux-rdma@vger.kernel.org; 
 t...@opengridcomputing.com
 Subject: RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic
 
 While testing with ocrdma driver I am finding server side SQ full. Following 
 is the log,
yet to
 identify why it's happening. Once this is reported Client side crashes due to 
 some
reason.
 My kdump is not working properly therefore I am not able to analyze the 
 situation
properly.
 
 May 19 23:47:02 neo01-el64 kernel: svcrdma: RDMA_WRITE rmr=8008b12, 
 to=45a2d790c,
 xdr_off=0, write_len=68, vec-sge=88086cb4a0c8, vec-count=2
 May 19 23:47:02 neo01-el64 kernel: svcrdma: send_reply returns 0
 May 19 23:47:02 neo01-el64 kernel: svc: server 88086409a000 waiting for 
 data (to =
 360)
 May 19 23:47:02 neo01-el64 kernel: svc: transport 88087dfa2400 served by 
 daemon
 88086409a000
 May 19 23:47:02 neo01-el64 kernel: svc: server 88086409a000, pool 0, 
 transport
 88087dfa2400, inuse=18
 May 19 23:47:02 neo01-el64 kernel: svcrdma: rqstp=88086409a000
 May 19 23:47:02 neo01-el64 kernel: svcrdma: processing ctxt=880866754540 
 on
 xprt=88087dfa2400, rqstp=88086409a000, status=0
 May 19 23:47:02 neo01-el64 kernel: svcrdma: failed to post SQ WR rc=-22, 
 sc_sq_count=0,
 sc_sq_depth=128
 May 19 23:47:02 neo01-el64 kernel: svcrdma: Error -22 posting RDMA_READ

Hey Deevesh,

Looking ocrdma_post_send(),-22 (-EINVAL) is returned when the QP is not in RTS. 
 If the SQ
is full, -ENOMEM is returned.  So I think the send error is a downstream error 
because the
connection got knocked down.  You should try and figure out what kicked the QP 
out of RTS.


Steve.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic

2014-05-19 Thread Devesh Sharma
Hi Steve,

 -Original Message-
 From: Steve Wise [mailto:sw...@opengridcomputing.com]
 Sent: Tuesday, May 20, 2014 12:44 AM
 To: Devesh Sharma; 'J. Bruce Fields'
 Cc: linux-...@vger.kernel.org; linux-rdma@vger.kernel.org;
 t...@opengridcomputing.com
 Subject: RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic
 
 
 
  -Original Message-
  From: linux-nfs-ow...@vger.kernel.org
  [mailto:linux-nfs-ow...@vger.kernel.org] On Behalf Of Devesh Sharma
  Sent: Monday, May 19, 2014 2:07 PM
  To: Steve Wise; J. Bruce Fields
  Cc: linux-...@vger.kernel.org; linux-rdma@vger.kernel.org;
  t...@opengridcomputing.com
  Subject: RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic
 
  While testing with ocrdma driver I am finding server side SQ full.
  Following is the log,
 yet to
  identify why it's happening. Once this is reported Client side crashes
  due to some
 reason.
  My kdump is not working properly therefore I am not able to analyze
  the situation
 properly.
 
  May 19 23:47:02 neo01-el64 kernel: svcrdma: RDMA_WRITE rmr=8008b12,
  to=45a2d790c, xdr_off=0, write_len=68, vec-sge=88086cb4a0c8,
  vec-count=2 May 19 23:47:02 neo01-el64 kernel: svcrdma: send_reply
  returns 0 May 19 23:47:02 neo01-el64 kernel: svc: server
  88086409a000 waiting for data (to =
  360)
  May 19 23:47:02 neo01-el64 kernel: svc: transport 88087dfa2400
  served by daemon
  88086409a000
  May 19 23:47:02 neo01-el64 kernel: svc: server 88086409a000, pool
  0, transport 88087dfa2400, inuse=18 May 19 23:47:02 neo01-el64
  kernel: svcrdma: rqstp=88086409a000 May 19 23:47:02 neo01-el64
  kernel: svcrdma: processing ctxt=880866754540 on
  xprt=88087dfa2400, rqstp=88086409a000, status=0 May 19
  23:47:02 neo01-el64 kernel: svcrdma: failed to post SQ WR rc=-22,
  sc_sq_count=0,
  sc_sq_depth=128
  May 19 23:47:02 neo01-el64 kernel: svcrdma: Error -22 posting
  RDMA_READ
 
 Hey Deevesh,
 
 Looking ocrdma_post_send(),-22 (-EINVAL) is returned when the QP is not in
 RTS.  If the SQ is full, -ENOMEM is returned.  So I think the send error is a
 downstream error because the connection got knocked down.  You should
 try and figure out what kicked the QP out of RTS.

Oh wow! I perfectly missed it, let me go through the logs once again and update 
you.

 
 
 Steve.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic

2014-05-06 Thread J. Bruce Fields
On Tue, May 06, 2014 at 12:46:21PM -0500, Steve Wise wrote:
 This patch series refactors the NFSRDMA server marshalling logic to
 remove the intermediary map structures.  It also fixes an existing bug
 where the NFSRDMA server was not minding the device fast register page
 list length limitations.
 
 I've also made a git repo available with these patches on top of 3.15-rc4:
 
 git://git.openfabrics.org/~swise/linux svcrdma-refactor
 
 Changes since V1:
 
 - fixed regression for devices that don't support FRMRs (see
   rdma_read_chunk_lcl())
 
 - split patch up for closer review.  However I request it be squashed
   before merging as they is not bisectable, and I think these changes
   should all be a single commit anyway.

If it's not split up in a way that's bisectable, then yes, just don't
bother.

--b.

 
 Please review, and test if you can.
 
 Signed-off-by: Tom Tucker t...@opengridcomputing.com
 Signed-off-by: Steve Wise sw...@opengridcomputing.com
 
 ---
 
 Tom Tucker (3):
   svcrdma: Sendto changes
   svcrdma: Recvfrom changes
   svcrdma: Transport and header file changes
 
 
  include/linux/sunrpc/svc_rdma.h  |3 
  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c  |  633 
 --
  net/sunrpc/xprtrdma/svc_rdma_sendto.c|  230 +--
  net/sunrpc/xprtrdma/svc_rdma_transport.c |   62 ++-
  4 files changed, 318 insertions(+), 610 deletions(-)
 
 -- 
 
 Steve / Tom
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic

2014-05-06 Thread Steve Wise

On 5/6/2014 2:27 PM, J. Bruce Fields wrote:

On Tue, May 06, 2014 at 12:46:21PM -0500, Steve Wise wrote:

This patch series refactors the NFSRDMA server marshalling logic to
remove the intermediary map structures.  It also fixes an existing bug
where the NFSRDMA server was not minding the device fast register page
list length limitations.

I've also made a git repo available with these patches on top of 3.15-rc4:

git://git.openfabrics.org/~swise/linux svcrdma-refactor

Changes since V1:

- fixed regression for devices that don't support FRMRs (see
   rdma_read_chunk_lcl())

- split patch up for closer review.  However I request it be squashed
   before merging as they is not bisectable, and I think these changes
   should all be a single commit anyway.

If it's not split up in a way that's bisectable, then yes, just don't
bother.


I didn't see a good way to split it up, have it bisectable, and not have 
all the big stuff in one patch.  I think its a little more reviewable in 
these 3 patches, but when I post V3, I'll put it back as an uber patch.  
Hopefully folks can have a look at these 3 patches ignoring the bisect 
issue.Having said that, the rdma read logic really is better 
reviewed by look at the code after applying the patches.   That's why I 
published a git branch.


Thanks!

Steve.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html