Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-13 Thread Bart Van Assche
On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
 I'd actually think that the decline caused by exhausting credit on
 the initiator side is far worse than ib_post_recv.. If ib_post_recv
 really is noticable then maybe just force loading the RQ after XX
 unposted are outstanding.

The performance penalty of exhausting credit on the initiator side is
indeed far worse than that of a single ib_post_recv() call at the
target side. But in a realistic setup such credit exhaustion happens
at a rate of about once an hour, while ib_post_recv() is called once
for every request. It is easy to set up an SRP I/O test where
ib_post_recv() is called more than 250.000 times per second.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-13 Thread Roland Dreier

  The impact of the ib_post_recv() call is small but measurable. I'll
  publish measurement results soon.

Interesting.  If there is overhead you could amortize it by keeping some
number of spare buffers posting, and batching your calls to
ib_post_recv() (eg post a list of 8 buffers every 8 commands, and keep 7
spare buffers posted).

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-13 Thread Jason Gunthorpe
On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote:
 On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe
 jguntho...@obsidianresearch.com wrote:
  [ ... ]
 
  Also, I couldn't tell for sure from a cursory examination of the
  patch, but the initiator must be designed to not stall processing the
  RQ dependent on the SQ or the credit level, when using a credit scheme
  like this. Or you will deadlock.
 
  For instance it isn't clear to me how the control flow around
  srp_process_cred_req is ment to work - it tries to send a reply, but
  if it can't due to srp_get_txp_iu failing it just gives up forever?
 
 For a standards-conforming SRP target, srp_get_txp_iu() will never
 fail. A quote from section 5.5.2 of the SRP r16a document:

What guarantees that that the send completion for the reply is
processed before a receive completion for the next request?

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-13 Thread Bart Van Assche
On Wed, Jan 13, 2010 at 7:16 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:

 On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote:
  On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe
  jguntho...@obsidianresearch.com wrote:
   [ ... ]
  
   Also, I couldn't tell for sure from a cursory examination of the
   patch, but the initiator must be designed to not stall processing the
   RQ dependent on the SQ or the credit level, when using a credit scheme
   like this. Or you will deadlock.
  
   For instance it isn't clear to me how the control flow around
   srp_process_cred_req is ment to work - it tries to send a reply, but
   if it can't due to srp_get_txp_iu failing it just gives up forever?
 
  For a standards-conforming SRP target, srp_get_txp_iu() will never
  fail. A quote from section 5.5.2 of the SRP r16a document:

 What guarantees that that the send completion for the reply is
 processed before a receive completion for the next request?

Did you notice that I increased the number of receive slots reserved
for target-initiated requests from one to two ?

+       /* Number of receive slots reserved for receiving requests. */
+       SRP_RXR_SIZE            = 2,

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-13 Thread Jason Gunthorpe
On Wed, Jan 13, 2010 at 07:57:26PM +0100, Bart Van Assche wrote:
 On Wed, Jan 13, 2010 at 7:16 PM, Jason Gunthorpe
 jguntho...@obsidianresearch.com wrote:
 
  On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote:
   On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe
   jguntho...@obsidianresearch.com wrote:
[ ... ]
   
Also, I couldn't tell for sure from a cursory examination of the
patch, but the initiator must be designed to not stall processing the
RQ dependent on the SQ or the credit level, when using a credit scheme
like this. Or you will deadlock.
   
For instance it isn't clear to me how the control flow around
srp_process_cred_req is ment to work - it tries to send a reply, but
if it can't due to srp_get_txp_iu failing it just gives up forever?
  
   For a standards-conforming SRP target, srp_get_txp_iu() will never
   fail. A quote from section 5.5.2 of the SRP r16a document:
 
  What guarantees that that the send completion for the reply is
  processed before a receive completion for the next request?
 
 Did you notice that I increased the number of receive slots reserved
 for target-initiated requests from one to two ?

Yes, but I don't see how more receive slots helps exhausting send
slots.

The IBA doesn't guarentee ordering between completions originating
from different work queues. Getting a receive completion does not
guarantee that a related send completion has been received. Even if
the work queues are mapped to the same CQ.

Since srp_get_txp_iu depends on txp_head/tail which is incremented by
send completions, but srp_get_txp_iu is called on the receive
completion path, it is possible to call srp_get_txp_iu multiple times
before the send completions are generated - even though the target is
only generating new request sends in response to reply receives.

The two main causes are ack coalescing algorithms and lost ack
packets.

That said, with SRP_TXP_SIZE set to 2, I think it would be really hard
to break this in real life, with current hardware.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-12 Thread Roland Dreier

  Note: I do not know of any SRP targets that implement approach (2). As
  far as I know all SRP targets use approach (1).

But no other SRP targets send SRP_CRED_REQ -- and that includes all
native IB storage arrays that I know of.  So there must be some other
way around this (and having separate buffers for command receives and
response sends does seem like a sensible way to design a target anyway).
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-12 Thread Bart Van Assche
On Tue, Jan 12, 2010 at 6:13 PM, Roland Dreier rdre...@cisco.com wrote:

   Note: I do not know of any SRP targets that implement approach (2). As
   far as I know all SRP targets use approach (1).

 But no other SRP targets send SRP_CRED_REQ -- and that includes all
 native IB storage arrays that I know of.  So there must be some other
 way around this (and having separate buffers for command receives and
 response sends does seem like a sensible way to design a target anyway).

Having separate buffer sets for command receives and response sends
might seem a sensible way to design a target. But calling
ib_post_recv() before sending a response back to the initiator is not
acceptable because this increases communication latency. When a target
uses separate buffer sets, sends back SRP_RSP before responses before
ib_post_recv() is called, and when it processes SRP_CMD requests in
parallel, it still can happen that a sequence of RL - 2 SRP_RSP
responses is sent back to the initiator all with the REQUEST LIMIT
DELTA field set to zero. So even with separate buffer sets there is a
need for SRP_CRED_REQ support in the initiator.

I'm surprised by all this opposition against adding SRP_CRED_REQ
support in the initiator, a feature defined in the SRP standard ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-12 Thread Roland Dreier

  Having separate buffer sets for command receives and response sends
  might seem a sensible way to design a target. But calling
  ib_post_recv() before sending a response back to the initiator is not
  acceptable because this increases communication latency. When a target
  uses separate buffer sets, sends back SRP_RSP before responses before
  ib_post_recv() is called, and when it processes SRP_CMD requests in
  parallel, it still can happen that a sequence of RL - 2 SRP_RSP
  responses is sent back to the initiator all with the REQUEST LIMIT
  DELTA field set to zero. So even with separate buffer sets there is a
  need for SRP_CRED_REQ support in the initiator.

I doubt you could benchmark the overhead of calling ib_post_recv() in
the full SRP protocol.  Really, I bet it's less than 100 nanoseconds to
form the work request and call ib_post_recv().  Maybe I'm wrong but I
really expect the overhead of actually doing IO (even to a ramdisk
buffer or something) is going to be much higher than posting a receive.

Or maybe you did benchmark it and I'm wrong?

  I'm surprised by all this opposition against adding SRP_CRED_REQ
  support in the initiator, a feature defined in the SRP standard ?

As I said I agree we should implement support for SRP_CRED_REQ.  However
I think it would be better if the SCST target could be made to work
reliably with pre-2.6.34 initiators as well.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-12 Thread Jason Gunthorpe
On Tue, Jan 12, 2010 at 02:57:35PM -0800, Roland Dreier wrote:

 I doubt you could benchmark the overhead of calling ib_post_recv() in
 the full SRP protocol.  Really, I bet it's less than 100 nanoseconds to
 form the work request and call ib_post_recv().  Maybe I'm wrong but I
 really expect the overhead of actually doing IO (even to a ramdisk
 buffer or something) is going to be much higher than posting a receive.
 
 Or maybe you did benchmark it and I'm wrong?

I'd actually think that the decline caused by exhausting credit on
the initiator side is far worse than ib_post_recv.. If ib_post_recv
really is noticable then maybe just force loading the RQ after XX
unposted are outstanding.

I also was wondering if the SRP spec had anything to say about this,
creating this kind of RQ-SQ dependency on either side should only be
done if the protocol explicitly allows it for one side only -
otherwise you risk deadlocking when used with an initiator that felt
it too could create that kind of dependency.

Also, I couldn't tell for sure from a cursory examination of the
patch, but the initiator must be designed to not stall processing the
RQ dependent on the SQ or the credit level, when using a credit scheme
like this. Or you will deadlock.

For instance it isn't clear to me how the control flow around
srp_process_cred_req is ment to work - it tries to send a reply, but
if it can't due to srp_get_txp_iu failing it just gives up forever?

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-12 Thread Bart Van Assche
On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
 [ ... ]

 Also, I couldn't tell for sure from a cursory examination of the
 patch, but the initiator must be designed to not stall processing the
 RQ dependent on the SQ or the credit level, when using a credit scheme
 like this. Or you will deadlock.

 For instance it isn't clear to me how the control flow around
 srp_process_cred_req is ment to work - it tries to send a reply, but
 if it can't due to srp_get_txp_iu failing it just gives up forever?

For a standards-conforming SRP target, srp_get_txp_iu() will never
fail. A quote from section 5.5.2 of the SRP r16a document:

SRP target ports shall limit themselves to one outstanding SRP request
(see table 7) per RDMA channel. Upon
sending an SRP request, an SRP target port shall not send another SRP
request on the same RDMA channel
until after it receives the SRP response (see table 8) for the
previous SRP request.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-12 Thread Bart Van Assche
On Tue, Jan 12, 2010 at 11:57 PM, Roland Dreier rdre...@cisco.com wrote:

   Having separate buffer sets for command receives and response sends
   might seem a sensible way to design a target. But calling
   ib_post_recv() before sending a response back to the initiator is not
   acceptable because this increases communication latency. When a target
   uses separate buffer sets, sends back SRP_RSP before responses before
   ib_post_recv() is called, and when it processes SRP_CMD requests in
   parallel, it still can happen that a sequence of RL - 2 SRP_RSP
   responses is sent back to the initiator all with the REQUEST LIMIT
   DELTA field set to zero. So even with separate buffer sets there is a
   need for SRP_CRED_REQ support in the initiator.

 I doubt you could benchmark the overhead of calling ib_post_recv() in
 the full SRP protocol.  Really, I bet it's less than 100 nanoseconds to
 form the work request and call ib_post_recv().  Maybe I'm wrong but I
 really expect the overhead of actually doing IO (even to a ramdisk
 buffer or something) is going to be much higher than posting a receive.

 Or maybe you did benchmark it and I'm wrong?

The impact of the ib_post_recv() call is small but measurable. I'll
publish measurement results soon.

   I'm surprised by all this opposition against adding SRP_CRED_REQ
   support in the initiator, a feature defined in the SRP standard ?

 As I said I agree we should implement support for SRP_CRED_REQ.  However
 I think it would be better if the SCST target could be made to work
 reliably with pre-2.6.34 initiators as well.

OK, I will first evaluate which alternatives there are for SCST-SRPT
such that it does not rely on initiator-side support for SRP_CRED_REQ.
This will not only help with pre-2.6.34 SRP initiators but also with
other SRP initiators than the one available in the vanilla Linux
kernel (e.g. OFED or WinOF).

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-10 Thread Bart Van Assche
On Sun, Jan 10, 2010 at 7:41 AM, Roland Dreier rdre...@cisco.com wrote:

   The patch I posted is really fixing the original bug. The problem was
   that neither the SRP target nor the SRP initiator had support for
   SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both
   software components in order to fix this bug.

 There's no way for the target to return credits through responses?  I
 agree that we should implement the full SRP spec in the initiator but it
 seems unfortunate to force both an initiator and target upgrade to fix
 what really appears to be a target bug.  This means anyone running a
 pre-2.6.34 kernel won't be able to use the SCST SRP target reliably.

Please let me explain why the SCST SRP target behaves as observed, why
this behavior is not specific to SCST, and which workaround is
available for pre-2.6.34 SRP initiator users.

As known an SRP target passes the so-called req_lim value to the SRP
initiator via the REQUEST LIMIT DELTA field of the SRP_LOGIN_RSP
information unit. Let's call this value RL. As specified in the SRP
r16a document, an initiator may never send more than RL - 2 unanswered
SRP_CMD information units to an SRP target.

When an SRP_CMD request is being processed by an SRP target, the SRP
target can e.g. process this request using one of the following
strategies:
1. Using the buffer in which the SRP_CMD request was received to build
the response. In this case once the response has been built the target
will call ib_post_send() and will wait until the send completion has
been received before it will declare that buffer again available for
receiving by calling ib_post_recv().
2. Using separate sets of buffers for receiving SRP_CMD requests and
sending back SRP_RSP responses. In this case it is possible for the
target to re-enable receiving for the buffer in which the SRP_CMD
request was received before the SRP_RSP response is sent back.

Regarding approach (2): with this approach the value of the REQUEST
LIMIT DELTA field in the SRP_RSP information unit will always equal
one. With this approach it will never be necessary that the targets
sends an SRP_CRED_REQ information unit to the initiator.

Regarding approach (1): since for each SRP_RSP response sent back by
the target ib_post_recv() is called in the target after
ib_post_send(), at least for the first SRP_RSP response the REQUEST
LIMIT DELTA field will be equal to zero. And for a target that is able
to process all received SRP_CMD information units in parallel, it can
happen that the SRP target sends a contiguous series of (RL - 2)
SRP_RSP information units to the initiator with the REQUEST LIMIT
DELTA field equal to zero. As a consequence, the value of the req_lim
variable in the SRP initiator will be equal to 2 and the initiator
won't send any further SRP_CMD requests to the target. A scenario for
how to get the SRP initiator into this state can be found in
http://bugzilla.kernel.org/show_bug.cgi?id=14235.

The only way to get out of this deadlock is that the target send an
SRP_CRED_REQ information unit to the initiator with a non-zero REQUEST
LIMIT DELTA field, and that the SRP initiator processes this
SRP_CRED_REQ information unit.

Because of this possible SRP initiator lockup SCST-SRPT users have
been recommended to disable parallel processing of information units
in this SRP target (by specifying the ib_srpt kernel parameter
thread=1).

My conclusion is that the SRP initiator lockup explained in
http://bugzilla.kernel.org/show_bug.cgi?id=14235 is not specific to
SCST-SRPT but that this lockup can be triggered by any SRP target that
processes SRP_CMD requests in parallel.

So as far as I can see the choices we have are:
* Document that SRP_CRED_REQ support is missing in the Linux SRP
initiator and hence that command processing in SRP targets must be
complicated by making sure that never (RL - 2) contiguous SRP_RSP
information units are sent to the SRP initiator with the REQUEST LIMIT
DELTA field equal to zero.
* Add support for the SRP_CRED_REQ information unit in the Linux SRP initiator.

Note: I do not know of any SRP targets that implement approach (2). As
far as I know all SRP targets use approach (1).

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-09 Thread Roland Dreier

  The patch I posted is really fixing the original bug. The problem was
  that neither the SRP target nor the SRP initiator had support for
  SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both
  software components in order to fix this bug.

There's no way for the target to return credits through responses?  I
agree that we should implement the full SRP spec in the initiator but it
seems unfortunate to force both an initiator and target upgrade to fix
what really appears to be a target bug.  This means anyone running a
pre-2.6.34 kernel won't be able to use the SCST SRP target reliably.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-07 Thread Bart Van Assche
On Thu, Jan 7, 2010 at 8:57 AM, Roland Dreier rdre...@cisco.com wrote:

   It was only recently that I had the time to analyze this issue in
   depth and that I added SRP_CRED_REQ support in SCST (checked in that
   support about one week ago). Before SRP_CRED_REQ support was added in
   SCST, the SRP initiator could not log the Unhandled SRP opcode
   message because it did not receive any unhandled SRP opcodes.

 So it sounds like this is fixing something that was not the original
 bug?  Because SCST wasn't sending SRP_CRED_REQ when the issue of the
 initiator hanging was originally seen, if I understand correctly.

The patch I posted is really fixing the original bug. The problem was
that neither the SRP target nor the SRP initiator had support for
SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both
software components in order to fix this bug.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-07 Thread Bart Van Assche
On Thu, Jan 7, 2010 at 9:03 AM, Bart Van Assche
bart.vanass...@gmail.com wrote:
 On Thu, Jan 7, 2010 at 8:57 AM, Roland Dreier rdre...@cisco.com wrote:

   It was only recently that I had the time to analyze this issue in
   depth and that I added SRP_CRED_REQ support in SCST (checked in that
   support about one week ago). Before SRP_CRED_REQ support was added in
   SCST, the SRP initiator could not log the Unhandled SRP opcode
   message because it did not receive any unhandled SRP opcodes.

 So it sounds like this is fixing something that was not the original
 bug?  Because SCST wasn't sending SRP_CRED_REQ when the issue of the
 initiator hanging was originally seen, if I understand correctly.

 The patch I posted is really fixing the original bug. The problem was
 that neither the SRP target nor the SRP initiator had support for
 SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both
 software components in order to fix this bug.

However, I have to admit that the patch description should have
described what happened more in detail. I'll address that in the
second version of my patch.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-07 Thread David Dillow
On Thu, 2010-01-07 at 08:59 +0100, Bart Van Assche wrote:
 On Wed, Jan 6, 2010 at 10:40 PM, Roland Dreier rdre...@cisco.com wrote:
 
Is that regular kernel coding practice, to run away with the work
someone else did and to claim authorship ? As far as I know this is
considered as impolite.
 
  I don't see with proper credit for Bart as claiming authorship.  And
  yes, I think proposing a better way of doing things is definitely
  regular kernel coding practice.  It's more than polite -- it's going way
  beyond helpful review comments and actually helping revise the patches.
 
 You are welcome to have a look at the descriptions of the three
 patches posted by David Dillow. I was not credited in any way in these
 descriptions. Not for analyzing the problem, not for the design of a
 solution and not for the work I did in implementing a solution. That
 is why I wrote that David Dillow was claiming authorship of the work I
 did.

And here I thought I covered it by stating in the cover letter that they
were not signed off, needed credit for you, and shouldn't go upstream
until you ACK'd them, presumably with that fixed. I thought people would
put two and two together, but I suppose not. I'll try to be more
explicit next time.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-06 Thread Roland Dreier

  When the SRP initiator is communicating with an SRP target under load it can
  happen that the SRP initiator locks up. The communication pattern that causes
  the lockup is as follows:
  * SRP initiator sends (req_lim - 2) SRP_CMD requests to the target.
  * The REQUEST LIMIT DELTA field of each SRP_RSP response is zero.
  * The target sends an SRP_CRED_REQ information unit with non-zero REQUEST
LIMIT DELTA.
  
  The above communication pattern brings the initiator in the following state:
  * srp_queuecommand() always returns SCSI_MLQUEUE_HOST_BUSY.
  * The per-session variable zero_req_lim keeps increasing.
  The initiator never leaves this state because it ignores SRP_CRED_REQ
  information units.

This is all a bit obfuscated.  The problem is that the initiator runs
out of credits and stops sending commands; because we don't process
SRP_CRED_REQ messages from the target, we never get more credits.

I'm wondering why this took so long to come up?  Does SCST send
SRP_CRED_REQ only under unusual circumstances?  Also I'm wondering why
the Unhandled SRP opcode message didn't show up in the kernel log and
help debug this?

Some specific comments:

  +/* Similar to is_power_of_2(), but can be evaluated at compile time. */
  +#define IS_POWER_OF_2(n) ((n) != 0  (((n)  ((n) - 1)) == 0))

I don't think this level of ugliness is really required -- we can just
document carefully at the definition that we need things to be powers of 2.

  +for (i = 0; i  ARRAY_SIZE(target-txp_ring); ++i)
  +srp_free_iu(target-srp_host, target-txp_ring[i]);

Not sure I understand why we need two TX rings -- why can't we just have
one bigger TX ring that handles both requests and responses?

  + * Obtain an information unit for sending a request to the target.

I think there's a bit of overcommenting in a few places.  Does it really
help anyone to repeat that what get_tx_iu does is get an information
unit for sending?

  -return target-tx_ring[target-tx_head  SRP_SQ_SIZE];
  +return target-tx_ring[target-tx_head
  +(ARRAY_SIZE(target-tx_ring) - 1)];

is this an improvement?

  + * Send a request to the target.
  +static int __srp_post_send_req(struct srp_target_port *target,

same -- does the comment add anything?

  +/* Completion queue. */
  +SRP_CQ_SIZE = SRP_SQ_SIZE + SRP_TXP_SIZE + SRP_RQ_SIZE,

etc... all these comments are mostly taking up vertical space and
visually jarring, without adding much info.

  +/*
  + * SRP_CRED_REQ information unit, as defined in section 6.10 of the
  + * T10 SRP r16a document.
  + */
  +struct srp_cred_req {
  +u8  opcode;
  +u8  sol_not;
  +u8  reserved[2];
  +__be32  req_lim_delta;
  +u64 tag;
  +} __attribute__((packed));

again... does it help anyone to say struct srp_cred_req corresponds to
SRP_CRED_REQ?  scsi/srp.h already refers to the SRP document, and I
would think anyone looking up SRP_CRED_REQ could find it faster by
looking for the string itself rather than by section number.

The existing comments in the file are actually useful -- they explain
why some structures need to be packed, and as far as I can tell neither
srp_cred_req nor srp_cred_rsp need to be packed.

 - R.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-06 Thread Roland Dreier

  I agree that we should add support for SRP_CRED_REQ, but I'm not
  thrilled with the mix of changes in the patch, as well as the general
  aesthetics of the result. How about something like the following series
  -- posted as a follow up to this message -- with proper credit for Bart?
  I'll sign off on them once we're happy with a direction and Bart acks.

Thanks very much for doing this!  I agree that splitting up this way
makes the patches much easier to review, and is the right way to do this.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-06 Thread Roland Dreier

  It was only recently that I had the time to analyze this issue in
  depth and that I added SRP_CRED_REQ support in SCST (checked in that
  support about one week ago). Before SRP_CRED_REQ support was added in
  SCST, the SRP initiator could not log the Unhandled SRP opcode
  message because it did not receive any unhandled SRP opcodes.

So it sounds like this is fixing something that was not the original
bug?  Because SCST wasn't sending SRP_CRED_REQ when the issue of the
initiator hanging was originally seen, if I understand correctly.

Do you understand the root cause of the original bug?

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-06 Thread Bart Van Assche
On Wed, Jan 6, 2010 at 10:40 PM, Roland Dreier rdre...@cisco.com wrote:

   Is that regular kernel coding practice, to run away with the work
   someone else did and to claim authorship ? As far as I know this is
   considered as impolite.

 I don't see with proper credit for Bart as claiming authorship.  And
 yes, I think proposing a better way of doing things is definitely
 regular kernel coding practice.  It's more than polite -- it's going way
 beyond helpful review comments and actually helping revise the patches.

You are welcome to have a look at the descriptions of the three
patches posted by David Dillow. I was not credited in any way in these
descriptions. Not for analyzing the problem, not for the design of a
solution and not for the work I did in implementing a solution. That
is why I wrote that David Dillow was claiming authorship of the work I
did.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-04 Thread David Dillow
On Mon, 2010-01-04 at 08:13 +0100, Bart Van Assche wrote:
 On Mon, Jan 4, 2010 at 2:34 AM, David Dillow d...@thedillows.org wrote:
  I agree that we should add support for SRP_CRED_REQ, but I'm not
  thrilled with the mix of changes in the patch, as well as the general
  aesthetics of the result. How about something like the following series
  -- posted as a follow up to this message -- with proper credit for Bart?
  I'll sign off on them once we're happy with a direction and Bart acks.
 
  Also, these are all compile tested only, so they need some testing. I
  don't have anything that uses these messages, so some help would be
  appreciated.
 
 Is that regular kernel coding practice, to run away with the work
 someone else did and to claim authorship ? As far as I know this is
 considered as impolite.

At no time did I claim authorship. If you re-read what I wrote, I
explicitly listed proper credit to you as an outstanding issue.

I didn't like the way the code looked after your patch, and the three
separate changes mashed up into one patch. So I took time out of my
afternoon to break them up to better fit my interpretation of kernel
coding practice. I don't really care who's name they go in under.

Do you have any technical comments?

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-03 Thread David Dillow
On Sat, 2010-01-02 at 13:19 +0100, Bart Van Assche wrote:
 This patch fixes this issue by adding support for SRP_CRED_REQ information
 units in the SRP initiator. Additionally, this patch makes the per-session
 variable req_lim visible through sysfs, which makes observing the initiator
 state easier.

I agree that we should add support for SRP_CRED_REQ, but I'm not
thrilled with the mix of changes in the patch, as well as the general
aesthetics of the result. How about something like the following series
-- posted as a follow up to this message -- with proper credit for Bart?
I'll sign off on them once we're happy with a direction and Bart acks.

Also, these are all compile tested only, so they need some testing. I
don't have anything that uses these messages, so some help would be
appreciated.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-03 Thread Bart Van Assche
On Mon, Jan 4, 2010 at 2:34 AM, David Dillow d...@thedillows.org wrote:

 On Sat, 2010-01-02 at 13:19 +0100, Bart Van Assche wrote:
  This patch fixes this issue by adding support for SRP_CRED_REQ information
  units in the SRP initiator. Additionally, this patch makes the per-session
  variable req_lim visible through sysfs, which makes observing the initiator
  state easier.

 I agree that we should add support for SRP_CRED_REQ, but I'm not
 thrilled with the mix of changes in the patch, as well as the general
 aesthetics of the result. How about something like the following series
 -- posted as a follow up to this message -- with proper credit for Bart?
 I'll sign off on them once we're happy with a direction and Bart acks.

 Also, these are all compile tested only, so they need some testing. I
 don't have anything that uses these messages, so some help would be
 appreciated.

Is that regular kernel coding practice, to run away with the work
someone else did and to claim authorship ? As far as I know this is
considered as impolite.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] IB/srp: Fix initiator lockup

2010-01-02 Thread Bart Van Assche
When the SRP initiator is communicating with an SRP target under load it can
happen that the SRP initiator locks up. The communication pattern that causes
the lockup is as follows:
* SRP initiator sends (req_lim - 2) SRP_CMD requests to the target.
* The REQUEST LIMIT DELTA field of each SRP_RSP response is zero.
* The target sends an SRP_CRED_REQ information unit with non-zero REQUEST
  LIMIT DELTA.

The above communication pattern brings the initiator in the following state:
* srp_queuecommand() always returns SCSI_MLQUEUE_HOST_BUSY.
* The per-session variable zero_req_lim keeps increasing.
The initiator never leaves this state because it ignores SRP_CRED_REQ
information units.

This patch fixes this issue by adding support for SRP_CRED_REQ information
units in the SRP initiator. Additionally, this patch makes the per-session
variable req_lim visible through sysfs, which makes observing the initiator
state easier.

See also http://bugzilla.kernel.org/show_bug.cgi?id=14235

Signed-off-by: Bart Van Assche bart.vanass...@gmail.com
Reported-by: Chris Worley worl...@gmail.com
Cc: Roland Dreier rola...@cisco.com

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c
b/drivers/infiniband/ulp/srp/ib_srp.c
index 54c8fe2..2006a0a 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2005 Cisco Systems.  All rights reserved.
+ * Copyright (c) 2009 Bart Van Assche bart.vanass...@gmail.com.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -53,6 +54,8 @@
 #define PFXDRV_NAME : 
 #define DRV_VERSION0.2
 #define DRV_RELDATENovember 1, 2005
+/* Similar to is_power_of_2(), but can be evaluated at compile time. */
+#define IS_POWER_OF_2(n) ((n) != 0  (((n)  ((n) - 1)) == 0))

 MODULE_AUTHOR(Roland Dreier);
 MODULE_DESCRIPTION(InfiniBand SCSI RDMA Protocol initiator 
@@ -82,6 +85,11 @@ static void srp_add_one(struct ib_device *device);
 static void srp_remove_one(struct ib_device *device);
 static void srp_completion(struct ib_cq *cq, void *target_ptr);
 static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event);
+static void srp_process_cred_req(struct srp_target_port *target,
+struct srp_cred_req *req);
+static void srp_process_aer_req(struct srp_target_port *target,
+   struct srp_cred_req *req);
+static int srp_post_recv(struct srp_target_port *target);

 static struct scsi_transport_template *ib_srp_transport_template;

@@ -237,7 +245,7 @@ static int srp_create_target_ib(struct
srp_target_port *target)
ib_req_notify_cq(target-cq, IB_CQ_NEXT_COMP);

init_attr-event_handler   = srp_qp_event;
-   init_attr-cap.max_send_wr = SRP_SQ_SIZE;
+   init_attr-cap.max_send_wr = SRP_SQ_SIZE + SRP_TXP_SIZE;
init_attr-cap.max_recv_wr = SRP_RQ_SIZE;
init_attr-cap.max_recv_sge= 1;
init_attr-cap.max_send_sge= 1;
@@ -272,10 +280,12 @@ static void srp_free_target_ib(struct
srp_target_port *target)
ib_destroy_qp(target-qp);
ib_destroy_cq(target-cq);

-   for (i = 0; i  SRP_RQ_SIZE; ++i)
+   for (i = 0; i  ARRAY_SIZE(target-rx_ring); ++i)
srp_free_iu(target-srp_host, target-rx_ring[i]);
-   for (i = 0; i  SRP_SQ_SIZE + 1; ++i)
+   for (i = 0; i  ARRAY_SIZE(target-tx_ring); ++i)
srp_free_iu(target-srp_host, target-tx_ring[i]);
+   for (i = 0; i  ARRAY_SIZE(target-txp_ring); ++i)
+   srp_free_iu(target-srp_host, target-txp_ring[i]);
 }

 static void srp_path_rec_completion(int status,
@@ -888,6 +898,14 @@ static void srp_handle_recv(struct
srp_target_port *target, struct ib_wc *wc)
 PFX Got target logout request\n);
break;

+   case SRP_CRED_REQ:
+   srp_process_cred_req(target, iu-buf);
+   break;
+
+   case SRP_AER_REQ:
+   srp_process_aer_req(target, iu-buf);
+   break;
+
default:
shost_printk(KERN_WARNING, target-scsi_host,
 PFX Unhandled SRP opcode 0x%02x\n, opcode);
@@ -908,7 +926,11 @@ static void srp_completion(struct ib_cq *cq, void
*target_ptr)
if (wc.status) {
shost_printk(KERN_ERR, target-scsi_host,
 PFX failed %s status %d\n,
-wc.wr_id  SRP_OP_RECV ? receive : 
send,
+wc.wr_id  SRP_OP_RECV
+? receiving
+: wc.wr_id  SRP_OP_TXP
+? sending response
+: sending request,
 wc.status);
target-qp_in_error = 1;

Re: [PATCH] IB/srp: Fix initiator lockup

2010-01-02 Thread Bart Van Assche
On Sat, Jan 2, 2010 at 5:05 PM, Chris Worley worl...@gmail.com wrote:
 Would this fix effect the Window's SRP initiator too?

I'm not familiar with the WinOF codebase. But at first sight it looks
like the WinOF SRP initiator doesn't support processing SRP_CRED_REQ
information units either, which means that the WinOF SRP initiator is
vulnerable to the same initiator lockup issue as the Linux SRP
initiator.

Bart.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html