Re: [PATCH] IB/srp: Fix initiator lockup
On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: I'd actually think that the decline caused by exhausting credit on the initiator side is far worse than ib_post_recv.. If ib_post_recv really is noticable then maybe just force loading the RQ after XX unposted are outstanding. The performance penalty of exhausting credit on the initiator side is indeed far worse than that of a single ib_post_recv() call at the target side. But in a realistic setup such credit exhaustion happens at a rate of about once an hour, while ib_post_recv() is called once for every request. It is easy to set up an SRP I/O test where ib_post_recv() is called more than 250.000 times per second. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
The impact of the ib_post_recv() call is small but measurable. I'll publish measurement results soon. Interesting. If there is overhead you could amortize it by keeping some number of spare buffers posting, and batching your calls to ib_post_recv() (eg post a list of 8 buffers every 8 commands, and keep 7 spare buffers posted). - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote: On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: [ ... ] Also, I couldn't tell for sure from a cursory examination of the patch, but the initiator must be designed to not stall processing the RQ dependent on the SQ or the credit level, when using a credit scheme like this. Or you will deadlock. For instance it isn't clear to me how the control flow around srp_process_cred_req is ment to work - it tries to send a reply, but if it can't due to srp_get_txp_iu failing it just gives up forever? For a standards-conforming SRP target, srp_get_txp_iu() will never fail. A quote from section 5.5.2 of the SRP r16a document: What guarantees that that the send completion for the reply is processed before a receive completion for the next request? Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Wed, Jan 13, 2010 at 7:16 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote: On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: [ ... ] Also, I couldn't tell for sure from a cursory examination of the patch, but the initiator must be designed to not stall processing the RQ dependent on the SQ or the credit level, when using a credit scheme like this. Or you will deadlock. For instance it isn't clear to me how the control flow around srp_process_cred_req is ment to work - it tries to send a reply, but if it can't due to srp_get_txp_iu failing it just gives up forever? For a standards-conforming SRP target, srp_get_txp_iu() will never fail. A quote from section 5.5.2 of the SRP r16a document: What guarantees that that the send completion for the reply is processed before a receive completion for the next request? Did you notice that I increased the number of receive slots reserved for target-initiated requests from one to two ? + /* Number of receive slots reserved for receiving requests. */ + SRP_RXR_SIZE = 2, Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Wed, Jan 13, 2010 at 07:57:26PM +0100, Bart Van Assche wrote: On Wed, Jan 13, 2010 at 7:16 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Wed, Jan 13, 2010 at 08:23:27AM +0100, Bart Van Assche wrote: On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: [ ... ] Also, I couldn't tell for sure from a cursory examination of the patch, but the initiator must be designed to not stall processing the RQ dependent on the SQ or the credit level, when using a credit scheme like this. Or you will deadlock. For instance it isn't clear to me how the control flow around srp_process_cred_req is ment to work - it tries to send a reply, but if it can't due to srp_get_txp_iu failing it just gives up forever? For a standards-conforming SRP target, srp_get_txp_iu() will never fail. A quote from section 5.5.2 of the SRP r16a document: What guarantees that that the send completion for the reply is processed before a receive completion for the next request? Did you notice that I increased the number of receive slots reserved for target-initiated requests from one to two ? Yes, but I don't see how more receive slots helps exhausting send slots. The IBA doesn't guarentee ordering between completions originating from different work queues. Getting a receive completion does not guarantee that a related send completion has been received. Even if the work queues are mapped to the same CQ. Since srp_get_txp_iu depends on txp_head/tail which is incremented by send completions, but srp_get_txp_iu is called on the receive completion path, it is possible to call srp_get_txp_iu multiple times before the send completions are generated - even though the target is only generating new request sends in response to reply receives. The two main causes are ack coalescing algorithms and lost ack packets. That said, with SRP_TXP_SIZE set to 2, I think it would be really hard to break this in real life, with current hardware. Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
Note: I do not know of any SRP targets that implement approach (2). As far as I know all SRP targets use approach (1). But no other SRP targets send SRP_CRED_REQ -- and that includes all native IB storage arrays that I know of. So there must be some other way around this (and having separate buffers for command receives and response sends does seem like a sensible way to design a target anyway). -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Tue, Jan 12, 2010 at 6:13 PM, Roland Dreier rdre...@cisco.com wrote: Note: I do not know of any SRP targets that implement approach (2). As far as I know all SRP targets use approach (1). But no other SRP targets send SRP_CRED_REQ -- and that includes all native IB storage arrays that I know of. So there must be some other way around this (and having separate buffers for command receives and response sends does seem like a sensible way to design a target anyway). Having separate buffer sets for command receives and response sends might seem a sensible way to design a target. But calling ib_post_recv() before sending a response back to the initiator is not acceptable because this increases communication latency. When a target uses separate buffer sets, sends back SRP_RSP before responses before ib_post_recv() is called, and when it processes SRP_CMD requests in parallel, it still can happen that a sequence of RL - 2 SRP_RSP responses is sent back to the initiator all with the REQUEST LIMIT DELTA field set to zero. So even with separate buffer sets there is a need for SRP_CRED_REQ support in the initiator. I'm surprised by all this opposition against adding SRP_CRED_REQ support in the initiator, a feature defined in the SRP standard ? Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
Having separate buffer sets for command receives and response sends might seem a sensible way to design a target. But calling ib_post_recv() before sending a response back to the initiator is not acceptable because this increases communication latency. When a target uses separate buffer sets, sends back SRP_RSP before responses before ib_post_recv() is called, and when it processes SRP_CMD requests in parallel, it still can happen that a sequence of RL - 2 SRP_RSP responses is sent back to the initiator all with the REQUEST LIMIT DELTA field set to zero. So even with separate buffer sets there is a need for SRP_CRED_REQ support in the initiator. I doubt you could benchmark the overhead of calling ib_post_recv() in the full SRP protocol. Really, I bet it's less than 100 nanoseconds to form the work request and call ib_post_recv(). Maybe I'm wrong but I really expect the overhead of actually doing IO (even to a ramdisk buffer or something) is going to be much higher than posting a receive. Or maybe you did benchmark it and I'm wrong? I'm surprised by all this opposition against adding SRP_CRED_REQ support in the initiator, a feature defined in the SRP standard ? As I said I agree we should implement support for SRP_CRED_REQ. However I think it would be better if the SCST target could be made to work reliably with pre-2.6.34 initiators as well. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Tue, Jan 12, 2010 at 02:57:35PM -0800, Roland Dreier wrote: I doubt you could benchmark the overhead of calling ib_post_recv() in the full SRP protocol. Really, I bet it's less than 100 nanoseconds to form the work request and call ib_post_recv(). Maybe I'm wrong but I really expect the overhead of actually doing IO (even to a ramdisk buffer or something) is going to be much higher than posting a receive. Or maybe you did benchmark it and I'm wrong? I'd actually think that the decline caused by exhausting credit on the initiator side is far worse than ib_post_recv.. If ib_post_recv really is noticable then maybe just force loading the RQ after XX unposted are outstanding. I also was wondering if the SRP spec had anything to say about this, creating this kind of RQ-SQ dependency on either side should only be done if the protocol explicitly allows it for one side only - otherwise you risk deadlocking when used with an initiator that felt it too could create that kind of dependency. Also, I couldn't tell for sure from a cursory examination of the patch, but the initiator must be designed to not stall processing the RQ dependent on the SQ or the credit level, when using a credit scheme like this. Or you will deadlock. For instance it isn't clear to me how the control flow around srp_process_cred_req is ment to work - it tries to send a reply, but if it can't due to srp_get_txp_iu failing it just gives up forever? Jason -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Wed, Jan 13, 2010 at 12:24 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: [ ... ] Also, I couldn't tell for sure from a cursory examination of the patch, but the initiator must be designed to not stall processing the RQ dependent on the SQ or the credit level, when using a credit scheme like this. Or you will deadlock. For instance it isn't clear to me how the control flow around srp_process_cred_req is ment to work - it tries to send a reply, but if it can't due to srp_get_txp_iu failing it just gives up forever? For a standards-conforming SRP target, srp_get_txp_iu() will never fail. A quote from section 5.5.2 of the SRP r16a document: SRP target ports shall limit themselves to one outstanding SRP request (see table 7) per RDMA channel. Upon sending an SRP request, an SRP target port shall not send another SRP request on the same RDMA channel until after it receives the SRP response (see table 8) for the previous SRP request. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Tue, Jan 12, 2010 at 11:57 PM, Roland Dreier rdre...@cisco.com wrote: Having separate buffer sets for command receives and response sends might seem a sensible way to design a target. But calling ib_post_recv() before sending a response back to the initiator is not acceptable because this increases communication latency. When a target uses separate buffer sets, sends back SRP_RSP before responses before ib_post_recv() is called, and when it processes SRP_CMD requests in parallel, it still can happen that a sequence of RL - 2 SRP_RSP responses is sent back to the initiator all with the REQUEST LIMIT DELTA field set to zero. So even with separate buffer sets there is a need for SRP_CRED_REQ support in the initiator. I doubt you could benchmark the overhead of calling ib_post_recv() in the full SRP protocol. Really, I bet it's less than 100 nanoseconds to form the work request and call ib_post_recv(). Maybe I'm wrong but I really expect the overhead of actually doing IO (even to a ramdisk buffer or something) is going to be much higher than posting a receive. Or maybe you did benchmark it and I'm wrong? The impact of the ib_post_recv() call is small but measurable. I'll publish measurement results soon. I'm surprised by all this opposition against adding SRP_CRED_REQ support in the initiator, a feature defined in the SRP standard ? As I said I agree we should implement support for SRP_CRED_REQ. However I think it would be better if the SCST target could be made to work reliably with pre-2.6.34 initiators as well. OK, I will first evaluate which alternatives there are for SCST-SRPT such that it does not rely on initiator-side support for SRP_CRED_REQ. This will not only help with pre-2.6.34 SRP initiators but also with other SRP initiators than the one available in the vanilla Linux kernel (e.g. OFED or WinOF). Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Sun, Jan 10, 2010 at 7:41 AM, Roland Dreier rdre...@cisco.com wrote: The patch I posted is really fixing the original bug. The problem was that neither the SRP target nor the SRP initiator had support for SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both software components in order to fix this bug. There's no way for the target to return credits through responses? I agree that we should implement the full SRP spec in the initiator but it seems unfortunate to force both an initiator and target upgrade to fix what really appears to be a target bug. This means anyone running a pre-2.6.34 kernel won't be able to use the SCST SRP target reliably. Please let me explain why the SCST SRP target behaves as observed, why this behavior is not specific to SCST, and which workaround is available for pre-2.6.34 SRP initiator users. As known an SRP target passes the so-called req_lim value to the SRP initiator via the REQUEST LIMIT DELTA field of the SRP_LOGIN_RSP information unit. Let's call this value RL. As specified in the SRP r16a document, an initiator may never send more than RL - 2 unanswered SRP_CMD information units to an SRP target. When an SRP_CMD request is being processed by an SRP target, the SRP target can e.g. process this request using one of the following strategies: 1. Using the buffer in which the SRP_CMD request was received to build the response. In this case once the response has been built the target will call ib_post_send() and will wait until the send completion has been received before it will declare that buffer again available for receiving by calling ib_post_recv(). 2. Using separate sets of buffers for receiving SRP_CMD requests and sending back SRP_RSP responses. In this case it is possible for the target to re-enable receiving for the buffer in which the SRP_CMD request was received before the SRP_RSP response is sent back. Regarding approach (2): with this approach the value of the REQUEST LIMIT DELTA field in the SRP_RSP information unit will always equal one. With this approach it will never be necessary that the targets sends an SRP_CRED_REQ information unit to the initiator. Regarding approach (1): since for each SRP_RSP response sent back by the target ib_post_recv() is called in the target after ib_post_send(), at least for the first SRP_RSP response the REQUEST LIMIT DELTA field will be equal to zero. And for a target that is able to process all received SRP_CMD information units in parallel, it can happen that the SRP target sends a contiguous series of (RL - 2) SRP_RSP information units to the initiator with the REQUEST LIMIT DELTA field equal to zero. As a consequence, the value of the req_lim variable in the SRP initiator will be equal to 2 and the initiator won't send any further SRP_CMD requests to the target. A scenario for how to get the SRP initiator into this state can be found in http://bugzilla.kernel.org/show_bug.cgi?id=14235. The only way to get out of this deadlock is that the target send an SRP_CRED_REQ information unit to the initiator with a non-zero REQUEST LIMIT DELTA field, and that the SRP initiator processes this SRP_CRED_REQ information unit. Because of this possible SRP initiator lockup SCST-SRPT users have been recommended to disable parallel processing of information units in this SRP target (by specifying the ib_srpt kernel parameter thread=1). My conclusion is that the SRP initiator lockup explained in http://bugzilla.kernel.org/show_bug.cgi?id=14235 is not specific to SCST-SRPT but that this lockup can be triggered by any SRP target that processes SRP_CMD requests in parallel. So as far as I can see the choices we have are: * Document that SRP_CRED_REQ support is missing in the Linux SRP initiator and hence that command processing in SRP targets must be complicated by making sure that never (RL - 2) contiguous SRP_RSP information units are sent to the SRP initiator with the REQUEST LIMIT DELTA field equal to zero. * Add support for the SRP_CRED_REQ information unit in the Linux SRP initiator. Note: I do not know of any SRP targets that implement approach (2). As far as I know all SRP targets use approach (1). Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
The patch I posted is really fixing the original bug. The problem was that neither the SRP target nor the SRP initiator had support for SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both software components in order to fix this bug. There's no way for the target to return credits through responses? I agree that we should implement the full SRP spec in the initiator but it seems unfortunate to force both an initiator and target upgrade to fix what really appears to be a target bug. This means anyone running a pre-2.6.34 kernel won't be able to use the SCST SRP target reliably. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Thu, Jan 7, 2010 at 8:57 AM, Roland Dreier rdre...@cisco.com wrote: It was only recently that I had the time to analyze this issue in depth and that I added SRP_CRED_REQ support in SCST (checked in that support about one week ago). Before SRP_CRED_REQ support was added in SCST, the SRP initiator could not log the Unhandled SRP opcode message because it did not receive any unhandled SRP opcodes. So it sounds like this is fixing something that was not the original bug? Because SCST wasn't sending SRP_CRED_REQ when the issue of the initiator hanging was originally seen, if I understand correctly. The patch I posted is really fixing the original bug. The problem was that neither the SRP target nor the SRP initiator had support for SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both software components in order to fix this bug. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Thu, Jan 7, 2010 at 9:03 AM, Bart Van Assche bart.vanass...@gmail.com wrote: On Thu, Jan 7, 2010 at 8:57 AM, Roland Dreier rdre...@cisco.com wrote: It was only recently that I had the time to analyze this issue in depth and that I added SRP_CRED_REQ support in SCST (checked in that support about one week ago). Before SRP_CRED_REQ support was added in SCST, the SRP initiator could not log the Unhandled SRP opcode message because it did not receive any unhandled SRP opcodes. So it sounds like this is fixing something that was not the original bug? Because SCST wasn't sending SRP_CRED_REQ when the issue of the initiator hanging was originally seen, if I understand correctly. The patch I posted is really fixing the original bug. The problem was that neither the SRP target nor the SRP initiator had support for SRP_CRED_REQ. Support for SRP_CRED_REQ has to be added to both software components in order to fix this bug. However, I have to admit that the patch description should have described what happened more in detail. I'll address that in the second version of my patch. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Thu, 2010-01-07 at 08:59 +0100, Bart Van Assche wrote: On Wed, Jan 6, 2010 at 10:40 PM, Roland Dreier rdre...@cisco.com wrote: Is that regular kernel coding practice, to run away with the work someone else did and to claim authorship ? As far as I know this is considered as impolite. I don't see with proper credit for Bart as claiming authorship. And yes, I think proposing a better way of doing things is definitely regular kernel coding practice. It's more than polite -- it's going way beyond helpful review comments and actually helping revise the patches. You are welcome to have a look at the descriptions of the three patches posted by David Dillow. I was not credited in any way in these descriptions. Not for analyzing the problem, not for the design of a solution and not for the work I did in implementing a solution. That is why I wrote that David Dillow was claiming authorship of the work I did. And here I thought I covered it by stating in the cover letter that they were not signed off, needed credit for you, and shouldn't go upstream until you ACK'd them, presumably with that fixed. I thought people would put two and two together, but I suppose not. I'll try to be more explicit next time. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
When the SRP initiator is communicating with an SRP target under load it can happen that the SRP initiator locks up. The communication pattern that causes the lockup is as follows: * SRP initiator sends (req_lim - 2) SRP_CMD requests to the target. * The REQUEST LIMIT DELTA field of each SRP_RSP response is zero. * The target sends an SRP_CRED_REQ information unit with non-zero REQUEST LIMIT DELTA. The above communication pattern brings the initiator in the following state: * srp_queuecommand() always returns SCSI_MLQUEUE_HOST_BUSY. * The per-session variable zero_req_lim keeps increasing. The initiator never leaves this state because it ignores SRP_CRED_REQ information units. This is all a bit obfuscated. The problem is that the initiator runs out of credits and stops sending commands; because we don't process SRP_CRED_REQ messages from the target, we never get more credits. I'm wondering why this took so long to come up? Does SCST send SRP_CRED_REQ only under unusual circumstances? Also I'm wondering why the Unhandled SRP opcode message didn't show up in the kernel log and help debug this? Some specific comments: +/* Similar to is_power_of_2(), but can be evaluated at compile time. */ +#define IS_POWER_OF_2(n) ((n) != 0 (((n) ((n) - 1)) == 0)) I don't think this level of ugliness is really required -- we can just document carefully at the definition that we need things to be powers of 2. +for (i = 0; i ARRAY_SIZE(target-txp_ring); ++i) +srp_free_iu(target-srp_host, target-txp_ring[i]); Not sure I understand why we need two TX rings -- why can't we just have one bigger TX ring that handles both requests and responses? + * Obtain an information unit for sending a request to the target. I think there's a bit of overcommenting in a few places. Does it really help anyone to repeat that what get_tx_iu does is get an information unit for sending? -return target-tx_ring[target-tx_head SRP_SQ_SIZE]; +return target-tx_ring[target-tx_head +(ARRAY_SIZE(target-tx_ring) - 1)]; is this an improvement? + * Send a request to the target. +static int __srp_post_send_req(struct srp_target_port *target, same -- does the comment add anything? +/* Completion queue. */ +SRP_CQ_SIZE = SRP_SQ_SIZE + SRP_TXP_SIZE + SRP_RQ_SIZE, etc... all these comments are mostly taking up vertical space and visually jarring, without adding much info. +/* + * SRP_CRED_REQ information unit, as defined in section 6.10 of the + * T10 SRP r16a document. + */ +struct srp_cred_req { +u8 opcode; +u8 sol_not; +u8 reserved[2]; +__be32 req_lim_delta; +u64 tag; +} __attribute__((packed)); again... does it help anyone to say struct srp_cred_req corresponds to SRP_CRED_REQ? scsi/srp.h already refers to the SRP document, and I would think anyone looking up SRP_CRED_REQ could find it faster by looking for the string itself rather than by section number. The existing comments in the file are actually useful -- they explain why some structures need to be packed, and as far as I can tell neither srp_cred_req nor srp_cred_rsp need to be packed. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
I agree that we should add support for SRP_CRED_REQ, but I'm not thrilled with the mix of changes in the patch, as well as the general aesthetics of the result. How about something like the following series -- posted as a follow up to this message -- with proper credit for Bart? I'll sign off on them once we're happy with a direction and Bart acks. Thanks very much for doing this! I agree that splitting up this way makes the patches much easier to review, and is the right way to do this. - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
It was only recently that I had the time to analyze this issue in depth and that I added SRP_CRED_REQ support in SCST (checked in that support about one week ago). Before SRP_CRED_REQ support was added in SCST, the SRP initiator could not log the Unhandled SRP opcode message because it did not receive any unhandled SRP opcodes. So it sounds like this is fixing something that was not the original bug? Because SCST wasn't sending SRP_CRED_REQ when the issue of the initiator hanging was originally seen, if I understand correctly. Do you understand the root cause of the original bug? - R. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Wed, Jan 6, 2010 at 10:40 PM, Roland Dreier rdre...@cisco.com wrote: Is that regular kernel coding practice, to run away with the work someone else did and to claim authorship ? As far as I know this is considered as impolite. I don't see with proper credit for Bart as claiming authorship. And yes, I think proposing a better way of doing things is definitely regular kernel coding practice. It's more than polite -- it's going way beyond helpful review comments and actually helping revise the patches. You are welcome to have a look at the descriptions of the three patches posted by David Dillow. I was not credited in any way in these descriptions. Not for analyzing the problem, not for the design of a solution and not for the work I did in implementing a solution. That is why I wrote that David Dillow was claiming authorship of the work I did. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Mon, 2010-01-04 at 08:13 +0100, Bart Van Assche wrote: On Mon, Jan 4, 2010 at 2:34 AM, David Dillow d...@thedillows.org wrote: I agree that we should add support for SRP_CRED_REQ, but I'm not thrilled with the mix of changes in the patch, as well as the general aesthetics of the result. How about something like the following series -- posted as a follow up to this message -- with proper credit for Bart? I'll sign off on them once we're happy with a direction and Bart acks. Also, these are all compile tested only, so they need some testing. I don't have anything that uses these messages, so some help would be appreciated. Is that regular kernel coding practice, to run away with the work someone else did and to claim authorship ? As far as I know this is considered as impolite. At no time did I claim authorship. If you re-read what I wrote, I explicitly listed proper credit to you as an outstanding issue. I didn't like the way the code looked after your patch, and the three separate changes mashed up into one patch. So I took time out of my afternoon to break them up to better fit my interpretation of kernel coding practice. I don't really care who's name they go in under. Do you have any technical comments? -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Sat, 2010-01-02 at 13:19 +0100, Bart Van Assche wrote: This patch fixes this issue by adding support for SRP_CRED_REQ information units in the SRP initiator. Additionally, this patch makes the per-session variable req_lim visible through sysfs, which makes observing the initiator state easier. I agree that we should add support for SRP_CRED_REQ, but I'm not thrilled with the mix of changes in the patch, as well as the general aesthetics of the result. How about something like the following series -- posted as a follow up to this message -- with proper credit for Bart? I'll sign off on them once we're happy with a direction and Bart acks. Also, these are all compile tested only, so they need some testing. I don't have anything that uses these messages, so some help would be appreciated. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Fix initiator lockup
On Mon, Jan 4, 2010 at 2:34 AM, David Dillow d...@thedillows.org wrote: On Sat, 2010-01-02 at 13:19 +0100, Bart Van Assche wrote: This patch fixes this issue by adding support for SRP_CRED_REQ information units in the SRP initiator. Additionally, this patch makes the per-session variable req_lim visible through sysfs, which makes observing the initiator state easier. I agree that we should add support for SRP_CRED_REQ, but I'm not thrilled with the mix of changes in the patch, as well as the general aesthetics of the result. How about something like the following series -- posted as a follow up to this message -- with proper credit for Bart? I'll sign off on them once we're happy with a direction and Bart acks. Also, these are all compile tested only, so they need some testing. I don't have anything that uses these messages, so some help would be appreciated. Is that regular kernel coding practice, to run away with the work someone else did and to claim authorship ? As far as I know this is considered as impolite. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] IB/srp: Fix initiator lockup
When the SRP initiator is communicating with an SRP target under load it can happen that the SRP initiator locks up. The communication pattern that causes the lockup is as follows: * SRP initiator sends (req_lim - 2) SRP_CMD requests to the target. * The REQUEST LIMIT DELTA field of each SRP_RSP response is zero. * The target sends an SRP_CRED_REQ information unit with non-zero REQUEST LIMIT DELTA. The above communication pattern brings the initiator in the following state: * srp_queuecommand() always returns SCSI_MLQUEUE_HOST_BUSY. * The per-session variable zero_req_lim keeps increasing. The initiator never leaves this state because it ignores SRP_CRED_REQ information units. This patch fixes this issue by adding support for SRP_CRED_REQ information units in the SRP initiator. Additionally, this patch makes the per-session variable req_lim visible through sysfs, which makes observing the initiator state easier. See also http://bugzilla.kernel.org/show_bug.cgi?id=14235 Signed-off-by: Bart Van Assche bart.vanass...@gmail.com Reported-by: Chris Worley worl...@gmail.com Cc: Roland Dreier rola...@cisco.com diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 54c8fe2..2006a0a 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2005 Cisco Systems. All rights reserved. + * Copyright (c) 2009 Bart Van Assche bart.vanass...@gmail.com. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -53,6 +54,8 @@ #define PFXDRV_NAME : #define DRV_VERSION0.2 #define DRV_RELDATENovember 1, 2005 +/* Similar to is_power_of_2(), but can be evaluated at compile time. */ +#define IS_POWER_OF_2(n) ((n) != 0 (((n) ((n) - 1)) == 0)) MODULE_AUTHOR(Roland Dreier); MODULE_DESCRIPTION(InfiniBand SCSI RDMA Protocol initiator @@ -82,6 +85,11 @@ static void srp_add_one(struct ib_device *device); static void srp_remove_one(struct ib_device *device); static void srp_completion(struct ib_cq *cq, void *target_ptr); static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event); +static void srp_process_cred_req(struct srp_target_port *target, +struct srp_cred_req *req); +static void srp_process_aer_req(struct srp_target_port *target, + struct srp_cred_req *req); +static int srp_post_recv(struct srp_target_port *target); static struct scsi_transport_template *ib_srp_transport_template; @@ -237,7 +245,7 @@ static int srp_create_target_ib(struct srp_target_port *target) ib_req_notify_cq(target-cq, IB_CQ_NEXT_COMP); init_attr-event_handler = srp_qp_event; - init_attr-cap.max_send_wr = SRP_SQ_SIZE; + init_attr-cap.max_send_wr = SRP_SQ_SIZE + SRP_TXP_SIZE; init_attr-cap.max_recv_wr = SRP_RQ_SIZE; init_attr-cap.max_recv_sge= 1; init_attr-cap.max_send_sge= 1; @@ -272,10 +280,12 @@ static void srp_free_target_ib(struct srp_target_port *target) ib_destroy_qp(target-qp); ib_destroy_cq(target-cq); - for (i = 0; i SRP_RQ_SIZE; ++i) + for (i = 0; i ARRAY_SIZE(target-rx_ring); ++i) srp_free_iu(target-srp_host, target-rx_ring[i]); - for (i = 0; i SRP_SQ_SIZE + 1; ++i) + for (i = 0; i ARRAY_SIZE(target-tx_ring); ++i) srp_free_iu(target-srp_host, target-tx_ring[i]); + for (i = 0; i ARRAY_SIZE(target-txp_ring); ++i) + srp_free_iu(target-srp_host, target-txp_ring[i]); } static void srp_path_rec_completion(int status, @@ -888,6 +898,14 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc) PFX Got target logout request\n); break; + case SRP_CRED_REQ: + srp_process_cred_req(target, iu-buf); + break; + + case SRP_AER_REQ: + srp_process_aer_req(target, iu-buf); + break; + default: shost_printk(KERN_WARNING, target-scsi_host, PFX Unhandled SRP opcode 0x%02x\n, opcode); @@ -908,7 +926,11 @@ static void srp_completion(struct ib_cq *cq, void *target_ptr) if (wc.status) { shost_printk(KERN_ERR, target-scsi_host, PFX failed %s status %d\n, -wc.wr_id SRP_OP_RECV ? receive : send, +wc.wr_id SRP_OP_RECV +? receiving +: wc.wr_id SRP_OP_TXP +? sending response +: sending request, wc.status); target-qp_in_error = 1;
Re: [PATCH] IB/srp: Fix initiator lockup
On Sat, Jan 2, 2010 at 5:05 PM, Chris Worley worl...@gmail.com wrote: Would this fix effect the Window's SRP initiator too? I'm not familiar with the WinOF codebase. But at first sight it looks like the WinOF SRP initiator doesn't support processing SRP_CRED_REQ information units either, which means that the WinOF SRP initiator is vulnerable to the same initiator lockup issue as the Linux SRP initiator. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html