Re: [PATCH for-next 05/20] RDMA/hns: Add command queue support for hip08 RoCE driver
On 2017/9/27 20:41, Doug Ledford wrote: On Wed, 2017-09-27 at 08:21 -0400, Doug Ledford wrote: And if you argee, after this patchset has been accepted we will send a following up patch : In hns_roce_cmq_send function, replace usleep_range(1000, 2000); with the following statement: udelay(1); And if so, we can avoid using usleep_range function in spin_lock_bh spin region, because it probally cause calltrace. Ok, I'm fine with that. I'll pull these in. OK, these are in my tree, thanks. Thanks a lot. Regards Wei Hu
Re: [PATCH for-next 05/20] RDMA/hns: Add command queue support for hip08 RoCE driver
On Wed, 2017-09-27 at 08:21 -0400, Doug Ledford wrote: > > And if you argee, after this patchset has been accepted we will > send a > > following up patch : > > In hns_roce_cmq_send function, replace > > usleep_range(1000, 2000); > > with the following statement: > > udelay(1); > > And if so, we can avoid using usleep_range function in > spin_lock_bh > > spin region, > > because it probally cause calltrace. > > Ok, I'm fine with that. I'll pull these in. OK, these are in my tree, thanks. -- Doug Ledford GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Re: [PATCH for-next 05/20] RDMA/hns: Add command queue support for hip08 RoCE driver
On 9/26/2017 10:46 PM, Wei Hu (Xavier) wrote: > > > On 2017/9/27 0:18, Doug Ledford wrote: >> On 9/26/2017 9:13 AM, Wei Hu (Xavier) wrote: >>> >>> On 2017/9/26 1:36, Doug Ledford wrote: On Mon, 2017-09-25 at 20:18 +0300, Leon Romanovsky wrote: > On Mon, Sep 25, 2017 at 01:06:53PM -0400, Doug Ledford wrote: >> On Wed, 2017-08-30 at 17:23 +0800, Wei Hu (Xavier) wrote: >> >>> + /* >>> + * If the command is sync, wait for the firmware to >>> write >>> back, >>> + * if multi descriptors to be sent, use the first one to >>> check >>> + */ >>> + if ((desc->flag) & HNS_ROCE_CMD_FLAG_NO_INTR) { >>> + do { >>> + if (hns_roce_cmq_csq_done(hr_dev)) >>> + break; >>> + usleep_range(1000, 2000); >>> + timeout++; >>> + } while (timeout < priv->cmq.tx_timeout); >>> + } >> then we spin here for a maximum amount of time between 200 and >> 400ms, >> so 1/4 to 1/2 a second. All the time we are holding the bh lock on >> this CPU. That seems excessive to me. If we are going to spin >> that >> long, can you find a way to allocate/reserve your resources, send >> the >> command, then drop the bh lock while you spin, and retake it before >> you >> complete once the spinning is done? > They don't allocate anything in this loop, but checking the pointers > are > the same, see hns_roce_cmq_csq_done. I'm not sure I understand your intended implication of your comment. I wasn't concerned about them allocating anything, only that if the hardware is hung, then this loop will hang out for 1/4 to 1/2 a second and hold up all bottom half processing on this CPU in the meantime. That's the sort of things that provides poor overall system behavior. Now, since they are really only checking to see if the hardware has gotten around to their particular command, and their command is part of a ring structure, it's possible to record the original head command, and our new head command, and then release the spin_lock_bh around the entire do{ }while construct, and in hns_roce_cmd_csq_done() you could check that head is not in the range old_head:new_head. That would protect you in case something in the bottom half processing queued up some more commands and from one sleep to the next the head jumped from something other than the new_head to something past new_head, so that head == priv->cmq.csq.next_to_use ends up being perpetually false. But, that's just from a quick read of the code, I could easily be missing something here... >>> Hi, Doug >>> Driver issues the cmds in cmq, and firmware gets and processes >>> them. >>> The firmware process only one cmd at the same time, and it will >>> take >>> about serveral to 200 us in one cmd currently, so the driver need >>> not to use stream mode to issue cmd. >> I'm not sure I understand your response here. >> >> I get that the driver issues cmds in the cmq, and that the firmware gets >> them and processes them. >> >> I get that the firmware will only work on one command at a time and only >> move to the next one once the current one is complete. >> >> I get that commands take anywhere from a few usec to a couple hundred >> usec. >> >> I also get that because you are sleeping for somewhere in between 1000 >> and 2000 usecs, that the driver could easily finish a whole slew of >> commands. It could do 10 slow commands, or 100 or more fast commands. >> What this tells me is that the only reason your current implementation >> of hns_roce_cmq_csq_done() works at all is because you keep the device >> locked out from any other commands being put on the queue. As far as I >> can tell, that's the only way you can guarantee that at some point you >> will wake up and the head pointer will be exactly at csq->next_to_use. >> Otherwise, if you didn't block them out, then you could sleep with the >> head pointer before csq->next_to_use and wake up the next time with it >> already well past csq->next_to_use. Am I right about that? While you >> are waiting on this command queue, any other commands are blocked from >> being placed on the command queue? > Hi, Doug, > you are right. > And one "hns_x" ib device only has one command queue in hip08, > other commands will be blocked when waiting on the command queue. >> >> I don't understand what you mean by "so the driver need not to use >> stream mode to issue cmd". > Sorry, my expression error. > stream -> pipeline > > And if you argee, after this patchset has been accepted we will send a > following up patch : > In hns_roce_cmq_send function, replace > usleep_range(1000, 2000); > with the following statement: > udelay(1); > And if so, we can avoid using usleep_range function in spin_lock_bh > spin region, > because it prob
Re: [PATCH for-next 05/20] RDMA/hns: Add command queue support for hip08 RoCE driver
On 2017/9/27 0:18, Doug Ledford wrote: On 9/26/2017 9:13 AM, Wei Hu (Xavier) wrote: On 2017/9/26 1:36, Doug Ledford wrote: On Mon, 2017-09-25 at 20:18 +0300, Leon Romanovsky wrote: On Mon, Sep 25, 2017 at 01:06:53PM -0400, Doug Ledford wrote: On Wed, 2017-08-30 at 17:23 +0800, Wei Hu (Xavier) wrote: + /* + * If the command is sync, wait for the firmware to write back, + * if multi descriptors to be sent, use the first one to check + */ + if ((desc->flag) & HNS_ROCE_CMD_FLAG_NO_INTR) { + do { + if (hns_roce_cmq_csq_done(hr_dev)) + break; + usleep_range(1000, 2000); + timeout++; + } while (timeout < priv->cmq.tx_timeout); + } then we spin here for a maximum amount of time between 200 and 400ms, so 1/4 to 1/2 a second. All the time we are holding the bh lock on this CPU. That seems excessive to me. If we are going to spin that long, can you find a way to allocate/reserve your resources, send the command, then drop the bh lock while you spin, and retake it before you complete once the spinning is done? They don't allocate anything in this loop, but checking the pointers are the same, see hns_roce_cmq_csq_done. I'm not sure I understand your intended implication of your comment. I wasn't concerned about them allocating anything, only that if the hardware is hung, then this loop will hang out for 1/4 to 1/2 a second and hold up all bottom half processing on this CPU in the meantime. That's the sort of things that provides poor overall system behavior. Now, since they are really only checking to see if the hardware has gotten around to their particular command, and their command is part of a ring structure, it's possible to record the original head command, and our new head command, and then release the spin_lock_bh around the entire do{ }while construct, and in hns_roce_cmd_csq_done() you could check that head is not in the range old_head:new_head. That would protect you in case something in the bottom half processing queued up some more commands and from one sleep to the next the head jumped from something other than the new_head to something past new_head, so that head == priv->cmq.csq.next_to_use ends up being perpetually false. But, that's just from a quick read of the code, I could easily be missing something here... Hi, Doug Driver issues the cmds in cmq, and firmware gets and processes them. The firmware process only one cmd at the same time, and it will take about serveral to 200 us in one cmd currently, so the driver need not to use stream mode to issue cmd. I'm not sure I understand your response here. I get that the driver issues cmds in the cmq, and that the firmware gets them and processes them. I get that the firmware will only work on one command at a time and only move to the next one once the current one is complete. I get that commands take anywhere from a few usec to a couple hundred usec. I also get that because you are sleeping for somewhere in between 1000 and 2000 usecs, that the driver could easily finish a whole slew of commands. It could do 10 slow commands, or 100 or more fast commands. What this tells me is that the only reason your current implementation of hns_roce_cmq_csq_done() works at all is because you keep the device locked out from any other commands being put on the queue. As far as I can tell, that's the only way you can guarantee that at some point you will wake up and the head pointer will be exactly at csq->next_to_use. Otherwise, if you didn't block them out, then you could sleep with the head pointer before csq->next_to_use and wake up the next time with it already well past csq->next_to_use. Am I right about that? While you are waiting on this command queue, any other commands are blocked from being placed on the command queue? Hi, Doug, you are right. And one "hns_x" ib device only has one command queue in hip08, other commands will be blocked when waiting on the command queue. I don't understand what you mean by "so the driver need not to use stream mode to issue cmd". Sorry, my expression error. stream -> pipeline And if you argee, after this patchset has been accepted we will send a following up patch : In hns_roce_cmq_send function, replace usleep_range(1000, 2000); with the following statement: udelay(1); And if so, we can avoid using usleep_range function in spin_lock_bh spin region, because it probally cause calltrace. Best regards Wei Hu Regards Wei Hu +#define HNS_ROCE_CMQ_TX_TIMEOUT 200 or you could reduce the size of this define... -- Doug Ledford GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD -- To unsubscribe from this list: send the line "unsubscribe linux- rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html