Re: [PATCH 1/3] hpsa: remove unneeded loop
Hi James, I've gotten mail that you have accepted patches 2/3 and 3/3 from this series. This one has been accepted by Steve also, so please consider taking also this 1/3. Thanks, Tomas On 08/01/2013 03:11 PM, Tomas Henzl wrote: From: Tomas Henzl the...@redhat.com The cmd_pool_bits is protected everywhere with a spinlock, we don't need the test_and_set_bit, set_bit is enough and the loop can be removed too. Signed-off-by: Tomas Henzl the...@redhat.com --- drivers/scsi/hpsa.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 796482b..d7df01e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2662,15 +2662,12 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h) unsigned long flags; spin_lock_irqsave(h-lock, flags); - do { - i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); - if (i == h-nr_cmds) { - spin_unlock_irqrestore(h-lock, flags); - return NULL; - } - } while (test_and_set_bit - (i (BITS_PER_LONG - 1), - h-cmd_pool_bits + (i / BITS_PER_LONG)) != 0); + i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); + if (i == h-nr_cmds) { + spin_unlock_irqrestore(h-lock, flags); + return NULL; + } + set_bit(i (BITS_PER_LONG - 1), h-cmd_pool_bits + (i / BITS_PER_LONG)); h-nr_allocs++; spin_unlock_irqrestore(h-lock, flags); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On 08/06/2013 05:46 PM, scame...@beardog.cce.hp.com wrote: On Fri, Aug 02, 2013 at 01:13:59PM +0200, Tomas Henzl wrote: On 08/01/2013 06:18 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 05:39:36PM +0200, Tomas Henzl wrote: On 08/01/2013 05:19 PM, scame...@beardog.cce.hp.com wrote: [...] Btw. on line 1284 - isn't it similar to patch 2/3 ? ^^^ Oh, missed this the first time around, the sop driver uses the make_request_fn() interface, and it's not a stacked driver either, so there is no limit to the number of bios the block layer can stuff in -- make_request_fn must succeed. If we get full we just chain them together using pointers already in the struct bio for that purpose, so storing them in the driver requires no memory allocation on the driver's part. So while it's somewhat similar, we already have to handle the case of the block layer stuffing infinite bios into the driver, so getting full is not terribly out of the ordinary in that driver. OK. That being said, I'm poking around other bits of code lying around here looking for similar problems, so thanks again for that one. find_first_zero_bit is not atomic, but the test_and_set_bit, which is what counts, is atomic. That find_first_zero_bit is not atomic confused me about this code for a long time, and is why the spin lock was there in the first place. But if there's a race on the find_first_zero_bit and it returns the same bit to multiple concurrent threads, only one thread will win the test_and_set_bit, and the other threads will go back around the loop to try again, and get a different bit. Yes. But, let's expect just one zero bit at the end of the list. The find_first_zero_bit(ffzb) starts now, thread+1 zeroes a new bit at the beginning, ffzb continues, thread+2 takes the zero bit at the end. The result it that ffzb hasn't found a zero bit even though that at every moment that bit was there.Ffter that the function returns -EBUSY. rc = (u16) find_first_zero_bit(qinfo-request_bits, qinfo-qdepth); if (rc = qinfo-qdepth-1) return (u16) -EBUSY; Still, I think that this is almost impossible, and if it should happen a requeue is not so bad. Oh, wow. Didn't think of that. Hmm, technically no guarantee that any given thread would ever get a bit, if all the other threads keep snatching them away just ahead of an unlucky thread. Could we, instead of giving up, go back around and try again on the theory that some bits should be free in there someplace and the thread shouldn't be infinitely unlucky? In theory that gives you also no guarantee, it's likely that for a guarantee some kind of locking is needed, the spinlock, which already is there, gives you that. Otoh, a very high likelihood is probably enough and give better overall throughput, maybe some statistics/testing is needed? I don't know how much faster is it without the spinlock. On thinking about this a bit more, it would be a shame if we closed the hole allowing the cmd_alloc returned NULL message (the scsi_done() cmd_free() race) and then immediately opened up another different hole that permitted the same problem to occur. So to be safe, I think we should go with your patch as is -- leave the spin lock, but get rid of the unnecessary loop. Thank you. I was going to write something similar - that we could use my patch as a temporary solution until a better lockless is found. tomash -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On Fri, Aug 02, 2013 at 01:13:59PM +0200, Tomas Henzl wrote: On 08/01/2013 06:18 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 05:39:36PM +0200, Tomas Henzl wrote: On 08/01/2013 05:19 PM, scame...@beardog.cce.hp.com wrote: [...] Btw. on line 1284 - isn't it similar to patch 2/3 ? ^^^ Oh, missed this the first time around, the sop driver uses the make_request_fn() interface, and it's not a stacked driver either, so there is no limit to the number of bios the block layer can stuff in -- make_request_fn must succeed. If we get full we just chain them together using pointers already in the struct bio for that purpose, so storing them in the driver requires no memory allocation on the driver's part. So while it's somewhat similar, we already have to handle the case of the block layer stuffing infinite bios into the driver, so getting full is not terribly out of the ordinary in that driver. OK. That being said, I'm poking around other bits of code lying around here looking for similar problems, so thanks again for that one. find_first_zero_bit is not atomic, but the test_and_set_bit, which is what counts, is atomic. That find_first_zero_bit is not atomic confused me about this code for a long time, and is why the spin lock was there in the first place. But if there's a race on the find_first_zero_bit and it returns the same bit to multiple concurrent threads, only one thread will win the test_and_set_bit, and the other threads will go back around the loop to try again, and get a different bit. Yes. But, let's expect just one zero bit at the end of the list. The find_first_zero_bit(ffzb) starts now, thread+1 zeroes a new bit at the beginning, ffzb continues, thread+2 takes the zero bit at the end. The result it that ffzb hasn't found a zero bit even though that at every moment that bit was there.Ffter that the function returns -EBUSY. rc = (u16) find_first_zero_bit(qinfo-request_bits, qinfo-qdepth); if (rc = qinfo-qdepth-1) return (u16) -EBUSY; Still, I think that this is almost impossible, and if it should happen a requeue is not so bad. Oh, wow. Didn't think of that. Hmm, technically no guarantee that any given thread would ever get a bit, if all the other threads keep snatching them away just ahead of an unlucky thread. Could we, instead of giving up, go back around and try again on the theory that some bits should be free in there someplace and the thread shouldn't be infinitely unlucky? In theory that gives you also no guarantee, it's likely that for a guarantee some kind of locking is needed, the spinlock, which already is there, gives you that. Otoh, a very high likelihood is probably enough and give better overall throughput, maybe some statistics/testing is needed? I don't know how much faster is it without the spinlock. On thinking about this a bit more, it would be a shame if we closed the hole allowing the cmd_alloc returned NULL message (the scsi_done() cmd_free() race) and then immediately opened up another different hole that permitted the same problem to occur. So to be safe, I think we should go with your patch as is -- leave the spin lock, but get rid of the unnecessary loop. -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On 08/01/2013 06:18 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 05:39:36PM +0200, Tomas Henzl wrote: On 08/01/2013 05:19 PM, scame...@beardog.cce.hp.com wrote: [...] Btw. on line 1284 - isn't it similar to patch 2/3 ? ^^^ Oh, missed this the first time around, the sop driver uses the make_request_fn() interface, and it's not a stacked driver either, so there is no limit to the number of bios the block layer can stuff in -- make_request_fn must succeed. If we get full we just chain them together using pointers already in the struct bio for that purpose, so storing them in the driver requires no memory allocation on the driver's part. So while it's somewhat similar, we already have to handle the case of the block layer stuffing infinite bios into the driver, so getting full is not terribly out of the ordinary in that driver. OK. That being said, I'm poking around other bits of code lying around here looking for similar problems, so thanks again for that one. find_first_zero_bit is not atomic, but the test_and_set_bit, which is what counts, is atomic. That find_first_zero_bit is not atomic confused me about this code for a long time, and is why the spin lock was there in the first place. But if there's a race on the find_first_zero_bit and it returns the same bit to multiple concurrent threads, only one thread will win the test_and_set_bit, and the other threads will go back around the loop to try again, and get a different bit. Yes. But, let's expect just one zero bit at the end of the list. The find_first_zero_bit(ffzb) starts now, thread+1 zeroes a new bit at the beginning, ffzb continues, thread+2 takes the zero bit at the end. The result it that ffzb hasn't found a zero bit even though that at every moment that bit was there.Ffter that the function returns -EBUSY. rc = (u16) find_first_zero_bit(qinfo-request_bits, qinfo-qdepth); if (rc = qinfo-qdepth-1) return (u16) -EBUSY; Still, I think that this is almost impossible, and if it should happen a requeue is not so bad. Oh, wow. Didn't think of that. Hmm, technically no guarantee that any given thread would ever get a bit, if all the other threads keep snatching them away just ahead of an unlucky thread. Could we, instead of giving up, go back around and try again on the theory that some bits should be free in there someplace and the thread shouldn't be infinitely unlucky? In theory that gives you also no guarantee, it's likely that for a guarantee some kind of locking is needed, the spinlock, which already is there, gives you that. Otoh, a very high likelihood is probably enough and give better overall throughput, maybe some statistics/testing is needed? I don't know how much faster is it without the spinlock. tomash [...] -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On Thu, Aug 01, 2013 at 03:11:22PM +0200, Tomas Henzl wrote: From: Tomas Henzl the...@redhat.com The cmd_pool_bits is protected everywhere with a spinlock, we don't need the test_and_set_bit, set_bit is enough and the loop can be removed too. Signed-off-by: Tomas Henzl the...@redhat.com --- drivers/scsi/hpsa.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 796482b..d7df01e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2662,15 +2662,12 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h) unsigned long flags; spin_lock_irqsave(h-lock, flags); - do { - i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); - if (i == h-nr_cmds) { - spin_unlock_irqrestore(h-lock, flags); - return NULL; - } - } while (test_and_set_bit - (i (BITS_PER_LONG - 1), - h-cmd_pool_bits + (i / BITS_PER_LONG)) != 0); + i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); + if (i == h-nr_cmds) { + spin_unlock_irqrestore(h-lock, flags); + return NULL; + } + set_bit(i (BITS_PER_LONG - 1), h-cmd_pool_bits + (i / BITS_PER_LONG)); h-nr_allocs++; spin_unlock_irqrestore(h-lock, flags); -- 1.8.3.1 Would it be better instead to just not use the spinlock for protecting cmd_pool_bits? I have thought about doing this for awhile, but haven't gotten around to it. I think the while loop is safe without the spin lock. And then it is not needed in cmd_free either. -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On 08/01/2013 03:39 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 03:11:22PM +0200, Tomas Henzl wrote: From: Tomas Henzl the...@redhat.com The cmd_pool_bits is protected everywhere with a spinlock, we don't need the test_and_set_bit, set_bit is enough and the loop can be removed too. Signed-off-by: Tomas Henzl the...@redhat.com --- drivers/scsi/hpsa.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 796482b..d7df01e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2662,15 +2662,12 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h) unsigned long flags; spin_lock_irqsave(h-lock, flags); -do { -i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); -if (i == h-nr_cmds) { -spin_unlock_irqrestore(h-lock, flags); -return NULL; -} -} while (test_and_set_bit - (i (BITS_PER_LONG - 1), - h-cmd_pool_bits + (i / BITS_PER_LONG)) != 0); +i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); +if (i == h-nr_cmds) { +spin_unlock_irqrestore(h-lock, flags); +return NULL; +} +set_bit(i (BITS_PER_LONG - 1), h-cmd_pool_bits + (i / BITS_PER_LONG)); h-nr_allocs++; spin_unlock_irqrestore(h-lock, flags); -- 1.8.3.1 Would it be better instead to just not use the spinlock for protecting cmd_pool_bits? I have thought about doing this for awhile, but haven't gotten around to it. I think the while loop is safe without the spin lock. And then it is not needed in cmd_free either. I was evaluating the same idea for a while too, a loop and inside just the test_and_set_bit, maybe even a stored value to start with a likely empty bit from last time to tune it a bit. But I know almost nothing about the use pattern, so I decided for the least invasive change to the existing code, to not make it worse. -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On Thu, Aug 01, 2013 at 04:05:20PM +0200, Tomas Henzl wrote: On 08/01/2013 03:39 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 03:11:22PM +0200, Tomas Henzl wrote: From: Tomas Henzl the...@redhat.com The cmd_pool_bits is protected everywhere with a spinlock, we don't need the test_and_set_bit, set_bit is enough and the loop can be removed too. Signed-off-by: Tomas Henzl the...@redhat.com --- drivers/scsi/hpsa.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 796482b..d7df01e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2662,15 +2662,12 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h) unsigned long flags; spin_lock_irqsave(h-lock, flags); - do { - i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); - if (i == h-nr_cmds) { - spin_unlock_irqrestore(h-lock, flags); - return NULL; - } - } while (test_and_set_bit - (i (BITS_PER_LONG - 1), -h-cmd_pool_bits + (i / BITS_PER_LONG)) != 0); + i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); + if (i == h-nr_cmds) { + spin_unlock_irqrestore(h-lock, flags); + return NULL; + } + set_bit(i (BITS_PER_LONG - 1), h-cmd_pool_bits + (i / BITS_PER_LONG)); h-nr_allocs++; spin_unlock_irqrestore(h-lock, flags); -- 1.8.3.1 Would it be better instead to just not use the spinlock for protecting cmd_pool_bits? I have thought about doing this for awhile, but haven't gotten around to it. I think the while loop is safe without the spin lock. And then it is not needed in cmd_free either. I was evaluating the same idea for a while too, a loop and inside just the test_and_set_bit, maybe even a stored value to start with a likely empty bit from last time to tune it a bit. But I know almost nothing about the use pattern, so I decided for the least invasive change to the existing code, to not make it worse. Only reason I haven't done it is I'm loathe to make such a change to the main i/o path without testing it like crazy before unleashing it, and it's never been a convenient time to slide such a change in around here and get proper testing done (and there are other rather large changes brewing). However, we have been using a similar scheme with the SCSI over PCIe driver, here: https://github.com/HPSmartStorage/scsi-over-pcie/blob/master/block/sop.c in alloc_request() around line 1476 without problems, and nvme-core.c contains similar code in alloc_cmdid(), so I am confident it's sound in principle. I would want to beat on it though, in case it ends up exposing a firmware bug or something (not that I think it will, but you never know.) -- steve -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On 08/01/2013 04:21 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 04:05:20PM +0200, Tomas Henzl wrote: On 08/01/2013 03:39 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 03:11:22PM +0200, Tomas Henzl wrote: From: Tomas Henzl the...@redhat.com The cmd_pool_bits is protected everywhere with a spinlock, we don't need the test_and_set_bit, set_bit is enough and the loop can be removed too. Signed-off-by: Tomas Henzl the...@redhat.com --- drivers/scsi/hpsa.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 796482b..d7df01e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2662,15 +2662,12 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h) unsigned long flags; spin_lock_irqsave(h-lock, flags); - do { - i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); - if (i == h-nr_cmds) { - spin_unlock_irqrestore(h-lock, flags); - return NULL; - } - } while (test_and_set_bit - (i (BITS_PER_LONG - 1), -h-cmd_pool_bits + (i / BITS_PER_LONG)) != 0); + i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); + if (i == h-nr_cmds) { + spin_unlock_irqrestore(h-lock, flags); + return NULL; + } + set_bit(i (BITS_PER_LONG - 1), h-cmd_pool_bits + (i / BITS_PER_LONG)); h-nr_allocs++; spin_unlock_irqrestore(h-lock, flags); -- 1.8.3.1 Would it be better instead to just not use the spinlock for protecting cmd_pool_bits? I have thought about doing this for awhile, but haven't gotten around to it. I think the while loop is safe without the spin lock. And then it is not needed in cmd_free either. I was evaluating the same idea for a while too, a loop and inside just the test_and_set_bit, maybe even a stored value to start with a likely empty bit from last time to tune it a bit. But I know almost nothing about the use pattern, so I decided for the least invasive change to the existing code, to not make it worse. Only reason I haven't done it is I'm loathe to make such a change to the main i/o path without testing it like crazy before unleashing it, and it's never been a convenient time to slide such a change in around here and get proper testing done (and there are other rather large changes brewing). However, we have been using a similar scheme with the SCSI over PCIe driver, here: https://github.com/HPSmartStorage/scsi-over-pcie/blob/master/block/sop.c in alloc_request() around line 1476 without problems, and nvme-core.c contains similar code in alloc_cmdid(), so I am confident it's sound in principle. I would want to beat on it though, in case it ends up exposing a firmware bug or something (not that I think it will, but you never know.) I think the code is sound, maybe it could hypothetically return -EBUSY, because the find_first_zero_bit is not atomic, but this possibility is so low that it doesn't matter. Btw. on line 1284 - isn't it similar to patch 2/3 ? Back to this patch - we can take it as it is, because of the spinlock it should be safe, or omit it, you can then post a spinlock-less patch. I'll let the decision on you. tomash -- steve -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On Thu, Aug 01, 2013 at 04:59:45PM +0200, Tomas Henzl wrote: On 08/01/2013 04:21 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 04:05:20PM +0200, Tomas Henzl wrote: On 08/01/2013 03:39 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 03:11:22PM +0200, Tomas Henzl wrote: From: Tomas Henzl the...@redhat.com The cmd_pool_bits is protected everywhere with a spinlock, we don't need the test_and_set_bit, set_bit is enough and the loop can be removed too. Signed-off-by: Tomas Henzl the...@redhat.com --- drivers/scsi/hpsa.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 796482b..d7df01e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2662,15 +2662,12 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h) unsigned long flags; spin_lock_irqsave(h-lock, flags); -do { -i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); -if (i == h-nr_cmds) { -spin_unlock_irqrestore(h-lock, flags); -return NULL; -} -} while (test_and_set_bit - (i (BITS_PER_LONG - 1), - h-cmd_pool_bits + (i / BITS_PER_LONG)) != 0); +i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); +if (i == h-nr_cmds) { +spin_unlock_irqrestore(h-lock, flags); +return NULL; +} +set_bit(i (BITS_PER_LONG - 1), h-cmd_pool_bits + (i / BITS_PER_LONG)); h-nr_allocs++; spin_unlock_irqrestore(h-lock, flags); -- 1.8.3.1 Would it be better instead to just not use the spinlock for protecting cmd_pool_bits? I have thought about doing this for awhile, but haven't gotten around to it. I think the while loop is safe without the spin lock. And then it is not needed in cmd_free either. I was evaluating the same idea for a while too, a loop and inside just the test_and_set_bit, maybe even a stored value to start with a likely empty bit from last time to tune it a bit. But I know almost nothing about the use pattern, so I decided for the least invasive change to the existing code, to not make it worse. Only reason I haven't done it is I'm loathe to make such a change to the main i/o path without testing it like crazy before unleashing it, and it's never been a convenient time to slide such a change in around here and get proper testing done (and there are other rather large changes brewing). However, we have been using a similar scheme with the SCSI over PCIe driver, here: https://github.com/HPSmartStorage/scsi-over-pcie/blob/master/block/sop.c in alloc_request() around line 1476 without problems, and nvme-core.c contains similar code in alloc_cmdid(), so I am confident it's sound in principle. I would want to beat on it though, in case it ends up exposing a firmware bug or something (not that I think it will, but you never know.) I think the code is sound, maybe it could hypothetically return -EBUSY, because the find_first_zero_bit is not atomic, but this possibility is so low that it doesn't matter. Btw. on line 1284 - isn't it similar to patch 2/3 ? find_first_zero_bit is not atomic, but the test_and_set_bit, which is what counts, is atomic. That find_first_zero_bit is not atomic confused me about this code for a long time, and is why the spin lock was there in the first place. But if there's a race on the find_first_zero_bit and it returns the same bit to multiple concurrent threads, only one thread will win the test_and_set_bit, and the other threads will go back around the loop to try again, and get a different bit. I don't think a thread can get stuck in there never winning until all the bits are used up because there should always be enough bits for all the commands we would ever try to send concurrently, so every thread that gets in there should eventually get a bit. Or, am I missing some subtlety? Back to this patch - we can take it as it is, because of the spinlock it should be safe, or omit it, you can then post a spinlock-less patch. I'll let the decision on you. I think I like the spin-lock-less variant better. But I want to test it out here for awhile first. -- steve tomash -- steve -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: remove unneeded loop
On 08/01/2013 05:19 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 04:59:45PM +0200, Tomas Henzl wrote: On 08/01/2013 04:21 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 04:05:20PM +0200, Tomas Henzl wrote: On 08/01/2013 03:39 PM, scame...@beardog.cce.hp.com wrote: On Thu, Aug 01, 2013 at 03:11:22PM +0200, Tomas Henzl wrote: From: Tomas Henzl the...@redhat.com The cmd_pool_bits is protected everywhere with a spinlock, we don't need the test_and_set_bit, set_bit is enough and the loop can be removed too. Signed-off-by: Tomas Henzl the...@redhat.com --- drivers/scsi/hpsa.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c index 796482b..d7df01e 100644 --- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -2662,15 +2662,12 @@ static struct CommandList *cmd_alloc(struct ctlr_info *h) unsigned long flags; spin_lock_irqsave(h-lock, flags); -do { -i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); -if (i == h-nr_cmds) { -spin_unlock_irqrestore(h-lock, flags); -return NULL; -} -} while (test_and_set_bit - (i (BITS_PER_LONG - 1), - h-cmd_pool_bits + (i / BITS_PER_LONG)) != 0); +i = find_first_zero_bit(h-cmd_pool_bits, h-nr_cmds); +if (i == h-nr_cmds) { +spin_unlock_irqrestore(h-lock, flags); +return NULL; +} +set_bit(i (BITS_PER_LONG - 1), h-cmd_pool_bits + (i / BITS_PER_LONG)); h-nr_allocs++; spin_unlock_irqrestore(h-lock, flags); -- 1.8.3.1 Would it be better instead to just not use the spinlock for protecting cmd_pool_bits? I have thought about doing this for awhile, but haven't gotten around to it. I think the while loop is safe without the spin lock. And then it is not needed in cmd_free either. I was evaluating the same idea for a while too, a loop and inside just the test_and_set_bit, maybe even a stored value to start with a likely empty bit from last time to tune it a bit. But I know almost nothing about the use pattern, so I decided for the least invasive change to the existing code, to not make it worse. Only reason I haven't done it is I'm loathe to make such a change to the main i/o path without testing it like crazy before unleashing it, and it's never been a convenient time to slide such a change in around here and get proper testing done (and there are other rather large changes brewing). However, we have been using a similar scheme with the SCSI over PCIe driver, here: https://github.com/HPSmartStorage/scsi-over-pcie/blob/master/block/sop.c in alloc_request() around line 1476 without problems, and nvme-core.c contains similar code in alloc_cmdid(), so I am confident it's sound in principle. I would want to beat on it though, in case it ends up exposing a firmware bug or something (not that I think it will, but you never know.) I think the code is sound, maybe it could hypothetically return -EBUSY, because the find_first_zero_bit is not atomic, but this possibility is so low that it doesn't matter. Btw. on line 1284 - isn't it similar to patch 2/3 ? find_first_zero_bit is not atomic, but the test_and_set_bit, which is what counts, is atomic. That find_first_zero_bit is not atomic confused me about this code for a long time, and is why the spin lock was there in the first place. But if there's a race on the find_first_zero_bit and it returns the same bit to multiple concurrent threads, only one thread will win the test_and_set_bit, and the other threads will go back around the loop to try again, and get a different bit. Yes. But, let's expect just one zero bit at the end of the list. The find_first_zero_bit(ffzb) starts now, thread+1 zeroes a new bit at the beginning, ffzb continues, thread+2 takes the zero bit at the end. The result it that ffzb hasn't found a zero bit even though that at every moment that bit was there.Ffter that the function returns -EBUSY. rc = (u16) find_first_zero_bit(qinfo-request_bits, qinfo-qdepth); if (rc = qinfo-qdepth-1) return (u16) -EBUSY; Still, I think that this is almost impossible, and if it should happen a requeue is not so bad. I don't think a thread can get stuck in there never winning until all the bits are used up because there should always be enough bits for all the commands we would ever try to send concurrently, so every thread that gets in there should eventually get a bit. Or, am I missing some subtlety? Back to this patch - we can take it as it is, because of the spinlock it should be safe, or omit it, you can then post a spinlock-less patch. I'll let the decision on you. I think I like the spin-lock-less variant better. But I want to test it out here for
Re: [PATCH 1/3] hpsa: remove unneeded loop
On Thu, Aug 01, 2013 at 05:39:36PM +0200, Tomas Henzl wrote: On 08/01/2013 05:19 PM, scame...@beardog.cce.hp.com wrote: [...] Btw. on line 1284 - isn't it similar to patch 2/3 ? ^^^ Oh, missed this the first time around, the sop driver uses the make_request_fn() interface, and it's not a stacked driver either, so there is no limit to the number of bios the block layer can stuff in -- make_request_fn must succeed. If we get full we just chain them together using pointers already in the struct bio for that purpose, so storing them in the driver requires no memory allocation on the driver's part. So while it's somewhat similar, we already have to handle the case of the block layer stuffing infinite bios into the driver, so getting full is not terribly out of the ordinary in that driver. That being said, I'm poking around other bits of code lying around here looking for similar problems, so thanks again for that one. find_first_zero_bit is not atomic, but the test_and_set_bit, which is what counts, is atomic. That find_first_zero_bit is not atomic confused me about this code for a long time, and is why the spin lock was there in the first place. But if there's a race on the find_first_zero_bit and it returns the same bit to multiple concurrent threads, only one thread will win the test_and_set_bit, and the other threads will go back around the loop to try again, and get a different bit. Yes. But, let's expect just one zero bit at the end of the list. The find_first_zero_bit(ffzb) starts now, thread+1 zeroes a new bit at the beginning, ffzb continues, thread+2 takes the zero bit at the end. The result it that ffzb hasn't found a zero bit even though that at every moment that bit was there.Ffter that the function returns -EBUSY. rc = (u16) find_first_zero_bit(qinfo-request_bits, qinfo-qdepth); if (rc = qinfo-qdepth-1) return (u16) -EBUSY; Still, I think that this is almost impossible, and if it should happen a requeue is not so bad. Oh, wow. Didn't think of that. Hmm, technically no guarantee that any given thread would ever get a bit, if all the other threads keep snatching them away just ahead of an unlucky thread. Could we, instead of giving up, go back around and try again on the theory that some bits should be free in there someplace and the thread shouldn't be infinitely unlucky? [...] -- steve -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html