Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 10/6/20 8:24 AM, John Garry wrote: > On 28/09/2020 17:11, Kashyap Desai wrote: >>> >>> John, >>> Have you had a chance to check these outstanding SCSI patches? scsi: megaraid_sas: Added support for shared host tagset for >> cpuhotplug scsi: scsi_debug: Support host tagset scsi: hisi_sas: Switch v3 hw to MQ scsi: core: Show nr_hw_queues in sysfs scsi: Add host and host template flag 'host_tagset' >>> >>> These look good to me. >>> >>> Jens, feel free to merge. >> >> Hi Jens, Gentle ping. I am not able to find commits for above listed scsi >> patches. I want to use your repo which has above mentioned patch for >> patch submission. Martin has Acked the scsi >> patches. >> >>> >>> Acked-by: Martin K. Petersen >>> >>> -- >>> Martin K. Petersen Oracle Linux Engineering > > > Hi Jens, > > Could you kindly pick up the following patches, to go along with the > blk-mq changes: > > scsi: megaraid_sas: Added support for shared host tagset for > cpuhotplug > scsi: scsi_debug: Support host tagset > scsi: hisi_sas: Switch v3 hw to MQ > scsi: core: Show nr_hw_queues in sysfs > scsi: Add host and host template flag 'host_tagset' Sorry about the delay, picked up these 5 patches. -- Jens Axboe
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 28/09/2020 17:11, Kashyap Desai wrote: John, Have you had a chance to check these outstanding SCSI patches? scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug scsi: scsi_debug: Support host tagset scsi: hisi_sas: Switch v3 hw to MQ scsi: core: Show nr_hw_queues in sysfs scsi: Add host and host template flag 'host_tagset' These look good to me. Jens, feel free to merge. Hi Jens, Gentle ping. I am not able to find commits for above listed scsi patches. I want to use your repo which has above mentioned patch for patch submission. Martin has Acked the scsi patches. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering Hi Jens, Could you kindly pick up the following patches, to go along with the blk-mq changes: scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug scsi: scsi_debug: Support host tagset scsi: hisi_sas: Switch v3 hw to MQ scsi: core: Show nr_hw_queues in sysfs scsi: Add host and host template flag 'host_tagset' Thanks, John
RE: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
> > John, > > > Have you had a chance to check these outstanding SCSI patches? > > > > scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug > > scsi: scsi_debug: Support host tagset > > scsi: hisi_sas: Switch v3 hw to MQ > > scsi: core: Show nr_hw_queues in sysfs > > scsi: Add host and host template flag 'host_tagset' > > These look good to me. > > Jens, feel free to merge. Hi Jens, Gentle ping. I am not able to find commits for above listed scsi patches. I want to use your repo which has above mentioned patch for patch submission. Martin has Acked the scsi patches. > > Acked-by: Martin K. Petersen > > -- > Martin K. PetersenOracle Linux Engineering smime.p7s Description: S/MIME Cryptographic Signature
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 21/09/2020 23:15, John Garry wrote: On 21/09/2020 22:35, don.br...@microchip.com wrote: I'm waiting on the hpsa and smartpqi patches >>update, so please kindly merge only those >>patches, above. Thanks! John, the hpsa driver crashes, the or more patches to allow internal commands from Hannas seem to be missing. I'll let you know exactly which ones soon. Hi Don, Right, that branch did not include Hannes patches as they did not apply cleanly, but I think that you need the same patches as before. I can create a branch for you to test which does include those tomorrow - let me know. Alternatively I think that we could create a hpsa patch which does not rely on that series, like I mentioned here [0], but it would not be as clean. Cheers, John https://lore.kernel.org/linux-scsi/dc0e72d8-7076-060c-3cd3-3d51ac7e6...@huawei.com/ . JFYI, I put the reserved command v6 patchset and $subject patchset on this branch: https://github.com/hisilicon/kernel-dev/commits/private-topic-blk-mq-shared-tags-rfc-v8-resv-v6-baseline I don't have HW to test hpsa or smartpqi. Thanks, John
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 21/09/2020 22:35, don.br...@microchip.com wrote: I'm waiting on the hpsa and smartpqi patches >>update, so please kindly merge only those >>patches, above. Thanks! John, the hpsa driver crashes, the or more patches to allow internal commands from Hannas seem to be missing. I'll let you know exactly which ones soon. Hi Don, Right, that branch did not include Hannes patches as they did not apply cleanly, but I think that you need the same patches as before. I can create a branch for you to test which does include those tomorrow - let me know. Alternatively I think that we could create a hpsa patch which does not rely on that series, like I mentioned here [0], but it would not be as clean. Cheers, John https://lore.kernel.org/linux-scsi/dc0e72d8-7076-060c-3cd3-3d51ac7e6...@huawei.com/
RE: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
Subject: Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs >>Hi Jens, >>I'm waiting on the hpsa and smartpqi patches >>update, so please kindly merge >>only those >>patches, above. >>Thanks! John, the hpsa driver crashes, the or more patches to allow internal commands from Hannas seem to be missing. I'll let you know exactly which ones soon. Thanks, Don
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
Have you had a chance to check these outstanding SCSI patches? scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug scsi: scsi_debug: Support host tagset scsi: hisi_sas: Switch v3 hw to MQ scsi: core: Show nr_hw_queues in sysfs scsi: Add host and host template flag 'host_tagset' These look good to me. Jens, feel free to merge. Hi Jens, I'm waiting on the hpsa and smartpqi patches update, so please kindly merge only those patches, above. Thanks! Acked-by: Martin K. Petersen Cheers Martin
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
John, > Have you had a chance to check these outstanding SCSI patches? > > scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug > scsi: scsi_debug: Support host tagset > scsi: hisi_sas: Switch v3 hw to MQ > scsi: core: Show nr_hw_queues in sysfs > scsi: Add host and host template flag 'host_tagset' These look good to me. Jens, feel free to merge. Acked-by: Martin K. Petersen -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 04/09/2020 13:44, Martin K. Petersen wrote: Martin/James may want more review of the SCSI core bits, though. I'll take a look later today. Hi Martin, Have you had a chance to check these outstanding SCSI patches? scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug scsi: scsi_debug: Support host tagset scsi: hisi_sas: Switch v3 hw to MQ scsi: core: Show nr_hw_queues in sysfs scsi: Add host and host template flag 'host_tagset' Cheers, John
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 8/19/20 5:20 PM, John Garry wrote: > Hi all, > > Here is v8 of the patchset. > > In this version of the series, we keep the shared sbitmap for driver tags, > and introduce changes to fix up the tag budgeting across request queues. > We also have a change to count requests per-hctx for when an elevator is > enabled, as an optimisation. I also dropped the debugfs changes - more on > that below. > > Some performance figures: > > Using 12x SAS SSDs on hisi_sas v3 hw. mq-deadline results are included, > but it is not always an appropriate scheduler to use. > > Tag depth 4000 (default) 260** > > Baseline (v5.9-rc1): > none sched: 2094K IOPS 513K > mq-deadline sched:2145K IOPS 1336K > > Final, host_tagset=0 in LLDD *, ***: > none sched: 2120K IOPS 550K > mq-deadline sched:2121K IOPS 1309K > > Final ***: > none sched: 2132K IOPS 1185 > mq-deadline sched:2145K IOPS 2097 > > * this is relevant as this is the performance in supporting but not > enabling the feature > ** depth=260 is relevant as some point where we are regularly waiting for >tags to be available. Figures were are a bit unstable here. > *** Included "[PATCH V4] scsi: core: only re-run queue in > scsi_end_request() if device queue is busy" > > A copy of the patches can be found here: > https://github.com/hisilicon/kernel-dev/tree/private-topic-blk-mq-shared-tags-v8 > > The hpsa patch depends on: > https://lore.kernel.org/linux-scsi/20200430131904.5847-1-h...@suse.de/ > > And the smartpqi patch is not to be accepted. > > Comments (and testing) welcome, thanks! > > Differences to v7: > - Add null_blk and scsi_debug support > - Drop debugfs tags patch - it's too difficult to be the same between > hostwide and non-hostwide, as discussed: > https://lore.kernel.org/linux-scsi/1591810159-240929-1-git-send-email-john.ga...@huawei.com/T/#mb3eb462d8be40273718505989abd12f8228c15fd > And from commit 6bf0eb550452 ("sbitmap: Consider cleared bits in > sbitmap_bitmap_show()"), I guess not many used this anyway... > - Add elevator per-hctx request count for optimisation > - Break up "blk-mq: rename blk_mq_update_tag_set_depth()" into 2x patches > - Pass flags for avoid per-hq queue tags init/free for hostwide tags > - Add Don's reviewed-tag and tested-by tags to appropiate patches > - (@Don, please let me know if issue with how I did this) > - Add "scsi: core: Show nr_hw_queues in sysfs" > - Rework megaraid SAS patch to have module param (Kashyap) > - rebase > > V7 is here for more info: > https://lore.kernel.org/linux-scsi/1591810159-240929-1-git-send-email-john.ga...@huawei.com/T/#t > > Hannes Reinecke (5): > blk-mq: Rename blk_mq_update_tag_set_depth() > blk-mq: Free tags in blk_mq_init_tags() upon error > scsi: Add host and host template flag 'host_tagset' > hpsa: enable host_tagset and switch to MQ > smartpqi: enable host tagset > > John Garry (10): > blk-mq: Pass flags for tag init/free > blk-mq: Use pointers for blk_mq_tags bitmap tags > blk-mq: Facilitate a shared sbitmap per tagset > blk-mq: Relocate hctx_may_queue() > blk-mq: Record nr_active_requests per queue for when using shared > sbitmap > blk-mq: Record active_queues_shared_sbitmap per tag_set for when using > shared sbitmap > null_blk: Support shared tag bitmap > scsi: core: Show nr_hw_queues in sysfs > scsi: hisi_sas: Switch v3 hw to MQ > scsi: scsi_debug: Support host tagset > > Kashyap Desai (2): > blk-mq, elevator: Count requests per hctx to improve performance > scsi: megaraid_sas: Added support for shared host tagset for > cpuhotplug > > Ming Lei (1): > blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED > Now that Jens merged the block bits in his tree, wouldn't it be better to re-send the SCSI bits only, thereby avoiding a potential merge error later on? Cheers, Hannes -- Dr. Hannes ReineckeKernel Storage Architect h...@suse.de +49 911 74053 688 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), GF: Felix Imendörffer
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 08/09/2020 13:46, Hannes Reinecke wrote: Now that Jens merged the block bits in his tree, wouldn't it be better to re-send the SCSI bits only, thereby avoiding a potential merge error later on? Anything which I resend would need to be against Jens' tree (and not Martin's), assuming Jens will carry them also. So I am not sure how that will help. JFYI, I just tested against today's linux-next, and the SCSI parts (hpsa and smartpqi omitted) still apply there without conflict. Thanks, John
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
John, > Martin/James may want more review of the SCSI core bits, though. I'll take a look later today. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 03/09/2020 22:23, Jens Axboe wrote: On 8/19/20 9:20 AM, John Garry wrote: Hi all, Here is v8 of the patchset. In this version of the series, we keep the shared sbitmap for driver tags, and introduce changes to fix up the tag budgeting across request queues. We also have a change to count requests per-hctx for when an elevator is enabled, as an optimisation. I also dropped the debugfs changes - more on that below. Some performance figures: Using 12x SAS SSDs on hisi_sas v3 hw. mq-deadline results are included, but it is not always an appropriate scheduler to use. Tag depth 4000 (default) 260** Baseline (v5.9-rc1): none sched: 2094K IOPS 513K mq-deadline sched: 2145K IOPS 1336K Final, host_tagset=0 in LLDD *, ***: none sched: 2120K IOPS 550K mq-deadline sched: 2121K IOPS 1309K Final ***: none sched: 2132K IOPS 1185 mq-deadline sched: 2145K IOPS 2097 * this is relevant as this is the performance in supporting but not enabling the feature ** depth=260 is relevant as some point where we are regularly waiting for tags to be available. Figures were are a bit unstable here. *** Included "[PATCH V4] scsi: core: only re-run queue in scsi_end_request() if device queue is busy" A copy of the patches can be found here: https://github.com/hisilicon/kernel-dev/tree/private-topic-blk-mq-shared-tags-v8 The hpsa patch depends on: https://lore.kernel.org/linux-scsi/20200430131904.5847-1-h...@suse.de/ And the smartpqi patch is not to be accepted. Comments (and testing) welcome, thanks! I applied 1-11, leaving the SCSI core bits and drivers to Martin. I can also carry them, just let me know. Great, thanks! So the SCSI parts depend on the block parts for building, so I guess it makes sense if you could carry them also. hpsa and smartpqi patches are pending for now, but the rest could be picked up. Martin/James may want more review of the SCSI core bits, though. Thanks again, John
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 8/19/20 9:20 AM, John Garry wrote: > Hi all, > > Here is v8 of the patchset. > > In this version of the series, we keep the shared sbitmap for driver tags, > and introduce changes to fix up the tag budgeting across request queues. > We also have a change to count requests per-hctx for when an elevator is > enabled, as an optimisation. I also dropped the debugfs changes - more on > that below. > > Some performance figures: > > Using 12x SAS SSDs on hisi_sas v3 hw. mq-deadline results are included, > but it is not always an appropriate scheduler to use. > > Tag depth 4000 (default) 260** > > Baseline (v5.9-rc1): > none sched: 2094K IOPS 513K > mq-deadline sched:2145K IOPS 1336K > > Final, host_tagset=0 in LLDD *, ***: > none sched: 2120K IOPS 550K > mq-deadline sched:2121K IOPS 1309K > > Final ***: > none sched: 2132K IOPS 1185 > mq-deadline sched:2145K IOPS 2097 > > * this is relevant as this is the performance in supporting but not > enabling the feature > ** depth=260 is relevant as some point where we are regularly waiting for >tags to be available. Figures were are a bit unstable here. > *** Included "[PATCH V4] scsi: core: only re-run queue in > scsi_end_request() if device queue is busy" > > A copy of the patches can be found here: > https://github.com/hisilicon/kernel-dev/tree/private-topic-blk-mq-shared-tags-v8 > > The hpsa patch depends on: > https://lore.kernel.org/linux-scsi/20200430131904.5847-1-h...@suse.de/ > > And the smartpqi patch is not to be accepted. > > Comments (and testing) welcome, thanks! I applied 1-11, leaving the SCSI core bits and drivers to Martin. I can also carry them, just let me know. -- Jens Axboe
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
On 2020-08-19 11:20 a.m., John Garry wrote: Hi all, Here is v8 of the patchset. In this version of the series, we keep the shared sbitmap for driver tags, and introduce changes to fix up the tag budgeting across request queues. We also have a change to count requests per-hctx for when an elevator is enabled, as an optimisation. I also dropped the debugfs changes - more on that below. Some performance figures: Using 12x SAS SSDs on hisi_sas v3 hw. mq-deadline results are included, but it is not always an appropriate scheduler to use. Tag depth 4000 (default) 260** Baseline (v5.9-rc1): none sched: 2094K IOPS 513K mq-deadline sched: 2145K IOPS 1336K Final, host_tagset=0 in LLDD *, ***: none sched: 2120K IOPS 550K mq-deadline sched: 2121K IOPS 1309K Final ***: none sched: 2132K IOPS 1185 mq-deadline sched: 2145K IOPS 2097 * this is relevant as this is the performance in supporting but not enabling the feature ** depth=260 is relevant as some point where we are regularly waiting for tags to be available. Figures were are a bit unstable here. *** Included "[PATCH V4] scsi: core: only re-run queue in scsi_end_request() if device queue is busy" A copy of the patches can be found here: https://github.com/hisilicon/kernel-dev/tree/private-topic-blk-mq-shared-tags-v8 The hpsa patch depends on: https://lore.kernel.org/linux-scsi/20200430131904.5847-1-h...@suse.de/ And the smartpqi patch is not to be accepted. Comments (and testing) welcome, thanks! I tested this v8 patchset on MKP's 5.10/scsi-queue branch together with my rewritten sg driver on my laptop and a Ryzen 5 3600 machine. Since I don't have same hardware, I use the scsi_debug driver as the target: modprobe scsi_debug dev_size_mb=1024 sector_size=512 add_host=7 per_host_store=1 ndelay=1000 random=1 submit_queues=12 My test is a script which runs these three commands many times with differing parameters: sg_mrq_dd iflag=random bs=512 of=/dev/sg8 thr=64 time=2 time to transfer data was 0.312705 secs, 3433.72 MB/sec 2097152+0 records in 2097152+0 records out sg_mrq_dd bpt=256 thr=64 mrq=36 time=2 if=/dev/sg8 bs=512 of=/dev/sg9 time to transfer data was 0.212090 secs, 5062.67 MB/sec 2097152+0 records in 2097152+0 records out sg_mrq_dd --verify if=/dev/sg8 of=/dev/sg9 bs=512 bpt=256 thr=64 mrq=36 time=2 Doing verify/cmp rather than copy time to transfer data was 0.184563 secs, 5817.75 MB/sec 2097152+0 records in 2097152+0 records verified The above is the output from last section of the my script run on the Ryzen 5. So the three steps are: 1) produce random data on /dev/sg8 2) copy /dev/sg8 to /dev/sg9 3) verify /dev/sg8 and /dev/sg9 are the same. The latter step is done with a sequence of READ(/dev/sg8) and VERIFY(BYTCHK=1 on /dev/sg9). The "mrq" stands for multiple requests (in one invocation; the bsg driver did that before its write(2) command was removed. The SCSI devices on the Ryzen 5 machine are: # lsscsi -gs [2:0:0:0] diskIBM-207x HUSMM8020ASS20 J4B6 /dev/sda /dev/sg0 200GB [2:0:1:0] diskSEAGATE ST200FM0073 0007 /dev/sdb /dev/sg1 200GB [2:0:2:0] enclosu Areca Te ARC-802801.37.69 0137 - /dev/sg2 - [3:0:0:0] diskLinuxscsi_debug 0190 /dev/sdc /dev/sg3 1.07GB [4:0:0:0] diskLinuxscsi_debug 0190 /dev/sdd /dev/sg4 1.07GB [5:0:0:0] diskLinuxscsi_debug 0190 /dev/sde /dev/sg5 1.07GB [6:0:0:0] diskLinuxscsi_debug 0190 /dev/sdf /dev/sg6 1.07GB [7:0:0:0] diskLinuxscsi_debug 0190 /dev/sdg /dev/sg7 1.07GB [8:0:0:0] diskLinuxscsi_debug 0190 /dev/sdh /dev/sg8 1.07GB [9:0:0:0] diskLinuxscsi_debug 0190 /dev/sdi /dev/sg9 1.07GB [N:0:1:1] diskWDC WDS250G2B0C-00PXH0__1 /dev/nvme0n1 - 250GB My script took 17m12 and the highest throughput (on a copy) was 7.5 GB/sec. Then I reloaded the scsi_debug module, this time with an additional 'host_max_queue=128' parameter. The script run time was 5 seconds shorter and the maximum throughput was around 7.6 GB/sec. [Average throughput is around 4 GB/sec.] For comparison: # time liburing/examples/io_uring-cp /dev/sdh /dev/sdi real0m1.542s user0m0.004s sys 0m1.027s Umm, that's less then 1 GB/sec. In its defence io_uring-cp is an extremely simple, single threaded, proof-of-concept copy program, at least compared to sg_mrq_dd . As used by the sg_mrq_dd the rewritten sg driver bypasses moving 1 GB to and from the _user_ space while doing the above copy and verify steps. So: Tested-by: Douglas Gilbert Differences to v7: - Add null_blk and scsi_debug support - Drop debugfs tags patch - it's too difficult to be the same between hostwide and non-hostwide, as
Re: [PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
Hi Jens, I was wondering if you could kindly consider the block changes in this series, since I have now dropped the RFC flag? I guess patch 5/18 (using pointers to bitmaps) would be of first concern. We did discuss this previously, and I think what we're doing now could be considered satisfactory. Thanks, john Here is v8 of the patchset. In this version of the series, we keep the shared sbitmap for driver tags, and introduce changes to fix up the tag budgeting across request queues. We also have a change to count requests per-hctx for when an elevator is enabled, as an optimisation. I also dropped the debugfs changes - more on that below. Some performance figures: Using 12x SAS SSDs on hisi_sas v3 hw. mq-deadline results are included, but it is not always an appropriate scheduler to use. Tag depth 4000 (default) 260** Baseline (v5.9-rc1): none sched: 2094K IOPS 513K mq-deadline sched: 2145K IOPS 1336K Final, host_tagset=0 in LLDD *, ***: none sched: 2120K IOPS 550K mq-deadline sched: 2121K IOPS 1309K Final ***: none sched: 2132K IOPS 1185 mq-deadline sched: 2145K IOPS 2097 * this is relevant as this is the performance in supporting but not enabling the feature ** depth=260 is relevant as some point where we are regularly waiting for tags to be available. Figures were are a bit unstable here. *** Included "[PATCH V4] scsi: core: only re-run queue in scsi_end_request() if device queue is busy" A copy of the patches can be found here: https://github.com/hisilicon/kernel-dev/tree/private-topic-blk-mq-shared-tags-v8 The hpsa patch depends on: https://lore.kernel.org/linux-scsi/20200430131904.5847-1-h...@suse.de/ And the smartpqi patch is not to be accepted. Comments (and testing) welcome, thanks! Differences to v7: - Add null_blk and scsi_debug support - Drop debugfs tags patch - it's too difficult to be the same between hostwide and non-hostwide, as discussed: https://lore.kernel.org/linux-scsi/1591810159-240929-1-git-send-email-john.ga...@huawei.com/T/#mb3eb462d8be40273718505989abd12f8228c15fd And from commit 6bf0eb550452 ("sbitmap: Consider cleared bits in sbitmap_bitmap_show()"), I guess not many used this anyway... - Add elevator per-hctx request count for optimisation - Break up "blk-mq: rename blk_mq_update_tag_set_depth()" into 2x patches - Pass flags for avoid per-hq queue tags init/free for hostwide tags - Add Don's reviewed-tag and tested-by tags to appropiate patches - (@Don, please let me know if issue with how I did this) - Add "scsi: core: Show nr_hw_queues in sysfs" - Rework megaraid SAS patch to have module param (Kashyap) - rebase V7 is here for more info: https://lore.kernel.org/linux-scsi/1591810159-240929-1-git-send-email-john.ga...@huawei.com/T/#t Hannes Reinecke (5): blk-mq: Rename blk_mq_update_tag_set_depth() blk-mq: Free tags in blk_mq_init_tags() upon error scsi: Add host and host template flag 'host_tagset' hpsa: enable host_tagset and switch to MQ smartpqi: enable host tagset John Garry (10): blk-mq: Pass flags for tag init/free blk-mq: Use pointers for blk_mq_tags bitmap tags blk-mq: Facilitate a shared sbitmap per tagset blk-mq: Relocate hctx_may_queue() blk-mq: Record nr_active_requests per queue for when using shared sbitmap blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap null_blk: Support shared tag bitmap scsi: core: Show nr_hw_queues in sysfs scsi: hisi_sas: Switch v3 hw to MQ scsi: scsi_debug: Support host tagset Kashyap Desai (2): blk-mq, elevator: Count requests per hctx to improve performance scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug Ming Lei (1): blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED block/bfq-iosched.c | 9 +- block/blk-core.c| 2 + block/blk-mq-debugfs.c | 10 +- block/blk-mq-sched.c| 13 +- block/blk-mq-tag.c | 149 ++-- block/blk-mq-tag.h | 56 +++- block/blk-mq.c | 81 +++ block/blk-mq.h | 76 +- block/kyber-iosched.c | 4 +- block/mq-deadline.c | 6 + drivers/block/null_blk_main.c | 6 + drivers/block/rnbd/rnbd-clt.c | 2 +- drivers/scsi/hisi_sas/hisi_sas.h| 3 +- drivers/scsi/hisi_sas/hisi_sas_main.c | 36 ++--- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 87 +--- drivers/scsi/hosts.c| 1 + drivers/scsi/hpsa.c
[PATCH v8 00/18] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs
Hi all, Here is v8 of the patchset. In this version of the series, we keep the shared sbitmap for driver tags, and introduce changes to fix up the tag budgeting across request queues. We also have a change to count requests per-hctx for when an elevator is enabled, as an optimisation. I also dropped the debugfs changes - more on that below. Some performance figures: Using 12x SAS SSDs on hisi_sas v3 hw. mq-deadline results are included, but it is not always an appropriate scheduler to use. Tag depth 4000 (default) 260** Baseline (v5.9-rc1): none sched: 2094K IOPS 513K mq-deadline sched: 2145K IOPS 1336K Final, host_tagset=0 in LLDD *, ***: none sched: 2120K IOPS 550K mq-deadline sched: 2121K IOPS 1309K Final ***: none sched: 2132K IOPS 1185 mq-deadline sched: 2145K IOPS 2097 * this is relevant as this is the performance in supporting but not enabling the feature ** depth=260 is relevant as some point where we are regularly waiting for tags to be available. Figures were are a bit unstable here. *** Included "[PATCH V4] scsi: core: only re-run queue in scsi_end_request() if device queue is busy" A copy of the patches can be found here: https://github.com/hisilicon/kernel-dev/tree/private-topic-blk-mq-shared-tags-v8 The hpsa patch depends on: https://lore.kernel.org/linux-scsi/20200430131904.5847-1-h...@suse.de/ And the smartpqi patch is not to be accepted. Comments (and testing) welcome, thanks! Differences to v7: - Add null_blk and scsi_debug support - Drop debugfs tags patch - it's too difficult to be the same between hostwide and non-hostwide, as discussed: https://lore.kernel.org/linux-scsi/1591810159-240929-1-git-send-email-john.ga...@huawei.com/T/#mb3eb462d8be40273718505989abd12f8228c15fd And from commit 6bf0eb550452 ("sbitmap: Consider cleared bits in sbitmap_bitmap_show()"), I guess not many used this anyway... - Add elevator per-hctx request count for optimisation - Break up "blk-mq: rename blk_mq_update_tag_set_depth()" into 2x patches - Pass flags for avoid per-hq queue tags init/free for hostwide tags - Add Don's reviewed-tag and tested-by tags to appropiate patches - (@Don, please let me know if issue with how I did this) - Add "scsi: core: Show nr_hw_queues in sysfs" - Rework megaraid SAS patch to have module param (Kashyap) - rebase V7 is here for more info: https://lore.kernel.org/linux-scsi/1591810159-240929-1-git-send-email-john.ga...@huawei.com/T/#t Hannes Reinecke (5): blk-mq: Rename blk_mq_update_tag_set_depth() blk-mq: Free tags in blk_mq_init_tags() upon error scsi: Add host and host template flag 'host_tagset' hpsa: enable host_tagset and switch to MQ smartpqi: enable host tagset John Garry (10): blk-mq: Pass flags for tag init/free blk-mq: Use pointers for blk_mq_tags bitmap tags blk-mq: Facilitate a shared sbitmap per tagset blk-mq: Relocate hctx_may_queue() blk-mq: Record nr_active_requests per queue for when using shared sbitmap blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap null_blk: Support shared tag bitmap scsi: core: Show nr_hw_queues in sysfs scsi: hisi_sas: Switch v3 hw to MQ scsi: scsi_debug: Support host tagset Kashyap Desai (2): blk-mq, elevator: Count requests per hctx to improve performance scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug Ming Lei (1): blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED block/bfq-iosched.c | 9 +- block/blk-core.c| 2 + block/blk-mq-debugfs.c | 10 +- block/blk-mq-sched.c| 13 +- block/blk-mq-tag.c | 149 ++-- block/blk-mq-tag.h | 56 +++- block/blk-mq.c | 81 +++ block/blk-mq.h | 76 +- block/kyber-iosched.c | 4 +- block/mq-deadline.c | 6 + drivers/block/null_blk_main.c | 6 + drivers/block/rnbd/rnbd-clt.c | 2 +- drivers/scsi/hisi_sas/hisi_sas.h| 3 +- drivers/scsi/hisi_sas/hisi_sas_main.c | 36 ++--- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 87 +--- drivers/scsi/hosts.c| 1 + drivers/scsi/hpsa.c | 44 +- drivers/scsi/hpsa.h | 1 - drivers/scsi/megaraid/megaraid_sas_base.c | 39 + drivers/scsi/megaraid/megaraid_sas_fusion.c | 29 ++-- drivers/scsi/scsi_debug.c | 28 ++-- drivers/scsi/scsi_lib.c | 2 + drivers/scsi/scsi_sysfs.c | 11 ++