On 9/5/25 16:59, Daniel Wagner wrote:
Extend the capabilities of the generic CPU to hardware queue (hctx) mapping code, so it maps houskeeping CPUs and isolated CPUs to the hardware queues evenly.A hctx is only operational when there is at least one online housekeeping CPU assigned (aka active_hctx). Thus, check the final mapping that there is no hctx which has only offline housekeeing CPU and online isolated CPUs. Example mapping result: 16 online CPUs isolcpus=io_queue,2-3,6-7,12-13 Queue mapping: hctx0: default 0 2 hctx1: default 1 3 hctx2: default 4 6 hctx3: default 5 7 hctx4: default 8 12 hctx5: default 9 13 hctx6: default 10 hctx7: default 11 hctx8: default 14 hctx9: default 15 IRQ mapping: irq 42 affinity 0 effective 0 nvme0q0 irq 43 affinity 0 effective 0 nvme0q1 irq 44 affinity 1 effective 1 nvme0q2 irq 45 affinity 4 effective 4 nvme0q3 irq 46 affinity 5 effective 5 nvme0q4 irq 47 affinity 8 effective 8 nvme0q5 irq 48 affinity 9 effective 9 nvme0q6 irq 49 affinity 10 effective 10 nvme0q7 irq 50 affinity 11 effective 11 nvme0q8 irq 51 affinity 14 effective 14 nvme0q9 irq 52 affinity 15 effective 15 nvme0q10 A corner case is when the number of online CPUs and present CPUs differ and the driver asks for less queues than online CPUs, e.g. 8 online CPUs, 16 possible CPUs isolcpus=io_queue,2-3,6-7,12-13 virtio_blk.num_request_queues=2 Queue mapping: hctx0: default 0 1 2 3 4 5 6 7 8 12 13 hctx1: default 9 10 11 14 15 IRQ mapping irq 27 affinity 0 effective 0 virtio0-config irq 28 affinity 0-1,4-5,8 effective 5 virtio0-req.0 irq 29 affinity 9-11,14-15 effective 0 virtio0-req.1 Noteworthy is that for the normal/default configuration (!isoclpus) the mapping will change for systems which have non hyperthreading CPUs. The main assignment loop will completely rely that group_mask_cpus_evenly to do the right thing. The old code would distribute the CPUs linearly over the hardware context: queue mapping for /dev/nvme0n1 hctx0: default 0 8 hctx1: default 1 9 hctx2: default 2 10 hctx3: default 3 11 hctx4: default 4 12 hctx5: default 5 13 hctx6: default 6 14 hctx7: default 7 15 The assign each hardware context the map generated by the group_mask_cpus_evenly function: queue mapping for /dev/nvme0n1 hctx0: default 0 1 hctx1: default 2 3 hctx2: default 4 5 hctx3: default 6 7 hctx4: default 8 9 hctx5: default 10 11 hctx6: default 12 13 hctx7: default 14 15 In case of hyperthreading CPUs, the resulting map stays the same. Signed-off-by: Daniel Wagner <[email protected]> --- block/blk-mq-cpumap.c | 177 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 158 insertions(+), 19 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 8244ecf878358c0b8de84458dcd5100c2f360213..1e66882e4d5bd9f78d132f3a229a1577853f7a9f 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -17,12 +17,25 @@ #include "blk.h" #include "blk-mq.h"+static struct cpumask blk_hk_online_mask;+ static unsigned int blk_mq_num_queues(const struct cpumask *mask, unsigned int max_queues) { unsigned int num;- num = cpumask_weight(mask);+ if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) { + const struct cpumask *hk_mask; + struct cpumask avail_mask; + + hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE); + cpumask_and(&avail_mask, mask, hk_mask); + + num = cpumask_weight(&avail_mask); + } else { + num = cpumask_weight(mask); + } + return min_not_zero(num, max_queues); }@@ -31,9 +44,13 @@ static unsigned int blk_mq_num_queues(const struct cpumask *mask,* * Returns an affinity mask that represents the queue-to-CPU mapping * requested by the block layer based on possible CPUs. + * This helper takes isolcpus settings into account. */ const struct cpumask *blk_mq_possible_queue_affinity(void) { + if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) + return housekeeping_cpumask(HK_TYPE_IO_QUEUE); + return cpu_possible_mask; } EXPORT_SYMBOL_GPL(blk_mq_possible_queue_affinity); @@ -46,6 +63,12 @@ EXPORT_SYMBOL_GPL(blk_mq_possible_queue_affinity); */ const struct cpumask *blk_mq_online_queue_affinity(void) { + if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) { + cpumask_and(&blk_hk_online_mask, cpu_online_mask, + housekeeping_cpumask(HK_TYPE_IO_QUEUE)); + return &blk_hk_online_mask;
Can you explain the use of 'blk_hk_online_mask'? Why is a static variable? To my untrained eye it's being recalculated every time one calls this function. And only the first invocation run on an empty mask, all subsequent ones see a populated mask. Is that the intention? Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect [email protected] +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
