On Thu, Jul 03, 2025 at 08:58:02AM +0200, Hannes Reinecke wrote: > > + for (queue = 0; queue < qmap->nr_queues; queue++) { > > + unsigned int idx = (qmap->queue_offset + queue) % nr_masks; > > + > > + for_each_cpu(cpu, &hk_masks[idx]) { > > + qmap->mq_map[cpu] = idx; > > + > > + if (cpu_online(cpu)) > > + cpumask_set_cpu(qmap->mq_map[cpu], active_hctx); > > Why cpu_online? Up until this point it really didn't matter if the affinity > mask was set to 'online' or 'possible' cpus, but here you > require CPUs to be online...
This part here tracks if the a hardware context has at least one housekeeping CPU online. It is possible to provide configuration where we end up with hardware context which have offline housekeeping CPUs and online isolcpus. active_hctx tracks which of the hardware contexts is usable which is used in the next step... > > + } > > + } > > + > > + /* Map isolcpus to hardware context */ > > + queue = cpumask_first(active_hctx); > > + for_each_cpu_andnot(cpu, cpu_possible_mask, mask) { > > + qmap->mq_map[cpu] = (qmap->queue_offset + queue) % nr_masks; > > + queue = cpumask_next_wrap(queue, active_hctx); > > + } > > Really? Doesn't this map _all_ cpus, and not just the isolcpus? for_each_cpu iterates over is all CPUs which are not houskeeping CPUs (mask is the housekeeping mask), thus these are all isol CPU. Note the 'andnot' part. The cpumask_first/cpumask_next_wrap returns only hardware context which have at least one housekeeping CPU which is online. Yes, it possible to make this a bit smarter, so that we keep the grouping of the offline CPUs intact, though I am not sure if it is worth to add complexity for a corner case at least not yet. > > +fallback: > > + /* > > + * Map all CPUs to the first hctx to ensure at least one online > > + * housekeeping CPU is serving it. > > + */ > > + for_each_possible_cpu(cpu) > > + qmap->mq_map[cpu] = 0; > > I think you need to map all hctx, no? The block layer is filtering out hctx which have no CPU assigned to it when selecting a queue. This is really a failsafe mode, it just makes sure the system boots. > > + /* Map housekeeping CPUs to a hctx */ > > + for (queue = 0; queue < qmap->nr_queues; queue++) { > > + for_each_cpu(cpu, dev->bus->irq_get_affinity(dev, offset + > > queue)) { > > + qmap->mq_map[cpu] = qmap->queue_offset + queue; > > + > > + cpumask_set_cpu(cpu, mask); > > + if (cpu_online(cpu)) > > + cpumask_set_cpu(qmap->mq_map[cpu], active_hctx); > > Now that is really curious. You pick up the interrupt affinity from the > 'bus', which, I assume, is the PCI bus. And this would imply that the > bus can (or already is) programmed for this interrupt affinity. Yes, this is the case. irq_create_affinity_masks which use group_cpu_evenly/group_mask_cpu_evenly for the number of requested IRQs. The number of IRQs can be higher than the number of requested queues here. It's necessary to use the affinity mask created by irq_create_affinity_mask as input. > Which would imply that this is a usable interrupt affinity from the > hardware perspective, irrespective on whether the cpu is online or > not. So why the check to cpu_online()? Can't we simply take the existing > affinity > and rely on the hardware to do the right thing? Again, this is tracking if a htcx has online housekeeping CPU.