under virtual machine environment, cpu topology may differ from normal physical server.
for example (machine with 4 cores, 2 threads per core): normal physical server: core-id thread-0-id thread-1-id 0 0 4 1 1 5 2 2 6 3 3 7 virtual machine: core-id thread-0-id thread-1-id 0 0 1 1 2 3 2 4 5 3 6 7 If we attach disk with multi queues to virtual machine, blk_mq_map_queues can cause serious imbalance.this will lead to performance impact on system IO. Here is the qemu cmdline: "-smp 8,sockets=1,cores=8,threads=2" "-device virtio-blk-pci,drive=drive0,id=device0,num-queues=4,vectors=2" when vectors less than num-queues, virtio-blk will fallback to blk_mq_map_queues Before this patch: [root@blk-mq ~]# cat /sys/block/vd*/mq/*/cpu_list 0, 4, 5, 8, 9, 12, 13 1 2, 6, 7, 10, 11, 14, 15 3 After this patch: [root@blk-mq ~]# cat /sys/block/vd*/mq/*/cpu_list 0, 1, 8, 9 2, 3, 10, 11 4, 5, 12, 13 6, 7, 14, 15 Signed-off-by: luferry <lufe...@163.com> --- block/blk-mq-cpumap.c | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 03a534820271..2eb78ad4a49b 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -35,22 +35,37 @@ int blk_mq_map_queues(struct blk_mq_queue_map *qmap) { unsigned int *map = qmap->mq_map; unsigned int nr_queues = qmap->nr_queues; - unsigned int cpu, first_sibling; + unsigned int cpu, first_sibling, core = 0; + bool core_map = false; + /* + * If core num is euqal or over nr_queues, + * there be sure at least per core per queue + */ + for_each_possible_cpu(cpu) { + if (get_first_sibling(cpu) == cpu) + core++; + } + if (core >= nr_queues) + core_map = true; + + core = 0; for_each_possible_cpu(cpu) { /* - * First do sequential mapping between CPUs and queues. - * In case we still have CPUs to map, and we have some number of - * threads per cores then map sibling threads to the same queue for + * If cores is enough, just do map between cores and queues + * else will do sequential mapping between CPUs and queues first. + * For other cpus, we have some number of threads per cores + * then map sibling threads to the same queue for * performace optimizations. */ - if (cpu < nr_queues) { + if (!core_map && cpu < nr_queues) { map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); } else { first_sibling = get_first_sibling(cpu); - if (first_sibling == cpu) - map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); - else + if (first_sibling == cpu) { + map[cpu] = cpu_to_queue_index(qmap, nr_queues, core); + core++; + } else map[cpu] = map[first_sibling]; } } -- 2.14.1.40.g8e62ba1