Hi Thomas,

On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote:
> The real problem is irqbalanced aggressively exhausting the vector space
> of a _whole_ socket to the point that there is not a single vector left
> for serial. That's the problem you want to fix.

I believe this warning also gets triggered even when there's _no_ vector
exhaustion.

This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on
a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0).

  $ lscpu
  ...
  NUMA node0 CPU(s):   0-25,52-77
  NUMA node1 CPU(s):   26-51,78-103

  $ cat /sys/kernel/debug/tracing/trace
           ...
  irqbalance-1994    [017] d...    74.912799: irq_matrix_alloc: bit=33 cpu=26 
online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
  irqbalance-1994    [017] d...    74.912802: vector_alloc: irq=4 vector=33 
reserved=0 ret=0
  irqbalance-1994    [017] d...    74.912804: vector_update: irq=4 vector=33 
cpu=26 prev_vector=33 prev_cpu=7
  irqbalance-1994    [017] d...    74.912805: vector_config: irq=4 vector=33 
cpu=26 apicdest=0x00000040
      <idle>-0       [007] d.h.    74.970733: vector_free_moved: irq=4 cpu=7 
vector=33 is_managed=0
      <idle>-0       [007] d.h.    74.970738: irq_matrix_free: bit=33 cpu=7 
online=1 avl=200 alloc=1 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
           ...
    (agetty)-3004    [047] d...    81.731231: vector_deactivate: irq=4 
is_managed=0 can_reserve=1 reserve=0
    (agetty)-3004    [047] d...    81.738035: vector_clear: irq=4 vector=33 
cpu=26 prev_vector=0 prev_cpu=7
    (agetty)-3004    [047] d...    81.738040: irq_matrix_free: bit=33 cpu=26 
online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20689, 
global_rsvd=341, total_alloc=215
    (agetty)-3004    [047] d...    81.738046: irq_matrix_reserve: 
online_maps=104 global_avl=20689, global_rsvd=342, total_alloc=215
    (agetty)-3004    [047] d...    81.766739: vector_reserve: irq=4 ret=0
    (agetty)-3004    [047] d...    81.766741: vector_config: irq=4 vector=239 
cpu=0 apicdest=0x00000000
    (agetty)-3004    [047] d...    81.777152: vector_activate: irq=4 
is_managed=0 can_reserve=1 reserve=0
    (agetty)-3004    [047] d...    81.777157: vector_alloc: irq=4 vector=0 
reserved=1 ret=-22
    ----------------------------------------> irq_matrix_alloc() failed with
                                              EINVAL because the cpumask
                                              passed in is empty, which is a
                                              result of affmask being
                                              (ff,ffffc000,000fffff,fc000000)
                                              and cpumask_of_node(node)
                                              being
                                              (00,00003fff,fff00000,03ffffff). 

    (agetty)-3004    [047] d...    81.789349: irq_matrix_alloc: bit=33 cpu=1 
online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, 
global_rsvd=341, total_alloc=216
    (agetty)-3004    [047] d...    81.789351: vector_alloc: irq=4 vector=33 
reserved=1 ret=0
    (agetty)-3004    [047] d...    81.789353: vector_update: irq=4 vector=33 
cpu=1 prev_vector=0 prev_cpu=26
    (agetty)-3004    [047] d...    81.789355: vector_config: irq=4 vector=33 
cpu=1 apicdest=0x00000002
    ----------------------------------------> "irq 4: Affinity broken due to
                                              vector space exhaustion."
                                              warning shows up

    (agetty)-3004    [047] d...    81.900783: irq_matrix_alloc: bit=33 cpu=26 
online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
    (agetty)-3004    [047] d...    82.053535: vector_alloc: irq=4 vector=33 
reserved=0 ret=0
    (agetty)-3004    [047] d...    82.053536: vector_update: irq=4 vector=33 
cpu=26 prev_vector=33 prev_cpu=1
    (agetty)-3004    [047] d...    82.053538: vector_config: irq=4 vector=33 
cpu=26 apicdest=0x00000040


Shung-Hsi Yu

Reply via email to