On Thu, Dec 10, 2020 at 09:18:22PM +0100, Thomas Gleixner wrote:
> Prarit reported that depending on the affinity setting the
> 
>  ' irq $N: Affinity broken due to vector space exhaustion.'
> 
> message is showing up in dmesg, but the vector space on the CPUs in the
> affinity mask is definitely not exhausted.
> 
> Shung-Hsi provided traces and analysis which pinpoints the problem:
> 
> The ordering of trying to assign an interrupt vector in
> assign_irq_vector_any_locked() is simply wrong if the interrupt data has a
> valid node assigned. It does:
> 
>  1) Try the intersection of affinity mask and node mask
>  2) Try the node mask
>  3) Try the full affinity mask
>  4) Try the full online mask
> 
> Obviously #2 and #3 are in the wrong order as the requested affinity
> mask has to take precedence.
> 
> In the observed cases #1 failed because the affinity mask did not contain
> CPUs from node 0. That made it allocate a vector from node 0, thereby
> breaking affinity and emitting the misleading message.
> 
> Revert the order of #2 and #3 so the full affinity mask without the node
> intersection is tried before actually affinity is broken.
> 
> If no node is assigned then only the full affinity mask and if that fails
> the full online mask is tried.
> 
> Fixes: d6ffc6ac83b1 ("x86/vector: Respect affinity mask in irq descriptor")
> Reported-by: Shung-Hsi Yu <shung-hsi...@suse.com>
> Reported-by: Prarit Bhargava <pra...@redhat.com>
> Signed-off-by: Thomas Gleixner <t...@linutronix.de>
> Tested-by: Shung-Hsi Yu <shung-hsi...@suse.com>
> Cc: sta...@vger.kernel.org
> ---
>  arch/x86/kernel/apic/vector.c |   24 ++++++++++++++----------
>  1 file changed, 14 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/kernel/apic/vector.c
> +++ b/arch/x86/kernel/apic/vector.c
> @@ -273,20 +273,24 @@ static int assign_irq_vector_any_locked(
>       const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
>       int node = irq_data_get_node(irqd);
>  
> -     if (node == NUMA_NO_NODE)
> -             goto all;
> -     /* Try the intersection of @affmsk and node mask */
> -     cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
> -     if (!assign_vector_locked(irqd, vector_searchmask))
> -             return 0;
> -     /* Try the node mask */
> -     if (!assign_vector_locked(irqd, cpumask_of_node(node)))
> -             return 0;
> -all:
> +     if (node != NUMA_NO_NODE) {
> +             /* Try the intersection of @affmsk and node mask */
> +             cpumask_and(vector_searchmask, cpumask_of_node(node), affmsk);
> +             if (!assign_vector_locked(irqd, vector_searchmask))
> +                     return 0;
> +     }
> +
>       /* Try the full affinity mask */
>       cpumask_and(vector_searchmask, affmsk, cpu_online_mask);
>       if (!assign_vector_locked(irqd, vector_searchmask))
>               return 0;
> +
> +     if (node != NUMA_NO_NODE) {
> +             /* Try the node mask */
> +             if (!assign_vector_locked(irqd, cpumask_of_node(node)))
> +                     return 0;
> +     }
> +
>       /* Try the full online mask */
>       return assign_vector_locked(irqd, cpu_online_mask);
>  }
> 

Reviewed-by: Ming Lei <ming....@redhat.com>


Thanks,
Ming

Reply via email to