On Fri, Feb 28, 2020 at 11:16:55AM +0100, Jan Beulich wrote:
> On 28.02.2020 10:33, Roger Pau Monne wrote:
> > Current usage of the per-CPU scratch cpumask is dangerous since
> > there's no way to figure out if the mask is already being used except
> > for manual code inspection of all the callers and possible call paths.
> > 
> > This is unsafe and not reliable, so introduce a minimal get/put
> > infrastructure to prevent nested usage of the scratch mask and usage
> > in interrupt context.
> > 
> > Move the definition of scratch_cpumask to smp.c in order to place the
> > declaration and the accessors as close as possible.
> 
> You've changed one instance of "declaration", but not also the other.

Oh, sorry. Sadly you are not the only one with a cold this week :).

> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -196,7 +196,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
> >  {
> >      unsigned int cpu, old_vector, irq = desc->irq;
> >      unsigned int vector = desc->arch.vector;
> > -    cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
> > +    cpumask_t *tmp_mask = get_scratch_cpumask();
> >  
> >      BUG_ON(!valid_irq_vector(vector));
> >  
> > @@ -208,6 +208,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
> >          ASSERT(per_cpu(vector_irq, cpu)[vector] == irq);
> >          per_cpu(vector_irq, cpu)[vector] = ~irq;
> >      }
> > +    put_scratch_cpumask(tmp_mask);
> >  
> >      desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
> >      cpumask_clear(desc->arch.cpu_mask);
> > @@ -227,8 +228,9 @@ static void _clear_irq_vector(struct irq_desc *desc)
> >  
> >      /* If we were in motion, also clear desc->arch.old_vector */
> >      old_vector = desc->arch.old_vector;
> > -    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
> >  
> > +    cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
> > +    tmp_mask = get_scratch_cpumask();
> 
> Did you test this? It looks overwhelmingly likely that the two
> lines need to be the other way around.

Urg, yes, I've tested it but likely missed to trigger this case and
even worse failed to spot it on my own review. It's obviously wrong.

> > +    /*
> > +     * Due to reentrancy scratch cpumask cannot be used in IRQ, #MC or #NMI
> > +     * context.
> > +     */
> > +    BUG_ON(in_irq() || in_mce_handler() || in_nmi_handler());
> > +
> > +    if ( use && unlikely(this_cpu(scratch_cpumask_use)) )
> > +    {
> > +        printk("scratch CPU mask already in use by %ps (%p)\n",
> > +               this_cpu(scratch_cpumask_use), 
> > this_cpu(scratch_cpumask_use));
> 
> Why the raw %p as well? We don't do so elsewhere, I think. Yes,
> it's debugging code only, but I wonder anyway.

I use addr2line to find the offending line, and it's much easier to do
so if you have the address directly, rather than having to use nm in
order to figure out the address of the symbol and then add the offset.

Maybe I'm missing some other way to do this more easily?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to