Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2016-03-14 Thread Boris Ostrovsky

On 03/12/2016 04:19 AM, Thomas Gleixner wrote:

Boris,

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:

On 07/14/2015 04:15 PM, Thomas Gleixner wrote:

The issue here is that all architectures need that protection and just
Xen does irq allocations in cpu_up.

So moving that protection into architecture code is not really an
option.


Otherwise we will need to have something like arch_post_cpu_up()
after the lock is released.

I'm not sure, that this will work. You probably want to do this in the
cpu prepare stage, i.e. before calling __cpu_up().

For PV guests (the ones that use xen_cpu_up()) it will work either before
or
after __cpu_up(). At least my (somewhat limited) testing didn't show any
problems so far.

However, HVM CPUs use xen_hvm_cpu_up() and if you read comments there you
will
see that xen_smp_intr_init() needs to be called before native_cpu_up() but
xen_init_lock_cpu() (which eventually calls irq_alloc_descs()) needs to be
called after.

I think I can split xen_init_lock_cpu() so that the part that needs to be
called after will avoid going into irq core code. And then the rest will
go
into arch_cpu_prepare().

I think we should revisit this for 4.3. For 4.2 we can do the trivial
variant and move the locking in native_cpu_up() and x86 only. x86 was
the only arch on which such wreckage has been seen in the wild, but we
should have that protection for all archs in the long run.

Patch below should fix the issue.

Thanks! Most of my tests passed, I had a couple of failures but I will need to
see whether they are related to this patch.

Did you ever come around to address that irq allocation from within cpu_up()?

I really want to generalize the protection instead of carrying that x86 only
hack forever.


Sorry, I completely forgot about this. Let me see how I can take 
allocations from under the lock. I might just be able to put them in CPU 
notifiers --- most into CPU_UP_PREPARE but spinlock interrupt may need 
to go into CPU_ONLINE.


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2016-03-12 Thread Thomas Gleixner
Boris,

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> On 07/14/2015 04:15 PM, Thomas Gleixner wrote:
> > > > The issue here is that all architectures need that protection and just
> > > > Xen does irq allocations in cpu_up.
> > > > 
> > > > So moving that protection into architecture code is not really an
> > > > option.
> > > > 
> > > > > > > Otherwise we will need to have something like arch_post_cpu_up()
> > > > > > > after the lock is released.
> > > > I'm not sure, that this will work. You probably want to do this in the
> > > > cpu prepare stage, i.e. before calling __cpu_up().
> > > For PV guests (the ones that use xen_cpu_up()) it will work either before
> > > or
> > > after __cpu_up(). At least my (somewhat limited) testing didn't show any
> > > problems so far.
> > > 
> > > However, HVM CPUs use xen_hvm_cpu_up() and if you read comments there you
> > > will
> > > see that xen_smp_intr_init() needs to be called before native_cpu_up() but
> > > xen_init_lock_cpu() (which eventually calls irq_alloc_descs()) needs to be
> > > called after.
> > > 
> > > I think I can split xen_init_lock_cpu() so that the part that needs to be
> > > called after will avoid going into irq core code. And then the rest will
> > > go
> > > into arch_cpu_prepare().
> > I think we should revisit this for 4.3. For 4.2 we can do the trivial
> > variant and move the locking in native_cpu_up() and x86 only. x86 was
> > the only arch on which such wreckage has been seen in the wild, but we
> > should have that protection for all archs in the long run.
> > 
> > Patch below should fix the issue.
> 
> Thanks! Most of my tests passed, I had a couple of failures but I will need to
> see whether they are related to this patch.

Did you ever come around to address that irq allocation from within cpu_up()?

I really want to generalize the protection instead of carrying that x86 only
hack forever.

Thanks,

tglx

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2015-07-14 Thread Boris Ostrovsky

On 07/14/2015 04:15 PM, Thomas Gleixner wrote:

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:

On 07/14/2015 01:32 PM, Thomas Gleixner wrote:

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:

On 07/14/2015 11:44 AM, Thomas Gleixner wrote:

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:

Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.

This breaks Xen guests that allocate interrupt descriptors in
.cpu_up().

And where exactly does XEN allocate those descriptors?

xen_cpu_up()
  xen_setup_timer()
  bind_virq_to_irqhandler()
  bind_virq_to_irq()
  xen_allocate_irq_dynamic()
  xen_allocate_irqs_dynamic()
  irq_alloc_descs()


There is also a similar pass via xen_cpu_up() -> xen_smp_intr_init()

Sigh.
   


Any chance this locking can be moved into arch code?

No.

The issue here is that all architectures need that protection and just
Xen does irq allocations in cpu_up.

So moving that protection into architecture code is not really an
option.


Otherwise we will need to have something like arch_post_cpu_up()
after the lock is released.

I'm not sure, that this will work. You probably want to do this in the
cpu prepare stage, i.e. before calling __cpu_up().

For PV guests (the ones that use xen_cpu_up()) it will work either before or
after __cpu_up(). At least my (somewhat limited) testing didn't show any
problems so far.

However, HVM CPUs use xen_hvm_cpu_up() and if you read comments there you will
see that xen_smp_intr_init() needs to be called before native_cpu_up() but
xen_init_lock_cpu() (which eventually calls irq_alloc_descs()) needs to be
called after.

I think I can split xen_init_lock_cpu() so that the part that needs to be
called after will avoid going into irq core code. And then the rest will go
into arch_cpu_prepare().

I think we should revisit this for 4.3. For 4.2 we can do the trivial
variant and move the locking in native_cpu_up() and x86 only. x86 was
the only arch on which such wreckage has been seen in the wild, but we
should have that protection for all archs in the long run.

Patch below should fix the issue.



Thanks! Most of my tests passed, I had a couple of failures but I will 
need to see whether they are related to this patch.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2015-07-14 Thread Thomas Gleixner
On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> On 07/14/2015 01:32 PM, Thomas Gleixner wrote:
> > On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> > > On 07/14/2015 11:44 AM, Thomas Gleixner wrote:
> > > > On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> > > > > > Prevent allocation and freeing of interrupt descriptors accross cpu
> > > > > > hotplug.
> > > > > This breaks Xen guests that allocate interrupt descriptors in
> > > > > .cpu_up().
> > > > And where exactly does XEN allocate those descriptors?
> > > xen_cpu_up()
> > >  xen_setup_timer()
> > >  bind_virq_to_irqhandler()
> > >  bind_virq_to_irq()
> > >  xen_allocate_irq_dynamic()
> > >  xen_allocate_irqs_dynamic()
> > >  irq_alloc_descs()
> > > 
> > > 
> > > There is also a similar pass via xen_cpu_up() -> xen_smp_intr_init()
> > Sigh.
> >   
> > > >
> > > > > Any chance this locking can be moved into arch code?
> > > > No.
> > The issue here is that all architectures need that protection and just
> > Xen does irq allocations in cpu_up.
> > 
> > So moving that protection into architecture code is not really an
> > option.
> > 
> > > > > Otherwise we will need to have something like arch_post_cpu_up()
> > > > > after the lock is released.
> > I'm not sure, that this will work. You probably want to do this in the
> > cpu prepare stage, i.e. before calling __cpu_up().
> 
> For PV guests (the ones that use xen_cpu_up()) it will work either before or
> after __cpu_up(). At least my (somewhat limited) testing didn't show any
> problems so far.
> 
> However, HVM CPUs use xen_hvm_cpu_up() and if you read comments there you will
> see that xen_smp_intr_init() needs to be called before native_cpu_up() but
> xen_init_lock_cpu() (which eventually calls irq_alloc_descs()) needs to be
> called after.
> 
> I think I can split xen_init_lock_cpu() so that the part that needs to be
> called after will avoid going into irq core code. And then the rest will go
> into arch_cpu_prepare().

I think we should revisit this for 4.3. For 4.2 we can do the trivial
variant and move the locking in native_cpu_up() and x86 only. x86 was
the only arch on which such wreckage has been seen in the wild, but we
should have that protection for all archs in the long run.

Patch below should fix the issue.

Thanks,

tglx
---
commit d4a969314077914a623f3e2c5120cd2ef31aba30
Author: Thomas Gleixner 
Date:   Tue Jul 14 22:03:57 2015 +0200

genirq: Revert sparse irq locking around __cpu_up() and move it to x86 for 
now

Boris reported that the sparse_irq protection around __cpu_up() in the
generic code causes a regression on Xen. Xen allocates interrupts and
some more in the xen_cpu_up() function, so it deadlocks on the
sparse_irq_lock.

There is no simple fix for this and we really should have the
protection for all architectures, but for now the only solution is to
move it to x86 where actual wreckage due to the lack of protection has
been observed.

Reported-by: Boris Ostrovsky 
Fixes: a89941816726 'hotplug: Prevent alloc/free of irq descriptors during 
cpu up/down'
Signed-off-by: Thomas Gleixner 
Cc: Peter Zijlstra 
Cc: xiao jin 
Cc: Joerg Roedel 
Cc: Borislav Petkov 
Cc: Yanmin Zhang 
Cc: xen-devel 

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index d3010aa79daf..b1f3ed9c7a9e 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -992,8 +992,17 @@ int native_cpu_up(unsigned int cpu, struct task_struct 
*tidle)
 
common_cpu_up(cpu, tidle);
 
+   /*
+* We have to walk the irq descriptors to setup the vector
+* space for the cpu which comes online.  Prevent irq
+* alloc/free across the bringup.
+*/
+   irq_lock_sparse();
+
err = do_boot_cpu(apicid, cpu, tidle);
+
if (err) {
+   irq_unlock_sparse();
pr_err("do_boot_cpu failed(%d) to wakeup CPU#%u\n", err, cpu);
return -EIO;
}
@@ -1011,6 +1020,8 @@ int native_cpu_up(unsigned int cpu, struct task_struct 
*tidle)
touch_nmi_watchdog();
}
 
+   irq_unlock_sparse();
+
return 0;
 }
 
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6a374544d495..5644ec5582b9 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -527,18 +527,9 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen)
goto out_notify;
}
 
-   /*
-* Some architectures have to walk the irq descriptors to
-* setup the vector space for the cpu which comes online.
-* Prevent irq alloc/free across the bringup.
-*/
-   irq_lock_sparse();
-
/* Arch-specific enabling code. */
ret = __cpu_up(cpu, idle);
 
-   irq_unlock_sparse();
-
if (ret != 0)
goto out_notify;
BUG_ON(!cpu_online(cpu));

__

Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2015-07-14 Thread Boris Ostrovsky

On 07/14/2015 01:32 PM, Thomas Gleixner wrote:

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:

On 07/14/2015 11:44 AM, Thomas Gleixner wrote:

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:

Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.

This breaks Xen guests that allocate interrupt descriptors in .cpu_up().

And where exactly does XEN allocate those descriptors?

xen_cpu_up()
 xen_setup_timer()
 bind_virq_to_irqhandler()
 bind_virq_to_irq()
 xen_allocate_irq_dynamic()
 xen_allocate_irqs_dynamic()
 irq_alloc_descs()


There is also a similar pass via xen_cpu_up() -> xen_smp_intr_init()

Sigh.
  
   

Any chance this locking can be moved into arch code?

No.

The issue here is that all architectures need that protection and just
Xen does irq allocations in cpu_up.

So moving that protection into architecture code is not really an
option.


Otherwise we will need to have something like arch_post_cpu_up()
after the lock is released.

I'm not sure, that this will work. You probably want to do this in the
cpu prepare stage, i.e. before calling __cpu_up().




For PV guests (the ones that use xen_cpu_up()) it will work either 
before or after __cpu_up(). At least my (somewhat limited) testing 
didn't show any problems so far.


However, HVM CPUs use xen_hvm_cpu_up() and if you read comments there 
you will see that xen_smp_intr_init() needs to be called before 
native_cpu_up() but xen_init_lock_cpu() (which eventually calls 
irq_alloc_descs()) needs to be called after.


I think I can split xen_init_lock_cpu() so that the part that needs to 
be called after will avoid going into irq core code. And then the rest 
will go into arch_cpu_prepare().



-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2015-07-14 Thread Thomas Gleixner
On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> On 07/14/2015 11:44 AM, Thomas Gleixner wrote:
> > On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> > > > Prevent allocation and freeing of interrupt descriptors accross cpu
> > > > hotplug.
> > > 
> > > This breaks Xen guests that allocate interrupt descriptors in .cpu_up().
> > And where exactly does XEN allocate those descriptors?
> 
> xen_cpu_up()
> xen_setup_timer()
> bind_virq_to_irqhandler()
> bind_virq_to_irq()
> xen_allocate_irq_dynamic()
> xen_allocate_irqs_dynamic()
> irq_alloc_descs()
> 
> 
> There is also a similar pass via xen_cpu_up() -> xen_smp_intr_init()

Sigh.
 
> 
> >   
> > > Any chance this locking can be moved into arch code?
> > No.

The issue here is that all architectures need that protection and just
Xen does irq allocations in cpu_up.

So moving that protection into architecture code is not really an
option.

> > > Otherwise we will need to have something like arch_post_cpu_up()
> > > after the lock is released.

I'm not sure, that this will work. You probably want to do this in the
cpu prepare stage, i.e. before calling __cpu_up().

I have to walk the dogs now. Will look into it later tonight.

> > > (The patch doesn't appear to have any side effects for the down path since
> > > Xen
> > > guests deallocate descriptors in __cpu_die()).
> >   Exact place please.
> 
> Whose place? Where descriptors are deallocated?
> 
> __cpu_die()
> xen_cpu_die()
> xen_teardown_timer()
> unbind_from_irqhandler()
> unbind_from_irq()
> __unbind_from_irq()
> xen_free_irq()
> irq_free_descs()
> free_desc()

Right, that's outside the lock held region.

Thanks,

tglx

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2015-07-14 Thread Boris Ostrovsky

On 07/14/2015 11:44 AM, Thomas Gleixner wrote:

On Tue, 14 Jul 2015, Boris Ostrovsky wrote:

Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.


This breaks Xen guests that allocate interrupt descriptors in .cpu_up().

And where exactly does XEN allocate those descriptors?


xen_cpu_up()
xen_setup_timer()
bind_virq_to_irqhandler()
bind_virq_to_irq()
xen_allocate_irq_dynamic()
xen_allocate_irqs_dynamic()
irq_alloc_descs()


There is also a similar pass via xen_cpu_up() -> xen_smp_intr_init()


  

Any chance this locking can be moved into arch code?

No.


(The patch doesn't appear to have any side effects for the down path since Xen
guests deallocate descriptors in __cpu_die()).
  
Exact place please.


Whose place? Where descriptors are deallocated?

__cpu_die()
xen_cpu_die()
xen_teardown_timer()
unbind_from_irqhandler()
unbind_from_irq()
__unbind_from_irq()
xen_free_irq()
irq_free_descs()
free_desc()

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2015-07-14 Thread Thomas Gleixner
On Tue, 14 Jul 2015, Boris Ostrovsky wrote:
> > Prevent allocation and freeing of interrupt descriptors accross cpu
> > hotplug.
> 
> 
> This breaks Xen guests that allocate interrupt descriptors in .cpu_up().

And where exactly does XEN allocate those descriptors?
 
> Any chance this locking can be moved into arch code?

No.

> (The patch doesn't appear to have any side effects for the down path since Xen
> guests deallocate descriptors in __cpu_die()).
 
Exact place please.

Thanks,

tglx

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [patch 1/4] hotplug: Prevent alloc/free of irq descriptors during cpu up/down

2015-07-14 Thread Boris Ostrovsky

On 07/05/2015 01:12 PM, Thomas Gleixner wrote:

When a cpu goes up some architectures (e.g. x86) have to walk the irq
space to set up the vector space for the cpu. While this needs extra
protection at the architecture level we can avoid a few race
conditions by preventing the concurrent allocation/free of irq
descriptors and the associated data.

When a cpu goes down it moves the interrupts which are targeted to
this cpu away by reassigning the affinities. While this happens
interrupts can be allocated and freed, which opens a can of race
conditions in the code which reassignes the affinities because
interrupt descriptors might be freed underneath.

Example:

CPU1CPU2
cpu_up/down
  irq_desc = irq_to_desc(irq);
remove_from_radix_tree(desc);
  raw_spin_lock(&desc->lock);
free(desc);

We could protect the irq descriptors with RCU, but that would require
a full tree change of all accesses to interrupt descriptors. But
fortunately these kind of race conditions are rather limited to a few
things like cpu hotplug. The normal setup/teardown is very well
serialized. So the simpler and obvious solution is:

Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.



This breaks Xen guests that allocate interrupt descriptors in .cpu_up().

Any chance this locking can be moved into arch code? Otherwise we will 
need to have something like arch_post_cpu_up() after the lock is released.


(The patch doesn't appear to have any side effects for the down path 
since Xen guests deallocate descriptors in __cpu_die()).



-boris




Signed-off-by: Thomas Gleixner
---
  include/linux/irqdesc.h |7 ++-
  kernel/cpu.c|   21 -
  kernel/irq/internals.h  |4 
  3 files changed, 26 insertions(+), 6 deletions(-)

Index: tip/include/linux/irqdesc.h
===
--- tip.orig/include/linux/irqdesc.h
+++ tip/include/linux/irqdesc.h
@@ -90,7 +90,12 @@ struct irq_desc {
const char  *name;
  } cacheline_internodealigned_in_smp;

-#ifndef CONFIG_SPARSE_IRQ
+#ifdef CONFIG_SPARSE_IRQ
+extern void irq_lock_sparse(void);
+extern void irq_unlock_sparse(void);
+#else
+static inline void irq_lock_sparse(void) { }
+static inline void irq_unlock_sparse(void) { }
  extern struct irq_desc irq_desc[NR_IRQS];
  #endif

Index: tip/kernel/cpu.c
===
--- tip.orig/kernel/cpu.c
+++ tip/kernel/cpu.c
@@ -392,13 +392,19 @@ static int __ref _cpu_down(unsigned int
smpboot_park_threads(cpu);

/*
-* So now all preempt/rcu users must observe !cpu_active().
+* Prevent irq alloc/free while the dying cpu reorganizes the
+* interrupt affinities.
 */
+   irq_lock_sparse();

+   /*
+* So now all preempt/rcu users must observe !cpu_active().
+*/
err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
if (err) {
/* CPU didn't die: tell everyone.  Can't complain. */
cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
+   irq_unlock_sparse();
goto out_release;
}
BUG_ON(cpu_online(cpu));
@@ -415,6 +421,9 @@ static int __ref _cpu_down(unsigned int
smp_mb(); /* Read from cpu_dead_idle before __cpu_die(). */
per_cpu(cpu_dead_idle, cpu) = false;

+   /* Interrupts are moved away from the dying cpu, reenable alloc/free */
+   irq_unlock_sparse();
+
hotplug_cpu__broadcast_tick_pull(cpu);
/* This actually kills the CPU. */
__cpu_die(cpu);
@@ -517,8 +526,18 @@ static int _cpu_up(unsigned int cpu, int
goto out_notify;
}

+   /*
+* Some architectures have to walk the irq descriptors to
+* setup the vector space for the cpu which comes online.
+* Prevent irq alloc/free across the bringup.
+*/
+   irq_lock_sparse();
+
/* Arch-specific enabling code. */
ret = __cpu_up(cpu, idle);
+
+   irq_unlock_sparse();
+
if (ret != 0)
goto out_notify;
BUG_ON(!cpu_online(cpu));
Index: tip/kernel/irq/internals.h
===
--- tip.orig/kernel/irq/internals.h
+++ tip/kernel/irq/internals.h
@@ -76,12 +76,8 @@ extern void unmask_threaded_irq(struct i

  #ifdef CONFIG_SPARSE_IRQ
  static inline void irq_mark_irq(unsigned int irq) { }
-extern void irq_lock_sparse(void);
-extern void irq_unlock_sparse(void);
  #else
  extern void irq_mark_irq(unsigned int irq);
-static inline void irq_lock_sparse(void) { }
-static inline void irq_unlock_sparse(void) { }
  #endif

  extern void init_kstat_irqs(struct irq_desc *desc, int node, int nr);


--



___
Xen-devel mailing list
Xen-devel@lists.xen.o