Re: Warning in irq_work_queue_on()

2015-09-05 Thread Paul E. McKenney
On Fri, Sep 04, 2015 at 05:11:54PM +0200, Frederic Weisbecker wrote:
> On Thu, Sep 03, 2015 at 09:58:40AM +0200, Peter Zijlstra wrote:
> > On Thu, Sep 03, 2015 at 02:03:51AM +0200, Frederic Weisbecker wrote:
> > > On Thu, Sep 03, 2015 at 12:24:27AM +0200, Peter Zijlstra wrote:
> > > > On Wed, Sep 02, 2015 at 11:50:22PM +0200, Frederic Weisbecker wrote:
> > > > > > > [  875.703227]  [] 
> > > > > > > tick_nohz_full_kick_cpu+0x44/0x50
> > > > > 
> > > > > It happens in nohz full, but I'm not sure the guilty is nohz full.
> > > > > 
> > > > > The problem here is that wake_up_nohz_cpu() selects a CPU that is 
> > > > > offline.
> > > > 
> > > > wake_up_nohz_cpu() doesn't do any such thing. Where does the selection
> > > > logic live?
> > > 
> > > Err, got confused with get_nohz_timer_target(). But yeah 
> > > wake_up_nohz_cpu() is
> > > called with a CPU that is chosen by mod_timer() -> 
> > > get_nohz_timer_target().
> > > 
> > > > 
> > > > > But this shouldn't happen. Either it selects a CPU that is in the 
> > > > > domain tree,
> > > > > and I suspect offline CPUs aren't supposed to be there, or it selects 
> > > > > the current
> > > > > CPU. And if the CPU is offlined, it shouldn't be running some 
> > > > > kthread...
> > > > 
> > > > Do no assume things like that.. always check with the active mask.
> > > 
> > > Hmm, so perhaps we need something like this (makes me realize that
> > > the is_housekeeping_cpu() passes the wrong argument, no issue in practice
> > > since nohz full aren't in the domain tree but I still need to fix that 
> > > along).
> > > 
> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 0902e4d..2c10a69 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -628,7 +628,7 @@ int get_nohz_timer_target(void)
> > >  
> > >   rcu_read_lock();
> > >   for_each_domain(cpu, sd) {
> > > - for_each_cpu(i, sched_domain_span(sd)) {
> > > + for_each_cpu_and(i, sched_domain_span(sd), cpu_online_mask) {
> > 
> > cpu_active_mask, we clear that when we start killing the cpu. online
> > only gets cleared once the cpu is actually dead.
> 
> So, after our discussion in IRC, I checked how domains are rebuild on hotplug
> ops and it appears that partition_sched_domain() is called on CPU_DOWN_PREPARE
> only. The CPU shouldn't be on the domain tree after that.
> 
> (Correct me if I'm wrong, I really am not an expert in the domain handling 
> code.
> As you said that we can't guarantee that a CPU in the domain tree is in the 
> cpu_online_mask,
> I'm likely wrong somewhere).
> 
> This is then followed by synchronize_sched(). Which means that after that, the
> new version of the CPU domains (with the offlining CPU excluded) is visible
> everywhere while the CPU is still in cpu_online_mask.
> 
> And finally stop machine runs and the CPU is cleared out of cpu_online_mask.
> So I'm probably missing something, otherwise we could find a CPU in the domain
> tree that is not in cpu_online_mask.

OK, I have to ask...  Should I be trying Frederic's patch?

At the current failure rate, I will need to be running it for about
a year to give any reasonable conclusion.  :-/

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning in irq_work_queue_on()

2015-09-04 Thread Frederic Weisbecker
On Thu, Sep 03, 2015 at 09:58:40AM +0200, Peter Zijlstra wrote:
> On Thu, Sep 03, 2015 at 02:03:51AM +0200, Frederic Weisbecker wrote:
> > On Thu, Sep 03, 2015 at 12:24:27AM +0200, Peter Zijlstra wrote:
> > > On Wed, Sep 02, 2015 at 11:50:22PM +0200, Frederic Weisbecker wrote:
> > > > > > [  875.703227]  [] 
> > > > > > tick_nohz_full_kick_cpu+0x44/0x50
> > > > 
> > > > It happens in nohz full, but I'm not sure the guilty is nohz full.
> > > > 
> > > > The problem here is that wake_up_nohz_cpu() selects a CPU that is 
> > > > offline.
> > > 
> > > wake_up_nohz_cpu() doesn't do any such thing. Where does the selection
> > > logic live?
> > 
> > Err, got confused with get_nohz_timer_target(). But yeah wake_up_nohz_cpu() 
> > is
> > called with a CPU that is chosen by mod_timer() -> get_nohz_timer_target().
> > 
> > > 
> > > > But this shouldn't happen. Either it selects a CPU that is in the 
> > > > domain tree,
> > > > and I suspect offline CPUs aren't supposed to be there, or it selects 
> > > > the current
> > > > CPU. And if the CPU is offlined, it shouldn't be running some kthread...
> > > 
> > > Do no assume things like that.. always check with the active mask.
> > 
> > Hmm, so perhaps we need something like this (makes me realize that
> > the is_housekeeping_cpu() passes the wrong argument, no issue in practice
> > since nohz full aren't in the domain tree but I still need to fix that 
> > along).
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 0902e4d..2c10a69 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -628,7 +628,7 @@ int get_nohz_timer_target(void)
> >  
> > rcu_read_lock();
> > for_each_domain(cpu, sd) {
> > -   for_each_cpu(i, sched_domain_span(sd)) {
> > +   for_each_cpu_and(i, sched_domain_span(sd), cpu_online_mask) {
> 
> cpu_active_mask, we clear that when we start killing the cpu. online
> only gets cleared once the cpu is actually dead.

So, after our discussion in IRC, I checked how domains are rebuild on hotplug
ops and it appears that partition_sched_domain() is called on CPU_DOWN_PREPARE
only. The CPU shouldn't be on the domain tree after that.

(Correct me if I'm wrong, I really am not an expert in the domain handling code.
As you said that we can't guarantee that a CPU in the domain tree is in the 
cpu_online_mask,
I'm likely wrong somewhere).

This is then followed by synchronize_sched(). Which means that after that, the
new version of the CPU domains (with the offlining CPU excluded) is visible
everywhere while the CPU is still in cpu_online_mask.

And finally stop machine runs and the CPU is cleared out of cpu_online_mask.
So I'm probably missing something, otherwise we could find a CPU in the domain
tree that is not in cpu_online_mask.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning in irq_work_queue_on()

2015-09-03 Thread Peter Zijlstra
On Thu, Sep 03, 2015 at 02:03:51AM +0200, Frederic Weisbecker wrote:
> On Thu, Sep 03, 2015 at 12:24:27AM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 02, 2015 at 11:50:22PM +0200, Frederic Weisbecker wrote:
> > > > > [  875.703227]  [] tick_nohz_full_kick_cpu+0x44/0x50
> > > 
> > > It happens in nohz full, but I'm not sure the guilty is nohz full.
> > > 
> > > The problem here is that wake_up_nohz_cpu() selects a CPU that is offline.
> > 
> > wake_up_nohz_cpu() doesn't do any such thing. Where does the selection
> > logic live?
> 
> Err, got confused with get_nohz_timer_target(). But yeah wake_up_nohz_cpu() is
> called with a CPU that is chosen by mod_timer() -> get_nohz_timer_target().
> 
> > 
> > > But this shouldn't happen. Either it selects a CPU that is in the domain 
> > > tree,
> > > and I suspect offline CPUs aren't supposed to be there, or it selects the 
> > > current
> > > CPU. And if the CPU is offlined, it shouldn't be running some kthread...
> > 
> > Do no assume things like that.. always check with the active mask.
> 
> Hmm, so perhaps we need something like this (makes me realize that
> the is_housekeeping_cpu() passes the wrong argument, no issue in practice
> since nohz full aren't in the domain tree but I still need to fix that along).
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0902e4d..2c10a69 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -628,7 +628,7 @@ int get_nohz_timer_target(void)
>  
>   rcu_read_lock();
>   for_each_domain(cpu, sd) {
> - for_each_cpu(i, sched_domain_span(sd)) {
> + for_each_cpu_and(i, sched_domain_span(sd), cpu_online_mask) {

cpu_active_mask, we clear that when we start killing the cpu. online
only gets cleared once the cpu is actually dead.

>   if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
>   cpu = i;
>   goto unlock;
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning in irq_work_queue_on()

2015-09-02 Thread Frederic Weisbecker
On Thu, Sep 03, 2015 at 12:24:27AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 02, 2015 at 11:50:22PM +0200, Frederic Weisbecker wrote:
> > > > [  875.703227]  [] tick_nohz_full_kick_cpu+0x44/0x50
> > 
> > It happens in nohz full, but I'm not sure the guilty is nohz full.
> > 
> > The problem here is that wake_up_nohz_cpu() selects a CPU that is offline.
> 
> wake_up_nohz_cpu() doesn't do any such thing. Where does the selection
> logic live?

Err, got confused with get_nohz_timer_target(). But yeah wake_up_nohz_cpu() is
called with a CPU that is chosen by mod_timer() -> get_nohz_timer_target().

> 
> > But this shouldn't happen. Either it selects a CPU that is in the domain 
> > tree,
> > and I suspect offline CPUs aren't supposed to be there, or it selects the 
> > current
> > CPU. And if the CPU is offlined, it shouldn't be running some kthread...
> 
> Do no assume things like that.. always check with the active mask.

Hmm, so perhaps we need something like this (makes me realize that
the is_housekeeping_cpu() passes the wrong argument, no issue in practice
since nohz full aren't in the domain tree but I still need to fix that along).

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0902e4d..2c10a69 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -628,7 +628,7 @@ int get_nohz_timer_target(void)
 
rcu_read_lock();
for_each_domain(cpu, sd) {
-   for_each_cpu(i, sched_domain_span(sd)) {
+   for_each_cpu_and(i, sched_domain_span(sd), cpu_online_mask) {
if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
cpu = i;
goto unlock;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning in irq_work_queue_on()

2015-09-02 Thread Peter Zijlstra
On Wed, Sep 02, 2015 at 11:50:22PM +0200, Frederic Weisbecker wrote:
> > > [  875.703227]  [] tick_nohz_full_kick_cpu+0x44/0x50
> 
> It happens in nohz full, but I'm not sure the guilty is nohz full.
> 
> The problem here is that wake_up_nohz_cpu() selects a CPU that is offline.

wake_up_nohz_cpu() doesn't do any such thing. Where does the selection
logic live?

> But this shouldn't happen. Either it selects a CPU that is in the domain tree,
> and I suspect offline CPUs aren't supposed to be there, or it selects the 
> current
> CPU. And if the CPU is offlined, it shouldn't be running some kthread...

Do no assume things like that.. always check with the active mask.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning in irq_work_queue_on()

2015-09-02 Thread Frederic Weisbecker
On Wed, Sep 02, 2015 at 03:44:05PM -0400, Tejun Heo wrote:
> (cc'ing peterz)
> 
> Ooh, this is from irq_work which doesn't have much to do with
> workqueue.  Peter?
> 
> On Mon, Aug 24, 2015 at 05:16:11PM -0700, Paul E. McKenney wrote:
> > Hello, Tejun,
> > 
> > As discussed last week, I am getting an occasional warning out of
> > irq_work_queue_on() WARN_ON_ONCE(cpu_is_offline(cpu)).  The repeat-by
> > seems to be a week or so of rcutorture runs on 16-CPU KVM instances
> > on x86.  So please see below on the off-chance that this is of use.
> > I have also attached a .config file.
> > 
> > Thoughts?
> > 
> > Thanx, Paul
> > 
> > 
> > 
> > [  875.702254] [ cut here ]
> > [  875.703111] WARNING: CPU: 0 PID: 768 at 
> > /home/paulmck/public_git/bisect-linux-rcu/kernel/irq_work.c:69 
> > irq_work_queue_on+0xd4/0x110()
> > [  875.703227] Modules linked in:
> > [  875.703227] CPU: 0 PID: 768 Comm: rcu_torture_rea Tainted: GW
> >4.1.0-rc4+ #1
> > [  875.703227] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> > Bochs 01/01/2011
> > [  875.703227]  81baadd8 88001dc5fce8 81895418 
> > 00aa
> > [  875.703227]   88001dc5fd28 810517d5 
> > 00015bc0
> > [  875.703227]  0004 0004 88001fc8f980 
> > 88001fc8d500
> > [  875.703227] Call Trace:
> > [  875.703227]  [] dump_stack+0x45/0x57
> > [  875.703227]  [] warn_slowpath_common+0x85/0xc0
> > [  875.703227]  [] warn_slowpath_null+0x15/0x20
> > [  875.703227]  [] irq_work_queue_on+0xd4/0x110
> > [  875.703227]  [] tick_nohz_full_kick_cpu+0x44/0x50

It happens in nohz full, but I'm not sure the guilty is nohz full.

The problem here is that wake_up_nohz_cpu() selects a CPU that is offline.
But this shouldn't happen. Either it selects a CPU that is in the domain tree,
and I suspect offline CPUs aren't supposed to be there, or it selects the 
current
CPU. And if the CPU is offlined, it shouldn't be running some kthread...

> > [  875.703227]  [] wake_up_nohz_cpu+0xb4/0x100
> > [  875.703227]  [] internal_add_timer+0x86/0xa0
> > [  875.703227]  [] mod_timer+0xf1/0x1e0
> > [  875.703227]  [] rcu_torture_reader+0x2a4/0x2e0
> > [  875.703227]  [] ? rcu_torture_reader+0x2e0/0x2e0
> > [  875.703227]  [] ? 
> > rcutorture_trace_dump.part.10+0x20/0x20
> > [  875.703227]  [] kthread+0xcd/0xf0
> > [  875.703227]  [] ? kthread_create_on_node+0x180/0x180
> > [  875.703227]  [] ret_from_fork+0x42/0x70
> > [  875.703227]  [] ? kthread_create_on_node+0x180/0x180
> > [  875.703227] ---[ end trace 74175128740d0113 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Warning in irq_work_queue_on()

2015-09-02 Thread Tejun Heo
(cc'ing peterz)

Ooh, this is from irq_work which doesn't have much to do with
workqueue.  Peter?

On Mon, Aug 24, 2015 at 05:16:11PM -0700, Paul E. McKenney wrote:
> Hello, Tejun,
> 
> As discussed last week, I am getting an occasional warning out of
> irq_work_queue_on() WARN_ON_ONCE(cpu_is_offline(cpu)).  The repeat-by
> seems to be a week or so of rcutorture runs on 16-CPU KVM instances
> on x86.  So please see below on the off-chance that this is of use.
> I have also attached a .config file.
> 
> Thoughts?
> 
>   Thanx, Paul
> 
> 
> 
> [  875.702254] [ cut here ]
> [  875.703111] WARNING: CPU: 0 PID: 768 at 
> /home/paulmck/public_git/bisect-linux-rcu/kernel/irq_work.c:69 
> irq_work_queue_on+0xd4/0x110()
> [  875.703227] Modules linked in:
> [  875.703227] CPU: 0 PID: 768 Comm: rcu_torture_rea Tainted: GW  
>  4.1.0-rc4+ #1
> [  875.703227] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> Bochs 01/01/2011
> [  875.703227]  81baadd8 88001dc5fce8 81895418 
> 00aa
> [  875.703227]   88001dc5fd28 810517d5 
> 00015bc0
> [  875.703227]  0004 0004 88001fc8f980 
> 88001fc8d500
> [  875.703227] Call Trace:
> [  875.703227]  [] dump_stack+0x45/0x57
> [  875.703227]  [] warn_slowpath_common+0x85/0xc0
> [  875.703227]  [] warn_slowpath_null+0x15/0x20
> [  875.703227]  [] irq_work_queue_on+0xd4/0x110
> [  875.703227]  [] tick_nohz_full_kick_cpu+0x44/0x50
> [  875.703227]  [] wake_up_nohz_cpu+0xb4/0x100
> [  875.703227]  [] internal_add_timer+0x86/0xa0
> [  875.703227]  [] mod_timer+0xf1/0x1e0
> [  875.703227]  [] rcu_torture_reader+0x2a4/0x2e0
> [  875.703227]  [] ? rcu_torture_reader+0x2e0/0x2e0
> [  875.703227]  [] ? rcutorture_trace_dump.part.10+0x20/0x20
> [  875.703227]  [] kthread+0xcd/0xf0
> [  875.703227]  [] ? kthread_create_on_node+0x180/0x180
> [  875.703227]  [] ret_from_fork+0x42/0x70
> [  875.703227]  [] ? kthread_create_on_node+0x180/0x180
> [  875.703227] ---[ end trace 74175128740d0113 ]---

> #
> # Automatically generated file; DO NOT EDIT.
> # Linux/x86 4.1.0-rc4 Kernel Configuration
> #
> CONFIG_64BIT=y
> CONFIG_X86_64=y
> CONFIG_X86=y
> CONFIG_INSTRUCTION_DECODER=y
> CONFIG_PERF_EVENTS_INTEL_UNCORE=y
> CONFIG_OUTPUT_FORMAT="elf64-x86-64"
> CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_HAVE_LATENCYTOP_SUPPORT=y
> CONFIG_MMU=y
> CONFIG_NEED_DMA_MAP_STATE=y
> CONFIG_NEED_SG_DMA_LENGTH=y
> CONFIG_GENERIC_ISA_DMA=y
> CONFIG_GENERIC_BUG=y
> CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
> CONFIG_GENERIC_HWEIGHT=y
> CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y
> CONFIG_GENERIC_CALIBRATE_DELAY=y
> CONFIG_ARCH_HAS_CPU_RELAX=y
> CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
> CONFIG_HAVE_SETUP_PER_CPU_AREA=y
> CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
> CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
> CONFIG_ARCH_HIBERNATION_POSSIBLE=y
> CONFIG_ARCH_SUSPEND_POSSIBLE=y
> CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
> CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
> CONFIG_ZONE_DMA32=y
> CONFIG_AUDIT_ARCH=y
> CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
> CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
> CONFIG_HAVE_INTEL_TXT=y
> CONFIG_X86_64_SMP=y
> CONFIG_X86_HT=y
> CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi 
> -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 
> -fcall-saved-r10 -fcall-saved-r11"
> CONFIG_ARCH_SUPPORTS_UPROBES=y
> CONFIG_FIX_EARLYCON_MEM=y
> CONFIG_PGTABLE_LEVELS=4
> CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
> CONFIG_IRQ_WORK=y
> CONFIG_BUILDTIME_EXTABLE_SORT=y
> 
> #
> # General setup
> #
> CONFIG_INIT_ENV_ARG_LIMIT=32
> CONFIG_CROSS_COMPILE=""
> # CONFIG_COMPILE_TEST is not set
> CONFIG_LOCALVERSION=""
> # CONFIG_LOCALVERSION_AUTO is not set
> CONFIG_HAVE_KERNEL_GZIP=y
> CONFIG_HAVE_KERNEL_BZIP2=y
> CONFIG_HAVE_KERNEL_LZMA=y
> CONFIG_HAVE_KERNEL_XZ=y
> CONFIG_HAVE_KERNEL_LZO=y
> CONFIG_HAVE_KERNEL_LZ4=y
> CONFIG_KERNEL_GZIP=y
> # CONFIG_KERNEL_BZIP2 is not set
> # CONFIG_KERNEL_LZMA is not set
> # CONFIG_KERNEL_XZ is not set
> # CONFIG_KERNEL_LZO is not set
> # CONFIG_KERNEL_LZ4 is not set
> CONFIG_DEFAULT_HOSTNAME="(none)"
> CONFIG_SWAP=y
> CONFIG_SYSVIPC=y
> CONFIG_SYSVIPC_SYSCTL=y
> CONFIG_POSIX_MQUEUE=y
> CONFIG_POSIX_MQUEUE_SYSCTL=y
> CONFIG_CROSS_MEMORY_ATTACH=y
> CONFIG_FHANDLE=y
> CONFIG_USELIB=y
> CONFIG_AUDIT=y
> CONFIG_HAVE_ARCH_AUDITSYSCALL=y
> CONFIG_AUDITSYSCALL=y
> CONFIG_AUDIT_WATCH=y
> CONFIG_AUDIT_TREE=y
> 
> #
> # IRQ subsystem
> #
> CONFIG_GENERIC_IRQ_PROBE=y
> CONFIG_GENERIC_IRQ_SHOW=y
> CONFIG_GENERIC_IRQ_LEGACY_ALLOC_HWIRQ=y
> CONFIG_GENERIC_PENDING_IRQ=y
> CONFIG_IRQ_DOMAIN=y
> CONFIG_GENERIC_MSI_IRQ=y
> # CONFIG_IRQ_DOMAIN_DEBUG is not set
> CONFIG_IRQ_FORCED_THREADING=y
> CONFIG_SPARSE_IRQ=y
> CONFIG_C