On 06/05/2013 04:08 PM, Jiri Kosina wrote:
> On Wed, 5 Jun 2013, Michael Wang wrote:
>
>>> Just to not let this thread sleep -- I am seeing this as well, even with
>>> current Linus' tree (git HEAD == aa4f608).
>>
>> Have you tried this:
>>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c
>> b/
On Wed, 5 Jun 2013, Michael Wang wrote:
> > Just to not let this thread sleep -- I am seeing this as well, even with
> > current Linus' tree (git HEAD == aa4f608).
>
> Have you tried this:
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c
> b/drivers/cpufreq/cpufreq_governor.c
> index 443442d
Hi, Jiri
On 06/05/2013 05:20 AM, Jiri Kosina wrote:
[snip]
>
> Just to not let this thread sleep -- I am seeing this as well, even with
> current Linus' tree (git HEAD == aa4f608).
Have you tried this:
diff --git a/drivers/cpufreq/cpufreq_governor.c
b/drivers/cpufreq/cpufreq_governor.c
index 4
On Fri, 17 May 2013, Borislav Petkov wrote:
> commit f7ea0fd639c2c48d3c61b6eec75362be290c6874
> Author: Thomas Gleixner
> Date: Mon May 13 21:40:27 2013 +0200
>
> tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline
>
> Now, when I halt the box, I see these splats originat
On 05/21/2013 03:21 PM, Borislav Petkov wrote:
> On Tue, May 21, 2013 at 10:20:51AM +0800, Michael Wang wrote:
>> This is not enough to prove that policy->cpus is wrong, the cpu could
>> be online when get from policy->cpus, but offline when checked here,
>> since hotplug is able to happen during t
On Tue, May 21, 2013 at 10:20:51AM +0800, Michael Wang wrote:
> This is not enough to prove that policy->cpus is wrong, the cpu could
> be online when get from policy->cpus, but offline when checked here,
> since hotplug is able to happen during the period.
Strictly speaking you're correct but I d
On 05/21/2013 10:20 AM, Michael Wang wrote:
[snip]
>
> If hotplug could not happen but still get an offline cpu from
> policy->cpus, than we could say it's wrong, otherwise we proved nothing...
like this:
diff --git a/drivers/cpufreq/cpufreq_governor.c
b/drivers/cpufreq/cpufreq_governor.c
index
On 05/20/2013 09:23 PM, Borislav Petkov wrote:
> On Mon, May 20, 2013 at 05:24:05PM +0800, Michael Wang wrote:
diff --git a/drivers/cpufreq/cpufreq_governor.c
b/drivers/cpufreq/cpufreq_governor.c
index 443442d..449be88 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/d
On Mon, May 20, 2013 at 07:13:08PM +0530, Viresh Kumar wrote:
> Hmm, so for sure there is some locking issue there. ave you tried my
> Hpatch?
No, not yet. Pretty busy ATM. Btw, you could try reproducing it too, in
the meantime - simply enable
CONFIG_NO_HZ_COMMON=y
# CONFIG_NO_HZ_IDLE is not set
On 20 May 2013 18:53, Borislav Petkov wrote:
> I just confirmed that policy->cpus contains offlined cores with this:
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c
> b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad82d23..e8c25f71e9b6 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> ++
On Mon, May 20, 2013 at 05:24:05PM +0800, Michael Wang wrote:
> >> diff --git a/drivers/cpufreq/cpufreq_governor.c
> >> b/drivers/cpufreq/cpufreq_governor.c
> >> index 443442d..449be88 100644
> >> --- a/drivers/cpufreq/cpufreq_governor.c
> >> +++ b/drivers/cpufreq/cpufreq_governor.c
> >> @@ -26,6 +
On 20 May 2013 15:10, Viresh Kumar wrote:
> On 20 May 2013 15:01, Srivatsa S. Bhat
> wrote:
>> And Viresh, in the regular hotplug paths, the call to gov_cancel_work() is
>> supposed to kill any pending workqueue functions pertaining to offline CPUs
>> right?
>
> Yes.. It will cancel work for all
On 20 May 2013 15:01, Srivatsa S. Bhat wrote:
> And Viresh, in the regular hotplug paths, the call to gov_cancel_work() is
> supposed to kill any pending workqueue functions pertaining to offline CPUs
> right?
Yes.. It will cancel work for all cpus first and will start again for
online cpus again
On 05/20/2013 01:40 PM, Frederic Weisbecker wrote:
> 2013/5/20 Borislav Petkov :
>> On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
>>> I suppose the reason is that the cpu we passed to
>>> mod_delayed_work_on() has a chance to become offline before we
>>> disabled irq, what about che
On 05/20/2013 05:09 PM, Viresh Kumar wrote:
> On 20 May 2013 14:26, Michael Wang wrote:
>> On 05/20/2013 03:25 PM, Michael Wang wrote:
>>> Yeah, that's right, I guess the issue is, although the policy->cpus is
>>> correct at a given time, after get cpu from it, it's possible to be
>>> changed, unl
On 20 May 2013 14:26, Michael Wang wrote:
> On 05/20/2013 03:25 PM, Michael Wang wrote:
>> Yeah, that's right, I guess the issue is, although the policy->cpus is
>> correct at a given time, after get cpu from it, it's possible to be
>> changed, unless we disabled preempt or irq, or hotplug before
On 05/20/2013 03:25 PM, Michael Wang wrote:
[]
>
> Yeah, that's right, I guess the issue is, although the policy->cpus is
> correct at a given time, after get cpu from it, it's possible to be
> changed, unless we disabled preempt or irq, or hotplug before we use it...
>
> Like such issue cases:
>
2013/5/20 Borislav Petkov :
> On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
>> I suppose the reason is that the cpu we passed to
>> mod_delayed_work_on() has a chance to become offline before we
>> disabled irq, what about check it before send resched ipi? like:
>
> I think this is
Hello,
On Mon, May 20, 2013 at 08:47:27AM +0200, Borislav Petkov wrote:
> > So there are two questions here:
> > 1. Is gov_queue_work() want to queue the work on offline cpu?
> > 2. Is mod_delayed_work_on() allow offline cpu?
> >
> > I guess both should be false?
>
> Well, if we don't allow queu
Hi, Viresh
On 05/20/2013 03:12 PM, Viresh Kumar wrote:
> Hi Michael,
>
> I haven't followed this mail chain earlier and saw this mail only as I am
> added in cc now. I probably have answers to few questions here:
Thanks for your quick respond :)
>
> On 20 May 2013 12:36, Michael Wang wrote:
>>
Hi Michael,
I haven't followed this mail chain earlier and saw this mail only as I am
added in cc now. I probably have answers to few questions here:
On 20 May 2013 12:36, Michael Wang wrote:
> On 05/20/2013 02:58 PM, Michael Wang wrote:
>> On 05/20/2013 02:47 PM, Borislav Petkov wrote:
>>> On M
On 05/20/2013 02:58 PM, Michael Wang wrote:
> On 05/20/2013 02:47 PM, Borislav Petkov wrote:
>> On Mon, May 20, 2013 at 02:23:37PM +0800, Michael Wang wrote:
>>> On 05/20/2013 12:50 PM, Borislav Petkov wrote:
On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
> I suppose the rea
On 05/20/2013 02:47 PM, Borislav Petkov wrote:
> On Mon, May 20, 2013 at 02:23:37PM +0800, Michael Wang wrote:
>> On 05/20/2013 12:50 PM, Borislav Petkov wrote:
>>> On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
I suppose the reason is that the cpu we passed to
mod_delayed_
On Mon, May 20, 2013 at 02:23:37PM +0800, Michael Wang wrote:
> On 05/20/2013 12:50 PM, Borislav Petkov wrote:
> > On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
> >> I suppose the reason is that the cpu we passed to
> >> mod_delayed_work_on() has a chance to become offline before we
On 05/20/2013 12:50 PM, Borislav Petkov wrote:
> On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
>> I suppose the reason is that the cpu we passed to
>> mod_delayed_work_on() has a chance to become offline before we
>> disabled irq, what about check it before send resched ipi? like:
>
On Mon, May 20, 2013 at 11:16:33AM +0800, Michael Wang wrote:
> I suppose the reason is that the cpu we passed to
> mod_delayed_work_on() has a chance to become offline before we
> disabled irq, what about check it before send resched ipi? like:
I think this is only addressing the symptoms - what
Hi, Borislav
On 05/17/2013 09:56 PM, Borislav Petkov wrote:
[snip]
> [ 51.737378] [] native_smp_send_reschedule+0x58/0x60
> [ 51.744013] [] wake_up_nohz_cpu+0x2d/0xa0
I suppose the reason is that the cpu we passed to mod_delayed_work_on()
has a chance to become offline before we disabled ir
On Thu, May 16, 2013 at 12:43:58AM +0200, Borislav Petkov wrote:
> On Wed, May 15, 2013 at 11:45:28AM -0700, Paul E. McKenney wrote:
> > Does the following patch help?
>
> Hmm, I just tried on 3.10-rc1
>
> CONFIG_NO_HZ_FULL_ALL=y
>
> on the one hand and then
>
> CONFIG_NO_HZ_FULL=y
> # CONFIG_N
On Wed, May 15, 2013 at 11:45:28AM -0700, Paul E. McKenney wrote:
> Does the following patch help?
Hmm, I just tried on 3.10-rc1
CONFIG_NO_HZ_FULL_ALL=y
on the one hand and then
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ_FULL_ALL is not set
with "nohz_full=4-7 rcu_nocbs=4-7" on the cmdline and I don't
On Thu, May 09, 2013 at 02:58:59PM +0200, Borislav Petkov wrote:
> On Thu, May 09, 2013 at 02:50:40PM +0200, Borislav Petkov wrote:
> > Looks like we're sending a resched IPI to a cpu which is not online
> > yet in order to start the MCE polling timer. So the rcu* options are
> > kinda unlikely to
On Mon, 13 May 2013, Thomas Gleixner wrote:
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -650,6 +650,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct
> > > tick_sched *ts,
> > >
> > > ts->last_tick = hrtimer_get_expires(&ts->sched_time
On Mon, 13 May 2013, Jiri Kosina wrote:
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -650,6 +650,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct
> > tick_sched *ts,
> >
> > ts->last_tick = hrtimer_get_expires(&ts->sched_timer);
> >
On Fri, 10 May 2013, Frederic Weisbecker wrote:
> The problem is that it doesn't catch issues with irqs that have been enabled
> before in start_secondary(), then re-disabled somewhow. Warning on offline
> CPU from the place
> that disables the tick should catch the issue.
>
> Jiri, could you t
On Fri, May 10, 2013 at 06:23:40PM +0200, Borislav Petkov wrote:
> On Fri, May 10, 2013 at 05:43:50PM +0200, Frederic Weisbecker wrote:
> > So either interrupts are spuriously enabled early, or ts->tick_stopped
> > is not correctly initialized.
>
> Hmm, it can't be interrupts disabled because add_
On Fri, May 10, 2013 at 05:43:50PM +0200, Frederic Weisbecker wrote:
> Right. But this is adding a timer locally, from CPU 1 to CPU 1, as
> indicated in the trace with the "1 1" line. So the only way for
> this IPI to be self-sent is if the tick is stopped locally (cf:
> wake_up_full_nohz_cpu()).
>
On Fri, May 10, 2013 at 05:21:02PM +0200, Borislav Petkov wrote:
> On Fri, May 10, 2013 at 05:03:56PM +0200, Jiri Kosina wrote:
> > [ ... snip ... ]
> > Enabling non-boot CPUs ...
> > smpboot: Booting Node 0 Processor 1 APIC 0x1
> > CPU1 microcode updated early to revision 0x60f, date = 2010-09-
On Fri, 10 May 2013, Frederic Weisbecker wrote:
> In fact it would be nice to have DO_ONCE(something) and stuff whatever
> we want inside.
> All the printk_once() et. al could even be implemented using that.
Sounds nice, but if it's going to be used for something else than purely
debugging outpu
On Fri, May 10, 2013 at 11:45:56AM +0200, Borislav Petkov wrote:
> On Fri, May 10, 2013 at 11:37:29AM +0200, Ingo Molnar wrote:
> > The pattern I use in such cases is:
> >
> > if (WARN_ONCE(!cpu_online(cpu))) {
> > printk("%d %d\n", cpu, smp_processor_id());
> > dump_st
On Fri, May 10, 2013 at 05:03:56PM +0200, Jiri Kosina wrote:
> [ ... snip ... ]
> Enabling non-boot CPUs ...
> smpboot: Booting Node 0 Processor 1 APIC 0x1
> CPU1 microcode updated early to revision 0x60f, date = 2010-09-29
> Disabled fast string operations
> 1 1
> CPU: 1 PID: 0 Comm: swapper
On Fri, 10 May 2013, Frederic Weisbecker wrote:
> Like Borislav said, it's due to the scheduler IPI sent to an offline
> target. Here this is because we enqueue a timer and we must ensure the
> target handles this timer by rescheduling its tick if necessary.
>
> But it's weird because the mce tim
On Fri, May 10, 2013 at 11:37:29AM +0200, Ingo Molnar wrote:
> The pattern I use in such cases is:
>
> if (WARN_ONCE(!cpu_online(cpu))) {
> printk("%d %d\n", cpu, smp_processor_id());
> dump_stack();
> }
Cool, and WARN_ONCE dumps stack already so:
* Frederic Weisbecker wrote:
> 2013/5/10 Borislav Petkov :
> > On Fri, May 10, 2013 at 02:29:31AM +0200, Frederic Weisbecker wrote:
> >> @@ -616,8 +616,17 @@ static bool wake_up_full_nohz_cpu(int cpu)
> >> {
> >> if (tick_nohz_full_cpu(cpu)) {
> >> if (cpu != smp_processor_i
On Fri, May 10, 2013 at 11:26:39AM +0200, Frederic Weisbecker wrote:
> 2013/5/10 Borislav Petkov :
> > On Fri, May 10, 2013 at 02:29:31AM +0200, Frederic Weisbecker wrote:
> >> @@ -616,8 +616,17 @@ static bool wake_up_full_nohz_cpu(int cpu)
> >> {
> >> if (tick_nohz_full_cpu(cpu)) {
> >>
2013/5/10 Borislav Petkov :
> On Fri, May 10, 2013 at 02:29:31AM +0200, Frederic Weisbecker wrote:
>> @@ -616,8 +616,17 @@ static bool wake_up_full_nohz_cpu(int cpu)
>> {
>> if (tick_nohz_full_cpu(cpu)) {
>> if (cpu != smp_processor_id() ||
>> - tick_nohz_tick_s
On Fri, May 10, 2013 at 02:29:31AM +0200, Frederic Weisbecker wrote:
> @@ -616,8 +616,17 @@ static bool wake_up_full_nohz_cpu(int cpu)
> {
> if (tick_nohz_full_cpu(cpu)) {
> if (cpu != smp_processor_id() ||
> - tick_nohz_tick_stopped())
> + tick_
On Thu, May 09, 2013 at 02:29:18PM +0200, Jiri Kosina wrote:
> Hi,
>
> I just got the warning below when resuming from hibernation with kernel
> that has NO_HZ_FULL_ALL=y. This is with topmost commit e0fd9affeb640.
>
>
> [ ... snip ... ]
> PM: Hibernation mode set to 'shutdown'
> PM: Marking
On Thu, May 09, 2013 at 02:50:40PM +0200, Borislav Petkov wrote:
> Looks like we're sending a resched IPI to a cpu which is not online
> yet in order to start the MCE polling timer. So the rcu* options are
> kinda unlikely to be related, AFAICT.
On a second thought, they must be somehow indirectly
On Thu, May 09, 2013 at 02:29:18PM +0200, Jiri Kosina wrote:
> Hi,
>
> I just got the warning below when resuming from hibernation with kernel
> that has NO_HZ_FULL_ALL=y. This is with topmost commit e0fd9affeb640.
Did you boot with any of the NO_HZ_FULL options on the command line,
i.e. rcu_noc
48 matches
Mail list logo