On Mon, 2013-03-18 at 12:06 -0700, Tejun Heo wrote:
> Me neither. Unfortunately, I'm out of ideas at the moment.
> Hmm... last year, there was a similar issue, I think it was in AMD
> cpufreq, which was caused by work function doing
> set_cpus_allowed_ptr(), so the idle worker was on the correct
On Mon, Mar 18, 2013 at 02:57:30PM -0400, Steven Rostedt wrote:
> I like the theory, but it has one flaw. I agree that the update should
> be wrapped in preempt_disable() but since this bug happens on the same
> CPU, the state of the list will be the same when it was preempted to
> when it bugged.
On Mon, 2013-03-18 at 11:21 -0700, Tejun Heo wrote:
> I've been thinking about it and AFAICS the only way that BUG_ON()
> could trigger from preemption is if preemption happens while the
> idle_list head is becoming or stopping being empty.
> ie. pool->worklist is half updated so list_empty() isn'
On Mon, 2013-03-18 at 11:26 -0700, Tejun Heo wrote:
> > Hmm, the issue is that a "use to be" idle thread got migrated, and is
> > now being woken up by another worker. What can cause an established
> > worker to migrate without HOTPLUG being active?
>
> It doesn't. I think it's trying to wakeup
On Mon, Mar 18, 2013 at 01:08:07PM -0400, Steven Rostedt wrote:
> On Mon, 2013-03-18 at 09:43 -0700, Tejun Heo wrote:
>
> > Making gcwq locks disable preemption would be much safer / easier, but
> > if that's not desirable, anything touching gcwq->idle_list would be a
> > good place to start - wor
On Mon, Mar 18, 2013 at 02:23:56PM -0400, Steven Rostedt wrote:
> On Mon, 2013-03-18 at 09:43 -0700, Tejun Heo wrote:
> > Hello, Steven.
> >
> > On Mon, Mar 18, 2013 at 12:30:43PM -0400, Steven Rostedt wrote:
> > > If you happen to know the critical areas that require preemption to be
> > > disabl
On Mon, 2013-03-18 at 09:43 -0700, Tejun Heo wrote:
> Hello, Steven.
>
> On Mon, Mar 18, 2013 at 12:30:43PM -0400, Steven Rostedt wrote:
> > If you happen to know the critical areas that require preemption to be
> > disabled for real, we can encapsulate them with:
> >
> > preempt_disable_rt()
On Mon, 2013-03-18 at 09:43 -0700, Tejun Heo wrote:
> Making gcwq locks disable preemption would be much safer / easier, but
> if that's not desirable, anything touching gcwq->idle_list would be a
> good place to start - worker_enter_idle() and worker_leave_idle().
> Hmmm... ignoring CPU hotplug,
On Mon, Mar 18, 2013 at 12:41:23PM -0400, Steven Rostedt wrote:
> But, I'm worried about the loops that are done while holding this lock.
> Just looking at is_chained_work() that does for_each_busy_worker(), how
> big can that list be? If it's bound by # of CPUs then that may be fine,
> but if it c
Hello, Steven.
On Mon, Mar 18, 2013 at 12:30:43PM -0400, Steven Rostedt wrote:
> If you happen to know the critical areas that require preemption to be
> disabled for real, we can encapsulate them with:
>
> preempt_disable_rt();
>
> preempt_enable_rt();
>
> These are currently only
On Mon, 2013-03-18 at 09:27 -0700, Tejun Heo wrote:
> Does that mean that a task holding gcwq->lock may be preempted? If
> so, that sure could lead to weird problems. Maybe gcwq->lock should
> be marked non-preemptible somehow?
If the gcwq->lock is never held for a long time (really, more than
On Mon, 2013-03-18 at 12:27 -0400, Steven Rostedt wrote:
> IOW, what can happen in -rt here is:
>
> spin_lock_irq(&gcwq->lock);
> [...]
>
> -> preempt_schedule();
> schedule();
> try_to_wake_up_local();
>
> [...]
> sp
Hey, Steven.
On Mon, Mar 18, 2013 at 12:23:19PM -0400, Steven Rostedt wrote:
> > Maybe I'm confused but I can't really see how the above would be a
> > problem to workqueue in itself. Both rq->lock and gcwq->lock are
> > irq-safe, so spin_lock() not disabling preemption shouldn't be a
> > problem
On Mon, 2013-03-18 at 12:23 -0400, Steven Rostedt wrote:
> > Maybe I'm confused but I can't really see how the above would be a
> > problem to workqueue in itself. Both rq->lock and gcwq->lock are
> > irq-safe, so spin_lock() not disabling preemption shouldn't be a
> > problem. Are CPU hotplug o
On Mon, 2013-03-18 at 09:06 -0700, Tejun Heo wrote:
> Hello, Steven.
>
> On Mon, Mar 18, 2013 at 10:36:23AM -0400, Steven Rostedt wrote:
> > kernel BUG at kernel/sched/core.c:1731!
> > invalid opcode: [#1] PREEMPT SMP
> > CPU 5
> > Pid: 16637, comm: kworker/5:0 Not tainted 3.6.11-rt30.25.el
Hello, Steven.
On Mon, Mar 18, 2013 at 10:36:23AM -0400, Steven Rostedt wrote:
> kernel BUG at kernel/sched/core.c:1731!
> invalid opcode: [#1] PREEMPT SMP
> CPU 5
> Pid: 16637, comm: kworker/5:0 Not tainted 3.6.11-rt30.25.el6rt.x86_64 #1 HP
> ProLiant DL580 G7
...
> static void try_to_wak
Hi Tejun,
I'm debugging a crash on -rt that has the following:
kernel BUG at kernel/sched/core.c:1731!
invalid opcode: [#1] PREEMPT SMP
CPU 5
Pid: 16637, comm: kworker/5:0 Not tainted 3.6.11-rt30.25.el6rt.x86_64 #1 HP
ProLiant DL580 G7
RIP: 0010:[] [] __schedule+0x89a/0x8c0
RSP: 0018:fff
17 matches
Mail list logo