When we warn about a preempt_count leak; reset the preempt_count to
the known good value such that the problem does not ripple forward.

This is most important on x86 which has a per cpu preempt_count that is
not saved/restored (after this series). So if you schedule with an
invalid (!2*PREEMPT_DISABLE_OFFSET) preempt_count the next task is
messed up too.

Enforcing this invariant limits the borkage to just the one task.

Reviewed-by: Frederic Weisbecker <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
 kernel/exit.c       |    4 +++-
 kernel/sched/core.c |    4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -706,10 +706,12 @@ void do_exit(long code)
        smp_mb();
        raw_spin_unlock_wait(&tsk->pi_lock);
 
-       if (unlikely(in_atomic()))
+       if (unlikely(in_atomic())) {
                pr_info("note: %s[%d] exited with preempt_count %d\n",
                        current->comm, task_pid_nr(current),
                        preempt_count());
+               preempt_count_set(PREEMPT_ENABLED);
+       }
 
        /* sync mm's RSS info before statistics gathering */
        if (tsk->mm)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2968,8 +2968,10 @@ static inline void schedule_debug(struct
        BUG_ON(unlikely(task_stack_end_corrupted(prev)));
 #endif
 
-       if (unlikely(in_atomic_preempt_off()))
+       if (unlikely(in_atomic_preempt_off())) {
                __schedule_bug(prev);
+               preempt_count_set(PREEMPT_DISABLED);
+       }
        rcu_sleep_check();
 
        profile_hit(SCHED_PROFILING, __builtin_return_address(0));


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to