Hi, Mike stumbled over a cute bug where the RT/DL balancing ops caused a bug.
The exact scenario is __sched_setscheduler() changing a (runnable) task from FIFO to OTHER. In swiched_from_rt(), where we do pull_rt_task() we temporarity drop rq->lock. This gap allows regular cfs load-balancing to step in and migrate our. However, check_class_changed() will happily continue with switched_to_fair() which assumes our task is still on the old rq and makes the kernel go boom. Instead of trying to patch this up and make things complicated; simply disallow these methods to drop rq->lock and extend the current post_schedule stuff into a balancing callback list, and use that. This survives Mike's testcase for well over an hour on my ivb-ep. I've not yet tested it on anything bigger. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

