This patch reintroduces cond_resched_may_throttle(), which acts exactly as cond_resched() except it does not forbid CFS bandwidth induced throttling. It will be used in the following patch.
============================================================================= Author: Vladimir Davydov Email: vdavy...@parallels.com Subject: sched: introduce cond_resched_may_throttle Date: Fri, 15 Mar 2013 18:53:12 +0400 Since cond_resched() is sometimes called under a semaphore, it was forbidden to throttle tasks there in order to eliminate the possibility of the priority inversion problem. However, it turned out that some tasks must be throttled on cond_resched(), otherwise they won't have a chance to be throttled at all breaking the concept of CPU limits. The most notable (and currently the only identified) example is vm hypervisors such as KVM or Balalaika. To fix this problem, the patch introduces the new function cond_resched_may_throttle(), which works just like usual cond_resched() except it allows the scheduler to throttle the caller's task group. This function must be used by those pieces of software that can only be throttled on cond_resched() under certain conditions. This function is to and will be used by Balalaika - I'm going to send the corresponding patch. Perhaps, it's also worth while using it in KVM, however there is no rush in it because I doubt anyone will use KVM, vzkernel, and our hacked CPU limits altogether so it can wait. https://jira.sw.ru/browse/PSBM-18888 Signed-off-by: Vladimir Davydov <vdavy...@parallels.com> ============================================================================= Signed-off-by: Vladimir Davydov <vdavy...@parallels.com> --- include/linux/sched.h | 8 ++++++++ kernel/sched/core.c | 11 +++++++++++ kernel/sched/fair.c | 3 ++- 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 65fedc487c92..3bd2f50822b8 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1245,6 +1245,7 @@ struct task_struct { unsigned sched_interruptible_sleep:1; unsigned woken_while_running:1; + unsigned may_throttle:1; pid_t pid; pid_t tgid; @@ -2693,6 +2694,13 @@ extern int _cond_resched(void); _cond_resched(); \ }) +extern int __cond_resched_may_throttle(void); + +#define cond_resched_may_throttle() ({ \ + __might_sleep(__FILE__, __LINE__, 0); \ + __cond_resched_may_throttle(); \ +}) + extern int __cond_resched_lock(spinlock_t *lock); #ifdef CONFIG_PREEMPT_COUNT diff --git a/kernel/sched/core.c b/kernel/sched/core.c index bbb6fc3251ad..77d330cd2b79 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4790,6 +4790,17 @@ int __sched _cond_resched(void) } EXPORT_SYMBOL(_cond_resched); +int __sched __cond_resched_may_throttle(void) +{ + if (should_resched()) { + current->may_throttle = 1; + __cond_resched(); + current->may_throttle = 0; + return 1; + } + return 0; +} + /* * __cond_resched_lock() - if a reschedule is pending, drop the given lock, * call schedule, and on return reacquire the lock. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 167d0f66387b..49a93569b5e6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -965,7 +965,8 @@ static inline void update_entity_boost(struct sched_entity *se) p->woken_while_running; p->woken_while_running = 0; } else - se->boosted = sched_feat(BOOST_PREEMPT); + se->boosted = sched_feat(BOOST_PREEMPT) && + !p->may_throttle; } } -- 2.1.4 _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel