On 24.06.2014 01:05, bseg...@google.com wrote: > Kirill Tkhai <tk...@yandex.ru> writes: > >> On 23.06.2014 21:29, bseg...@google.com wrote: >>> Peter Zijlstra <pet...@infradead.org> writes: >>> >>>> On Tue, Jun 17, 2014 at 05:24:10PM +0400, Kirill Tkhai wrote: >>>>> @@ -3790,6 +3803,12 @@ static void __maybe_unused >>>>> unthrottle_offline_cfs_rqs(struct rq *rq) >>>>> cfs_rq->runtime_remaining = 1; >>>>> if (cfs_rq_throttled(cfs_rq)) >>>>> unthrottle_cfs_rq(cfs_rq); >>>>> + >>>>> + /* >>>>> + * Offline rq is schedulable till cpu is completely disabled >>>>> + * in take_cpu_down(), so we prevent new cfs throttling here. >>>>> + */ >>>>> + cfs_rq->runtime_enabled = 0; >>>> >>>> Does it make sense to clear this before calling unthrottle_cfs_rq()? >>>> Just to make sure they're in the right order.. >>> >>> I believe that order is irrelevant here - I do not believe that >>> unthrottle_cfs_rq(a) can cause a throttle_cfs_rq(a). In fact, I don't >>> see any code that will check it at all from unthrottle, although I might >>> be missing something. It _can_ cause a throttle_cfs_rq(parent_cfs_rq(a)), >>> but that should be fine as long as for_each_leaf_cfs_rq is sorted >>> correctly. >> >> I think this is correct. We may change the order just for the hope, that >> anybody will work on it in some way in the future, and this person could >> skip this opaque place. Ok, I don't know how is better :) >> >>> That said, migrate_tasks drops rq->lock, and I /think/ another cpu could >>> wake another task onto this cpu, which could then enqueue_throttle its >>> cfs_rq (which previously had no tasks and thus wasn't on >>> leaf_cfs_rq_list). You certainly could have tg_set_bandwidth come in and >>> turn runtime_enabled on. >> >> We mask cpu inactive on CPU_DOWN_PREPARE stage (in sched_cpu_inactive). >> Other cpu is not able to wake a task there after that. >> >> rq is masked offline in cpuset_cpu_inactive() (during the same stage). >> But priority of sched_cpu_inactive() is higher than priority of >> cpuset_cpu_inactive(). >> >> CPU_PRI_SCHED_INACTIVE = INT_MIN + 1, >> CPU_PRI_CPUSET_INACTIVE = INT_MIN, >> >> This guarantees that nobody could use dying_cpu even before we start >> unthrottling. Another cpu will see dying_cpu is inactive. >> >> So, it looks like we are free of this problem. > > Ah, ok, I haven't looked that hard at hotplug, and wasn't sure of the > ordering there. We still have the tg_set_cfs_bandwidth issue, because > that uses for_each_possible_cpu. However, with the addition of > rq_online_fair, that can probably be changed to for_each_active_cpu, and > then I think we would be fine.
Ok, now I see. Thanks, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/