On Mon, Apr 30, 2018 at 12:42 PM, Peter Zijlstra <[email protected]> wrote: > On Mon, Apr 30, 2018 at 12:29:25PM -0700, Cong Wang wrote: >> Currently, the sched_cfs_bandwidth_slice_us is a global setting which >> affects all cgroups. Different groups may want different values based >> on their own workload, one size doesn't fit all. The global pool filled >> periodically is per cgroup too, they should have the right to distribute >> their own quota to each local CPU with their own frequency. > > Why.. what happens? This doesn't really tell us anything.
We saw tasks in a container got throttled for many times even when they don't apparently over-burn the CPU's. I tried to reduce the sched_cfs_bandwidth_slice_us from the default 5ms to 1ms, it solved the problem as no tasks got throttled after this change. This is why I want to change it. And I don't think 1ms will be good for all containers, so in order to minimize the impact, I would like to keep the slice change within each container. This is why I propose this patch rather just `sysctl -w`. Do you think otherwise? BTW, people reported a similar (if not same) issue here before: https://gist.github.com/bobrik/2030ff040fad360327a5fab7a09c4ff1 Thanks!

