For multi-tenancy currently there are mechanisms to share the system CPUs by time-sharing (e.g: CFS) and by dividing up the system in 'rigid' containers by using system calls like sched_setaffinity. There is no existing way in the linux kernel today, for flexible workloads where there is a need to give the whole system while still maintaining a notion of preference to CPUs.
This patch introduces a new CPU mask, 'cpus_preferred' within the task_struct structure and allows applications a way to specify a set of CPUs which the application would like to run on. The scheduler will try to honor the applications' request the best it can, however if the scheduler finds that there are no idle CPUs within the preferred list, it shall run the application anywhere within the system. This can be used to design soft containers which allows a tenant to use more capacity than he is entitled to when others aren't fully using theirs. The advantage of space sharing the system as opposed to time sharing is that you maintain more cache locality when the soft containers are being utilized. Since this behavior is observed on every scheduling decision, the application gets to run on its preferred CPUs as long as the application does not overuse its specified resources. The design of soft containers still needs more user-space code however, this is what is needed from the kernel. FAQs: Q) What if I set "hard" affinity after I set a preference by using soft affinity? A: Hard affinity will over-ride any previous soft affinity. Q) What if my application had already specified a "hard" affinity? Can I still provide a set of CPUs for soft affinity? A: Yes, it will work as long as the new soft affinity is a subset of the "hard" affinity. Q) Can I have mutually exclusive hard and soft affinities? A: No, soft affinity is always a subset of hard affinity. Note: Ignore the kernel/sched/tick-sched.c change. It is just fixing a build error on Peter's tree. Rohit Jain (2): sched: Introduce new flags to sched_setaffinity to support soft affinity. sched: Actual changes after adding SCHED_SOFT_AFFINITY to make it work with the scheduler arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/init_task.h | 1 + include/linux/sched.h | 4 +- include/linux/syscalls.h | 3 + include/uapi/asm-generic/unistd.h | 4 +- include/uapi/linux/sched.h | 3 + kernel/compat.c | 2 +- kernel/sched/core.c | 167 ++++++++++++++++++++++++++++----- kernel/sched/cpudeadline.c | 4 +- kernel/sched/cpupri.c | 4 +- kernel/sched/fair.c | 116 +++++++++++++++++------ kernel/time/tick-sched.c | 1 + 12 files changed, 250 insertions(+), 60 deletions(-) -- 2.7.4