Re: [RFC PATCH 02/10] sched: Task placement for heterogeneous systems based on task load-tracking
Hi Viresh, On Thu, Oct 04, 2012 at 07:02:03AM +0100, Viresh Kumar wrote: Hi Morten, On 22 September 2012 00:02, morten.rasmus...@arm.com wrote: From: Morten Rasmussen morten.rasmus...@arm.com This patch introduces the basic SCHED_HMP infrastructure. Each class of cpus is represented by a hmp_domain and tasks will only be moved between these domains when their load profiles suggest it is beneficial. SCHED_HMP relies heavily on the task load-tracking introduced in Paul Turners fair group scheduling patch set: https://lkml.org/lkml/2012/8/23/267 SCHED_HMP requires that the platform implements arch_get_hmp_domains() which should set up the platform specific list of hmp_domains. It is also assumed that the platform disables SD_LOAD_BALANCE for the appropriate sched_domains. An explanation of this requirement would be helpful here. Yes. This is to prevent the load-balancer from moving tasks between hmp_domains. This will be done exclusively by SCHED_HMP instead to implement a strict task migration policy and avoid changing the load-balancer behaviour. The load-balancer will take care of load-balacing within each hmp_domain. Tasks placement takes place every time a task is to be inserted into a runqueue based on its load history. The task placement decision is based on load thresholds. There are no restrictions on the number of hmp_domains, however, multiple (2) has not been tested and the up/down migration policy is rather simple. Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com --- arch/arm/Kconfig | 17 + include/linux/sched.h |6 ++ kernel/sched/fair.c | 168 + kernel/sched/sched.h |6 ++ 4 files changed, 197 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index f4a5d58..5b09684 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1554,6 +1554,23 @@ config SCHED_SMT MultiThreading at a cost of slightly increased overhead in some places. If unsure say N here. +config DISABLE_CPU_SCHED_DOMAIN_BALANCE + bool (EXPERIMENTAL) Disable CPU level scheduler load-balancing + help + Disables scheduler load-balancing at CPU sched domain level. Shouldn't this depend on EXPERIMENTAL? It should. The ongoing discussion about CONFIG_EXPERIMENTAL that Amit is referring to hasn't come to a conclusion yet. +config SCHED_HMP + bool (EXPERIMENTAL) Heterogenous multiprocessor scheduling ditto. + depends on DISABLE_CPU_SCHED_DOMAIN_BALANCE SCHED_MC FAIR_GROUP_SCHED !SCHED_AUTOGROUP + help + Experimental scheduler optimizations for heterogeneous platforms. + Attempts to introspectively select task affinity to optimize power + and performance. Basic support for multiple (2) cpu types is in place, + but it has only been tested with two types of cpus. + There is currently no support for migration of task groups, hence + !SCHED_AUTOGROUP. Furthermore, normal load-balancing must be disabled + between cpus of different type (DISABLE_CPU_SCHED_DOMAIN_BALANCE). + config HAVE_ARM_SCU bool help diff --git a/include/linux/sched.h b/include/linux/sched.h index 81e4e82..df971a3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1039,6 +1039,12 @@ unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu); bool cpus_share_cache(int this_cpu, int that_cpu); +#ifdef CONFIG_SCHED_HMP +struct hmp_domain { + struct cpumask cpus; + struct list_head hmp_domains; Probably need a better name here. domain_list? Yes. hmp_domain_list would be better and stick with the hmp_* naming convention. +}; +#endif /* CONFIG_SCHED_HMP */ #else /* CONFIG_SMP */ struct sched_domain_attr; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3e17dd5..d80de46 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3077,6 +3077,125 @@ static int select_idle_sibling(struct task_struct *p, int target) return target; } +#ifdef CONFIG_SCHED_HMP +/* + * Heterogenous multiprocessor (HMP) optimizations + * + * The cpu types are distinguished using a list of hmp_domains + * which each represent one cpu type using a cpumask. + * The list is assumed ordered by compute capacity with the + * fastest domain first. + */ +DEFINE_PER_CPU(struct hmp_domain *, hmp_cpu_domain); + +extern void __init arch_get_hmp_domains(struct list_head *hmp_domains_list); + +/* Setup hmp_domains */ +static int __init hmp_cpu_mask_setup(void) How should we interpret its return value? Can you mention what does 0 1 mean here? Returns 0 if domain setup failed, i.e. the domain list is empty, and 1 otherwise. +{ + char buf[64]; + struct hmp_domain
Re: [RFC PATCH 02/10] sched: Task placement for heterogeneous systems based on task load-tracking
On 9 October 2012 21:26, Morten Rasmussen morten.rasmus...@arm.com wrote: On Thu, Oct 04, 2012 at 07:02:03AM +0100, Viresh Kumar wrote: On 22 September 2012 00:02, morten.rasmus...@arm.com wrote: SCHED_HMP requires that the platform implements arch_get_hmp_domains() which should set up the platform specific list of hmp_domains. It is also assumed that the platform disables SD_LOAD_BALANCE for the appropriate sched_domains. An explanation of this requirement would be helpful here. Yes. This is to prevent the load-balancer from moving tasks between hmp_domains. This will be done exclusively by SCHED_HMP instead to implement a strict task migration policy and avoid changing the load-balancer behaviour. The load-balancer will take care of load-balacing within each hmp_domain. Honestly speaking i understood this point now and earlier it wasn't clear to me :) What would be ideal is to put this information in the comment just before we re-define other SCHED_*** domains where we disable balancing. And keep it in the commit log too. +struct hmp_domain { + struct cpumask cpus; + struct list_head hmp_domains; Probably need a better name here. domain_list? Yes. hmp_domain_list would be better and stick with the hmp_* naming convention. IMHO hmp_ would be better for global names, but names of variables enclosed within another hmp_*** data type don't actually need hmp_**, as this is implicity. i.e. struct hmp_domain { struct cpumask cpus; struct list_head domain_list; } would be better than struct list_head hmp domain_list; as the parent structure already have hmp_***. So whatever is inside the struct is obviously hmp specific. +/* Setup hmp_domains */ +static int __init hmp_cpu_mask_setup(void) How should we interpret its return value? Can you mention what does 0 1 mean here? Returns 0 if domain setup failed, i.e. the domain list is empty, and 1 otherwise. Helpful. Please mention this in function comment in your next revision. +{ + char buf[64]; + struct hmp_domain *domain; + struct list_head *pos; + int dc, cpu; + /* Print hmp_domains */ + dc = 0; Should be done during definition of dc. You missed this ?? + for_each_cpu_mask(cpu, domain-cpus) { + per_cpu(hmp_cpu_domain, cpu) = domain; + } Should use hmp_cpu_domain(cpu) here. Also no need of {} for single line loop. ?? + dc++; You aren't using it... Only for testing? Should we remove it from mainline patchset and keep it locally? I'm using it in the pr_debug line a little earlier. It is used for enumerating the hmp_domains. My mistake :( +/* Check if cpu is in fastest hmp_domain */ +static inline unsigned int hmp_cpu_is_fastest(int cpu) +{ + struct list_head *pos; + + pos = hmp_cpu_domain(cpu)-hmp_domains; + return pos == hmp_domains.next; better create list_is_first() for this. I had the same thought, but I see that as a separate patch that should be submitted separately. Correct. So better send it now, so that it is included before you send your next version. :) -- viresh ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Re: [RFC PATCH 02/10] sched: Task placement for heterogeneous systems based on task load-tracking
Hi Morten, On 22 September 2012 00:02, morten.rasmus...@arm.com wrote: From: Morten Rasmussen morten.rasmus...@arm.com This patch introduces the basic SCHED_HMP infrastructure. Each class of cpus is represented by a hmp_domain and tasks will only be moved between these domains when their load profiles suggest it is beneficial. SCHED_HMP relies heavily on the task load-tracking introduced in Paul Turners fair group scheduling patch set: https://lkml.org/lkml/2012/8/23/267 SCHED_HMP requires that the platform implements arch_get_hmp_domains() which should set up the platform specific list of hmp_domains. It is also assumed that the platform disables SD_LOAD_BALANCE for the appropriate sched_domains. An explanation of this requirement would be helpful here. Tasks placement takes place every time a task is to be inserted into a runqueue based on its load history. The task placement decision is based on load thresholds. There are no restrictions on the number of hmp_domains, however, multiple (2) has not been tested and the up/down migration policy is rather simple. Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com --- arch/arm/Kconfig | 17 + include/linux/sched.h |6 ++ kernel/sched/fair.c | 168 + kernel/sched/sched.h |6 ++ 4 files changed, 197 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index f4a5d58..5b09684 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1554,6 +1554,23 @@ config SCHED_SMT MultiThreading at a cost of slightly increased overhead in some places. If unsure say N here. +config DISABLE_CPU_SCHED_DOMAIN_BALANCE + bool (EXPERIMENTAL) Disable CPU level scheduler load-balancing + help + Disables scheduler load-balancing at CPU sched domain level. Shouldn't this depend on EXPERIMENTAL? +config SCHED_HMP + bool (EXPERIMENTAL) Heterogenous multiprocessor scheduling ditto. + depends on DISABLE_CPU_SCHED_DOMAIN_BALANCE SCHED_MC FAIR_GROUP_SCHED !SCHED_AUTOGROUP + help + Experimental scheduler optimizations for heterogeneous platforms. + Attempts to introspectively select task affinity to optimize power + and performance. Basic support for multiple (2) cpu types is in place, + but it has only been tested with two types of cpus. + There is currently no support for migration of task groups, hence + !SCHED_AUTOGROUP. Furthermore, normal load-balancing must be disabled + between cpus of different type (DISABLE_CPU_SCHED_DOMAIN_BALANCE). + config HAVE_ARM_SCU bool help diff --git a/include/linux/sched.h b/include/linux/sched.h index 81e4e82..df971a3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1039,6 +1039,12 @@ unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu); bool cpus_share_cache(int this_cpu, int that_cpu); +#ifdef CONFIG_SCHED_HMP +struct hmp_domain { + struct cpumask cpus; + struct list_head hmp_domains; Probably need a better name here. domain_list? +}; +#endif /* CONFIG_SCHED_HMP */ #else /* CONFIG_SMP */ struct sched_domain_attr; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3e17dd5..d80de46 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3077,6 +3077,125 @@ static int select_idle_sibling(struct task_struct *p, int target) return target; } +#ifdef CONFIG_SCHED_HMP +/* + * Heterogenous multiprocessor (HMP) optimizations + * + * The cpu types are distinguished using a list of hmp_domains + * which each represent one cpu type using a cpumask. + * The list is assumed ordered by compute capacity with the + * fastest domain first. + */ +DEFINE_PER_CPU(struct hmp_domain *, hmp_cpu_domain); + +extern void __init arch_get_hmp_domains(struct list_head *hmp_domains_list); + +/* Setup hmp_domains */ +static int __init hmp_cpu_mask_setup(void) How should we interpret its return value? Can you mention what does 0 1 mean here? +{ + char buf[64]; + struct hmp_domain *domain; + struct list_head *pos; + int dc, cpu; + + pr_debug(Initializing HMP scheduler:\n); + + /* Initialize hmp_domains using platform code */ + arch_get_hmp_domains(hmp_domains); + if (list_empty(hmp_domains)) { + pr_debug(HMP domain list is empty!\n); + return 0; + } + + /* Print hmp_domains */ + dc = 0; Should be done during definition of dc. + list_for_each(pos, hmp_domains) { + domain = list_entry(pos, struct hmp_domain, hmp_domains); + cpulist_scnprintf(buf, 64, domain-cpus); + pr_debug( HMP domain %d: %s\n, dc, buf); Spaces before HMP are intentional? + + for_each_cpu_mask(cpu, domain-cpus) { +
[RFC PATCH 02/10] sched: Task placement for heterogeneous systems based on task load-tracking
From: Morten Rasmussen morten.rasmus...@arm.com This patch introduces the basic SCHED_HMP infrastructure. Each class of cpus is represented by a hmp_domain and tasks will only be moved between these domains when their load profiles suggest it is beneficial. SCHED_HMP relies heavily on the task load-tracking introduced in Paul Turners fair group scheduling patch set: https://lkml.org/lkml/2012/8/23/267 SCHED_HMP requires that the platform implements arch_get_hmp_domains() which should set up the platform specific list of hmp_domains. It is also assumed that the platform disables SD_LOAD_BALANCE for the appropriate sched_domains. Tasks placement takes place every time a task is to be inserted into a runqueue based on its load history. The task placement decision is based on load thresholds. There are no restrictions on the number of hmp_domains, however, multiple (2) has not been tested and the up/down migration policy is rather simple. Signed-off-by: Morten Rasmussen morten.rasmus...@arm.com --- arch/arm/Kconfig | 17 + include/linux/sched.h |6 ++ kernel/sched/fair.c | 168 + kernel/sched/sched.h |6 ++ 4 files changed, 197 insertions(+) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index f4a5d58..5b09684 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1554,6 +1554,23 @@ config SCHED_SMT MultiThreading at a cost of slightly increased overhead in some places. If unsure say N here. +config DISABLE_CPU_SCHED_DOMAIN_BALANCE + bool (EXPERIMENTAL) Disable CPU level scheduler load-balancing + help + Disables scheduler load-balancing at CPU sched domain level. + +config SCHED_HMP + bool (EXPERIMENTAL) Heterogenous multiprocessor scheduling + depends on DISABLE_CPU_SCHED_DOMAIN_BALANCE SCHED_MC FAIR_GROUP_SCHED !SCHED_AUTOGROUP + help + Experimental scheduler optimizations for heterogeneous platforms. + Attempts to introspectively select task affinity to optimize power + and performance. Basic support for multiple (2) cpu types is in place, + but it has only been tested with two types of cpus. + There is currently no support for migration of task groups, hence + !SCHED_AUTOGROUP. Furthermore, normal load-balancing must be disabled + between cpus of different type (DISABLE_CPU_SCHED_DOMAIN_BALANCE). + config HAVE_ARM_SCU bool help diff --git a/include/linux/sched.h b/include/linux/sched.h index 81e4e82..df971a3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1039,6 +1039,12 @@ unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu); bool cpus_share_cache(int this_cpu, int that_cpu); +#ifdef CONFIG_SCHED_HMP +struct hmp_domain { + struct cpumask cpus; + struct list_head hmp_domains; +}; +#endif /* CONFIG_SCHED_HMP */ #else /* CONFIG_SMP */ struct sched_domain_attr; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3e17dd5..d80de46 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3077,6 +3077,125 @@ static int select_idle_sibling(struct task_struct *p, int target) return target; } +#ifdef CONFIG_SCHED_HMP +/* + * Heterogenous multiprocessor (HMP) optimizations + * + * The cpu types are distinguished using a list of hmp_domains + * which each represent one cpu type using a cpumask. + * The list is assumed ordered by compute capacity with the + * fastest domain first. + */ +DEFINE_PER_CPU(struct hmp_domain *, hmp_cpu_domain); + +extern void __init arch_get_hmp_domains(struct list_head *hmp_domains_list); + +/* Setup hmp_domains */ +static int __init hmp_cpu_mask_setup(void) +{ + char buf[64]; + struct hmp_domain *domain; + struct list_head *pos; + int dc, cpu; + + pr_debug(Initializing HMP scheduler:\n); + + /* Initialize hmp_domains using platform code */ + arch_get_hmp_domains(hmp_domains); + if (list_empty(hmp_domains)) { + pr_debug(HMP domain list is empty!\n); + return 0; + } + + /* Print hmp_domains */ + dc = 0; + list_for_each(pos, hmp_domains) { + domain = list_entry(pos, struct hmp_domain, hmp_domains); + cpulist_scnprintf(buf, 64, domain-cpus); + pr_debug( HMP domain %d: %s\n, dc, buf); + + for_each_cpu_mask(cpu, domain-cpus) { + per_cpu(hmp_cpu_domain, cpu) = domain; + } + dc++; + } + + return 1; +} + +/* + * Migration thresholds should be in the range [0..1023] + * hmp_up_threshold: min. load required for migrating tasks to a faster cpu + * hmp_down_threshold: max. load allowed for tasks migrating to a slower cpu + * The default values (512, 256) offer good responsiveness, but may need + * tweaking suit particular needs. + */ +unsigned int hmp_up_threshold = 512; +unsigned