Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping
On 08-May 21:44, Peter Zijlstra wrote: > On Tue, May 07, 2019 at 12:13:47PM +0100, Patrick Bellasi wrote: > > On 17-Apr 15:26, Suren Baghdasaryan wrote: > > > On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi > > > wrote: > > > > > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void) > > > > #else /* CONFIG_UCLAMP_TASK */ > > > > static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) > > > > { } > > > > static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) > > > > { } > > > > +static inline int uclamp_validate(struct task_struct *p, > > > > + const struct sched_attr *attr) > > > > +{ > > > > + return -ENODEV; > > > > > > ENOSYS might be more appropriate? > > > > Yep, agree, thanks! > > No, -ENOSYS (see the comment) is special in that it indicates the whole > system call is unavailable; that is most certainly not the case! Yep, noted. Thanks. -- #include Patrick Bellasi
Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping
On 08-May 21:41, Peter Zijlstra wrote: > On Tue, Apr 02, 2019 at 11:41:42AM +0100, Patrick Bellasi wrote: > > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void) > > #else /* CONFIG_UCLAMP_TASK */ > > static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { } > > static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { } > > +static inline int uclamp_validate(struct task_struct *p, > > + const struct sched_attr *attr) > > +{ > > + return -ENODEV; > > Does that maybe want to be -EOPNOTSUPP ? Suren propose ENOSYS for another similar case, i.e. !CONFIG_UCLAMP_TASK definitions. But EOPNOTSUPP seems more appropriate to me too. -- #include Patrick Bellasi
Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping
On Tue, May 07, 2019 at 12:13:47PM +0100, Patrick Bellasi wrote: > On 17-Apr 15:26, Suren Baghdasaryan wrote: > > On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi > > wrote: > > > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void) > > > #else /* CONFIG_UCLAMP_TASK */ > > > static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { > > > } > > > static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { > > > } > > > +static inline int uclamp_validate(struct task_struct *p, > > > + const struct sched_attr *attr) > > > +{ > > > + return -ENODEV; > > > > ENOSYS might be more appropriate? > > Yep, agree, thanks! No, -ENOSYS (see the comment) is special in that it indicates the whole system call is unavailable; that is most certainly not the case!
Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping
On Tue, Apr 02, 2019 at 11:41:42AM +0100, Patrick Bellasi wrote: > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void) > #else /* CONFIG_UCLAMP_TASK */ > static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { } > static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { } > +static inline int uclamp_validate(struct task_struct *p, > + const struct sched_attr *attr) > +{ > + return -ENODEV; Does that maybe want to be -EOPNOTSUPP ? > +}
Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping
On 17-Apr 15:26, Suren Baghdasaryan wrote: > On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi > wrote: [...] > > Do not allow to change sched class specific params and non class > > specific params (i.e. clamp values) at the same time. This keeps things > > simple and still works for the most common cases since we are usually > > interested in just one of the two actions. > > Sorry, I can't find where you are checking to eliminate the > possibility of simultaneous changes to both sched class specific > params and non class specific params... Am I too tired or they are > indeed missing? No, you right... that limitation has been removed in v8 :) I'll remove the above paragraph in v9, thanks for spotting it. [...] > > +static int uclamp_validate(struct task_struct *p, > > + const struct sched_attr *attr) > > +{ > > + unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value; > > + unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value; > > + > > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) > > + lower_bound = attr->sched_util_min; > > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) > > + upper_bound = attr->sched_util_max; > > + > > + if (lower_bound > upper_bound) > > + return -EINVAL; > > + if (upper_bound > SCHED_CAPACITY_SCALE) > > + return -EINVAL; > > + > > + return 0; > > +} [...] > > static void uclamp_fork(struct task_struct *p) > > { > > unsigned int clamp_id; > > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void) > > #else /* CONFIG_UCLAMP_TASK */ > > static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { } > > static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { } > > +static inline int uclamp_validate(struct task_struct *p, > > + const struct sched_attr *attr) > > +{ > > + return -ENODEV; > > ENOSYS might be more appropriate? Yep, agree, thanks! > > > +} > > +static void __setscheduler_uclamp(struct task_struct *p, > > + const struct sched_attr *attr) { } > > static inline void uclamp_fork(struct task_struct *p) { } > > static inline void init_uclamp(void) { } > > #endif /* CONFIG_UCLAMP_TASK */ > > @@ -4424,6 +4475,13 @@ static void __setscheduler_params(struct task_struct > > *p, > > static void __setscheduler(struct rq *rq, struct task_struct *p, > >const struct sched_attr *attr, bool keep_boost) > > { > > + /* > > +* If params can't change scheduling class changes aren't allowed > > +* either. > > +*/ > > + if (attr->sched_flags & SCHED_FLAG_KEEP_PARAMS) > > + return; > > + > > __setscheduler_params(p, attr); > > > > /* > > @@ -4561,6 +4619,13 @@ static int __sched_setscheduler(struct task_struct > > *p, > > return retval; > > } > > > > + /* Update task specific "requested" clamps */ > > + if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP) { > > + retval = uclamp_validate(p, attr); > > + if (retval) > > + return retval; > > + } > > + > > /* > > * Make sure no PI-waiters arrive (or leave) while we are > > * changing the priority of the task: [...] -- #include Patrick Bellasi
Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping
On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi wrote: > > The SCHED_DEADLINE scheduling class provides an advanced and formal > model to define tasks requirements that can translate into proper > decisions for both task placements and frequencies selections. Other > classes have a more simplified model based on the POSIX concept of > priorities. > > Such a simple priority based model however does not allow to exploit > most advanced features of the Linux scheduler like, for example, driving > frequencies selection via the schedutil cpufreq governor. However, also > for non SCHED_DEADLINE tasks, it's still interesting to define tasks > properties to support scheduler decisions. > > Utilization clamping exposes to user-space a new set of per-task > attributes the scheduler can use as hints about the expected/required > utilization for a task. This allows to implement a "proactive" per-task > frequency control policy, a more advanced policy than the current one > based just on "passive" measured task utilization. For example, it's > possible to boost interactive tasks (e.g. to get better performance) or > cap background tasks (e.g. to be more energy/thermal efficient). > > Introduce a new API to set utilization clamping values for a specified > task by extending sched_setattr(), a syscall which already allows to > define task specific properties for different scheduling classes. A new > pair of attributes allows to specify a minimum and maximum utilization > the scheduler can consider for a task. > > Do that by validating the required clamp values before and then applying > the required changes using _the_ same pattern already in use for > __setscheduler(). This ensures that the task is re-enqueued with the new > clamp values. > > Do not allow to change sched class specific params and non class > specific params (i.e. clamp values) at the same time. This keeps things > simple and still works for the most common cases since we are usually > interested in just one of the two actions. Sorry, I can't find where you are checking to eliminate the possibility of simultaneous changes to both sched class specific params and non class specific params... Am I too tired or they are indeed missing? > > Signed-off-by: Patrick Bellasi > Cc: Ingo Molnar > Cc: Peter Zijlstra > > --- > Changes in v8: > Others: > - using p->uclamp_req to track clamp values "requested" from userspace > --- > include/linux/sched.h| 9 > include/uapi/linux/sched.h | 12 - > include/uapi/linux/sched/types.h | 66 > kernel/sched/core.c | 87 +++- > 4 files changed, 162 insertions(+), 12 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index d8491954e2e1..c2b81a84985b 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -585,6 +585,7 @@ struct sched_dl_entity { > * @value: clamp value "assigned" to a se > * @bucket_id: bucket index corresponding to the "assigned" value > * @active:the se is currently refcounted in a rq's bucket > + * @user_defined: the requested clamp value comes from user-space > * > * The bucket_id is the index of the clamp bucket matching the clamp value > * which is pre-computed and stored to avoid expensive integer divisions from > @@ -594,11 +595,19 @@ struct sched_dl_entity { > * which can be different from the clamp value "requested" from user-space. > * This allows to know a task is refcounted in the rq's bucket corresponding > * to the "effective" bucket_id. > + * > + * The user_defined bit is set whenever a task has got a task-specific clamp > + * value requested from userspace, i.e. the system defaults apply to this > task > + * just as a restriction. This allows to relax default clamps when a less > + * restrictive task-specific value has been requested, thus allowing to > + * implement a "nice" semantic. For example, a task running with a 20% > + * default boost can still drop its own boosting to 0%. > */ > struct uclamp_se { > unsigned int value : bits_per(SCHED_CAPACITY_SCALE); > unsigned int bucket_id : bits_per(UCLAMP_BUCKETS); > unsigned int active : 1; > + unsigned int user_defined : 1; > }; > #endif /* CONFIG_UCLAMP_TASK */ > > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h > index 075c610adf45..d2c65617a4a4 100644 > --- a/include/uapi/linux/sched.h > +++ b/include/uapi/linux/sched.h > @@ -53,10 +53,20 @@ > #define SCHED_FLAG_RECLAIM 0x02 > #define SCHED_FLAG_DL_OVERRUN 0x04 > #define SCHED_FLAG_KEEP_POLICY 0x08 > +#define SCHED_FLAG_KEEP_PARAMS 0x10 > +#define SCHED_FLAG_UTIL_CLAMP_MIN 0x20 > +#define SCHED_FLAG_UTIL_CLAMP_MAX 0x40 > + > +#define SCHED_FLAG_KEEP_ALL(SCHED_FLAG_KEEP_POLICY | \ > +SCHED_FLAG_KEEP_PARAMS) > + >