Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping

2019-05-09 Thread Patrick Bellasi
On 08-May 21:44, Peter Zijlstra wrote:
> On Tue, May 07, 2019 at 12:13:47PM +0100, Patrick Bellasi wrote:
> > On 17-Apr 15:26, Suren Baghdasaryan wrote:
> > > On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi  
> > > wrote:
> 
> > > > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void)
> > > >  #else /* CONFIG_UCLAMP_TASK */
> > > >  static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) 
> > > > { }
> > > >  static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) 
> > > > { }
> > > > +static inline int uclamp_validate(struct task_struct *p,
> > > > + const struct sched_attr *attr)
> > > > +{
> > > > +   return -ENODEV;
> > > 
> > > ENOSYS might be more appropriate?
> > 
> > Yep, agree, thanks!
> 
> No, -ENOSYS (see the comment) is special in that it indicates the whole
> system call is unavailable; that is most certainly not the case!

Yep, noted. Thanks.

-- 
#include 

Patrick Bellasi


Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping

2019-05-09 Thread Patrick Bellasi
On 08-May 21:41, Peter Zijlstra wrote:
> On Tue, Apr 02, 2019 at 11:41:42AM +0100, Patrick Bellasi wrote:
> > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void)
> >  #else /* CONFIG_UCLAMP_TASK */
> >  static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { }
> >  static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { }
> > +static inline int uclamp_validate(struct task_struct *p,
> > + const struct sched_attr *attr)
> > +{
> > +   return -ENODEV;
> 
> Does that maybe want to be -EOPNOTSUPP ?

Suren propose ENOSYS for another similar case, i.e.
!CONFIG_UCLAMP_TASK definitions.

But EOPNOTSUPP seems more appropriate to me too.

-- 
#include 

Patrick Bellasi


Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping

2019-05-08 Thread Peter Zijlstra
On Tue, May 07, 2019 at 12:13:47PM +0100, Patrick Bellasi wrote:
> On 17-Apr 15:26, Suren Baghdasaryan wrote:
> > On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi  
> > wrote:

> > > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void)
> > >  #else /* CONFIG_UCLAMP_TASK */
> > >  static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { 
> > > }
> > >  static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { 
> > > }
> > > +static inline int uclamp_validate(struct task_struct *p,
> > > + const struct sched_attr *attr)
> > > +{
> > > +   return -ENODEV;
> > 
> > ENOSYS might be more appropriate?
> 
> Yep, agree, thanks!

No, -ENOSYS (see the comment) is special in that it indicates the whole
system call is unavailable; that is most certainly not the case!


Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping

2019-05-08 Thread Peter Zijlstra
On Tue, Apr 02, 2019 at 11:41:42AM +0100, Patrick Bellasi wrote:
> @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void)
>  #else /* CONFIG_UCLAMP_TASK */
>  static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { }
>  static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { }
> +static inline int uclamp_validate(struct task_struct *p,
> +   const struct sched_attr *attr)
> +{
> + return -ENODEV;

Does that maybe want to be -EOPNOTSUPP ?

> +}


Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping

2019-05-07 Thread Patrick Bellasi
On 17-Apr 15:26, Suren Baghdasaryan wrote:
> On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi  
> wrote:

[...]

> > Do not allow to change sched class specific params and non class
> > specific params (i.e. clamp values) at the same time.  This keeps things
> > simple and still works for the most common cases since we are usually
> > interested in just one of the two actions.
> 
> Sorry, I can't find where you are checking to eliminate the
> possibility of simultaneous changes to both sched class specific
> params and non class specific params... Am I too tired or they are
> indeed missing?

No, you right... that limitation has been removed in v8 :)

I'll remove the above paragraph in v9, thanks for spotting it.

[...]

> > +static int uclamp_validate(struct task_struct *p,
> > +  const struct sched_attr *attr)
> > +{
> > +   unsigned int lower_bound = p->uclamp_req[UCLAMP_MIN].value;
> > +   unsigned int upper_bound = p->uclamp_req[UCLAMP_MAX].value;
> > +
> > +   if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN)
> > +   lower_bound = attr->sched_util_min;
> > +   if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX)
> > +   upper_bound = attr->sched_util_max;
> > +
> > +   if (lower_bound > upper_bound)
> > +   return -EINVAL;
> > +   if (upper_bound > SCHED_CAPACITY_SCALE)
> > +   return -EINVAL;
> > +
> > +   return 0;
> > +}

[...]

> >  static void uclamp_fork(struct task_struct *p)
> >  {
> > unsigned int clamp_id;
> > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void)
> >  #else /* CONFIG_UCLAMP_TASK */
> >  static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { }
> >  static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { }
> > +static inline int uclamp_validate(struct task_struct *p,
> > + const struct sched_attr *attr)
> > +{
> > +   return -ENODEV;
> 
> ENOSYS might be more appropriate?

Yep, agree, thanks!

> 
> > +}
> > +static void __setscheduler_uclamp(struct task_struct *p,
> > + const struct sched_attr *attr) { }
> >  static inline void uclamp_fork(struct task_struct *p) { }
> >  static inline void init_uclamp(void) { }
> >  #endif /* CONFIG_UCLAMP_TASK */
> > @@ -4424,6 +4475,13 @@ static void __setscheduler_params(struct task_struct 
> > *p,
> >  static void __setscheduler(struct rq *rq, struct task_struct *p,
> >const struct sched_attr *attr, bool keep_boost)
> >  {
> > +   /*
> > +* If params can't change scheduling class changes aren't allowed
> > +* either.
> > +*/
> > +   if (attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)
> > +   return;
> > +
> > __setscheduler_params(p, attr);
> >
> > /*
> > @@ -4561,6 +4619,13 @@ static int __sched_setscheduler(struct task_struct 
> > *p,
> > return retval;
> > }
> >
> > +   /* Update task specific "requested" clamps */
> > +   if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP) {
> > +   retval = uclamp_validate(p, attr);
> > +   if (retval)
> > +   return retval;
> > +   }
> > +
> > /*
> >  * Make sure no PI-waiters arrive (or leave) while we are
> >  * changing the priority of the task:

[...]

-- 
#include 

Patrick Bellasi


Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping

2019-04-17 Thread Suren Baghdasaryan
On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi  wrote:
>
> The SCHED_DEADLINE scheduling class provides an advanced and formal
> model to define tasks requirements that can translate into proper
> decisions for both task placements and frequencies selections. Other
> classes have a more simplified model based on the POSIX concept of
> priorities.
>
> Such a simple priority based model however does not allow to exploit
> most advanced features of the Linux scheduler like, for example, driving
> frequencies selection via the schedutil cpufreq governor. However, also
> for non SCHED_DEADLINE tasks, it's still interesting to define tasks
> properties to support scheduler decisions.
>
> Utilization clamping exposes to user-space a new set of per-task
> attributes the scheduler can use as hints about the expected/required
> utilization for a task. This allows to implement a "proactive" per-task
> frequency control policy, a more advanced policy than the current one
> based just on "passive" measured task utilization. For example, it's
> possible to boost interactive tasks (e.g. to get better performance) or
> cap background tasks (e.g. to be more energy/thermal efficient).
>
> Introduce a new API to set utilization clamping values for a specified
> task by extending sched_setattr(), a syscall which already allows to
> define task specific properties for different scheduling classes. A new
> pair of attributes allows to specify a minimum and maximum utilization
> the scheduler can consider for a task.
>
> Do that by validating the required clamp values before and then applying
> the required changes using _the_ same pattern already in use for
> __setscheduler(). This ensures that the task is re-enqueued with the new
> clamp values.
>
> Do not allow to change sched class specific params and non class
> specific params (i.e. clamp values) at the same time.  This keeps things
> simple and still works for the most common cases since we are usually
> interested in just one of the two actions.

Sorry, I can't find where you are checking to eliminate the
possibility of simultaneous changes to both sched class specific
params and non class specific params... Am I too tired or they are
indeed missing?

>
> Signed-off-by: Patrick Bellasi 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
>
> ---
> Changes in v8:
>  Others:
>  - using p->uclamp_req to track clamp values "requested" from userspace
> ---
>  include/linux/sched.h|  9 
>  include/uapi/linux/sched.h   | 12 -
>  include/uapi/linux/sched/types.h | 66 
>  kernel/sched/core.c  | 87 +++-
>  4 files changed, 162 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index d8491954e2e1..c2b81a84985b 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -585,6 +585,7 @@ struct sched_dl_entity {
>   * @value: clamp value "assigned" to a se
>   * @bucket_id: bucket index corresponding to the "assigned" value
>   * @active:the se is currently refcounted in a rq's bucket
> + * @user_defined:  the requested clamp value comes from user-space
>   *
>   * The bucket_id is the index of the clamp bucket matching the clamp value
>   * which is pre-computed and stored to avoid expensive integer divisions from
> @@ -594,11 +595,19 @@ struct sched_dl_entity {
>   * which can be different from the clamp value "requested" from user-space.
>   * This allows to know a task is refcounted in the rq's bucket corresponding
>   * to the "effective" bucket_id.
> + *
> + * The user_defined bit is set whenever a task has got a task-specific clamp
> + * value requested from userspace, i.e. the system defaults apply to this 
> task
> + * just as a restriction. This allows to relax default clamps when a less
> + * restrictive task-specific value has been requested, thus allowing to
> + * implement a "nice" semantic. For example, a task running with a 20%
> + * default boost can still drop its own boosting to 0%.
>   */
>  struct uclamp_se {
> unsigned int value  : bits_per(SCHED_CAPACITY_SCALE);
> unsigned int bucket_id  : bits_per(UCLAMP_BUCKETS);
> unsigned int active : 1;
> +   unsigned int user_defined   : 1;
>  };
>  #endif /* CONFIG_UCLAMP_TASK */
>
> diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> index 075c610adf45..d2c65617a4a4 100644
> --- a/include/uapi/linux/sched.h
> +++ b/include/uapi/linux/sched.h
> @@ -53,10 +53,20 @@
>  #define SCHED_FLAG_RECLAIM 0x02
>  #define SCHED_FLAG_DL_OVERRUN  0x04
>  #define SCHED_FLAG_KEEP_POLICY 0x08
> +#define SCHED_FLAG_KEEP_PARAMS 0x10
> +#define SCHED_FLAG_UTIL_CLAMP_MIN  0x20
> +#define SCHED_FLAG_UTIL_CLAMP_MAX  0x40
> +
> +#define SCHED_FLAG_KEEP_ALL(SCHED_FLAG_KEEP_POLICY | \
> +SCHED_FLAG_KEEP_PARAMS)
> +
> 

[PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping

2019-04-02 Thread Patrick Bellasi
The SCHED_DEADLINE scheduling class provides an advanced and formal
model to define tasks requirements that can translate into proper
decisions for both task placements and frequencies selections. Other
classes have a more simplified model based on the POSIX concept of
priorities.

Such a simple priority based model however does not allow to exploit
most advanced features of the Linux scheduler like, for example, driving
frequencies selection via the schedutil cpufreq governor. However, also
for non SCHED_DEADLINE tasks, it's still interesting to define tasks
properties to support scheduler decisions.

Utilization clamping exposes to user-space a new set of per-task
attributes the scheduler can use as hints about the expected/required
utilization for a task. This allows to implement a "proactive" per-task
frequency control policy, a more advanced policy than the current one
based just on "passive" measured task utilization. For example, it's
possible to boost interactive tasks (e.g. to get better performance) or
cap background tasks (e.g. to be more energy/thermal efficient).

Introduce a new API to set utilization clamping values for a specified
task by extending sched_setattr(), a syscall which already allows to
define task specific properties for different scheduling classes. A new
pair of attributes allows to specify a minimum and maximum utilization
the scheduler can consider for a task.

Do that by validating the required clamp values before and then applying
the required changes using _the_ same pattern already in use for
__setscheduler(). This ensures that the task is re-enqueued with the new
clamp values.

Do not allow to change sched class specific params and non class
specific params (i.e. clamp values) at the same time.  This keeps things
simple and still works for the most common cases since we are usually
interested in just one of the two actions.

Signed-off-by: Patrick Bellasi 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 

---
Changes in v8:
 Others:
 - using p->uclamp_req to track clamp values "requested" from userspace
---
 include/linux/sched.h|  9 
 include/uapi/linux/sched.h   | 12 -
 include/uapi/linux/sched/types.h | 66 
 kernel/sched/core.c  | 87 +++-
 4 files changed, 162 insertions(+), 12 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index d8491954e2e1..c2b81a84985b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -585,6 +585,7 @@ struct sched_dl_entity {
  * @value: clamp value "assigned" to a se
  * @bucket_id: bucket index corresponding to the "assigned" value
  * @active:the se is currently refcounted in a rq's bucket
+ * @user_defined:  the requested clamp value comes from user-space
  *
  * The bucket_id is the index of the clamp bucket matching the clamp value
  * which is pre-computed and stored to avoid expensive integer divisions from
@@ -594,11 +595,19 @@ struct sched_dl_entity {
  * which can be different from the clamp value "requested" from user-space.
  * This allows to know a task is refcounted in the rq's bucket corresponding
  * to the "effective" bucket_id.
+ *
+ * The user_defined bit is set whenever a task has got a task-specific clamp
+ * value requested from userspace, i.e. the system defaults apply to this task
+ * just as a restriction. This allows to relax default clamps when a less
+ * restrictive task-specific value has been requested, thus allowing to
+ * implement a "nice" semantic. For example, a task running with a 20%
+ * default boost can still drop its own boosting to 0%.
  */
 struct uclamp_se {
unsigned int value  : bits_per(SCHED_CAPACITY_SCALE);
unsigned int bucket_id  : bits_per(UCLAMP_BUCKETS);
unsigned int active : 1;
+   unsigned int user_defined   : 1;
 };
 #endif /* CONFIG_UCLAMP_TASK */
 
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 075c610adf45..d2c65617a4a4 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -53,10 +53,20 @@
 #define SCHED_FLAG_RECLAIM 0x02
 #define SCHED_FLAG_DL_OVERRUN  0x04
 #define SCHED_FLAG_KEEP_POLICY 0x08
+#define SCHED_FLAG_KEEP_PARAMS 0x10
+#define SCHED_FLAG_UTIL_CLAMP_MIN  0x20
+#define SCHED_FLAG_UTIL_CLAMP_MAX  0x40
+
+#define SCHED_FLAG_KEEP_ALL(SCHED_FLAG_KEEP_POLICY | \
+SCHED_FLAG_KEEP_PARAMS)
+
+#define SCHED_FLAG_UTIL_CLAMP  (SCHED_FLAG_UTIL_CLAMP_MIN | \
+SCHED_FLAG_UTIL_CLAMP_MAX)
 
 #define SCHED_FLAG_ALL (SCHED_FLAG_RESET_ON_FORK   | \
 SCHED_FLAG_RECLAIM | \
 SCHED_FLAG_DL_OVERRUN  | \
-SCHED_FLAG_KEEP_POLICY)
+SCHED_FLAG_KEEP_ALL| \
+