On 12/05/20 14:39, Dietmar Eggemann wrote: > On 11/05/2020 10:01, Juri Lelli wrote: > > On 06/05/20 17:09, Dietmar Eggemann wrote: > >> On 06/05/2020 14:37, Juri Lelli wrote: > >>> On 06/05/20 12:54, Dietmar Eggemann wrote: > >>>> On 27/04/2020 10:37, Dietmar Eggemann wrote: > > [...] > > >>> to say that we actually want to check new tasks bw requirement against > >>> the available bandwidth of the particular CPU they happen to be running > >>> (and will continue to run) when setscheduler is called. > >> > >> By 'available bandwidth of the particular CPU' you refer to > >> '\Sum_{cpu_rq(i)->rd->span} CPU capacity', right? > > > > No. I was referring to the single CPU capacity. The capacity of the CPU > > where a task is running when setscheduler is called for it (and DL AC > > performed). See below, maybe more clear why I wondered about this case.. > > OK, got it! I was just confused since I don't think that this patch > introduced the issue. > > Before the patch 'int cpus = dl_bw_cpus(task_cpu(p))' was used which > returns the number of cpus on the (default) rd (n). So for a single CPU > (1024) we use n*1024. > > I wonder if a fix for that should be part of this patch-set?
Not really, I guess. As you said, the issue was there already. We can fix both situations with a subsequent patch. I just realized that we have a problem by reviewing this set, but not this set job to fix it. While you are at changing this part, it might be good to put a comment (XXX fix this, or something) about the issue, so that we don't forget. > [...] > > >> ... > >> [ 144.920102] __dl_bw_capacity CPU3 rd->span=3-5 return 1338 > >> [ 144.925607] sched_dl_overflow: [bash 1999] task_cpu(p)=3 cap=1338 > >> cpus_ptr=3-5 > > > > So, here you are checking new task bw against 1338 which is 3*L > > capacity. However, since load balance is disabled at this point for 3-5, > > once admitted the task will only be able to run on CPU 3. Now, if more > > tasks on CPU 3 are admitted the same way (up to 1338) I believe they > > will start to experience deadline misses because only 446 will be > > actually available to them until load balance is enabled below and they > > are then free to migrate to CPUs 4 and 5. > > > > Does it makes sense? > > Yes, it does. > > So my first idea was to only consider the CPU (i.e. its CPU capacity) in > case we detect 'cpu_rq(cpu)->rd == def_root_domain'? > > In case I re-enable load-balancing on cpuset '/', we can't make a task > in cpuset 'B' DL since we hit this in __sched_setscheduler(): > > 4931 /* > 4932 * Don't allow tasks with an affinity mask smaller than > 4933 * the entire root_domain to become SCHED_DEADLINE. > ... > 4935 */ > 4936 if (!cpumask_subset(span, p->cpus_ptr) || ... > > root@juno:~# echo 1 > /sys/fs/cgroup/cpuset/cpuset.sched_load_balance > root@juno:~# echo $$ > /sys/fs/cgroup/cpuset/B/tasks > root@juno:~# chrt -d --sched-runtime 8000 --sched-period 16000 -p 0 $$ > chrt: failed to set pid 2316's policy: Operation not permitted > > So this task has to leave 'B' first I assume. Right, because the span is back to contain all cpus (load balancing enabled at root level), but tasks in 'B' still have affinity set to a subset of them.