On Tue, Aug 6, 2019 at 11:24 AM Aaron Lu <aaron...@linux.alibaba.com> wrote: > > On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote: > > On 8/2/19 8:37 AM, Julien Desfossez wrote: > > > We tested both Aaron's and Tim's patches and here are our results. > > > > > > Test setup: > > > - 2 1-thread sysbench, one running the cpu benchmark, the other one the > > > mem benchmark > > > - both started at the same time > > > - both are pinned on the same core (2 hardware threads) > > > - 10 30-seconds runs > > > - test script: https://paste.debian.net/plainh/834cf45c > > > - only showing the CPU events/sec (higher is better) > > > - tested 4 tag configurations: > > > - no tag > > > - sysbench mem untagged, sysbench cpu tagged > > > - sysbench mem tagged, sysbench cpu untagged > > > - both tagged with a different tag > > > - "Alone" is the sysbench CPU running alone on the core, no tag > > > - "nosmt" is both sysbench pinned on the same hardware thread, no tag > > > - "Tim's full patchset + sched" is an experiment with Tim's patchset > > > combined with Aaron's "hack patch" to get rid of the remaining deep > > > idle cases > > > - In all test cases, both tasks can run simultaneously (which was not > > > the case without those patches), but the standard deviation is a > > > pretty good indicator of the fairness/consistency. > > > > Thanks for testing the patches and giving such detailed data. > > Thanks Julien. > > > I came to realize that for my scheme, the accumulated deficit of forced > > idle could be wiped > > out in one execution of a task on the forced idle cpu, with the update of > > the min_vruntime, > > even if the execution time could be far less than the accumulated deficit. > > That's probably one reason my scheme didn't achieve fairness. > > I've been thinking if we should consider core wide tenent fairness? > > Let's say there are 3 tasks on 2 threads' rq of the same core, 2 tasks > (e.g. A1, A2) belong to tenent A and the 3rd B1 belong to another tenent > B. Assume A1 and B1 are queued on the same thread and A2 on the other > thread, when we decide priority for A1 and B1, shall we also consider > A2's vruntime? i.e. shall we consider A1 and A2 as a whole since they > belong to the same tenent? I tend to think we should make fairness per > core per tenent, instead of per thread(cpu) per task(sched entity). What > do you guys think? >
I also think a way to make fairness per cookie per core, is this what you want to propose? Thanks, -Aubrey > Implemention of the idea is a mess to me, as I feel I'm duplicating the > existing per cpu per sched_entity enqueue/update vruntime/dequeue logic > for the per core per tenent stuff.