On Sun, Oct 13, 2019 at 08:44:32AM -0400, Vineeth Remanan Pillai wrote: > On Fri, Oct 11, 2019 at 11:55 PM Aaron Lu <aaron...@linux.alibaba.com> wrote: > > > > > I don't think we need do the normalization afterwrads and it appears > > we are on the same page regarding core wide vruntime.
Should be "we are not on the same page..." [...] > > The weird thing about my patch is, the min_vruntime is often increased, > > it doesn't point to the smallest value as in a traditional cfs_rq. This > > probabaly can be changed to follow the tradition, I don't quite remember > > why I did this, will need to check this some time later. > > Yeah, I noticed this. In my patch, I had already accounted for this and > changed > to min() instead of max() which is more logical that min_vruntime should be > the > minimum of both the run queue. I now remembered why I used max(). Assume rq1 and rq2's min_vruntime are both at 2000 and the core wide min_vruntime is also 2000. Also assume both runqueues are empty at the moment. Then task t1 is queued to rq1 and runs for a long time while rq2 keeps empty. rq1's min_vruntime will be incremented all the time while the core wide min_vruntime stays at 2000 if min() is used. Then when another task gets queued to rq2, it will get really large unfair boost by using a much smaller min_vruntime as its base. To fix this, either max() is used as is done in my patch, or adjust rq2's min_vruntime to be the same as rq1's on each update_core_cfs_min_vruntime() when rq2 is found empty and then use min() to get the core wide min_vruntime. Looks not worth the trouble to use min().