On Wed, Jun 26, 2019 at 03:47:17PM -0700, subhra mazumdar wrote: > The soft affinity CPUs present in the cpumask cpus_preferred is used by the > scheduler in two levels of search. First is in determining wake affine > which choses the LLC domain and secondly while searching for idle CPUs in > LLC domain. In the first level it uses cpus_preferred to prune out the > search space. In the second level it first searches the cpus_preferred and > then cpus_allowed. Using affinity_unequal flag it breaks early to avoid > any overhead in the scheduler fast path when soft affinity is not used. > This only changes the wake up path of the scheduler, the idle balancing > is unchanged; together they achieve the "softness" of scheduling.
I really dislike this implementation. I thought the idea was to remain work conserving (in so far as that we're that anyway), so changing select_idle_sibling() doesn't make sense to me. If there is idle, we use it. Same for newidle; which you already retained. This then leaves regular balancing, and for that we can fudge with can_migrate_task() and nr_balance_failed or something. And I also really don't want a second utilization tipping point; we already have the overloaded thing. I also still dislike how you never looked into the numa balancer, which already has peferred_nid stuff.