On Wed, 2016-05-11 at 09:23 +0800, Yuyang Du wrote: > > Yeah, just like everything else, it'll cuts both ways (why you can't > > win the sched game). If I can believe tbench, at tasks=cpus, reducing > > lag increased utilization and reduced latency a wee bit, as did the > > reserve thing once a booboo got fixed up. > > Ok, so you have a secret IDLE_RESERVE? Good luck and show it, ;)
Nothing sexy, just cpmxchg(), with the obvious test/set/clear spots. cmpxchg(&cpu_rq(cpu)->idle_latch, cpu, nr_cpu_ids) > Depends on the goal. For both, load lagging reality means the high > > frequency component is squelched, meaning less migration cost, but also > > higher latency due to stacking. It's a tradeoff where Chris' latency > > is everything" benchmark, and _maybe_ the real world load it's based > > upon is on Peter's end of the rob Peter to pay Paul transaction. The > > benchmark says it definitely is, the real world load may have already > > been fixed up by the select_idle_sibling() rewrite. > > Obviously, load avgs are good at balancing in a larger scale in a timeframe, > so they should be used in comparing/balancing sd's not cpus. However, this > is not the case currently: avgs are mixed with idle cpu/core selection, so > I think better job can be done before and after select_idle_sibling(). > > For example, I don't know what the complex wake_affine() is really doing for > what. Am i missing something, you think? wake_affine() just says no to keep us from pulling the whole load to one cache, starting massive tug-o-war with LB and nuking throughput. Everybody wants hot data, but they can't all have it and scale. -Mike