On Mon, 2018-02-05 at 11:10 +0000, Mel Gorman wrote: > On Fri, Feb 02, 2018 at 12:01:37PM -0800, Srinivas Pandruvada wrote: > > > Sure, but the lack on detection when tasks are low utilisation > > > but > > > still > > > latency/throughput sensitive is problematic. Users shouldn't have > > > to > > > know they need to disable HWP or set performance goernor out of > > > the > > > box. > > > It's only going to get worse as sockets get larger. > > > > I am not saying that we shouldn't do anything. Can you give me some > > workloads which you care the most? > > > > The proprietary workloads I'm aware of are useless to the discussion > as they cannot be trivially reproduced and are typically only > available > under NDA. However, hints can be gotten by looking at the number of > cases > where recommended tunings limits C-states, set the performance > governor, > alter intel_pstate setpoint (if not HWP) etc. > > For the purposes of illustration, dbench at low thread counts does > a reasonable job even though it's not that interesting a workload in > general. With ext4 in particular, the journalling thread interactions > bounce tasks around the machine and the short sleeps for IO both > combine > to have relatively low utilisation on individual CPUs. It's less > pronounced > on xfs as it bounces less due to using kworkers instead of kthreads. > > > > > > > > There are totally different way HWP is handled in client an > > > > servers. > > > > If you set desired all heuristics they collected will be > > > > dumped, so > > > > they suggest don't set desired when you are in autonomous mode. > > > > If > > > > we > > > > really want a boost set the EPP. We know that EPP makes lots of > > > > measurable difference. > > > > > > > > > > Sure boosting EPP makes a difference -- it's essentially what the > > > performance > > > goveror does and I know that can be done by a user but it's still > > > basically a > > > cop-out. Default performance for low utilisation or lightly > > > loaded > > > machines > > > is poor. Maybe it should be set based on the ACPI preferred > > > profile > > > but > > > that information is not always available. It would be nice if > > > *some* > > > sort of hint about new migrations or tasks waking from IO would > > > be > > > desirable. > > > > EPP is a range not a single value. So you don't need to make EPP=0 > > as a > > performance governor. PeterZ gave me some scheduler change to > > experiment, which can be used as hint to play with EPP. > > > > I know EPP is a range, default from bios usually appear to be 6 or 7 > but > I didn't do much experiementation to see if there is another value > that > works better. Even if there is, the default may need to change as not > many > people even know what EPP is or how it should be tuned. I think you are talking about EPB not EPP because of ranges you mentioned here. EPP is a value from 0 to 255. EPP is part of HWP_REQUEST MSR. EPB with HWP is used only in Broadwell server. I think you are using Skylake here.
Thanks, Srinivas