On 15.11.2021 15:33, Andrew Cooper wrote:
> On 15/11/2021 10:53, Jan Beulich wrote:
>> On 12.11.2021 19:51, Andrew Cooper wrote:
>>> On 10/11/2021 09:19, Jane Malalane wrote:
>>>> Before, user would change turbo status but this had no effect: boolean
>>>> was set but policy wasn't reevaluated.  Policy must be reevaluated so
>>>> that CPU frequency is chosen according to the turbo status and the
>>>> current governor.
>>>>
>>>> Therefore, add __cpufreq_governor() in cpufreq_update_turbo().
>>>>
>>>> Reported-by: <edvin.to...@citrix.com>
>>>> Signed-off-by: <jane.malal...@citrix.com>
>>>> ---
>>>> CC: Jan Beulich <jbeul...@suse.com>
>>>> CC: Ian Jackson <i...@xenproject.org>
>>>> ---
>>>>
>>>> Release rationale:
>>>> Not taking this patch means that turbo status is misleading.
>>>>
>>>> Taking this patch is low-risk as it only uses a function that already
>>>> exists and is already used for setting the chosen scaling governor.
>>>> Essentially, this change is equivalent to running 'xenpm
>>>> en/disable-turbo-mode' and, subsequently, running 'xenpm
>>>> set-scaling-governor <current governor>'.
>>>> ---
>>>>  xen/drivers/cpufreq/utility.c | 6 +++++-
>>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
>>>> index b93895d4dd..5f200ff3ee 100644
>>>> --- a/xen/drivers/cpufreq/utility.c
>>>> +++ b/xen/drivers/cpufreq/utility.c
>>>> @@ -417,10 +417,14 @@ int cpufreq_update_turbo(int cpuid, int new_state)
>>>>      {
>>>>          ret = cpufreq_driver.update(cpuid, policy);
>>>>          if (ret)
>>>> +        {
>>>>              policy->turbo = curr_state;
>>>> +            return ret;
>>>> +        }
>>>>      }
>>>>  
>>>> -    return ret;
>>>> +    /* Reevaluate current CPU policy. */
>>>> +    return __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
>>>>  }
>>> So, having looked through the manual, what the cpufreq_driver does for
>>> turbo on Intel is bogus according to the SDM.
>>>
>>> There is a non-architectrual dance with IA32_MISC_ENABLE bit 38 (per
>>> package) for firmware to configure turbo, but it manifests as another
>>> dynamic CPUID bit (which I think we handle correctly).  It has the same
>>> semantics as XD_DISABLE and CPUID_LIMIT so we might want to consider
>>> adding it to the set of bits we clear during boot.
>> This is quite confusing in the docs - at least one of the tables calls
>> the bit "IDA Disable", while other entries at least also refer to the
>> effect of disabling IDA. I'm afraid I have trouble connecting turbo
>> mode and IDA disabling, unless both are two different names of the
>> same thing. Maybe they really are, except that SDM vol 2 uses yet
>> another slightly different term for CPUID[6].EAX[1]: "Intel Turbo Boost
>> Technology".
> 
> SDM Vol3 14.3 uses IDA and turbo interchangeably in some cases.  It
> reads as if IDA is the general technology name, and turbo is a sub-mode
> for "doing it automatically without software involvement".
> 
> On CPUs which support IDA, the CPUID bit is !MISC_ENABLE[38], with
> further adds to the confusion of which is which.
> 
>>> However, the correct way to turn off turbo is to set
>>> IA32_PERF_CTL.TURBO_DISENGAGE bit, which is per logical processor.  As
>>> such, it *should* behave far more like the AMD CPB path.
>> I'm afraid public documentation has no mention of a bit of this name.
>> Considering the above I wonder whether you mean "IDA engage" (bit 32),
>> albeit this doesn't seem very likely when you're taking about a
>> "disengage" bit.
> 
> It is that bit.  Despite the name given in Vol4 Table 2-12, it is "set
> to disable".
> 
> Vol3 Figure 14-2 explicitly calls it the "IDA/Turbo disengage" bit and
> the surrounding text makes it clear that it is disable bit, not an
> enable bit.  Also, that's how the Linux intel_pstate driver uses it.

Okay, then I agree with the proposal you've made.

Jan

>>  If it is, we'd still need to cope with it being
>> unavailable: While as per the doc it exists from Merom onwards, i.e.
>> just far enough back for all 64-bit capable processors to be covered,
>> at least there it is attributed "Mobile only".
> 
> Honestly, the number of errors in those tables are alarming.  The MSR is
> has been around longer than turbo, and I'm told that the interface has
> never changed since its introduction.
> 
> Looking across Linux, there's a mess too.
> 
> acpi-cpufreq and energy_perf_policy use the MISC_ENABLE bit (which is
> package wide)
> intel_ips driver uses PERF_CTL (which is logical processor)
> intel_pstate uses MISC_ENABLE || max_pstate == turbo_pstate
> 
> I'm certain I've seen "one pstate being special" WRT turbo before, but I
> can't locate anything about this in the SDM, which possibly means it is
> model specific.
> 
> ~Andrew
> 


Reply via email to