Re: [PATCH] x86/Intel: don't log bogus frequency range on Core/Core2 processors

2022-02-08 Thread Roger Pau Monné
On Tue, Feb 08, 2022 at 03:28:23PM +0100, Jan Beulich wrote:
> On 08.02.2022 15:20, Roger Pau Monné wrote:
> > On Tue, Feb 08, 2022 at 11:51:03AM +0100, Jan Beulich wrote:
> >> On 08.02.2022 09:54, Roger Pau Monné wrote:
> >>> On Fri, Feb 04, 2022 at 02:56:43PM +0100, Jan Beulich wrote:
>  --- a/xen/arch/x86/cpu/intel.c
>  +++ b/xen/arch/x86/cpu/intel.c
>  @@ -435,6 +435,26 @@ static void intel_log_freq(const struct
>   if ( c->x86 == 6 )
>   switch ( c->x86_model )
>   {
>  +static const unsigned short core_factors[] =
>  +{ 26667, 1, 2, 16667, 3, 1, 4 };
>  +
>  +case 0x0e: /* Core */
>  +case 0x0f: case 0x16: case 0x17: case 0x1d: /* Core2 */
>  +/*
>  + * PLATFORM_INFO, while not documented for these, 
>  appears to
>  + * exist in at least some cases, but what it holds 
>  doesn't
>  + * match the scheme used by newer CPUs.  At a guess, 
>  the min
>  + * and max fields look to be reversed, while the scaling
>  + * factor is encoded in FSB_FREQ.
>  + */
>  +if ( min_ratio > max_ratio )
>  +SWAP(min_ratio, max_ratio);
>  +if ( rdmsr_safe(MSR_FSB_FREQ, msrval) ||
>  + (msrval &= 7) >= ARRAY_SIZE(core_factors) )
>  +return;
>  +factor = core_factors[msrval];
>  +break;
>  +
>   case 0x1a: case 0x1e: case 0x1f: case 0x2e: /* Nehalem */
>   case 0x25: case 0x2c: case 0x2f: /* Westmere */
>   factor = 1;
> >>>
> >>> Seeing that the MSR is present on non documented models and has
> >>> unknown behavior we might want to further sanity check that min < max
> >>> before printing anything?
> >>
> >> But I'm already swapping the two in the opposite case?
> > 
> > You are only doing the swapping for Core/Core2.
> > 
> > What I mean is that given the possible availability of
> > MSR_INTEL_PLATFORM_INFO on undocumented platforms and the different
> > semantics we should unconditionally check that the frequencies we are
> > going to print are sane, and one easy check would be that min < max
> > before printing.
> 
> Oh, I see. Yes, I did consider this, but decided against because it
> would hide cases where we're not in line with reality. I might not
> have spotted the issue here if we would have had such a check in
> place already (maybe the too low number would have caught my
> attention, but the  ...  range logged was far more
> obviously wrong). (In any event, if such a change was to be made, I
> think it should be a separate patch.)

My suggestion was to avoid printing both (max and min) if min > max,
as there's obviously something wrong there. Maybe we could print
unconditionally for debug builds, or print an error message otherwise
to note that PLATFORM_INFO is present but the values calculated don't
make sense?

In any case, this is just for informational purposes, so I don't
really want to delay you anymore with this. If you think both options
above are not worth it, feel free to take my Ack for this one:

Acked-by: Roger Pau Monné 

Thanks, Roger.



Re: [PATCH] x86/Intel: don't log bogus frequency range on Core/Core2 processors

2022-02-08 Thread Jan Beulich
On 08.02.2022 15:20, Roger Pau Monné wrote:
> On Tue, Feb 08, 2022 at 11:51:03AM +0100, Jan Beulich wrote:
>> On 08.02.2022 09:54, Roger Pau Monné wrote:
>>> On Fri, Feb 04, 2022 at 02:56:43PM +0100, Jan Beulich wrote:
 --- a/xen/arch/x86/cpu/intel.c
 +++ b/xen/arch/x86/cpu/intel.c
 @@ -435,6 +435,26 @@ static void intel_log_freq(const struct
  if ( c->x86 == 6 )
  switch ( c->x86_model )
  {
 +static const unsigned short core_factors[] =
 +{ 26667, 1, 2, 16667, 3, 1, 4 };
 +
 +case 0x0e: /* Core */
 +case 0x0f: case 0x16: case 0x17: case 0x1d: /* Core2 */
 +/*
 + * PLATFORM_INFO, while not documented for these, appears 
 to
 + * exist in at least some cases, but what it holds doesn't
 + * match the scheme used by newer CPUs.  At a guess, the 
 min
 + * and max fields look to be reversed, while the scaling
 + * factor is encoded in FSB_FREQ.
 + */
 +if ( min_ratio > max_ratio )
 +SWAP(min_ratio, max_ratio);
 +if ( rdmsr_safe(MSR_FSB_FREQ, msrval) ||
 + (msrval &= 7) >= ARRAY_SIZE(core_factors) )
 +return;
 +factor = core_factors[msrval];
 +break;
 +
  case 0x1a: case 0x1e: case 0x1f: case 0x2e: /* Nehalem */
  case 0x25: case 0x2c: case 0x2f: /* Westmere */
  factor = 1;
>>>
>>> Seeing that the MSR is present on non documented models and has
>>> unknown behavior we might want to further sanity check that min < max
>>> before printing anything?
>>
>> But I'm already swapping the two in the opposite case?
> 
> You are only doing the swapping for Core/Core2.
> 
> What I mean is that given the possible availability of
> MSR_INTEL_PLATFORM_INFO on undocumented platforms and the different
> semantics we should unconditionally check that the frequencies we are
> going to print are sane, and one easy check would be that min < max
> before printing.

Oh, I see. Yes, I did consider this, but decided against because it
would hide cases where we're not in line with reality. I might not
have spotted the issue here if we would have had such a check in
place already (maybe the too low number would have caught my
attention, but the  ...  range logged was far more
obviously wrong). (In any event, if such a change was to be made, I
think it should be a separate patch.)

Jan




Re: [PATCH] x86/Intel: don't log bogus frequency range on Core/Core2 processors

2022-02-08 Thread Roger Pau Monné
On Tue, Feb 08, 2022 at 11:51:03AM +0100, Jan Beulich wrote:
> On 08.02.2022 09:54, Roger Pau Monné wrote:
> > On Fri, Feb 04, 2022 at 02:56:43PM +0100, Jan Beulich wrote:
> >> Models 0F and 17 don't have PLATFORM_INFO documented. While it exists on
> >> at least model 0F, the information there doesn't match the scheme used
> >> on newer models (I'm observing a range of 700 ... 600 MHz reported on a
> >> Xeon E5345).
> > 
> > Maybe it would be best to limit ourselves to the models that have the
> > MSR documented in the SDM?
> 
> Well, yes, that's what I wasn't sure about: The information is used only
> for logging, so it's not the end of the world if we display something
> strange. We'd want to address such anomalies (like the one I did observe
> here) of course. But I wonder whether being entirely silent is really
> better.

OK, let's add the quirk for Core/Core2 then.

> >> --- a/xen/arch/x86/cpu/intel.c
> >> +++ b/xen/arch/x86/cpu/intel.c
> >> @@ -435,6 +435,26 @@ static void intel_log_freq(const struct
> >>  if ( c->x86 == 6 )
> >>  switch ( c->x86_model )
> >>  {
> >> +static const unsigned short core_factors[] =
> >> +{ 26667, 1, 2, 16667, 3, 1, 4 };
> >> +
> >> +case 0x0e: /* Core */
> >> +case 0x0f: case 0x16: case 0x17: case 0x1d: /* Core2 */
> >> +/*
> >> + * PLATFORM_INFO, while not documented for these, appears 
> >> to
> >> + * exist in at least some cases, but what it holds doesn't
> >> + * match the scheme used by newer CPUs.  At a guess, the 
> >> min
> >> + * and max fields look to be reversed, while the scaling
> >> + * factor is encoded in FSB_FREQ.
> >> + */
> >> +if ( min_ratio > max_ratio )
> >> +SWAP(min_ratio, max_ratio);
> >> +if ( rdmsr_safe(MSR_FSB_FREQ, msrval) ||
> >> + (msrval &= 7) >= ARRAY_SIZE(core_factors) )
> >> +return;
> >> +factor = core_factors[msrval];
> >> +break;
> >> +
> >>  case 0x1a: case 0x1e: case 0x1f: case 0x2e: /* Nehalem */
> >>  case 0x25: case 0x2c: case 0x2f: /* Westmere */
> >>  factor = 1;
> > 
> > Seeing that the MSR is present on non documented models and has
> > unknown behavior we might want to further sanity check that min < max
> > before printing anything?
> 
> But I'm already swapping the two in the opposite case?

You are only doing the swapping for Core/Core2.

What I mean is that given the possible availability of
MSR_INTEL_PLATFORM_INFO on undocumented platforms and the different
semantics we should unconditionally check that the frequencies we are
going to print are sane, and one easy check would be that min < max
before printing.

Thanks, Roger.



Re: [PATCH] x86/Intel: don't log bogus frequency range on Core/Core2 processors

2022-02-08 Thread Jan Beulich
On 08.02.2022 09:54, Roger Pau Monné wrote:
> On Fri, Feb 04, 2022 at 02:56:43PM +0100, Jan Beulich wrote:
>> Models 0F and 17 don't have PLATFORM_INFO documented. While it exists on
>> at least model 0F, the information there doesn't match the scheme used
>> on newer models (I'm observing a range of 700 ... 600 MHz reported on a
>> Xeon E5345).
> 
> Maybe it would be best to limit ourselves to the models that have the
> MSR documented in the SDM?

Well, yes, that's what I wasn't sure about: The information is used only
for logging, so it's not the end of the world if we display something
strange. We'd want to address such anomalies (like the one I did observe
here) of course. But I wonder whether being entirely silent is really
better.

>> --- a/xen/arch/x86/cpu/intel.c
>> +++ b/xen/arch/x86/cpu/intel.c
>> @@ -435,6 +435,26 @@ static void intel_log_freq(const struct
>>  if ( c->x86 == 6 )
>>  switch ( c->x86_model )
>>  {
>> +static const unsigned short core_factors[] =
>> +{ 26667, 1, 2, 16667, 3, 1, 4 };
>> +
>> +case 0x0e: /* Core */
>> +case 0x0f: case 0x16: case 0x17: case 0x1d: /* Core2 */
>> +/*
>> + * PLATFORM_INFO, while not documented for these, appears to
>> + * exist in at least some cases, but what it holds doesn't
>> + * match the scheme used by newer CPUs.  At a guess, the min
>> + * and max fields look to be reversed, while the scaling
>> + * factor is encoded in FSB_FREQ.
>> + */
>> +if ( min_ratio > max_ratio )
>> +SWAP(min_ratio, max_ratio);
>> +if ( rdmsr_safe(MSR_FSB_FREQ, msrval) ||
>> + (msrval &= 7) >= ARRAY_SIZE(core_factors) )
>> +return;
>> +factor = core_factors[msrval];
>> +break;
>> +
>>  case 0x1a: case 0x1e: case 0x1f: case 0x2e: /* Nehalem */
>>  case 0x25: case 0x2c: case 0x2f: /* Westmere */
>>  factor = 1;
> 
> Seeing that the MSR is present on non documented models and has
> unknown behavior we might want to further sanity check that min < max
> before printing anything?

But I'm already swapping the two in the opposite case?

Jan




Re: [PATCH] x86/Intel: don't log bogus frequency range on Core/Core2 processors

2022-02-08 Thread Roger Pau Monné
On Fri, Feb 04, 2022 at 02:56:43PM +0100, Jan Beulich wrote:
> Models 0F and 17 don't have PLATFORM_INFO documented. While it exists on
> at least model 0F, the information there doesn't match the scheme used
> on newer models (I'm observing a range of 700 ... 600 MHz reported on a
> Xeon E5345).

Maybe it would be best to limit ourselves to the models that have the
MSR documented in the SDM?

> 
> Sadly the Enhanced Intel Core instance of the table entry is not self-
> consistent: The numeric description of the low 3 bits doesn't match the
> subsequent more textual description in some of the cases; I'm using the
> former here.
> 
> Include the older Core model 0E as well as the two other Core2 models,
> none of which have respective MSR tables in the SDM.
> 
> Fixes: f6b6517cd5db ("x86: retrieve and log CPU frequency information")
> Signed-off-by: Jan Beulich 
> ---
> While the SDM table for the two models lists FSB_FREQ, I'm afraid its
> information is of little use here: If anything it could serve as a
> reference for the frequency determined by calibrate_APIC_clock().
> ---
> RFC: I may want to rebase over Roger's addition of intel-family.h, but
>  first of all I wanted to see whether going this route is deemed
>  acceptable at all.
> 
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -435,6 +435,26 @@ static void intel_log_freq(const struct
>  if ( c->x86 == 6 )
>  switch ( c->x86_model )
>  {
> +static const unsigned short core_factors[] =
> +{ 26667, 1, 2, 16667, 3, 1, 4 };
> +
> +case 0x0e: /* Core */
> +case 0x0f: case 0x16: case 0x17: case 0x1d: /* Core2 */
> +/*
> + * PLATFORM_INFO, while not documented for these, appears to
> + * exist in at least some cases, but what it holds doesn't
> + * match the scheme used by newer CPUs.  At a guess, the min
> + * and max fields look to be reversed, while the scaling
> + * factor is encoded in FSB_FREQ.
> + */
> +if ( min_ratio > max_ratio )
> +SWAP(min_ratio, max_ratio);
> +if ( rdmsr_safe(MSR_FSB_FREQ, msrval) ||
> + (msrval &= 7) >= ARRAY_SIZE(core_factors) )
> +return;
> +factor = core_factors[msrval];
> +break;
> +
>  case 0x1a: case 0x1e: case 0x1f: case 0x2e: /* Nehalem */
>  case 0x25: case 0x2c: case 0x2f: /* Westmere */
>  factor = 1;

Seeing that the MSR is present on non documented models and has
unknown behavior we might want to further sanity check that min < max
before printing anything?

Thanks, Roger.



[PATCH] x86/Intel: don't log bogus frequency range on Core/Core2 processors

2022-02-04 Thread Jan Beulich
Models 0F and 17 don't have PLATFORM_INFO documented. While it exists on
at least model 0F, the information there doesn't match the scheme used
on newer models (I'm observing a range of 700 ... 600 MHz reported on a
Xeon E5345).

Sadly the Enhanced Intel Core instance of the table entry is not self-
consistent: The numeric description of the low 3 bits doesn't match the
subsequent more textual description in some of the cases; I'm using the
former here.

Include the older Core model 0E as well as the two other Core2 models,
none of which have respective MSR tables in the SDM.

Fixes: f6b6517cd5db ("x86: retrieve and log CPU frequency information")
Signed-off-by: Jan Beulich 
---
While the SDM table for the two models lists FSB_FREQ, I'm afraid its
information is of little use here: If anything it could serve as a
reference for the frequency determined by calibrate_APIC_clock().
---
RFC: I may want to rebase over Roger's addition of intel-family.h, but
 first of all I wanted to see whether going this route is deemed
 acceptable at all.

--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -435,6 +435,26 @@ static void intel_log_freq(const struct
 if ( c->x86 == 6 )
 switch ( c->x86_model )
 {
+static const unsigned short core_factors[] =
+{ 26667, 1, 2, 16667, 3, 1, 4 };
+
+case 0x0e: /* Core */
+case 0x0f: case 0x16: case 0x17: case 0x1d: /* Core2 */
+/*
+ * PLATFORM_INFO, while not documented for these, appears to
+ * exist in at least some cases, but what it holds doesn't
+ * match the scheme used by newer CPUs.  At a guess, the min
+ * and max fields look to be reversed, while the scaling
+ * factor is encoded in FSB_FREQ.
+ */
+if ( min_ratio > max_ratio )
+SWAP(min_ratio, max_ratio);
+if ( rdmsr_safe(MSR_FSB_FREQ, msrval) ||
+ (msrval &= 7) >= ARRAY_SIZE(core_factors) )
+return;
+factor = core_factors[msrval];
+break;
+
 case 0x1a: case 0x1e: case 0x1f: case 0x2e: /* Nehalem */
 case 0x25: case 0x2c: case 0x2f: /* Westmere */
 factor = 1;