[PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.

2018-01-16 Thread Peter Jones
On my laptop running at 2.4GHz, if I run a VM where tsc calibration
using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
to do so (as measured with the stopwatch on my phone), with a tsc delta
of 0x1cd1c85300, or around 125 billion cycles.

If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
to wait for 5-200us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
million cycles, or more or less instantly.

Additionally, this reading the pmtimer was returning 0x anyway,
and that's obviously an invalid return.  I've added a check for that and
0 so we don't bother waiting for the test if what we're seeing is dead
pins with no response at all.

Signed-off-by: Peter Jones 
---
 grub-core/kern/i386/tsc_pmtimer.c | 43 ++-
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
b/grub-core/kern/i386/tsc_pmtimer.c
index c9c36169978..609402b8376 100644
--- a/grub-core/kern/i386/tsc_pmtimer.c
+++ b/grub-core/kern/i386/tsc_pmtimer.c
@@ -38,30 +38,53 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
   grub_uint64_t start_tsc;
   grub_uint64_t end_tsc;
   int num_iter = 0;
+  int bad_reads = 0;
 
-  start = grub_inl (pmtimer) & 0xff;
+  start = grub_inl (pmtimer) & 0x3fff;
   last = start;
   end = start + num_pm_ticks;
   start_tsc = grub_get_tsc ();
   while (1)
 {
-  cur = grub_inl (pmtimer) & 0xff;
+  cur = grub_inl (pmtimer);
+
+  /* If we get 10 reads in a row that are obviously dead pins, there's no
+reason to do this thousands of times.
+   */
+  if (cur == 0x || cur == 0)
+   {
+ bad_reads++;
+ grub_dprintf ("pmtimer", "cur: 0x%08x bad_reads: %d\n", cur, 
bad_reads);
+
+ if (bad_reads == 10)
+   return 0;
+   }
+  else if (bad_reads)
+   bad_reads = 0;
+
+  cur &= 0x3fff;
+
   if (cur < last)
-   cur |= 0x100;
+   cur |= 0x4000;
   num_iter++;
   if (cur >= end)
{
  end_tsc = grub_get_tsc ();
+ grub_dprintf ("pmtimer", "tsc delta is 0x%016lx\n",
+   end_tsc - start_tsc);
  return end_tsc - start_tsc;
}
-  /* Check for broken PM timer.
-5000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz)
-if after this time we still don't have 1 ms on pmtimer, then
-pmtimer is broken.
+  /* Check for broken PM timer.  5000 TSCs is between 5us (10GHz) and
+200us (250 MHz).  If after this time we still don't have 1us on
+pmtimer, then pmtimer is broken.
*/
-  if ((num_iter & 0xff) == 0 && grub_get_tsc () - start_tsc > 500) 
{
-   return 0;
-  }
+  end_tsc = grub_get_tsc();
+  if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000)
+   {
+ grub_dprintf ("pmtimer", "tsc delta is 0x%016lx\n",
+   end_tsc - start_tsc);
+ return 0;
+   }
 }
 }
 
-- 
2.15.0


___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


[PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.

2018-02-19 Thread Peter Jones
On my laptop running at 2.4GHz, if I run a VM where tsc calibration
using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
to do so (as measured with the stopwatch on my phone), with a tsc delta
of 0x1cd1c85300, or around 125 billion cycles.

If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
to wait for 5-200us, it decides it's broken in ~0x2626aa0 TSCs, aka ~2.4
million cycles, or more or less instantly.

Additionally, this reading the pmtimer was returning 0x anyway,
and that's obviously an invalid return.  I've added a check for that and
0 so we don't bother waiting for the test if what we're seeing is dead
pins with no response at all.

If "debug" is includes "pmtimer", you will see one of the following
three outcomes.  If pmtimer gives all 0 or all 1 bits, you will see:

kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 1
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 2
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 3
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 4
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 5
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 6
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 7
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 8
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 9
kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 10
kern/i386/tsc_pmtimer.c:78: timer is broken; giving up.

This outcome was tested using qemu+kvm with UEFI (OVMF) firmware and
these options: -machine pc-q35-2.10 -cpu Broadwell-noTSX

If pmtimer gives any other bit patterns but is not actually marching
forward fast enough to use for clock calibration, you will see:

kern/i386/tsc_pmtimer.c:121: pmtimer delta is 0x0 (1904 iterations)
kern/i386/tsc_pmtimer.c:124: tsc delta is implausible: 0x2626aa0

This outcome was tested using grub compiled with GRUB_PMTIMER_IGNORE_BAD_READS
defined (so as not to trip the bad read test) using qemu+kvm with UEFI
(OVMF) firmware, and these options: -machine pc-q35-2.10 -cpu Broadwell-noTSX

If pmtimer actually works, you'll see something like:

kern/i386/tsc_pmtimer.c:121: pmtimer delta is 0x0 (1904 iterations)
kern/i386/tsc_pmtimer.c:124: tsc delta is implausible: 0x2626aa0

This outcome was tested using qemu+kvm with UEFI (OVMF) firmware, and
these options: -machine pc-i440fx-2.4 -cpu Broadwell-noTSX

I've also tested this outcome on a real Intel Xeon E3-1275v3 on an Intel
Server Board S1200V3RPS using the SDV.RP.B8 "Release" build here:
https://firmware.intel.com/sites/default/files/UEFIDevKit_S1200RP_vB8.zip

Signed-off-by: Peter Jones 
---
 grub-core/kern/i386/tsc_pmtimer.c | 109 +++---
 1 file changed, 89 insertions(+), 20 deletions(-)

diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
b/grub-core/kern/i386/tsc_pmtimer.c
index c9c36169978..ca15c3aacd7 100644
--- a/grub-core/kern/i386/tsc_pmtimer.c
+++ b/grub-core/kern/i386/tsc_pmtimer.c
@@ -28,40 +28,101 @@
 #include 
 #include 
 
+/*
+ * Define GRUB_PMTIMER_IGNORE_BAD_READS if you're trying to test a timer that's
+ * present but doesn't keep time well.
+ */
+// #define GRUB_PMTIMER_IGNORE_BAD_READS
+
 grub_uint64_t
 grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
 grub_uint16_t num_pm_ticks)
 {
   grub_uint32_t start;
-  grub_uint32_t last;
-  grub_uint32_t cur, end;
+  grub_uint64_t cur, end;
   grub_uint64_t start_tsc;
   grub_uint64_t end_tsc;
-  int num_iter = 0;
+  unsigned int num_iter = 0;
+#ifndef GRUB_PMTIMER_IGNORE_BAD_READS
+  int bad_reads = 0;
+#endif
 
-  start = grub_inl (pmtimer) & 0xff;
-  last = start;
+  /*
+   * Some timers are 24-bit and some are 32-bit, but it doesn't make much
+   * difference to us.  Caring which one we have isn't really worth it since
+   * the low-order digits will give us enough data to calibrate TSC.  So just
+   * mask the top-order byte off.
+   */
+  cur = start = grub_inl (pmtimer) & 0xffUL;
   end = start + num_pm_ticks;
   start_tsc = grub_get_tsc ();
   while (1)
 {
-  cur = grub_inl (pmtimer) & 0xff;
-  if (cur < last)
-   cur |= 0x100;
-  num_iter++;
+  cur &= 0xff00ULL;
+  cur |= grub_inl (pmtimer) & 0xffUL;
+
+  end_tsc = grub_get_tsc();
+
+#ifndef GRUB_PMTIMER_IGNORE_BAD_READS
+  /*
+   * If we get 10 reads in a row that are obviously dead pins, there's no
+   * reason to do this thousands of times.
+   */
+  if (cur == 0xffUL || cur == 0)
+   {
+ bad_reads++;
+ grub_dprintf ("pmtimer",
+   "pmtimer: 0x%"PRIxGRUB_UINT64_T" bad_reads: %d\n",
+   cur, bad_reads);
+ grub_dprintf ("pmtimer", "timer is broken; giving up.\n");
+
+ if (bad_reads == 10)
+   return 0;
+   }
+#endif
+
+  if (cur < start)
+   cur += 0x100;
+
   if (cur >= end)
{
- end_tsc = g

Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.

2018-01-18 Thread Daniel Kiper
On Tue, Jan 16, 2018 at 01:16:17PM -0500, Peter Jones wrote:
> On my laptop running at 2.4GHz, if I run a VM where tsc calibration
> using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
> to do so (as measured with the stopwatch on my phone), with a tsc delta
> of 0x1cd1c85300, or around 125 billion cycles.
>
> If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
> to wait for 5-200us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
> million cycles, or more or less instantly.
>
> Additionally, this reading the pmtimer was returning 0x anyway,
> and that's obviously an invalid return.  I've added a check for that and
> 0 so we don't bother waiting for the test if what we're seeing is dead
> pins with no response at all.
>
> Signed-off-by: Peter Jones 
> ---
>  grub-core/kern/i386/tsc_pmtimer.c | 43 
> ++-
>  1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
> b/grub-core/kern/i386/tsc_pmtimer.c
> index c9c36169978..609402b8376 100644
> --- a/grub-core/kern/i386/tsc_pmtimer.c
> +++ b/grub-core/kern/i386/tsc_pmtimer.c
> @@ -38,30 +38,53 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
>grub_uint64_t start_tsc;
>grub_uint64_t end_tsc;
>int num_iter = 0;
> +  int bad_reads = 0;
>
> -  start = grub_inl (pmtimer) & 0xff;
> +  start = grub_inl (pmtimer) & 0x3fff;

I am not sure why you are changing this to 0x3fff...

>last = start;
>end = start + num_pm_ticks;
>start_tsc = grub_get_tsc ();
>while (1)
>  {
> -  cur = grub_inl (pmtimer) & 0xff;

What about 24-bit timers? I would leave this here...

> +  cur = grub_inl (pmtimer);
> +
> +  /* If we get 10 reads in a row that are obviously dead pins, there's no
> +  reason to do this thousands of times.
> +   */
> +  if (cur == 0x || cur == 0)

...and here I would check for 0xff and 0.

> + {
> +   bad_reads++;
> +   grub_dprintf ("pmtimer", "cur: 0x%08x bad_reads: %d\n", cur, 
> bad_reads);
> +
> +   if (bad_reads == 10)
> + return 0;
> + }
> +  else if (bad_reads)
> + bad_reads = 0;

Do we really need to reset this?

> +  cur &= 0x3fff;
> +
>if (cur < last)
> - cur |= 0x100;
> + cur |= 0x4000;
>num_iter++;
>if (cur >= end)
>   {
> end_tsc = grub_get_tsc ();
> +   grub_dprintf ("pmtimer", "tsc delta is 0x%016lx\n",
> + end_tsc - start_tsc);
> return end_tsc - start_tsc;
>   }
> -  /* Check for broken PM timer.
> -  5000 TSCs is between 5 ms (10GHz) and 200 ms (250 MHz)
> -  if after this time we still don't have 1 ms on pmtimer, then
> -  pmtimer is broken.
> +  /* Check for broken PM timer.  5000 TSCs is between 5us (10GHz) and
 ^^^ 500ns?

> +  200us (250 MHz).  If after this time we still don't have 1us on
 ^ 20us?

> +  pmtimer, then pmtimer is broken.
> */
> -  if ((num_iter & 0xff) == 0 && grub_get_tsc () - start_tsc > 
> 500) {
> - return 0;
> -  }
> +  end_tsc = grub_get_tsc();
> +  if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000)

Why 0x3fff here which means, AIUI, 4000 iterations?

Daniel

___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.

2018-01-19 Thread Peter Jones
.  So I wound up dividing
these by 10k instead of 1k.  Woops.  Will fix in v2.

> 
> > +pmtimer, then pmtimer is broken.
> > */
> > -  if ((num_iter & 0xff) == 0 && grub_get_tsc () - start_tsc > 
> > 500) {
> > -   return 0;
> > -  }
> > +  end_tsc = grub_get_tsc();
> > +  if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000)
> 
> Why 0x3fff here which means, AIUI, 4000 iterations?

Er... 16384, surely?  But basically I just wanted to scale down the
limit by approximately the same order as I scaled down the TSC delta,
and I wanted to leave the code basically working the same way (not for
any good reason), so I still wanted a number with only low-order bits
set.  TSC is scaled down by 1000, so I scaled the PM tick iteration down
by 1024.  I could make it twice as wide of a window instead, if you're
worried about that particular race, but I don't think it's going to
matter.

If our loop takes 1 TSC tick per iteration, which is quite impluasibly
fast, and we're on a 10GHz machine, then we should normally see pmtimer
change by 3580 in ~2800 iterations.  If it takes 16k iterations (again,
with perfectly executing code that takes no time), the pmtimer is
running at 625KHz instead of 3.58MHz.  If we're on a 250MHz machine,
that number is ~70 iterations.  If we get to 16000 iterations without
seeing pmtimer change by 3580 then pmtimer is either broken or we're on
a 60GHz+ machine.

That said the logic here did not match the comment and it still doesn't;
it should be comparing cur and start, not the number of iterations.  On
the next version of the patch I'm going to do something like:

if ((++num_iter > (grub_uint32_t)num_pm_ticks << 3UL) ||
(cur == start && end_tsc - start_tsc > 5000))

So it'll have a fairly widely ranged total limit but also actually check if
pmtimer results have changed at all.

Tentative patch below, but I haven't had a chance to test it yet:

>From 97e59bbe000961db5f823c4df074835d1165bd9d Mon Sep 17 00:00:00 2001
From: Peter Jones 
Date: Tue, 7 Nov 2017 17:12:17 -0500
Subject: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.

On my laptop running at 2.4GHz, if I run a VM where tsc calibration
using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
to do so (as measured with the stopwatch on my phone), with a tsc delta
of 0x1cd1c85300, or around 125 billion cycles.

If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
to wait for 500ns-20us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
million cycles, or more or less instantly.

Additionally, this reading the pmtimer was returning 0x anyway,
and that's obviously an invalid return.  I've added a check for that and
0 so we don't bother waiting for the test if what we're seeing is dead
pins with no response at all.
---
 grub-core/kern/i386/tsc_pmtimer.c | 56 ---
 1 file changed, 46 insertions(+), 10 deletions(-)

diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
b/grub-core/kern/i386/tsc_pmtimer.c
index c9c36169978..b09c00316df 100644
--- a/grub-core/kern/i386/tsc_pmtimer.c
+++ b/grub-core/kern/i386/tsc_pmtimer.c
@@ -37,8 +37,14 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
   grub_uint32_t cur, end;
   grub_uint64_t start_tsc;
   grub_uint64_t end_tsc;
-  int num_iter = 0;
+  unsigned int num_iter = 0;
+  int bad_reads = 0;
 
+  /* Some timers are 24-bit and some are 32-bit, but it doesn't make much
+   * difference to us.  Caring which one we have isn't really worth it since
+   * the low-order digits will give us enough data to calibrate TSC.  So just
+   * mask the top-order byte off.
+   */
   start = grub_inl (pmtimer) & 0xff;
   last = start;
   end = start + num_pm_ticks;
@@ -46,22 +52,52 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
   while (1)
 {
   cur = grub_inl (pmtimer) & 0xff;
+
+  /* If we get 10 reads in a row that are obviously dead pins, there's no
+reason to do this thousands of times.
+   */
+  if (cur == 0xff || cur == 0)
+   {
+ bad_reads++;
+ grub_dprintf ("pmtimer", "pmtimer value: 0x%08x bad_reads: %d\n",
+   cur, bad_reads);
+
+ if (bad_reads == 10)
+   return 0;
+   }
+  else if (bad_reads)
+   bad_reads = 0;
+
+  end_tsc = grub_get_tsc();
+
   if (cur < last)
cur |= 0x100;
-  num_iter++;
   if (cur >= end)
{
- end_tsc = grub_get_tsc ();
+ grub_dprintf ("pmtimer", "tsc delta is 0x%016llx\n",
+   end_tsc - start_tsc);
  return end_tsc - start_tsc;
}
-  /* Check for broken PM timer.
-   

Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.

2018-01-29 Thread Daniel Kiper
 broken PM timer.  5000 TSCs is between 5us (10GHz) and
> >  ^^^ 500ns?
> >
> > > +  200us (250 MHz).  If after this time we still don't have 1us on
> >  ^ 20us?
>
> You know, I stared at this for a surprising amount of time when writing
> it trying to figure out why these numbers seemed weird.  While I have
> the diff in front of me rather than the code as it is now, I see that
> it's because I was starting with the comment that was previously there,
> which had 50M instead of 5M for the TSC count.  So I wound up dividing
> these by 10k instead of 1k.  Woops.  Will fix in v2.
>
> >
> > > +  pmtimer, then pmtimer is broken.
> > > */
> > > -  if ((num_iter & 0xff) == 0 && grub_get_tsc () - start_tsc > 
> > > 500) {
> > > - return 0;
> > > -  }
> > > +  end_tsc = grub_get_tsc();
> > > +  if ((num_iter & 0x3fff) == 0 && end_tsc - start_tsc > 5000)
> >
> > Why 0x3fff here which means, AIUI, 4000 iterations?
>
> Er... 16384, surely?  But basically I just wanted to scale down the

Yep, or I simply forgot 0x before 4000...

> limit by approximately the same order as I scaled down the TSC delta,
> and I wanted to leave the code basically working the same way (not for
> any good reason), so I still wanted a number with only low-order bits
> set.  TSC is scaled down by 1000, so I scaled the PM tick iteration down
> by 1024.  I could make it twice as wide of a window instead, if you're
> worried about that particular race, but I don't think it's going to
> matter.
>
> If our loop takes 1 TSC tick per iteration, which is quite impluasibly
> fast, and we're on a 10GHz machine, then we should normally see pmtimer
> change by 3580 in ~2800 iterations.  If it takes 16k iterations (again,

I think that number of iterations is wrong here. Let's see...

f_pm = n_pm / t_pm, n_pm = 3580, f_pm = 3.58MHz => t_pm = 3580 / 3.58MHz
t_pm = 0,001s = 1ms => num_iter = 1ms * 10GHz = 10.000.000

Then below calculations are wrong too...

> with perfectly executing code that takes no time), the pmtimer is
> running at 625KHz instead of 3.58MHz.  If we're on a 250MHz machine,
> that number is ~70 iterations.  If we get to 16000 iterations without
> seeing pmtimer change by 3580 then pmtimer is either broken or we're on
> a 60GHz+ machine.
>
> That said the logic here did not match the comment and it still doesn't;
> it should be comparing cur and start, not the number of iterations.  On
> the next version of the patch I'm going to do something like:
>
> if ((++num_iter > (grub_uint32_t)num_pm_ticks << 3UL) ||

Taking into account above I am not sure why you multiply by 8 here...

> (cur == start && end_tsc - start_tsc > 5000))
>
> So it'll have a fairly widely ranged total limit but also actually check if
> pmtimer results have changed at all.
>
> Tentative patch below, but I haven't had a chance to test it yet:

In general I am not against the idea itself but I think that some
calculations should be fixed here and there.

> From 97e59bbe000961db5f823c4df074835d1165bd9d Mon Sep 17 00:00:00 2001
> From: Peter Jones 
> Date: Tue, 7 Nov 2017 17:12:17 -0500
> Subject: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.
>
> On my laptop running at 2.4GHz, if I run a VM where tsc calibration
> using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
> to do so (as measured with the stopwatch on my phone), with a tsc delta
> of 0x1cd1c85300, or around 125 billion cycles.
>
> If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
> to wait for 500ns-20us, it decides it's broken in ~0x7998f9e TSCs, aka ~2
> million cycles, or more or less instantly.
>
> Additionally, this reading the pmtimer was returning 0x anyway,
> and that's obviously an invalid return.  I've added a check for that and
> 0 so we don't bother waiting for the test if what we're seeing is dead
> pins with no response at all.
> ---
>  grub-core/kern/i386/tsc_pmtimer.c | 56 
> ---
>  1 file changed, 46 insertions(+), 10 deletions(-)
>
> diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
> b/grub-core/kern/i386/tsc_pmtimer.c
> index c9c36169978..b09c00316df 100644
> --- a/grub-core/kern/i386/tsc_pmtimer.c
> +++ b/grub-core/kern/i386/tsc_pmtimer.c
> @@ -37,8 +37,14 @@ grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
>grub_uint32_t cur, end;
>grub_uint64_t start_tsc;
>grub_uint64_t end_tsc;
> -  

Re: [PATCH] Make pmtimer tsc calibration not take 51 seconds to fail.

2018-02-20 Thread Daniel Kiper
On Mon, Feb 19, 2018 at 05:37:56PM -0500, Peter Jones wrote:
> On my laptop running at 2.4GHz, if I run a VM where tsc calibration
> using pmtimer will fail presuming a broken pmtimer, it takes ~51 seconds
> to do so (as measured with the stopwatch on my phone), with a tsc delta
> of 0x1cd1c85300, or around 125 billion cycles.
>
> If instead of trying to wait for 5-200ms to show up on the pmtimer, we try
> to wait for 5-200us, it decides it's broken in ~0x2626aa0 TSCs, aka ~2.4
> million cycles, or more or less instantly.
>
> Additionally, this reading the pmtimer was returning 0x anyway,
> and that's obviously an invalid return.  I've added a check for that and
> 0 so we don't bother waiting for the test if what we're seeing is dead
> pins with no response at all.
>
> If "debug" is includes "pmtimer", you will see one of the following
> three outcomes.  If pmtimer gives all 0 or all 1 bits, you will see:
>
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 1
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 2
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 3
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 4
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 5
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 6
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 7
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 8
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 9
> kern/i386/tsc_pmtimer.c:77: pmtimer: 0xff bad_reads: 10
> kern/i386/tsc_pmtimer.c:78: timer is broken; giving up.

OK.

> This outcome was tested using qemu+kvm with UEFI (OVMF) firmware and
> these options: -machine pc-q35-2.10 -cpu Broadwell-noTSX
>
> If pmtimer gives any other bit patterns but is not actually marching
> forward fast enough to use for clock calibration, you will see:
>
> kern/i386/tsc_pmtimer.c:121: pmtimer delta is 0x0 (1904 iterations)
> kern/i386/tsc_pmtimer.c:124: tsc delta is implausible: 0x2626aa0

OK.

> This outcome was tested using grub compiled with GRUB_PMTIMER_IGNORE_BAD_READS
> defined (so as not to trip the bad read test) using qemu+kvm with UEFI
> (OVMF) firmware, and these options: -machine pc-q35-2.10 -cpu Broadwell-noTSX
>
> If pmtimer actually works, you'll see something like:
>
> kern/i386/tsc_pmtimer.c:121: pmtimer delta is 0x0 (1904 iterations)
> kern/i386/tsc_pmtimer.c:124: tsc delta is implausible: 0x2626aa0

Hmmm... Same as above?

> This outcome was tested using qemu+kvm with UEFI (OVMF) firmware, and
> these options: -machine pc-i440fx-2.4 -cpu Broadwell-noTSX
>
> I've also tested this outcome on a real Intel Xeon E3-1275v3 on an Intel
> Server Board S1200V3RPS using the SDV.RP.B8 "Release" build here:
> https://firmware.intel.com/sites/default/files/UEFIDevKit_S1200RP_vB8.zip
>
> Signed-off-by: Peter Jones 
> ---
>  grub-core/kern/i386/tsc_pmtimer.c | 109 
> +++---
>  1 file changed, 89 insertions(+), 20 deletions(-)
>
> diff --git a/grub-core/kern/i386/tsc_pmtimer.c 
> b/grub-core/kern/i386/tsc_pmtimer.c
> index c9c36169978..ca15c3aacd7 100644
> --- a/grub-core/kern/i386/tsc_pmtimer.c
> +++ b/grub-core/kern/i386/tsc_pmtimer.c
> @@ -28,40 +28,101 @@
>  #include 
>  #include 
>
> +/*
> + * Define GRUB_PMTIMER_IGNORE_BAD_READS if you're trying to test a timer 
> that's
> + * present but doesn't keep time well.
> + */
> +// #define GRUB_PMTIMER_IGNORE_BAD_READS
> +
>  grub_uint64_t
>  grub_pmtimer_wait_count_tsc (grub_port_t pmtimer,
>grub_uint16_t num_pm_ticks)
>  {
>grub_uint32_t start;
> -  grub_uint32_t last;
> -  grub_uint32_t cur, end;
> +  grub_uint64_t cur, end;
>grub_uint64_t start_tsc;
>grub_uint64_t end_tsc;
> -  int num_iter = 0;
> +  unsigned int num_iter = 0;
> +#ifndef GRUB_PMTIMER_IGNORE_BAD_READS
> +  int bad_reads = 0;
> +#endif
>
> -  start = grub_inl (pmtimer) & 0xff;
> -  last = start;
> +  /*
> +   * Some timers are 24-bit and some are 32-bit, but it doesn't make much
> +   * difference to us.  Caring which one we have isn't really worth it since
> +   * the low-order digits will give us enough data to calibrate TSC.  So just
> +   * mask the top-order byte off.
> +   */
> +  cur = start = grub_inl (pmtimer) & 0xffUL;

Just for the sake of readability I would do s/0xffUL/0x00ffUL/ here and 
below.

>end = start + num_pm_ticks;
>start_tsc = grub_get_tsc ();
>while (1)
>  {
> -  cur = grub_inl (pmtimer) & 0xff;
> -  if (cur < last)
> - cur |= 0x100;
> -  num_iter++;
> +  cur &= 0xff00ULL;

Could you put a comment before this line? It took me some time
to get it and required to take a look below. This is not obvious
at first sight.

> +  cur |= grub_inl (pmtimer) & 0xffUL;
> +
> +  end_tsc = grub_get_tsc();
> +
> +#ifndef GRUB_PMTIMER_IGNORE_BAD_READS
> +  /*
> +   * If we get 10 reads in a row that are obviously dead pins, ther