On Wed, Jul 26, 2017 at 19:24 +0800, Adam Steen wrote:
> Hi
> 
> Is there an easy/accurate way to calculate the tsc timecounter frequency?
> like Time Stamp Counters <http://blog.tinola.com/?e=54> on Linux. (on a
> Sandy Bridge cpu)
> 
> Another reference Converting Sandy Bridge TSC to wall clock time
> <https://software.intel.com/en-us/forums/intel-isa-extensions/topic/284137>.
> 
> The code below works but i don't really know how accurate it is, at best
> 10,000 Hz.
> 
> Cheers
> Adam
>

Hi,

First of all it's not clear why do you want to calculate TSC
frequency in the userland program?  The kernel does it and
prints the result to the system message buffer (viewed with
the dmesg command).

The second thing worth pointing out is that gettimeofday is a
syscall that queries the timestamp from the timecounter code
(updated every 10ms) with a current delta read directly from
the hardware so that you get an accurate reading, but then
it's adjusted according to the system time adjustment rules
imposed by things like NTP and settimeofday, so essentially
it's not monotonic (unless you can ensure there is no actor
present that is adjusting the time while you're performing
your measurement).  There's also a way for userland to query
a precise monotonically increasing timestamp: clock_gettime
with CLOCK_MONOTONIC as the clock_id.  In this case the
returned timestamp is relative to the moment the system was
brought up but this doesn't matter if all you need is
difference.

The third thing to know is where does this hardware reading
comes from and what's its precision.  Running "sysctl -n
kern.timecounter.hardware" will tell you what is currently
selected as the source and then you can locate that device
your dmesg (with exception of i8254 -- that 1.19Mhz PIT).
For instance on my laptop it's ACPI HPET that is proving a
running counter with the frequency of 14 MHz. This is what's
going to limit the precision of your measurement.

To get a better reading you may try to take a series of say
10 measurements and calculate the average.

The difference between RDTSC and RDTSCP is that the latter
tells you on which CPU the instruction was executed.  This
poses a valid question: is TSC frequency the same on a multi-
socket system.  And I don't have an answer for that one.
AFAIU, this boils down to the motherboard design and if the
manufacturer has selected to use different quartz crystals
for different sockets, then as we know for a fact that no two
quartz crystals are created the same and thus frequency
sourced from them and multiplied by clock generator PLLs to
produce bus and then core clock signals will be slightly
different between sockets.  I believe there's a way to
compensate for that but OpenBSD doesn't do this currently.

However, your code doesn't check on which CPU the RDTSC has
been executed so you can just use RDTSC and hope that TSC
frequencies are the same and all counters on all cores have
been started at the same time by the firmware (which is
another question whether or not this is actually true).

The CPUID call that you can see used there is there to
provide serialization.  I believe there's no need to do it in
this case.  In fact we haven't observed adverse effects w/o
an additional serialization instruction on Skylake where TSC
is used as the default timecounter (e.g. instead of an HPET).

The other issue with your program is that you don't account
for how long does it take to perform a syscall operation.  You
could time it before running the loop and then subtract the
reading.

And finally, a userland program will not run a 1 second loop
uninterrupted since the scheduler will always attempt to
select a different process every 10ms.  Which means that
a potential context switch and a 10ms timeslice of another
process might make its way into your measurement.

This all begs the same question I asked in the beginning: why
do you want to calculate the TSC frequency in the userland
program?


> #include <stdio.h>
> #include <unistd.h>
> #include <sys/time.h>
> 
> uint64_t rdtscp()
> {
>     uint32_t lo, hi;
>      __asm__ __volatile__ ("RDTSCP\n\t"
>                            "mov %%edx, %0\n\t"
>                            "mov %%eax, %1\n\t"
>                            "CPUID\n\t": "=r" (hi), "=r" (lo):: "%rax",
> "%rbx", "%rcx", "%rdx");
>     return (uint64_t)hi << 32 | lo;
> }
> 
> uint64_t rdtsc()
> {
>     uint32_t lo, hi;
>      __asm__ __volatile__ ("CPUID\n\t"
>                            "RDTSC\n\t"
>                            "mov %%edx, %0\n\t"
>                            "mov %%eax, %1\n\t": "=r" (hi), "=r" (lo)::
>                            "%rax", "%rbx", "%rcx", "%rdx");;
>     return (uint64_t)hi << 32 | lo;
> }
> 
> uint64_t get_tsc_freq_hz()
> {
>     uint64_t start_timestamp, end_timestamp;
>     struct timeval tv_start, tv_end;
> 
>     gettimeofday(&tv_start, NULL);
>     start_timestamp = rdtsc();
>     while (1) {
>         gettimeofday(&tv_end, NULL);
>         if (tv_end.tv_sec > tv_start.tv_sec + 1)
>             break;
>     }
>     end_timestamp = rdtscp();
> 
>     uint64_t cycles = end_timestamp - start_timestamp;
>     uint64_t usec = (tv_end.tv_sec - tv_start.tv_sec) * 1000000 +
> (tv_end.tv_usec - tv_start.tv_usec);
>     // convert to cycles per second need to muliple the result by 1000000
>     uint64_t tsc_freq = 1000000 * cycles / usec;
> 
>     return tsc_freq;
> }
> 
> int main (int argc, char *argv[])
> {
>     uint64_t tsc_freq = get_tsc_freq_hz();
> 
>     printf("TSC frequency = %llu Hz\n", tsc_freq);
> 
>     return 0;
> }

Reply via email to