On Wed, Jul 26, 2017 at 19:24 +0800, Adam Steen wrote: > Hi > > Is there an easy/accurate way to calculate the tsc timecounter frequency? > like Time Stamp Counters <http://blog.tinola.com/?e=54> on Linux. (on a > Sandy Bridge cpu) > > Another reference Converting Sandy Bridge TSC to wall clock time > <https://software.intel.com/en-us/forums/intel-isa-extensions/topic/284137>. > > The code below works but i don't really know how accurate it is, at best > 10,000 Hz. > > Cheers > Adam >
Hi, First of all it's not clear why do you want to calculate TSC frequency in the userland program? The kernel does it and prints the result to the system message buffer (viewed with the dmesg command). The second thing worth pointing out is that gettimeofday is a syscall that queries the timestamp from the timecounter code (updated every 10ms) with a current delta read directly from the hardware so that you get an accurate reading, but then it's adjusted according to the system time adjustment rules imposed by things like NTP and settimeofday, so essentially it's not monotonic (unless you can ensure there is no actor present that is adjusting the time while you're performing your measurement). There's also a way for userland to query a precise monotonically increasing timestamp: clock_gettime with CLOCK_MONOTONIC as the clock_id. In this case the returned timestamp is relative to the moment the system was brought up but this doesn't matter if all you need is difference. The third thing to know is where does this hardware reading comes from and what's its precision. Running "sysctl -n kern.timecounter.hardware" will tell you what is currently selected as the source and then you can locate that device your dmesg (with exception of i8254 -- that 1.19Mhz PIT). For instance on my laptop it's ACPI HPET that is proving a running counter with the frequency of 14 MHz. This is what's going to limit the precision of your measurement. To get a better reading you may try to take a series of say 10 measurements and calculate the average. The difference between RDTSC and RDTSCP is that the latter tells you on which CPU the instruction was executed. This poses a valid question: is TSC frequency the same on a multi- socket system. And I don't have an answer for that one. AFAIU, this boils down to the motherboard design and if the manufacturer has selected to use different quartz crystals for different sockets, then as we know for a fact that no two quartz crystals are created the same and thus frequency sourced from them and multiplied by clock generator PLLs to produce bus and then core clock signals will be slightly different between sockets. I believe there's a way to compensate for that but OpenBSD doesn't do this currently. However, your code doesn't check on which CPU the RDTSC has been executed so you can just use RDTSC and hope that TSC frequencies are the same and all counters on all cores have been started at the same time by the firmware (which is another question whether or not this is actually true). The CPUID call that you can see used there is there to provide serialization. I believe there's no need to do it in this case. In fact we haven't observed adverse effects w/o an additional serialization instruction on Skylake where TSC is used as the default timecounter (e.g. instead of an HPET). The other issue with your program is that you don't account for how long does it take to perform a syscall operation. You could time it before running the loop and then subtract the reading. And finally, a userland program will not run a 1 second loop uninterrupted since the scheduler will always attempt to select a different process every 10ms. Which means that a potential context switch and a 10ms timeslice of another process might make its way into your measurement. This all begs the same question I asked in the beginning: why do you want to calculate the TSC frequency in the userland program? > #include <stdio.h> > #include <unistd.h> > #include <sys/time.h> > > uint64_t rdtscp() > { > uint32_t lo, hi; > __asm__ __volatile__ ("RDTSCP\n\t" > "mov %%edx, %0\n\t" > "mov %%eax, %1\n\t" > "CPUID\n\t": "=r" (hi), "=r" (lo):: "%rax", > "%rbx", "%rcx", "%rdx"); > return (uint64_t)hi << 32 | lo; > } > > uint64_t rdtsc() > { > uint32_t lo, hi; > __asm__ __volatile__ ("CPUID\n\t" > "RDTSC\n\t" > "mov %%edx, %0\n\t" > "mov %%eax, %1\n\t": "=r" (hi), "=r" (lo):: > "%rax", "%rbx", "%rcx", "%rdx");; > return (uint64_t)hi << 32 | lo; > } > > uint64_t get_tsc_freq_hz() > { > uint64_t start_timestamp, end_timestamp; > struct timeval tv_start, tv_end; > > gettimeofday(&tv_start, NULL); > start_timestamp = rdtsc(); > while (1) { > gettimeofday(&tv_end, NULL); > if (tv_end.tv_sec > tv_start.tv_sec + 1) > break; > } > end_timestamp = rdtscp(); > > uint64_t cycles = end_timestamp - start_timestamp; > uint64_t usec = (tv_end.tv_sec - tv_start.tv_sec) * 1000000 + > (tv_end.tv_usec - tv_start.tv_usec); > // convert to cycles per second need to muliple the result by 1000000 > uint64_t tsc_freq = 1000000 * cycles / usec; > > return tsc_freq; > } > > int main (int argc, char *argv[]) > { > uint64_t tsc_freq = get_tsc_freq_hz(); > > printf("TSC frequency = %llu Hz\n", tsc_freq); > > return 0; > }