Re: Calculate the frequency of the tsc timecounter

2017-08-02 Thread Adam Steen
On Tue, Aug 1, 2017 at 6:43 PM, Adam Steen  wrote:
> Hi Mike
>
> Please see the output below (I did have to update a few DPRINTF's with
> the change to clang, did you want a diff for checking in?)
> I appreciate you having a look.
>
> Cheers
> Adam
>
> root on sd0a (15cc7df693e2251e.a) swap on sd0b dump on sd0b
> vm_impl_init_vmx: created vm_map @ 0x80b99000
> vm_resetcpu: resetting vm 1 vcpu 0 to power on defaults
> guest eptp = 0x39eb8f01e
> vmm_alloc_vpid: allocated VPID/ASID 1
> vmx_handle_exit: unhandled exit 2147483681 (unknown)
> vcpu @ 0x800032ffc000
>  rax=0x rbx=0x rcx=0x
>  rdx=0x rbp=0x rdi=0x5000
>  rsi=0x  r8=0x  r9=0x
>  r10=0x r11=0x r12=0x
>  r13=0x r14=0x r15=0x
>  rip=0x0010 rsp=0x1ff8
>  cr0=0x0020 (pg cd nw am wp NE et ts em mp pe)
>  cr2=0x
>  cr3=0x (pwt pcd)
>  cr4=0x2000 (pke smap smep osxsave pcide fsgsbase smxe
> VMXE osxmmexcpt osfxsr pce pge mce pae pse de tsd pvi vme)
>  --Guest Segment Info--
>  cs=0x0008 rpl=0 base=0x limit=0x a/r=0xa099
>   granularity=1 dib=0 l(64 bit)=1 present=1 sys=1 type=code, x only, accessed
> code, r/x
>  ds=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
>   granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
>  es=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
>   granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
>  fs=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
>   granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
>  gs=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
>   granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
>  ss=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
>   granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
>  tr=0x base=0x limit=0x a/r=0x008b
>   granularity=0 dib=0 l(64 bit)=0 present=1 sys=0 type=tss (busy)
>  gdtr base=0x1000 limit=0x0017
>  idtr base=0x limit=0x
>  ldtr=0x base=0x limit=0x a/r=0x1
>   (unusable)
>  --Guest MSRs @ 0xff039b869000 (paddr: 0x00039b869000)--
>   MSR 0 @ 0xff039b869000 : 0xc080 (EFER),
> value=0x0500 (sce LME LMA nxe)
>   MSR 1 @ 0xff039b869010 : 0xc081 (STAR), value=0x
>   MSR 2 @ 0xff039b869020 : 0xc082 (LSTAR), value=0x
>   MSR 3 @ 0xff039b869030 : 0xc083 (CSTAR), value=0x
>   MSR 4 @ 0xff039b869040 : 0xc084 (SFMASK), value=0x
>   MSR 5 @ 0xff039b869050 : 0xc102 (KGSBASE), value=0x
> vcpu @ 0x800032ffc000
> parent vm @ 0xff0395ee7000
> mode: VMX
> pinbased ctls: 0x7f0016
> true pinbased ctls: 0x7f0016
>  EXTERNAL_INT_EXITING: Can set:Yes Can clear:Yes
>  NMI_EXITING: Can set:Yes Can clear:Yes
>  VIRTUAL_NMIS: Can set:Yes Can clear:Yes
>  ACTIVATE_VMX_PREEMPTION_TIMER: Can set:Yes Can clear:Yes
>  PROCESS_POSTED_INTERRUPTS: Can set:No Can clear:Yes
> procbased ctls: 0xfff9fffe0401e172
> true procbased ctls: 0xfff9fffe04006172
>  INTERRUPT_WINDOW_EXITING: Can set:Yes Can clear:Yes
>  USE_TSC_OFFSETTING: Can set:Yes Can clear:Yes
>  HLT_EXITING: Can set:Yes Can clear:Yes
>  INVLPG_EXITING: Can set:Yes Can clear:Yes
>  MWAIT_EXITING: Can set:Yes Can clear:Yes
>  RDPMC_EXITING: Can set:Yes Can clear:Yes
>  RDTSC_EXITING: Can set:Yes Can clear:Yes
>  CR3_LOAD_EXITING: Can set:Yes Can clear:Yes
>  CR3_STORE_EXITING: Can set:Yes Can clear:Yes
>  CR8_LOAD_EXITING: Can set:Yes Can clear:Yes
>  CR8_STORE_EXITING: Can set:Yes Can clear:Yes
>  USE_TPR_SHADOW: Can set:Yes Can clear:Yes
>  NMI_WINDOW_EXITING: Can set:Yes Can clear:Yes
>  MOV_DR_EXITING: Can set:Yes Can clear:Yes
>  UNCONDITIONAL_IO_EXITING: Can set:Yes Can clear:Yes
>  USE_IO_BITMAPS: Can set:Yes Can clear:Yes
>  MONITOR_TRAP_FLAG: Can set:Yes Can clear:Yes
>  USE_MSR_BITMAPS: Can set:Yes Can clear:Yes
>  MONITOR_EXITING: Can set:Yes Can clear:Yes
>  PAUSE_EXITING: Can set:Yes Can clear:Yes
> procbased2 ctls: 0xff
>  VIRTUALIZE_APIC: Can set:Yes Can clear:Yes
>  ENABLE_EPT: Can set:Yes Can clear:Yes
>  DESCRIPTOR_TABLE_EXITING: Can set:Yes Can clear:Yes
>  ENABLE_RDTSCP: Can set:Yes Can clear:Yes
>  VIRTUALIZE_X2APIC_MODE: Can set:Yes Can clear:Yes
>  ENABLE_VPID: Can set:Yes 

Re: Calculate the frequency of the tsc timecounter

2017-08-02 Thread Mike Larkin
On Thu, Aug 03, 2017 at 07:56:11AM +0800, Adam Steen wrote:
> On Mon, Jul 31, 2017 at 3:58 PM, Mike Belopuhov  wrote:
> > On Mon, Jul 31, 2017 at 09:48 +0800, Adam Steen wrote:
> >> Ted Unangst  wrote:
> >> > we don't currently export this info, but we could add some sysctls. 
> >> > there's
> >> > some cpufeatures stuff there, but generally stuff isn't exported until
> >> > somebody finds a use for it... it shouldn't be too hard to add something 
> >> > to
> >> > amd64/machdep.c sysctl if you're interested.
> >>
> >> I am interested, as i need the info, i will look into it and hopefully
> >> come back with a patch.
> >
> > This is a bad idea because TSC as the time source is only usable
> > by OpenBSD on Skylake and Kaby Lake CPUs since they encode the TSC
> > frequency in the CPUID. All older CPUs have their TSCs measured
> > against the PIT. Currently the measurement done by the kernel isn't
> > very precise and if TSC is selected as a timecounter, the machine
> > would be gaining time on a pace that cannot be corrected by our NTP
> > daemon. (IIRC, about an hour a day on my Haswell running with NTP).
> >
> > To be able to use TSC as a timecounter source on OpenBSD or Solo5
> > you'd have to improve the in-kernel measurement of the TSC frequency
> > first. I've tried to perform 10 measurements and take an average and
> > it does improve accuracy, however I believe we need to poach another
> > bit from Linux and re-calibrate TSC via HPET:
> >
> >  
> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
> >
> > I think this is the most sane thing we can do. Here's a complete
> > procedure that Linux kernel undertakes:
> >
> >  
> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
> >
> > Regards,
> > Mike
> 
> Hi Mike and Ted
> 
> I understand using the tsc as a timecounter on non Skylake and
> Kabylake processors is inaccurate, but this i my first real foray into
> kernel programming, so wanted to started of slow. below is a diff to
> expose if the tsc is invariant and the tsc frequency via sysctl
> machdep. I would like to get this commited first and then move on to
> improving the in-kernel measurement of the tsc frequency as Mike
> describes above.
> 
> Sorry its taken a while to get back to you I have been working with
> Mike Larkin on vmm and my port of Solo5/ukvm.
> 
> Cheers
> Adam
> 
> comments?
> 

Everything in these sysctls can be obtained from CPUID on the processors you
want (skylake and later), and since that can be called in any CPL, why is
a kernel interface needed for this? The only thing that would be missing is
the tsc frequency on older-than-skylake cpus, but I don't think this is what
you are after, is it? (and even if you wanted this information on < skylake,
as mikeb points out, the accuracy of that value would then be very suspect and
probably not usable anyway).

-ml

> Index: sys/arch/amd64/amd64/identcpu.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/identcpu.c,v
> retrieving revision 1.87
> diff -u -p -u -p -r1.87 identcpu.c
> --- sys/arch/amd64/amd64/identcpu.c 20 Jun 2017 05:34:41 - 1.87
> +++ sys/arch/amd64/amd64/identcpu.c 2 Aug 2017 23:45:54 -
> @@ -63,6 +63,8 @@ struct timecounter tsc_timecounter = {
>   tsc_get_timecount, NULL, ~0u, 0, "tsc", -1000, NULL
>  };
> 
> +u_int64_t amd64_tsc_freq = 0;
> +int amd64_has_invariant_tsc;
>  int amd64_has_xcrypt;
>  #ifdef CRYPTO
>  int amd64_has_pclmul;
> @@ -566,9 +568,12 @@ identifycpu(struct cpu_info *ci)
>   /* Check if it's an invariant TSC */
>   if (cpu_apmi_edx & CPUIDEDX_ITSC)
>   ci->ci_flags |= CPUF_INVAR_TSC;
> +
> +amd64_has_invariant_tsc = (ci->ci_flags & CPUF_INVAR_TSC) != 0;
>   }
> 
>   ci->ci_tsc_freq = cpu_tsc_freq(ci);
> +amd64_tsc_freq = ci->ci_tsc_freq;
> 
>   amd_cpu_cacheinfo(ci);
> 
> Index: sys/arch/amd64/amd64/machdep.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/machdep.c,v
> retrieving revision 1.231
> diff -u -p -u -p -r1.231 machdep.c
> --- sys/arch/amd64/amd64/machdep.c 12 Jul 2017 06:26:32 - 1.231
> +++ sys/arch/amd64/amd64/machdep.c 2 Aug 2017 23:45:54 -
> @@ -425,7 +425,9 @@ int
>  cpu_sysctl(int *name, u_int namelen, void *oldp, size_t *oldlenp, void *newp,
>  size_t newlen, struct proc *p)
>  {
> +extern u_int64_t amd64_tsc_freq;
>   extern int amd64_has_xcrypt;
> + extern int amd64_has_invariant_tsc;
>   dev_t consdev;
>   dev_t dev;
>   int val, error;
> @@ -496,6 +498,10 @@ cpu_sysctl(int *name, u_int namelen, voi
>   pckbc_release_console();
>   return (error);
>  #endif
> +case CPU_TSCFREQ:
> +return (sysctl_rdquad(oldp, oldlenp, newp, amd64_tsc_freq));
> + case CPU_INVARIANTTSC:
> + return (sysctl_rdint(oldp, oldlenp, newp, amd64_has_invariant_tsc));
>   default:
>   return (EOPNOTSUPP);
>   }
> Index: 

Re: Calculate the frequency of the tsc timecounter

2017-08-02 Thread Adam Steen
On Mon, Jul 31, 2017 at 3:58 PM, Mike Belopuhov  wrote:
> On Mon, Jul 31, 2017 at 09:48 +0800, Adam Steen wrote:
>> Ted Unangst  wrote:
>> > we don't currently export this info, but we could add some sysctls. there's
>> > some cpufeatures stuff there, but generally stuff isn't exported until
>> > somebody finds a use for it... it shouldn't be too hard to add something to
>> > amd64/machdep.c sysctl if you're interested.
>>
>> I am interested, as i need the info, i will look into it and hopefully
>> come back with a patch.
>
> This is a bad idea because TSC as the time source is only usable
> by OpenBSD on Skylake and Kaby Lake CPUs since they encode the TSC
> frequency in the CPUID. All older CPUs have their TSCs measured
> against the PIT. Currently the measurement done by the kernel isn't
> very precise and if TSC is selected as a timecounter, the machine
> would be gaining time on a pace that cannot be corrected by our NTP
> daemon. (IIRC, about an hour a day on my Haswell running with NTP).
>
> To be able to use TSC as a timecounter source on OpenBSD or Solo5
> you'd have to improve the in-kernel measurement of the TSC frequency
> first. I've tried to perform 10 measurements and take an average and
> it does improve accuracy, however I believe we need to poach another
> bit from Linux and re-calibrate TSC via HPET:
>
>  
> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
>
> I think this is the most sane thing we can do. Here's a complete
> procedure that Linux kernel undertakes:
>
>  
> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
>
> Regards,
> Mike

Hi Mike and Ted

I understand using the tsc as a timecounter on non Skylake and
Kabylake processors is inaccurate, but this i my first real foray into
kernel programming, so wanted to started of slow. below is a diff to
expose if the tsc is invariant and the tsc frequency via sysctl
machdep. I would like to get this commited first and then move on to
improving the in-kernel measurement of the tsc frequency as Mike
describes above.

Sorry its taken a while to get back to you I have been working with
Mike Larkin on vmm and my port of Solo5/ukvm.

Cheers
Adam

comments?

Index: sys/arch/amd64/amd64/identcpu.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/identcpu.c,v
retrieving revision 1.87
diff -u -p -u -p -r1.87 identcpu.c
--- sys/arch/amd64/amd64/identcpu.c 20 Jun 2017 05:34:41 - 1.87
+++ sys/arch/amd64/amd64/identcpu.c 2 Aug 2017 23:45:54 -
@@ -63,6 +63,8 @@ struct timecounter tsc_timecounter = {
  tsc_get_timecount, NULL, ~0u, 0, "tsc", -1000, NULL
 };

+u_int64_t amd64_tsc_freq = 0;
+int amd64_has_invariant_tsc;
 int amd64_has_xcrypt;
 #ifdef CRYPTO
 int amd64_has_pclmul;
@@ -566,9 +568,12 @@ identifycpu(struct cpu_info *ci)
  /* Check if it's an invariant TSC */
  if (cpu_apmi_edx & CPUIDEDX_ITSC)
  ci->ci_flags |= CPUF_INVAR_TSC;
+
+amd64_has_invariant_tsc = (ci->ci_flags & CPUF_INVAR_TSC) != 0;
  }

  ci->ci_tsc_freq = cpu_tsc_freq(ci);
+amd64_tsc_freq = ci->ci_tsc_freq;

  amd_cpu_cacheinfo(ci);

Index: sys/arch/amd64/amd64/machdep.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/machdep.c,v
retrieving revision 1.231
diff -u -p -u -p -r1.231 machdep.c
--- sys/arch/amd64/amd64/machdep.c 12 Jul 2017 06:26:32 - 1.231
+++ sys/arch/amd64/amd64/machdep.c 2 Aug 2017 23:45:54 -
@@ -425,7 +425,9 @@ int
 cpu_sysctl(int *name, u_int namelen, void *oldp, size_t *oldlenp, void *newp,
 size_t newlen, struct proc *p)
 {
+extern u_int64_t amd64_tsc_freq;
  extern int amd64_has_xcrypt;
+ extern int amd64_has_invariant_tsc;
  dev_t consdev;
  dev_t dev;
  int val, error;
@@ -496,6 +498,10 @@ cpu_sysctl(int *name, u_int namelen, voi
  pckbc_release_console();
  return (error);
 #endif
+case CPU_TSCFREQ:
+return (sysctl_rdquad(oldp, oldlenp, newp, amd64_tsc_freq));
+ case CPU_INVARIANTTSC:
+ return (sysctl_rdint(oldp, oldlenp, newp, amd64_has_invariant_tsc));
  default:
  return (EOPNOTSUPP);
  }
Index: sys/arch/amd64/include/cpu.h
===
RCS file: /cvs/src/sys/arch/amd64/include/cpu.h,v
retrieving revision 1.113
diff -u -p -u -p -r1.113 cpu.h
--- sys/arch/amd64/include/cpu.h 12 Jul 2017 06:26:32 - 1.113
+++ sys/arch/amd64/include/cpu.h 2 Aug 2017 23:45:56 -
@@ -429,7 +429,9 @@ void mp_setperf_init(void);
 #define CPU_XCRYPT 12 /* supports VIA xcrypt in userland */
 #define CPU_LIDACTION 14 /* action caused by lid close */
 #define CPU_FORCEUKBD 15 /* Force ukbd(4) as console keyboard */
-#define CPU_MAXID 16 /* number of valid machdep ids */
+#define CPU_TSCFREQ 16 /* tsc frequency */
+#define CPU_INVARIANTTSC 17 /* has invariant tsc */
+#define CPU_MAXID 18 /* number of valid machdep ids */

 #define CTL_MACHDEP_NAMES { \
  { 0, 0 }, \
@@ 

Re: Calculate the frequency of the tsc timecounter

2017-08-01 Thread Adam Steen
Hi Mike

Please see the output below (I did have to update a few DPRINTF's with
the change to clang, did you want a diff for checking in?)
I appreciate you having a look.

Cheers
Adam

root on sd0a (15cc7df693e2251e.a) swap on sd0b dump on sd0b
vm_impl_init_vmx: created vm_map @ 0x80b99000
vm_resetcpu: resetting vm 1 vcpu 0 to power on defaults
guest eptp = 0x39eb8f01e
vmm_alloc_vpid: allocated VPID/ASID 1
vmx_handle_exit: unhandled exit 2147483681 (unknown)
vcpu @ 0x800032ffc000
 rax=0x rbx=0x rcx=0x
 rdx=0x rbp=0x rdi=0x5000
 rsi=0x  r8=0x  r9=0x
 r10=0x r11=0x r12=0x
 r13=0x r14=0x r15=0x
 rip=0x0010 rsp=0x1ff8
 cr0=0x0020 (pg cd nw am wp NE et ts em mp pe)
 cr2=0x
 cr3=0x (pwt pcd)
 cr4=0x2000 (pke smap smep osxsave pcide fsgsbase smxe
VMXE osxmmexcpt osfxsr pce pge mce pae pse de tsd pvi vme)
 --Guest Segment Info--
 cs=0x0008 rpl=0 base=0x limit=0x a/r=0xa099
  granularity=1 dib=0 l(64 bit)=1 present=1 sys=1 type=code, x only, accessed
code, r/x
 ds=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
  granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
 es=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
  granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
 fs=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
  granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
 gs=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
  granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
 ss=0x0010 rpl=0 base=0x limit=0x a/r=0xc093
  granularity=1 dib=1 l(64 bit)=0 present=1 sys=1 type=data, r/w, accessed
 tr=0x base=0x limit=0x a/r=0x008b
  granularity=0 dib=0 l(64 bit)=0 present=1 sys=0 type=tss (busy)
 gdtr base=0x1000 limit=0x0017
 idtr base=0x limit=0x
 ldtr=0x base=0x limit=0x a/r=0x1
  (unusable)
 --Guest MSRs @ 0xff039b869000 (paddr: 0x00039b869000)--
  MSR 0 @ 0xff039b869000 : 0xc080 (EFER),
value=0x0500 (sce LME LMA nxe)
  MSR 1 @ 0xff039b869010 : 0xc081 (STAR), value=0x
  MSR 2 @ 0xff039b869020 : 0xc082 (LSTAR), value=0x
  MSR 3 @ 0xff039b869030 : 0xc083 (CSTAR), value=0x
  MSR 4 @ 0xff039b869040 : 0xc084 (SFMASK), value=0x
  MSR 5 @ 0xff039b869050 : 0xc102 (KGSBASE), value=0x
vcpu @ 0x800032ffc000
parent vm @ 0xff0395ee7000
mode: VMX
pinbased ctls: 0x7f0016
true pinbased ctls: 0x7f0016
 EXTERNAL_INT_EXITING: Can set:Yes Can clear:Yes
 NMI_EXITING: Can set:Yes Can clear:Yes
 VIRTUAL_NMIS: Can set:Yes Can clear:Yes
 ACTIVATE_VMX_PREEMPTION_TIMER: Can set:Yes Can clear:Yes
 PROCESS_POSTED_INTERRUPTS: Can set:No Can clear:Yes
procbased ctls: 0xfff9fffe0401e172
true procbased ctls: 0xfff9fffe04006172
 INTERRUPT_WINDOW_EXITING: Can set:Yes Can clear:Yes
 USE_TSC_OFFSETTING: Can set:Yes Can clear:Yes
 HLT_EXITING: Can set:Yes Can clear:Yes
 INVLPG_EXITING: Can set:Yes Can clear:Yes
 MWAIT_EXITING: Can set:Yes Can clear:Yes
 RDPMC_EXITING: Can set:Yes Can clear:Yes
 RDTSC_EXITING: Can set:Yes Can clear:Yes
 CR3_LOAD_EXITING: Can set:Yes Can clear:Yes
 CR3_STORE_EXITING: Can set:Yes Can clear:Yes
 CR8_LOAD_EXITING: Can set:Yes Can clear:Yes
 CR8_STORE_EXITING: Can set:Yes Can clear:Yes
 USE_TPR_SHADOW: Can set:Yes Can clear:Yes
 NMI_WINDOW_EXITING: Can set:Yes Can clear:Yes
 MOV_DR_EXITING: Can set:Yes Can clear:Yes
 UNCONDITIONAL_IO_EXITING: Can set:Yes Can clear:Yes
 USE_IO_BITMAPS: Can set:Yes Can clear:Yes
 MONITOR_TRAP_FLAG: Can set:Yes Can clear:Yes
 USE_MSR_BITMAPS: Can set:Yes Can clear:Yes
 MONITOR_EXITING: Can set:Yes Can clear:Yes
 PAUSE_EXITING: Can set:Yes Can clear:Yes
procbased2 ctls: 0xff
 VIRTUALIZE_APIC: Can set:Yes Can clear:Yes
 ENABLE_EPT: Can set:Yes Can clear:Yes
 DESCRIPTOR_TABLE_EXITING: Can set:Yes Can clear:Yes
 ENABLE_RDTSCP: Can set:Yes Can clear:Yes
 VIRTUALIZE_X2APIC_MODE: Can set:Yes Can clear:Yes
 ENABLE_VPID: Can set:Yes Can clear:Yes
 WBINVD_EXITING: Can set:Yes Can clear:Yes
 UNRESTRICTED_GUEST: Can set:Yes Can clear:Yes
 APIC_REGISTER_VIRTUALIZATION: Can set:No Can clear:Yes
 VIRTUAL_INTERRUPT_DELIVERY: Can set:No Can clear:Yes
 PAUSE_LOOP_EXITING: Can 

Re: Calculate the frequency of the tsc timecounter

2017-08-01 Thread Mike Larkin
On Tue, Aug 01, 2017 at 07:32:19AM +0800, Adam Steen wrote:
> On Tue, Aug 1, 2017 at 7:26 AM, Adam Steen  wrote:
> > Mike Belopuhov wrote:
> >
> >> To be able to use TSC as a timecounter source on OpenBSD or Solo5
> >> you'd have to improve the in-kernel measurement of the TSC frequency
> >> first. I've tried to perform 10 measurements and take an average and
> >> it does improve accuracy, however I believe we need to poach another
> >> bit from Linux and re-calibrate TSC via HPET:
> >>
> >>  
> >> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
> >>
> >> I think this is the most sane thing we can do. Here's a complete
> >> procedure that Linux kernel undertakes:
> >>
> >>  
> >> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
> >>
> >> Regards,
> >> Mike
> >
> > Looks like i have more sort out!
> >
> > Mike Larkin wrote:
> >> If you point me to a bootable image that causes this failure, I might be
> >> able to figure out what vmm(4) doesn't like.
> >>
> >> Nothing in lines 122-134 of the file indicated above should cause this.
> >
> > This is where things get a little more interesting, Solo5
> > (https://github.com/adamsteen/solo5) is actually two parts Solo5 the
> > Unikernel and ukvm the userland side of a hypervisor (currently
> > running with kvm and bhyve), I have been porting to run ukvm directly
> > with vmm. I expect the cause of "vmx_handle_exit: unhandled exit
> > 2147483681 (unknown)" is the register setup in
> > https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd_x86_64.c,
> > lines 118-147
> >
> > the constants are ukvm constants.
> >
> > struct vm_resetcpu_params vrp = {
> > .vrp_vm_id = hvb->vcp_id,
> > .vrp_vcpu_id = hvb->vcpu_id,
> > .vrp_init_state = {
> > .vrs_gprs[VCPU_REGS_RFLAGS] = X86_RFLAGS_INIT,
> > .vrs_gprs[VCPU_REGS_RIP] = gpa_ep,
> > .vrs_gprs[VCPU_REGS_RSP] = hv->mem_size - 8,
> > .vrs_gprs[VCPU_REGS_RDI] = X86_BOOT_INFO_BASE,
> > .vrs_crs[VCPU_REGS_CR0] = X86_CR0_INIT,
> > .vrs_crs[VCPU_REGS_CR3] = X86_CR3_INIT,
> > .vrs_crs[VCPU_REGS_CR4] = X86_CR4_INIT,
> > .vrs_sregs[VCPU_REGS_CS] = sreg_to_vsi(_x86_sreg_code),
> > .vrs_sregs[VCPU_REGS_DS] = sreg_to_vsi(_x86_sreg_data),
> > .vrs_sregs[VCPU_REGS_ES] = sreg_to_vsi(_x86_sreg_data),
> > .vrs_sregs[VCPU_REGS_FS] = sreg_to_vsi(_x86_sreg_data),
> > .vrs_sregs[VCPU_REGS_GS] = sreg_to_vsi(_x86_sreg_data),
> > .vrs_sregs[VCPU_REGS_SS] = sreg_to_vsi(_x86_sreg_data),
> > .vrs_gdtr = { 0x0, X86_GDTR_LIMIT, 0x0, X86_GDT_BASE},
> > .vrs_idtr = { 0x0, 0x, 0x0, 0x0},
> > .vrs_sregs[VCPU_REGS_LDTR] = 
> > sreg_to_vsi(_x86_sreg_unusable),
> > .vrs_sregs[VCPU_REGS_TR] = sreg_to_vsi(_x86_sreg_tr),
> > .vrs_msrs[VCPU_REGS_EFER] = X86_EFER_INIT,
> > .vrs_msrs[VCPU_REGS_STAR] = 0ULL,
> > .vrs_msrs[VCPU_REGS_LSTAR] = 0ULL,
> > .vrs_msrs[VCPU_REGS_CSTAR] = 0ULL,
> > .vrs_msrs[VCPU_REGS_SFMASK] = 0ULL,
> > .vrs_msrs[VCPU_REGS_KGSBASE] = 0ULL,
> > .vrs_crs[VCPU_REGS_XCR0] = XCR0_X87
> > }
> > };
> >
> > the three specific OpenBSD files are
> > https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd.h
> > https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd.c
> > https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd_x86_64.c
> > with small changes in ukvm/ukvm_elf.c and ukvm/ukvm_module_net.c
> >
> > I could upload a binary image for you but It won't run with vmd its
> > has ukvm specific hypercalls designed to simplify things.
> >
> > Cheers
> > Adam
> >
> > ps i am currently trying to document the differences in what vmm is
> > expecting and ukvm is expecting.
> 

I'd recommend enabling VMM_DEBUG and seeing if that prints more useful
information after the unhandled exit. That error code is usually because of
invalid VMCS content, but since you're rolling your own vmm interface, it's
not clear what might have been missed. If you send me that information
(from dmesg, it will be a lot) I may be able to help.

-ml


> One more thing
> 
> Please note currently i have to build the bootable binary image of
> solo5 with a cross compiler as i have not figured out the
> discrepancies between OpenBSD's ld and solo5's linker script.
> 
> Cheers
> Adam



Re: Calculate the frequency of the tsc timecounter

2017-07-31 Thread Adam Steen
On Tue, Aug 1, 2017 at 7:26 AM, Adam Steen  wrote:
> Mike Belopuhov wrote:
>
>> To be able to use TSC as a timecounter source on OpenBSD or Solo5
>> you'd have to improve the in-kernel measurement of the TSC frequency
>> first. I've tried to perform 10 measurements and take an average and
>> it does improve accuracy, however I believe we need to poach another
>> bit from Linux and re-calibrate TSC via HPET:
>>
>>  
>> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
>>
>> I think this is the most sane thing we can do. Here's a complete
>> procedure that Linux kernel undertakes:
>>
>>  
>> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
>>
>> Regards,
>> Mike
>
> Looks like i have more sort out!
>
> Mike Larkin wrote:
>> If you point me to a bootable image that causes this failure, I might be
>> able to figure out what vmm(4) doesn't like.
>>
>> Nothing in lines 122-134 of the file indicated above should cause this.
>
> This is where things get a little more interesting, Solo5
> (https://github.com/adamsteen/solo5) is actually two parts Solo5 the
> Unikernel and ukvm the userland side of a hypervisor (currently
> running with kvm and bhyve), I have been porting to run ukvm directly
> with vmm. I expect the cause of "vmx_handle_exit: unhandled exit
> 2147483681 (unknown)" is the register setup in
> https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd_x86_64.c,
> lines 118-147
>
> the constants are ukvm constants.
>
> struct vm_resetcpu_params vrp = {
> .vrp_vm_id = hvb->vcp_id,
> .vrp_vcpu_id = hvb->vcpu_id,
> .vrp_init_state = {
> .vrs_gprs[VCPU_REGS_RFLAGS] = X86_RFLAGS_INIT,
> .vrs_gprs[VCPU_REGS_RIP] = gpa_ep,
> .vrs_gprs[VCPU_REGS_RSP] = hv->mem_size - 8,
> .vrs_gprs[VCPU_REGS_RDI] = X86_BOOT_INFO_BASE,
> .vrs_crs[VCPU_REGS_CR0] = X86_CR0_INIT,
> .vrs_crs[VCPU_REGS_CR3] = X86_CR3_INIT,
> .vrs_crs[VCPU_REGS_CR4] = X86_CR4_INIT,
> .vrs_sregs[VCPU_REGS_CS] = sreg_to_vsi(_x86_sreg_code),
> .vrs_sregs[VCPU_REGS_DS] = sreg_to_vsi(_x86_sreg_data),
> .vrs_sregs[VCPU_REGS_ES] = sreg_to_vsi(_x86_sreg_data),
> .vrs_sregs[VCPU_REGS_FS] = sreg_to_vsi(_x86_sreg_data),
> .vrs_sregs[VCPU_REGS_GS] = sreg_to_vsi(_x86_sreg_data),
> .vrs_sregs[VCPU_REGS_SS] = sreg_to_vsi(_x86_sreg_data),
> .vrs_gdtr = { 0x0, X86_GDTR_LIMIT, 0x0, X86_GDT_BASE},
> .vrs_idtr = { 0x0, 0x, 0x0, 0x0},
> .vrs_sregs[VCPU_REGS_LDTR] = sreg_to_vsi(_x86_sreg_unusable),
> .vrs_sregs[VCPU_REGS_TR] = sreg_to_vsi(_x86_sreg_tr),
> .vrs_msrs[VCPU_REGS_EFER] = X86_EFER_INIT,
> .vrs_msrs[VCPU_REGS_STAR] = 0ULL,
> .vrs_msrs[VCPU_REGS_LSTAR] = 0ULL,
> .vrs_msrs[VCPU_REGS_CSTAR] = 0ULL,
> .vrs_msrs[VCPU_REGS_SFMASK] = 0ULL,
> .vrs_msrs[VCPU_REGS_KGSBASE] = 0ULL,
> .vrs_crs[VCPU_REGS_XCR0] = XCR0_X87
> }
> };
>
> the three specific OpenBSD files are
> https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd.h
> https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd.c
> https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd_x86_64.c
> with small changes in ukvm/ukvm_elf.c and ukvm/ukvm_module_net.c
>
> I could upload a binary image for you but It won't run with vmd its
> has ukvm specific hypercalls designed to simplify things.
>
> Cheers
> Adam
>
> ps i am currently trying to document the differences in what vmm is
> expecting and ukvm is expecting.

One more thing

Please note currently i have to build the bootable binary image of
solo5 with a cross compiler as i have not figured out the
discrepancies between OpenBSD's ld and solo5's linker script.

Cheers
Adam



Re: Calculate the frequency of the tsc timecounter

2017-07-31 Thread Adam Steen
Mike Belopuhov wrote:

> To be able to use TSC as a timecounter source on OpenBSD or Solo5
> you'd have to improve the in-kernel measurement of the TSC frequency
> first. I've tried to perform 10 measurements and take an average and
> it does improve accuracy, however I believe we need to poach another
> bit from Linux and re-calibrate TSC via HPET:
>
>  
> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
>
> I think this is the most sane thing we can do. Here's a complete
> procedure that Linux kernel undertakes:
>
>  
> http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
>
> Regards,
> Mike

Looks like i have more sort out!

Mike Larkin wrote:
> If you point me to a bootable image that causes this failure, I might be
> able to figure out what vmm(4) doesn't like.
>
> Nothing in lines 122-134 of the file indicated above should cause this.

This is where things get a little more interesting, Solo5
(https://github.com/adamsteen/solo5) is actually two parts Solo5 the
Unikernel and ukvm the userland side of a hypervisor (currently
running with kvm and bhyve), I have been porting to run ukvm directly
with vmm. I expect the cause of "vmx_handle_exit: unhandled exit
2147483681 (unknown)" is the register setup in
https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd_x86_64.c,
lines 118-147

the constants are ukvm constants.

struct vm_resetcpu_params vrp = {
.vrp_vm_id = hvb->vcp_id,
.vrp_vcpu_id = hvb->vcpu_id,
.vrp_init_state = {
.vrs_gprs[VCPU_REGS_RFLAGS] = X86_RFLAGS_INIT,
.vrs_gprs[VCPU_REGS_RIP] = gpa_ep,
.vrs_gprs[VCPU_REGS_RSP] = hv->mem_size - 8,
.vrs_gprs[VCPU_REGS_RDI] = X86_BOOT_INFO_BASE,
.vrs_crs[VCPU_REGS_CR0] = X86_CR0_INIT,
.vrs_crs[VCPU_REGS_CR3] = X86_CR3_INIT,
.vrs_crs[VCPU_REGS_CR4] = X86_CR4_INIT,
.vrs_sregs[VCPU_REGS_CS] = sreg_to_vsi(_x86_sreg_code),
.vrs_sregs[VCPU_REGS_DS] = sreg_to_vsi(_x86_sreg_data),
.vrs_sregs[VCPU_REGS_ES] = sreg_to_vsi(_x86_sreg_data),
.vrs_sregs[VCPU_REGS_FS] = sreg_to_vsi(_x86_sreg_data),
.vrs_sregs[VCPU_REGS_GS] = sreg_to_vsi(_x86_sreg_data),
.vrs_sregs[VCPU_REGS_SS] = sreg_to_vsi(_x86_sreg_data),
.vrs_gdtr = { 0x0, X86_GDTR_LIMIT, 0x0, X86_GDT_BASE},
.vrs_idtr = { 0x0, 0x, 0x0, 0x0},
.vrs_sregs[VCPU_REGS_LDTR] = sreg_to_vsi(_x86_sreg_unusable),
.vrs_sregs[VCPU_REGS_TR] = sreg_to_vsi(_x86_sreg_tr),
.vrs_msrs[VCPU_REGS_EFER] = X86_EFER_INIT,
.vrs_msrs[VCPU_REGS_STAR] = 0ULL,
.vrs_msrs[VCPU_REGS_LSTAR] = 0ULL,
.vrs_msrs[VCPU_REGS_CSTAR] = 0ULL,
.vrs_msrs[VCPU_REGS_SFMASK] = 0ULL,
.vrs_msrs[VCPU_REGS_KGSBASE] = 0ULL,
.vrs_crs[VCPU_REGS_XCR0] = XCR0_X87
}
};

the three specific OpenBSD files are
https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd.h
https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd.c
https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_openbsd_x86_64.c
with small changes in ukvm/ukvm_elf.c and ukvm/ukvm_module_net.c

I could upload a binary image for you but It won't run with vmd its
has ukvm specific hypercalls designed to simplify things.

Cheers
Adam

ps i am currently trying to document the differences in what vmm is
expecting and ukvm is expecting.



Re: Calculate the frequency of the tsc timecounter

2017-07-31 Thread Mike Larkin
On Mon, Jul 31, 2017 at 07:19:36AM +0800, Adam Steen wrote:
> Sorry, i sent that before i had finished.
> 
> I am trying to find an equivalent of the following code for FreeBSD
> 
> size_t outsz = sizeof bi->cpu.tsc_freq;
> int ret = sysctlbyname("machdep.tsc_freq", >cpu.tsc_freq, , 
> NULL,
> 0);
> if (ret == -1)
> err(1, "sysctl(machdep.tsc_freq)");
> int invariant_tsc = 0;
> outsz = sizeof invariant_tsc;
> ret = sysctlbyname("kern.timecounter.invariant_tsc", _tsc, 
> ,
> NULL, 0);
> if (ret == -1)
> err(1, "sysctl(kern.timecounter.invariant_tsc");
> if (invariant_tsc != 1)
> errx(1, "Host TSC is not invariant, cannot continue");
> 
> 
> line 122-134 of
> https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_freebsd_x86_64.c
> 
> Currently the port fails with "vmx_handle_exit: unhandled exit
> 2147483681 (unknown)" (0x8021 in hex), I am setting up the
> registers to incorrect values, but need to read more to understand
> what OpenBSD vmm and ukvm are expecting from each other.
> 

If you point me to a bootable image that causes this failure, I might be
able to figure out what vmm(4) doesn't like.

Nothing in lines 122-134 of the file indicated above should cause this.

-ml

> Cheers
> Adam
> 
> > Hi Mike
> >
> > In short i don't want to calculate the TSC frequency, i would like to
> > get it from the kernel if possible? (also checking if it was
> > invariant.) I knew my code was inaccurate, but didn't know another
> > way.
> >
> > I did check the output of dmesg: "cpu0: TSC frequency 2492310500 Hz',
> > but couldn't find a away to get this programmatically, eg via sysctl
> >
> > Long story, i am in the process of porting Solo5 to OpenBSD (see
> > https://github.com/adamsteen/solo5) and need the tsc frequency to
> > support the ukvm tsc clock.
> >
> > Is there currently a way to get the TSC frequency programmatically?
> > and is CPUID the best way to check if the tsc is invariant?
> > (CPUID.8007H:EDX[8], Intel's Designer's vol3b, section 16.11.1
> > Invariant TSC)
> >
> > On Mon, Jul 31, 2017 at 1:57 AM, Mike Belopuhov <m...@belopuhov.com> wrote:
> >> On Wed, Jul 26, 2017 at 19:24 +0800, Adam Steen wrote:
> >>> Hi
> >>>
> >>> Is there an easy/accurate way to calculate the tsc timecounter frequency?
> >>> like Time Stamp Counters <http://blog.tinola.com/?e=54> on Linux. (on a
> >>> Sandy Bridge cpu)
> >>>
> >>> Another reference Converting Sandy Bridge TSC to wall clock time
> >>> <https://software.intel.com/en-us/forums/intel-isa-extensions/topic/284137>.
> >>>
> >>> The code below works but i don't really know how accurate it is, at best
> >>> 10,000 Hz.
> >>>
> >>> Cheers
> >>> Adam
> >>>
> >>
> >> Hi,
> >>
> >> First of all it's not clear why do you want to calculate TSC
> >> frequency in the userland program?  The kernel does it and
> >> prints the result to the system message buffer (viewed with
> >> the dmesg command).
> >>
> >> The second thing worth pointing out is that gettimeofday is a
> >> syscall that queries the timestamp from the timecounter code
> >> (updated every 10ms) with a current delta read directly from
> >> the hardware so that you get an accurate reading, but then
> >> it's adjusted according to the system time adjustment rules
> >> imposed by things like NTP and settimeofday, so essentially
> >> it's not monotonic (unless you can ensure there is no actor
> >> present that is adjusting the time while you're performing
> >> your measurement).  There's also a way for userland to query
> >> a precise monotonically increasing timestamp: clock_gettime
> >> with CLOCK_MONOTONIC as the clock_id.  In this case the
> >> returned timestamp is relative to the moment the system was
> >> brought up but this doesn't matter if all you need is
> >> difference.
> >>
> >> The third thing to know is where does this hardware reading
> >> comes from and what's its precision.  Running "sysctl -n
> >> kern.timecounter.hardware" will tell you what is currently
> >> selected as the source and then you can locate that device
> >> your dmesg (with exception of i8254 -- that 1.19Mhz PIT).
> >> For instance on my laptop it's ACPI HPET that is proving a
> >&

Re: Calculate the frequency of the tsc timecounter

2017-07-31 Thread Mike Belopuhov
On Mon, Jul 31, 2017 at 09:48 +0800, Adam Steen wrote:
> Ted Unangst  wrote:
> > we don't currently export this info, but we could add some sysctls. there's
> > some cpufeatures stuff there, but generally stuff isn't exported until
> > somebody finds a use for it... it shouldn't be too hard to add something to
> > amd64/machdep.c sysctl if you're interested.
> 
> I am interested, as i need the info, i will look into it and hopefully
> come back with a patch.

This is a bad idea because TSC as the time source is only usable
by OpenBSD on Skylake and Kaby Lake CPUs since they encode the TSC
frequency in the CPUID. All older CPUs have their TSCs measured
against the PIT. Currently the measurement done by the kernel isn't
very precise and if TSC is selected as a timecounter, the machine
would be gaining time on a pace that cannot be corrected by our NTP
daemon. (IIRC, about an hour a day on my Haswell running with NTP).

To be able to use TSC as a timecounter source on OpenBSD or Solo5
you'd have to improve the in-kernel measurement of the TSC frequency
first. I've tried to perform 10 measurements and take an average and
it does improve accuracy, however I believe we need to poach another
bit from Linux and re-calibrate TSC via HPET:

 
http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409

I think this is the most sane thing we can do. Here's a complete
procedure that Linux kernel undertakes:

 
http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751

Regards,
Mike



Re: Calculate the frequency of the tsc timecounter

2017-07-30 Thread Adam Steen
Ted Unangst  wrote:
> we don't currently export this info, but we could add some sysctls. there's
> some cpufeatures stuff there, but generally stuff isn't exported until
> somebody finds a use for it... it shouldn't be too hard to add something to
> amd64/machdep.c sysctl if you're interested.

I am interested, as i need the info, i will look into it and hopefully
come back with a patch.



Re: Calculate the frequency of the tsc timecounter

2017-07-30 Thread Ted Unangst
Adam Steen wrote:
> Sorry, i sent that before i had finished.
> 
> I am trying to find an equivalent of the following code for FreeBSD

we don't currently export this info, but we could add some sysctls. there's
some cpufeatures stuff there, but generally stuff isn't exported until
somebody finds a use for it... it shouldn't be too hard to add something to
amd64/machdep.c sysctl if you're interested.

> 
> size_t outsz = sizeof bi->cpu.tsc_freq;
> int ret = sysctlbyname("machdep.tsc_freq", >cpu.tsc_freq, , 
> NULL,
> 0);



Re: Calculate the frequency of the tsc timecounter

2017-07-30 Thread Adam Steen
Sorry, i sent that before i had finished.

I am trying to find an equivalent of the following code for FreeBSD

size_t outsz = sizeof bi->cpu.tsc_freq;
int ret = sysctlbyname("machdep.tsc_freq", >cpu.tsc_freq, , NULL,
0);
if (ret == -1)
err(1, "sysctl(machdep.tsc_freq)");
int invariant_tsc = 0;
outsz = sizeof invariant_tsc;
ret = sysctlbyname("kern.timecounter.invariant_tsc", _tsc, ,
NULL, 0);
if (ret == -1)
err(1, "sysctl(kern.timecounter.invariant_tsc");
if (invariant_tsc != 1)
errx(1, "Host TSC is not invariant, cannot continue");


line 122-134 of
https://github.com/adamsteen/solo5/blob/master/ukvm/ukvm_hv_freebsd_x86_64.c

Currently the port fails with "vmx_handle_exit: unhandled exit
2147483681 (unknown)" (0x8021 in hex), I am setting up the
registers to incorrect values, but need to read more to understand
what OpenBSD vmm and ukvm are expecting from each other.

Cheers
Adam

> Hi Mike
>
> In short i don't want to calculate the TSC frequency, i would like to
> get it from the kernel if possible? (also checking if it was
> invariant.) I knew my code was inaccurate, but didn't know another
> way.
>
> I did check the output of dmesg: "cpu0: TSC frequency 2492310500 Hz',
> but couldn't find a away to get this programmatically, eg via sysctl
>
> Long story, i am in the process of porting Solo5 to OpenBSD (see
> https://github.com/adamsteen/solo5) and need the tsc frequency to
> support the ukvm tsc clock.
>
> Is there currently a way to get the TSC frequency programmatically?
> and is CPUID the best way to check if the tsc is invariant?
> (CPUID.8007H:EDX[8], Intel's Designer's vol3b, section 16.11.1
> Invariant TSC)
>
> On Mon, Jul 31, 2017 at 1:57 AM, Mike Belopuhov <m...@belopuhov.com> wrote:
>> On Wed, Jul 26, 2017 at 19:24 +0800, Adam Steen wrote:
>>> Hi
>>>
>>> Is there an easy/accurate way to calculate the tsc timecounter frequency?
>>> like Time Stamp Counters <http://blog.tinola.com/?e=54> on Linux. (on a
>>> Sandy Bridge cpu)
>>>
>>> Another reference Converting Sandy Bridge TSC to wall clock time
>>> <https://software.intel.com/en-us/forums/intel-isa-extensions/topic/284137>.
>>>
>>> The code below works but i don't really know how accurate it is, at best
>>> 10,000 Hz.
>>>
>>> Cheers
>>> Adam
>>>
>>
>> Hi,
>>
>> First of all it's not clear why do you want to calculate TSC
>> frequency in the userland program?  The kernel does it and
>> prints the result to the system message buffer (viewed with
>> the dmesg command).
>>
>> The second thing worth pointing out is that gettimeofday is a
>> syscall that queries the timestamp from the timecounter code
>> (updated every 10ms) with a current delta read directly from
>> the hardware so that you get an accurate reading, but then
>> it's adjusted according to the system time adjustment rules
>> imposed by things like NTP and settimeofday, so essentially
>> it's not monotonic (unless you can ensure there is no actor
>> present that is adjusting the time while you're performing
>> your measurement).  There's also a way for userland to query
>> a precise monotonically increasing timestamp: clock_gettime
>> with CLOCK_MONOTONIC as the clock_id.  In this case the
>> returned timestamp is relative to the moment the system was
>> brought up but this doesn't matter if all you need is
>> difference.
>>
>> The third thing to know is where does this hardware reading
>> comes from and what's its precision.  Running "sysctl -n
>> kern.timecounter.hardware" will tell you what is currently
>> selected as the source and then you can locate that device
>> your dmesg (with exception of i8254 -- that 1.19Mhz PIT).
>> For instance on my laptop it's ACPI HPET that is proving a
>> running counter with the frequency of 14 MHz. This is what's
>> going to limit the precision of your measurement.
>>
>> To get a better reading you may try to take a series of say
>> 10 measurements and calculate the average.
>>
>> The difference between RDTSC and RDTSCP is that the latter
>> tells you on which CPU the instruction was executed.  This
>> poses a valid question: is TSC frequency the same on a multi-
>> socket system.  And I don't have an answer for that one.
>> AFAIU, this boils down to the motherboard design and if the
>> manufacturer has selected to use different quartz crystals
>> for different sockets, then as we know for a fact that no two
>> 

Re: Calculate the frequency of the tsc timecounter

2017-07-30 Thread Adam Steen
Hi Mike

In short i don't want to calculate the TSC frequency, i would like to
get it from the kernel if possible? (also checking if it was
invariant.) I knew my code was inaccurate, but didn't know another
way.

I did check the output of dmesg: "cpu0: TSC frequency 2492310500 Hz',
but couldn't find a away to get this programmatically, eg via sysctl

Long story, i am in the process of porting Solo5 to OpenBSD (see
https://github.com/adamsteen/solo5) and need the tsc frequency to
support the ukvm tsc clock.

Is there currently a way to get the TSC frequency programmatically?
and is CPUID the best way to check if the tsc is invariant?
(CPUID.8007H:EDX[8], Intel's Designer's vol3b, section 16.11.1
Invariant TSC)

On Mon, Jul 31, 2017 at 1:57 AM, Mike Belopuhov <m...@belopuhov.com> wrote:
> On Wed, Jul 26, 2017 at 19:24 +0800, Adam Steen wrote:
>> Hi
>>
>> Is there an easy/accurate way to calculate the tsc timecounter frequency?
>> like Time Stamp Counters <http://blog.tinola.com/?e=54> on Linux. (on a
>> Sandy Bridge cpu)
>>
>> Another reference Converting Sandy Bridge TSC to wall clock time
>> <https://software.intel.com/en-us/forums/intel-isa-extensions/topic/284137>.
>>
>> The code below works but i don't really know how accurate it is, at best
>> 10,000 Hz.
>>
>> Cheers
>> Adam
>>
>
> Hi,
>
> First of all it's not clear why do you want to calculate TSC
> frequency in the userland program?  The kernel does it and
> prints the result to the system message buffer (viewed with
> the dmesg command).
>
> The second thing worth pointing out is that gettimeofday is a
> syscall that queries the timestamp from the timecounter code
> (updated every 10ms) with a current delta read directly from
> the hardware so that you get an accurate reading, but then
> it's adjusted according to the system time adjustment rules
> imposed by things like NTP and settimeofday, so essentially
> it's not monotonic (unless you can ensure there is no actor
> present that is adjusting the time while you're performing
> your measurement).  There's also a way for userland to query
> a precise monotonically increasing timestamp: clock_gettime
> with CLOCK_MONOTONIC as the clock_id.  In this case the
> returned timestamp is relative to the moment the system was
> brought up but this doesn't matter if all you need is
> difference.
>
> The third thing to know is where does this hardware reading
> comes from and what's its precision.  Running "sysctl -n
> kern.timecounter.hardware" will tell you what is currently
> selected as the source and then you can locate that device
> your dmesg (with exception of i8254 -- that 1.19Mhz PIT).
> For instance on my laptop it's ACPI HPET that is proving a
> running counter with the frequency of 14 MHz. This is what's
> going to limit the precision of your measurement.
>
> To get a better reading you may try to take a series of say
> 10 measurements and calculate the average.
>
> The difference between RDTSC and RDTSCP is that the latter
> tells you on which CPU the instruction was executed.  This
> poses a valid question: is TSC frequency the same on a multi-
> socket system.  And I don't have an answer for that one.
> AFAIU, this boils down to the motherboard design and if the
> manufacturer has selected to use different quartz crystals
> for different sockets, then as we know for a fact that no two
> quartz crystals are created the same and thus frequency
> sourced from them and multiplied by clock generator PLLs to
> produce bus and then core clock signals will be slightly
> different between sockets.  I believe there's a way to
> compensate for that but OpenBSD doesn't do this currently.
>
> However, your code doesn't check on which CPU the RDTSC has
> been executed so you can just use RDTSC and hope that TSC
> frequencies are the same and all counters on all cores have
> been started at the same time by the firmware (which is
> another question whether or not this is actually true).
>
> The CPUID call that you can see used there is there to
> provide serialization.  I believe there's no need to do it in
> this case.  In fact we haven't observed adverse effects w/o
> an additional serialization instruction on Skylake where TSC
> is used as the default timecounter (e.g. instead of an HPET).
>
> The other issue with your program is that you don't account
> for how long does it take to perform a syscall operation.  You
> could time it before running the loop and then subtract the
> reading.
>
> And finally, a userland program will not run a 1 second loop
> uninterrupted since the scheduler will always attempt to
> select a different process every 10ms.  Which means that
> a p

Re: Calculate the frequency of the tsc timecounter

2017-07-30 Thread Mike Belopuhov
On Wed, Jul 26, 2017 at 19:24 +0800, Adam Steen wrote:
> Hi
> 
> Is there an easy/accurate way to calculate the tsc timecounter frequency?
> like Time Stamp Counters <http://blog.tinola.com/?e=54> on Linux. (on a
> Sandy Bridge cpu)
> 
> Another reference Converting Sandy Bridge TSC to wall clock time
> <https://software.intel.com/en-us/forums/intel-isa-extensions/topic/284137>.
> 
> The code below works but i don't really know how accurate it is, at best
> 10,000 Hz.
> 
> Cheers
> Adam
>

Hi,

First of all it's not clear why do you want to calculate TSC
frequency in the userland program?  The kernel does it and
prints the result to the system message buffer (viewed with
the dmesg command).

The second thing worth pointing out is that gettimeofday is a
syscall that queries the timestamp from the timecounter code
(updated every 10ms) with a current delta read directly from
the hardware so that you get an accurate reading, but then
it's adjusted according to the system time adjustment rules
imposed by things like NTP and settimeofday, so essentially
it's not monotonic (unless you can ensure there is no actor
present that is adjusting the time while you're performing
your measurement).  There's also a way for userland to query
a precise monotonically increasing timestamp: clock_gettime
with CLOCK_MONOTONIC as the clock_id.  In this case the
returned timestamp is relative to the moment the system was
brought up but this doesn't matter if all you need is
difference.

The third thing to know is where does this hardware reading
comes from and what's its precision.  Running "sysctl -n
kern.timecounter.hardware" will tell you what is currently
selected as the source and then you can locate that device
your dmesg (with exception of i8254 -- that 1.19Mhz PIT).
For instance on my laptop it's ACPI HPET that is proving a
running counter with the frequency of 14 MHz. This is what's
going to limit the precision of your measurement.

To get a better reading you may try to take a series of say
10 measurements and calculate the average.

The difference between RDTSC and RDTSCP is that the latter
tells you on which CPU the instruction was executed.  This
poses a valid question: is TSC frequency the same on a multi-
socket system.  And I don't have an answer for that one.
AFAIU, this boils down to the motherboard design and if the
manufacturer has selected to use different quartz crystals
for different sockets, then as we know for a fact that no two
quartz crystals are created the same and thus frequency
sourced from them and multiplied by clock generator PLLs to
produce bus and then core clock signals will be slightly
different between sockets.  I believe there's a way to
compensate for that but OpenBSD doesn't do this currently.

However, your code doesn't check on which CPU the RDTSC has
been executed so you can just use RDTSC and hope that TSC
frequencies are the same and all counters on all cores have
been started at the same time by the firmware (which is
another question whether or not this is actually true).

The CPUID call that you can see used there is there to
provide serialization.  I believe there's no need to do it in
this case.  In fact we haven't observed adverse effects w/o
an additional serialization instruction on Skylake where TSC
is used as the default timecounter (e.g. instead of an HPET).

The other issue with your program is that you don't account
for how long does it take to perform a syscall operation.  You
could time it before running the loop and then subtract the
reading.

And finally, a userland program will not run a 1 second loop
uninterrupted since the scheduler will always attempt to
select a different process every 10ms.  Which means that
a potential context switch and a 10ms timeslice of another
process might make its way into your measurement.

This all begs the same question I asked in the beginning: why
do you want to calculate the TSC frequency in the userland
program?


> #include 
> #include 
> #include 
> 
> uint64_t rdtscp()
> {
> uint32_t lo, hi;
>  __asm__ __volatile__ ("RDTSCP\n\t"
>"mov %%edx, %0\n\t"
>"mov %%eax, %1\n\t"
>"CPUID\n\t": "=r" (hi), "=r" (lo):: "%rax",
> "%rbx", "%rcx", "%rdx");
> return (uint64_t)hi << 32 | lo;
> }
> 
> uint64_t rdtsc()
> {
> uint32_t lo, hi;
>  __asm__ __volatile__ ("CPUID\n\t"
>"RDTSC\n\t"
>"mov %%edx, %0\n\t"
>"mov %%eax, %1\n\t": "=r" (hi), "=r" (lo)::
>"%rax", "%rbx", "%rcx", "%rdx"

Calculate the frequency of the tsc timecounter

2017-07-26 Thread Adam Steen
Hi

Is there an easy/accurate way to calculate the tsc timecounter frequency?
like Time Stamp Counters <http://blog.tinola.com/?e=54> on Linux. (on a
Sandy Bridge cpu)

Another reference Converting Sandy Bridge TSC to wall clock time
<https://software.intel.com/en-us/forums/intel-isa-extensions/topic/284137>.

The code below works but i don't really know how accurate it is, at best
10,000 Hz.

Cheers
Adam

#include 
#include 
#include 

uint64_t rdtscp()
{
uint32_t lo, hi;
 __asm__ __volatile__ ("RDTSCP\n\t"
   "mov %%edx, %0\n\t"
   "mov %%eax, %1\n\t"
   "CPUID\n\t": "=r" (hi), "=r" (lo):: "%rax",
"%rbx", "%rcx", "%rdx");
return (uint64_t)hi << 32 | lo;
}

uint64_t rdtsc()
{
uint32_t lo, hi;
 __asm__ __volatile__ ("CPUID\n\t"
   "RDTSC\n\t"
   "mov %%edx, %0\n\t"
   "mov %%eax, %1\n\t": "=r" (hi), "=r" (lo)::
   "%rax", "%rbx", "%rcx", "%rdx");;
return (uint64_t)hi << 32 | lo;
}

uint64_t get_tsc_freq_hz()
{
uint64_t start_timestamp, end_timestamp;
struct timeval tv_start, tv_end;

gettimeofday(_start, NULL);
start_timestamp = rdtsc();
while (1) {
gettimeofday(_end, NULL);
if (tv_end.tv_sec > tv_start.tv_sec + 1)
break;
}
end_timestamp = rdtscp();

uint64_t cycles = end_timestamp - start_timestamp;
uint64_t usec = (tv_end.tv_sec - tv_start.tv_sec) * 100 +
(tv_end.tv_usec - tv_start.tv_usec);
// convert to cycles per second need to muliple the result by 100
uint64_t tsc_freq = 100 * cycles / usec;

return tsc_freq;
}

int main (int argc, char *argv[])
{
uint64_t tsc_freq = get_tsc_freq_hz();

printf("TSC frequency = %llu Hz\n", tsc_freq);

return 0;
}