Re: [CFT] Paravirtualized KVM clock

2015-01-21 Thread Peter Jeremy
On 2015-Jan-04 11:56:14 -0600, Bryan Venteicher bry...@daemoninthecloset.org 
wrote:
For the last few weeks, I've been working on adding support for KVM clock
in the projects/paravirt branch. Currently, a KVM VM guest will end up
selecting either the HPET or ACPI as the timecounter source. Unfortunately,
this is very costly since every timecounter fetch causes a VM exit. KVM
clock allows the guest to use the TSC instead; it is very similar to the
existing Xen timer.

A somewhat late response but have you looked at
https://github.com/blitz/freebsd/commit/cdc5f872b3e48cc0dda031fc7d6bdedc65c3148f
I've been running this[*] on a Google Compute Engine instance for about 6
months without problems.

[*] I had to patch out the test for KVM_FEATURE_CLOCKSOURCE_STABLE_BIT but
I think that's a GCE issue.

-- 
Peter Jeremy


pgpi9_M8QUFuE.pgp
Description: PGP signature


Re: [CFT] Paravirtualized KVM clock

2015-01-21 Thread Bryan Venteicher
On Wed, Jan 21, 2015 at 3:15 PM, Peter Jeremy pe...@rulingia.com wrote:

 On 2015-Jan-04 11:56:14 -0600, Bryan Venteicher 
 bry...@daemoninthecloset.org wrote:
 For the last few weeks, I've been working on adding support for KVM clock
 in the projects/paravirt branch. Currently, a KVM VM guest will end up
 selecting either the HPET or ACPI as the timecounter source.
 Unfortunately,
 this is very costly since every timecounter fetch causes a VM exit. KVM
 clock allows the guest to use the TSC instead; it is very similar to the
 existing Xen timer.

 A somewhat late response but have you looked at

 https://github.com/blitz/freebsd/commit/cdc5f872b3e48cc0dda031fc7d6bdedc65c3148f
 I've been running this[*] on a Google Compute Engine instance for about 6
 months without problems.


A goal of my work was to put a bit of infrastructure in place so FreeBSD
can support pvops across a variety of hypervisors. KVMCLOCK happens to be
about the easiest to implement, and has a decent performance win for many
situations.

I think that commit is broken on SMP guests: CPU_FOREACH() does not switch
the current CPU, so it just keeps writing to the MSR on the BSP.

[*] I had to patch out the test for KVM_FEATURE_CLOCKSOURCE_STABLE_BIT but
 I think that's a GCE issue.

 --
 Peter Jeremy

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


[CFT] Paravirtualized KVM clock

2015-01-04 Thread Bryan Venteicher
For the last few weeks, I've been working on adding support for KVM clock
in the projects/paravirt branch. Currently, a KVM VM guest will end up
selecting either the HPET or ACPI as the timecounter source. Unfortunately,
this is very costly since every timecounter fetch causes a VM exit. KVM
clock allows the guest to use the TSC instead; it is very similar to the
existing Xen timer.

The performance difference between HPET/ACPI and KVMCLOCK can be dramatic:
a simple disk benchmark goes from 10K IOPs to 100K IOPs.

The patch is attached is attached or available at [1]. I'd appreciate any
testing.

Also as a part of this, I've tried to generalized a bit of our existing
hypervisor guest code, with the eventual goal of being able to support more
invasive PV operations. The patch series is viewable in Phabricator.

https://reviews.freebsd.org/D1429 - paravirt: Generalize parts of the XEN
timer code into pvclock
https://reviews.freebsd.org/D1430 - paravirt: Add interface to calculate
the TSC frequency from pvclock
https://reviews.freebsd.org/D1431 - paravirt: Add simple hypervisor
registration and detection interface
https://reviews.freebsd.org/D1432 - paravirt: Add detection of bhyve using
new hypervisor interface
https://reviews.freebsd.org/D1433 - paravirt: Add detection of VMware using
new hypervisor interface
https://reviews.freebsd.org/D1434 - paravirt: Add detection of KVM using
new hypervisor interface
https://reviews.freebsd.org/D1435 - paravirt: Add KVM clock timecounter
support

My current plan is to MFC this series to 10-STABLE, and commit a
self-contained KVM clock to the other stable branches.

[1] - https://people.freebsd.org/~bryanv/patches/kvm_clock-1.patch
diff --git a/sys/amd64/include/pvclock.h b/sys/amd64/include/pvclock.h
new file mode 100644
index 000..f01fac6
--- /dev/null
+++ b/sys/amd64/include/pvclock.h
@@ -0,0 +1,6 @@
+/*-
+ * This file is in the public domain.
+ */
+/* $FreeBSD$ */
+
+#include x86/pvclock.h
diff --git a/sys/conf/files.amd64 b/sys/conf/files.amd64
index bbbe827..7d85742 100644
--- a/sys/conf/files.amd64
+++ b/sys/conf/files.amd64
@@ -555,13 +555,17 @@ x86/isa/nmi.c			standard
 x86/isa/orm.c			optional	isa
 x86/pci/pci_bus.c		optional	pci
 x86/pci/qpi.c			optional	pci
+x86/x86/bhyve.c			standard
 x86/x86/busdma_bounce.c		standard
 x86/x86/busdma_machdep.c	standard
 x86/x86/dump_machdep.c		standard
 x86/x86/fdt_machdep.c		optional	fdt
+x86/x86/hypervisor.c		standard
 x86/x86/identcpu.c		standard
 x86/x86/intr_machdep.c		standard
 x86/x86/io_apic.c		standard
+x86/x86/kvm.c			standard
+x86/x86/kvm_clock.c		standard
 x86/x86/legacy.c		standard
 x86/x86/local_apic.c		standard
 x86/x86/mca.c			standard
@@ -569,8 +573,10 @@ x86/x86/mptable.c		optional	mptable
 x86/x86/mptable_pci.c		optional	mptable pci
 x86/x86/msi.c			optional	pci
 x86/x86/nexus.c			standard
+x86/x86/pvclock.c		standard
 x86/x86/tsc.c			standard
 x86/x86/delay.c			standard
+x86/x86/vmware.c		standard
 x86/xen/hvm.c			optional	xenhvm
 x86/xen/xen_intr.c		optional	xen | xenhvm
 x86/xen/pv.c			optional	xenhvm
diff --git a/sys/conf/files.i386 b/sys/conf/files.i386
index 96879b8..ca83c4c 100644
--- a/sys/conf/files.i386
+++ b/sys/conf/files.i386
@@ -573,13 +573,17 @@ x86/isa/nmi.c			standard
 x86/isa/orm.c			optional isa
 x86/pci/pci_bus.c		optional pci
 x86/pci/qpi.c			optional pci
+x86/x86/bhyve.c			standard
 x86/x86/busdma_bounce.c		standard
 x86/x86/busdma_machdep.c	standard
 x86/x86/dump_machdep.c		standard
 x86/x86/fdt_machdep.c		optional fdt
+x86/x86/hypervisor.c		standard
 x86/x86/identcpu.c		standard
 x86/x86/intr_machdep.c		standard
 x86/x86/io_apic.c		optional apic
+x86/x86/kvm.c			standard
+x86/x86/kvm_clock.c		standard
 x86/x86/legacy.c		optional native
 x86/x86/local_apic.c		optional apic
 x86/x86/mca.c			standard
@@ -588,7 +592,9 @@ x86/x86/mptable_pci.c		optional apic native pci
 x86/x86/msi.c			optional apic pci
 x86/x86/nexus.c			standard
 x86/x86/tsc.c			standard
+x86/x86/pvclock.c		standard
 x86/x86/delay.c			standard
+x86/x86/vmware.c		standard
 x86/xen/hvm.c			optional xenhvm
 x86/xen/xen_intr.c		optional xen | xenhvm
 x86/xen/xen_apic.c		optional xenhvm
diff --git a/sys/dev/xen/timer/timer.c b/sys/dev/xen/timer/timer.c
index 5743076..53aff0a 100644
--- a/sys/dev/xen/timer/timer.c
+++ b/sys/dev/xen/timer/timer.c
@@ -59,6 +59,7 @@ __FBSDID($FreeBSD$);
 #include machine/clock.h
 #include machine/_inttypes.h
 #include machine/smp.h
+#include machine/pvclock.h
 
 #include dev/xen/timer/timer.h
 
@@ -95,9 +96,6 @@ struct xentimer_softc {
 	struct eventtimer et;
 };
 
-/* Last time; this guarantees a monotonically increasing clock. */
-volatile uint64_t xen_timer_last_time = 0;
-
 static void
 xentimer_identify(driver_t *driver, device_t parent)
 {
@@ -148,128 +146,20 @@ xentimer_probe(device_t dev)
 	return (BUS_PROBE_NOWILDCARD);
 }
 
-/*
- * Scale a 64-bit delta by scaling and multiplying by a 32-bit fraction,
- * yielding a 64-bit result.
- */
-static inline uint64_t

Re: [CFT] Paravirtualized KVM clock

2015-01-04 Thread Adrian Chadd
... so, out of pure curiousity - what's making the benchmark go
faster? Is it userland side of things calling clock methods, or
something in the kernel, or both?



-adrian


On 4 January 2015 at 09:56, Bryan Venteicher
bry...@daemoninthecloset.org wrote:
 For the last few weeks, I've been working on adding support for KVM clock
 in the projects/paravirt branch. Currently, a KVM VM guest will end up
 selecting either the HPET or ACPI as the timecounter source. Unfortunately,
 this is very costly since every timecounter fetch causes a VM exit. KVM
 clock allows the guest to use the TSC instead; it is very similar to the
 existing Xen timer.

 The performance difference between HPET/ACPI and KVMCLOCK can be dramatic:
 a simple disk benchmark goes from 10K IOPs to 100K IOPs.

 The patch is attached is attached or available at [1]. I'd appreciate any
 testing.

 Also as a part of this, I've tried to generalized a bit of our existing
 hypervisor guest code, with the eventual goal of being able to support more
 invasive PV operations. The patch series is viewable in Phabricator.

 https://reviews.freebsd.org/D1429 - paravirt: Generalize parts of the XEN
 timer code into pvclock
 https://reviews.freebsd.org/D1430 - paravirt: Add interface to calculate
 the TSC frequency from pvclock
 https://reviews.freebsd.org/D1431 - paravirt: Add simple hypervisor
 registration and detection interface
 https://reviews.freebsd.org/D1432 - paravirt: Add detection of bhyve using
 new hypervisor interface
 https://reviews.freebsd.org/D1433 - paravirt: Add detection of VMware using
 new hypervisor interface
 https://reviews.freebsd.org/D1434 - paravirt: Add detection of KVM using
 new hypervisor interface
 https://reviews.freebsd.org/D1435 - paravirt: Add KVM clock timecounter
 support

 My current plan is to MFC this series to 10-STABLE, and commit a
 self-contained KVM clock to the other stable branches.

 [1] - https://people.freebsd.org/~bryanv/patches/kvm_clock-1.patch

 ___
 freebsd-a...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-arch
 To unsubscribe, send any mail to freebsd-arch-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [CFT] Paravirtualized KVM clock

2015-01-04 Thread Jim Harris
On Sun, Jan 4, 2015 at 12:00 PM, Adrian Chadd adr...@freebsd.org wrote:

 ... so, out of pure curiousity - what's making the benchmark go
 faster? Is it userland side of things calling clock methods, or
 something in the kernel, or both?


Most likely GEOM statistic gathering in the kernel but Bryan would have to
confirm.

I intermittently saw this same kind of massive slowdown in nvme(4)
performance a couple of years back due to a bug in the TSC self-check code
which has since been fixed.  The bug would result in falling back to HPET
and all of the clock calls from the GEOM code for each I/O would kill
performance.



 -adrian


 On 4 January 2015 at 09:56, Bryan Venteicher
 bry...@daemoninthecloset.org wrote:
  For the last few weeks, I've been working on adding support for KVM clock
  in the projects/paravirt branch. Currently, a KVM VM guest will end up
  selecting either the HPET or ACPI as the timecounter source.
 Unfortunately,
  this is very costly since every timecounter fetch causes a VM exit. KVM
  clock allows the guest to use the TSC instead; it is very similar to the
  existing Xen timer.
 
  The performance difference between HPET/ACPI and KVMCLOCK can be
 dramatic:
  a simple disk benchmark goes from 10K IOPs to 100K IOPs.
 
  The patch is attached is attached or available at [1]. I'd appreciate any
  testing.
 
  Also as a part of this, I've tried to generalized a bit of our existing
  hypervisor guest code, with the eventual goal of being able to support
 more
  invasive PV operations. The patch series is viewable in Phabricator.
 
  https://reviews.freebsd.org/D1429 - paravirt: Generalize parts of the
 XEN
  timer code into pvclock
  https://reviews.freebsd.org/D1430 - paravirt: Add interface to calculate
  the TSC frequency from pvclock
  https://reviews.freebsd.org/D1431 - paravirt: Add simple hypervisor
  registration and detection interface
  https://reviews.freebsd.org/D1432 - paravirt: Add detection of bhyve
 using
  new hypervisor interface
  https://reviews.freebsd.org/D1433 - paravirt: Add detection of VMware
 using
  new hypervisor interface
  https://reviews.freebsd.org/D1434 - paravirt: Add detection of KVM using
  new hypervisor interface
  https://reviews.freebsd.org/D1435 - paravirt: Add KVM clock timecounter
  support
 
  My current plan is to MFC this series to 10-STABLE, and commit a
  self-contained KVM clock to the other stable branches.
 
  [1] - https://people.freebsd.org/~bryanv/patches/kvm_clock-1.patch
 
  ___
  freebsd-a...@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-arch
  To unsubscribe, send any mail to freebsd-arch-unsubscr...@freebsd.org
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [CFT] Paravirtualized KVM clock

2015-01-04 Thread Bryan Venteicher
On Sun, Jan 4, 2015 at 8:01 PM, Jim Harris jim.har...@gmail.com wrote:



 On Sun, Jan 4, 2015 at 12:00 PM, Adrian Chadd adr...@freebsd.org wrote:

 ... so, out of pure curiousity - what's making the benchmark go
 faster? Is it userland side of things calling clock methods, or
 something in the kernel, or both?


 Most likely GEOM statistic gathering in the kernel but Bryan would have to
 confirm.


Yes
​ - t​
hat's the main
​ source​
. A similar issue exists in the network stack
​BPF.​


I haven't looked or thought too much if it make sense / is possible to use
kvmclock in userland too (I think kib@ added fast gettimeofday  friends
support a few years back).


I intermittently saw this same kind of massive slowdown in nvme(4)
 performance a couple of years back due to a bug in the TSC self-check code
 which has since been fixed.  The bug would result in falling back to HPET
 and all of the clock calls from the GEOM code for each I/O would kill
 performance.



 -adrian


 On 4 January 2015 at 09:56, Bryan Venteicher
 bry...@daemoninthecloset.org wrote:
  For the last few weeks, I've been working on adding support for KVM
 clock
  in the projects/paravirt branch. Currently, a KVM VM guest will end up
  selecting either the HPET or ACPI as the timecounter source.
 Unfortunately,
  this is very costly since every timecounter fetch causes a VM exit. KVM
  clock allows the guest to use the TSC instead; it is very similar to the
  existing Xen timer.
 
  The performance difference between HPET/ACPI and KVMCLOCK can be
 dramatic:
  a simple disk benchmark goes from 10K IOPs to 100K IOPs.
 
  The patch is attached is attached or available at [1]. I'd appreciate
 any
  testing.
 
  Also as a part of this, I've tried to generalized a bit of our existing
  hypervisor guest code, with the eventual goal of being able to support
 more
  invasive PV operations. The patch series is viewable in Phabricator.
 
  https://reviews.freebsd.org/D1429 - paravirt: Generalize parts of the
 XEN
  timer code into pvclock
  https://reviews.freebsd.org/D1430 - paravirt: Add interface to
 calculate
  the TSC frequency from pvclock
  https://reviews.freebsd.org/D1431 - paravirt: Add simple hypervisor
  registration and detection interface
  https://reviews.freebsd.org/D1432 - paravirt: Add detection of bhyve
 using
  new hypervisor interface
  https://reviews.freebsd.org/D1433 - paravirt: Add detection of VMware
 using
  new hypervisor interface
  https://reviews.freebsd.org/D1434 - paravirt: Add detection of KVM
 using
  new hypervisor interface
  https://reviews.freebsd.org/D1435 - paravirt: Add KVM clock timecounter
  support
 
  My current plan is to MFC this series to 10-STABLE, and commit a
  self-contained KVM clock to the other stable branches.
 
  [1] - https://people.freebsd.org/~bryanv/patches/kvm_clock-1.patch
 
  ___
  freebsd-a...@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-arch
  To unsubscribe, send any mail to freebsd-arch-unsubscr...@freebsd.org
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
 



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org