Re: [PATCH v2 00/12] KVM: Add idempotent controls for migrating system counter state

2021-07-22 Thread Oliver Upton
On Wed, Jul 21, 2021 at 8:28 AM Andrew Jones  wrote:
>
> On Fri, Jul 16, 2021 at 09:26:17PM +, Oliver Upton wrote:
> > KVM's current means of saving/restoring system counters is plagued with
> > temporal issues. At least on ARM64 and x86, we migrate the guest's
> > system counter by-value through the respective guest system register
> > values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> > brittle as the state is not idempotent: the host system counter is still
> > oscillating between the attempted save and restore. Furthermore, VMMs
> > may wish to transparently live migrate guest VMs, meaning that they
> > include the elapsed time due to live migration blackout in the guest
> > system counter view. The VMM thread could be preempted for any number of
> > reasons (scheduler, L0 hypervisor under nested) between the time that
> > it calculates the desired guest counter value and when KVM actually sets
> > this counter state.
> >
> > Despite the value-based interface that we present to userspace, KVM
> > actually has idempotent guest controls by way of system counter offsets.
> > We can avoid all of the issues associated with a value-based interface
> > by abstracting these offset controls in new ioctls. This series
> > introduces new vCPU device attributes to provide userspace access to the
> > vCPU's system counter offset.
> >
> > Patch 1 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> > ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> > essential for a VMM to perform precise migration of the guest's system
> > counters.
> >
> > Patches 2-3 add support for x86 by shoehorning the new controls into the
> > pre-existing synchronization heuristics.
> >
> > Patches 4-5 implement a test for the new additions to
> > KVM_{GET,SET}_CLOCK.
> >
> > Patches 6-7 implement at test for the tsc offset attribute introduced in
> > patch 3.
> >
> > Patch 8 adds a device attribute for the arm64 virtual counter-timer
> > offset.
> >
> > Patch 9 extends the test from patch 7 to cover the arm64 virtual
> > counter-timer offset.
> >
> > Patch 10 adds a device attribute for the arm64 physical counter-timer
> > offset. Currently, this is implemented as a synthetic register, forcing
> > the guest to trap to the host and emulating the offset in the fast exit
> > path. Later down the line we will have hardware with FEAT_ECV, which
> > allows the hypervisor to perform physical counter-timer offsetting in
> > hardware (CNTPOFF_EL2).
> >
> > Patch 11 extends the test from patch 7 to cover the arm64 physical
> > counter-timer offset.
> >
> > Patch 12 introduces a benchmark to measure the overhead of emulation in
> > patch 10.
> >
> > Physical counter benchmark
> > --
> >
> > The following data was collected by running 1 iterations of the
> > benchmark test from Patch 6 on an Ampere Mt. Jade reference server, A 2S
> > machine with 2 80-core Ampere Altra SoCs. Measurements were collected
> > for both VHE and nVHE operation using the `kvm-arm.mode=` command-line
> > parameter.
> >
> > nVHE
> > 
> >
> > +++-+
> > |   Metric   | Native | Trapped |
> > +++-+
> > | Average| 54ns   | 148ns   |
> > | Standard Deviation | 124ns  | 122ns   |
> > | 95th Percentile| 258ns  | 348ns   |
> > +++-+
> >
> > VHE
> > ---
> >
> > +++-+
> > |   Metric   | Native | Trapped |
> > +++-+
> > | Average| 53ns   | 152ns   |
> > | Standard Deviation | 92ns   | 94ns|
> > | 95th Percentile| 204ns  | 307ns   |
> > +++-+
> >
> > This series applies cleanly to the following commit:
> >
> > 1889228d80fe ("KVM: selftests: smm_test: Test SMM enter from L2")
> >
> > v1 -> v2:
> >   - Reimplemented as vCPU device attributes instead of a distinct ioctl.
> >   - Added the (realtime, host_tsc) instant support to
> > KVM_{GET,SET}_CLOCK
> >   - Changed the arm64 implementation to broadcast counter offset values
> > to all vCPUs in a guest. This upholds the architectural expectations
> > of a consistent counter-timer across CPUs.
> >   - Fixed a bug with traps in VHE mode. We now configure traps on every
> > transition into a guest to handle differing VMs (trapped, emulated).
> >
>
> Oops, I see there's a v3 of this series. I'll switch to reviewing that. I
> think my comments / r-b's apply to that version as well though.

Hey Drew,

Thanks for the review. I'll address your comments from both v2 and v3
in the next series.

--
Thanks,
Oliver
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 00/12] KVM: Add idempotent controls for migrating system counter state

2021-07-21 Thread Andrew Jones
On Fri, Jul 16, 2021 at 09:26:17PM +, Oliver Upton wrote:
> KVM's current means of saving/restoring system counters is plagued with
> temporal issues. At least on ARM64 and x86, we migrate the guest's
> system counter by-value through the respective guest system register
> values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> brittle as the state is not idempotent: the host system counter is still
> oscillating between the attempted save and restore. Furthermore, VMMs
> may wish to transparently live migrate guest VMs, meaning that they
> include the elapsed time due to live migration blackout in the guest
> system counter view. The VMM thread could be preempted for any number of
> reasons (scheduler, L0 hypervisor under nested) between the time that
> it calculates the desired guest counter value and when KVM actually sets
> this counter state.
> 
> Despite the value-based interface that we present to userspace, KVM
> actually has idempotent guest controls by way of system counter offsets.
> We can avoid all of the issues associated with a value-based interface
> by abstracting these offset controls in new ioctls. This series
> introduces new vCPU device attributes to provide userspace access to the
> vCPU's system counter offset.
> 
> Patch 1 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> essential for a VMM to perform precise migration of the guest's system
> counters.
> 
> Patches 2-3 add support for x86 by shoehorning the new controls into the
> pre-existing synchronization heuristics.
> 
> Patches 4-5 implement a test for the new additions to
> KVM_{GET,SET}_CLOCK.
> 
> Patches 6-7 implement at test for the tsc offset attribute introduced in
> patch 3.
> 
> Patch 8 adds a device attribute for the arm64 virtual counter-timer
> offset.
> 
> Patch 9 extends the test from patch 7 to cover the arm64 virtual
> counter-timer offset.
> 
> Patch 10 adds a device attribute for the arm64 physical counter-timer
> offset. Currently, this is implemented as a synthetic register, forcing
> the guest to trap to the host and emulating the offset in the fast exit
> path. Later down the line we will have hardware with FEAT_ECV, which
> allows the hypervisor to perform physical counter-timer offsetting in
> hardware (CNTPOFF_EL2).
> 
> Patch 11 extends the test from patch 7 to cover the arm64 physical
> counter-timer offset.
> 
> Patch 12 introduces a benchmark to measure the overhead of emulation in
> patch 10.
> 
> Physical counter benchmark
> --
> 
> The following data was collected by running 1 iterations of the
> benchmark test from Patch 6 on an Ampere Mt. Jade reference server, A 2S
> machine with 2 80-core Ampere Altra SoCs. Measurements were collected
> for both VHE and nVHE operation using the `kvm-arm.mode=` command-line
> parameter.
> 
> nVHE
> 
> 
> +++-+
> |   Metric   | Native | Trapped |
> +++-+
> | Average| 54ns   | 148ns   |
> | Standard Deviation | 124ns  | 122ns   |
> | 95th Percentile| 258ns  | 348ns   |
> +++-+
> 
> VHE
> ---
> 
> +++-+
> |   Metric   | Native | Trapped |
> +++-+
> | Average| 53ns   | 152ns   |
> | Standard Deviation | 92ns   | 94ns|
> | 95th Percentile| 204ns  | 307ns   |
> +++-+
> 
> This series applies cleanly to the following commit:
> 
> 1889228d80fe ("KVM: selftests: smm_test: Test SMM enter from L2")
> 
> v1 -> v2:
>   - Reimplemented as vCPU device attributes instead of a distinct ioctl.
>   - Added the (realtime, host_tsc) instant support to
> KVM_{GET,SET}_CLOCK
>   - Changed the arm64 implementation to broadcast counter offset values
> to all vCPUs in a guest. This upholds the architectural expectations
> of a consistent counter-timer across CPUs.
>   - Fixed a bug with traps in VHE mode. We now configure traps on every
> transition into a guest to handle differing VMs (trapped, emulated).
>

Oops, I see there's a v3 of this series. I'll switch to reviewing that. I
think my comments / r-b's apply to that version as well though.

Thanks,
drew 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 00/12] KVM: Add idempotent controls for migrating system counter state

2021-07-16 Thread Oliver Upton
On Fri, Jul 16, 2021 at 2:26 PM Oliver Upton  wrote:
>
> KVM's current means of saving/restoring system counters is plagued with
> temporal issues. At least on ARM64 and x86, we migrate the guest's
> system counter by-value through the respective guest system register
> values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> brittle as the state is not idempotent: the host system counter is still
> oscillating between the attempted save and restore. Furthermore, VMMs
> may wish to transparently live migrate guest VMs, meaning that they
> include the elapsed time due to live migration blackout in the guest
> system counter view. The VMM thread could be preempted for any number of
> reasons (scheduler, L0 hypervisor under nested) between the time that
> it calculates the desired guest counter value and when KVM actually sets
> this counter state.
>
> Despite the value-based interface that we present to userspace, KVM
> actually has idempotent guest controls by way of system counter offsets.
> We can avoid all of the issues associated with a value-based interface
> by abstracting these offset controls in new ioctls. This series
> introduces new vCPU device attributes to provide userspace access to the
> vCPU's system counter offset.
>
> Patch 1 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> essential for a VMM to perform precise migration of the guest's system
> counters.
>
> Patches 2-3 add support for x86 by shoehorning the new controls into the
> pre-existing synchronization heuristics.
>
> Patches 4-5 implement a test for the new additions to
> KVM_{GET,SET}_CLOCK.
>
> Patches 6-7 implement at test for the tsc offset attribute introduced in
> patch 3.
>
> Patch 8 adds a device attribute for the arm64 virtual counter-timer
> offset.
>
> Patch 9 extends the test from patch 7 to cover the arm64 virtual
> counter-timer offset.
>
> Patch 10 adds a device attribute for the arm64 physical counter-timer
> offset. Currently, this is implemented as a synthetic register, forcing
> the guest to trap to the host and emulating the offset in the fast exit
> path. Later down the line we will have hardware with FEAT_ECV, which
> allows the hypervisor to perform physical counter-timer offsetting in
> hardware (CNTPOFF_EL2).
>
> Patch 11 extends the test from patch 7 to cover the arm64 physical
> counter-timer offset.
>
> Patch 12 introduces a benchmark to measure the overhead of emulation in
> patch 10.
>
> Physical counter benchmark
> --
>
> The following data was collected by running 1 iterations of the
> benchmark test from Patch 6 on an Ampere Mt. Jade reference server, A 2S
> machine with 2 80-core Ampere Altra SoCs. Measurements were collected
> for both VHE and nVHE operation using the `kvm-arm.mode=` command-line
> parameter.
>
> nVHE
> 
>
> +++-+
> |   Metric   | Native | Trapped |
> +++-+
> | Average| 54ns   | 148ns   |
> | Standard Deviation | 124ns  | 122ns   |
> | 95th Percentile| 258ns  | 348ns   |
> +++-+
>
> VHE
> ---
>
> +++-+
> |   Metric   | Native | Trapped |
> +++-+
> | Average| 53ns   | 152ns   |
> | Standard Deviation | 92ns   | 94ns|
> | 95th Percentile| 204ns  | 307ns   |
> +++-+
>
> This series applies cleanly to the following commit:
>
> 1889228d80fe ("KVM: selftests: smm_test: Test SMM enter from L2")

v1: https://lore.kernel.org/kvm/20210608214742.1897483-1-oup...@google.com/

> v1 -> v2:
>   - Reimplemented as vCPU device attributes instead of a distinct ioctl.
>   - Added the (realtime, host_tsc) instant support to
> KVM_{GET,SET}_CLOCK
>   - Changed the arm64 implementation to broadcast counter offset values
> to all vCPUs in a guest. This upholds the architectural expectations
> of a consistent counter-timer across CPUs.
>   - Fixed a bug with traps in VHE mode. We now configure traps on every
> transition into a guest to handle differing VMs (trapped, emulated).
>
> Oliver Upton (12):
>   KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
>   KVM: x86: Refactor tsc synchronization code
>   KVM: x86: Expose TSC offset controls to userspace
>   tools: arch: x86: pull in pvclock headers
>   selftests: KVM: Add test for KVM_{GET,SET}_CLOCK
>   selftests: KVM: Add helpers for vCPU device attributes
>   selftests: KVM: Introduce system counter offset test
>   KVM: arm64: Allow userspace to configure a vCPU's virtual offset
>   selftests: KVM: Add support for aarch64 to system_counter_offset_test
>   KVM: arm64: Provide userspace access to the physical counter offset
>   selftests: KVM: Test physical counter offsetting
>   selftests: KVM: Add counter emulation