Re: [PATCH 0/3] hvf x86 correctness and efficiency improvements

Phil Dennis-Jordan Mon, 16 Oct 2023 09:46:45 -0700

Hi Paolo,

On Mon, 16 Oct 2023 at 16:39, Paolo Bonzini <pbonz...@redhat.com> wrote:
>
> On 9/22/23 16:09, Phil Dennis-Jordan wrote:
> > Patch 1 enables the INVTSC CPUID bit when running with hvf. This can
> > enable some optimisations in the guest OS, and I've not found any reason
> > it shouldn't be allowed for hvf based hosts.
>
> It can be enabled, but it should include a migration blocker.  In fact,
> probably HVF itself should include a migration blocker because QEMU
> doesn't support TSC scaling.

I didn't think Qemu's HVF backend supported migration in any form at this
point anyway? Or do you mean machine model versioning of the default
setting?

> > Patch 2 fixes hvf_kick_vcpu_thread so it actually forces a VM exit
instead of
> > doing nothing. I guess this previously didn't cause any huge issues
because
> > hvf's hv_vcpu_run() would exit so extremely frequently on its own
accord.
>
> No, it's because signal delivery should already kick the vCPU out of
> guest mode.  I guess it does with hv_vcpu_run(), but not with
> hv_vcpu_run_until()?

It's possible! At any rate, hv_vcpu_run() only seems to run for VERY short
time slices in practice. I'll try gathering some data on exit reasons/kick
delivery latencies with the various combinations of kick methods and vcpu
run APIs if that helps.

> hv_vcpu_interrupt() is a problematic API, in that it is not clear how it
> handles races with hv_vcpu_run().  In particular, whether this causes an
> immediate vmexit or not:
>
>             thread 1                         thread 2
>             -----------------------          -----------------------
>                                              <CPU not in guest mode>
>             hv_vcpu_interrupt()
>                                              hv_vcpu_run() (or run_until)

Roman brought me up to speed on this issue in the thread here:
https://lists.gnu.org/archive/html/qemu-devel/2023-10/msg02156.html
including a patch and discussion on the subject from 2020. I've now updated
Roman's old patch which addressed it thanks to your feedback at the time
but which never got merged. (Judging by the list archives, v4 just never
got reviewed.) Details on my latest progress here:

https://lists.gnu.org/archive/html/qemu-devel/2023-10/msg04752.html

> Not that the current code is any better; there is no guarantee that the
> signal is delivered before hv_vcpu_run() is called.  However, if as you
> said hv_vcpu_run() will exit often on its own accord but
> hv_vcpu_run_until() does not, then this could cause difficult to
> reproduce bugs by switching to the latter.

At least with Roman's updated kick patch, according to tracing points I
inserted locally, I'm not seeing any slow or undelivered kicks with
hv_vcpu_run_until. However, I'm still observing the (S)VGA issues to which
Roman drew my attention at the same time, so I'm wondering if the issues
we're seeing with hv_vcpu_run_until aren't (all) related to interrupt
delivery. (To be clear, switching to hv_vcpu_run_until() WITHOUT
hv_vcpu_interrupt() causes some very obvious problems where the vCPU simply
doesn't exit at all for long periods.)

>From my point of view, there are at least 2 strong incentives for switching
to hv_vcpu_run_until():

1. hv_vcpu_run() exits very frequently, and often there is actually nothing
for the VMM to do except call hv_vcpu_run() again. With Qemu's current hvf
backend, each exit causes a BQL acquisition, and VMs with a bunch of vCPUs
rapidly become limited by BQL contention according to my profiling.

2. The HVF in macOS 12 and newer contains an in-kernel APIC implementation,
in a similar vein to KVM’s kernel irqchip. This should further reduce
apparent VM exits, but using it is conditional on using hv_vcpu_run_until().
Calling hv_vcpu_run() errors out with that enabled. Integrating that into
Qemu is still a work in progress, but I’ve got pretty far. (most of the
APIC and IOAPIC suites in kvm-unit-tests pass)

For those 2 reasons I’m pretty motivated to make things work with
hv_vcpu_run_until()
one way or another.

> > The temp variable is needed because the pointer expected by the
hv_vcpu_interrupt()
> > call doesn't match the fd field's type in the hvf accel's struct
AccelCPUState.
> > I'm unsure if it would be better to change that struct field to the
relevant
> > architecture's handle types, hv_vcpuid_t (x86, unsigned int) and
hv_vcpu_t
> > (aarch64, uint64_t), perhaps via an intermediate typedef?
>
> I think fd and the HVF type should be placed in an anonymous union.

That’s a good idea, I’ll put a patch together doing exactly that.

Thanks,
Phil

Re: [PATCH 0/3] hvf x86 correctness and efficiency improvements

Reply via email to