On Sun, 15 Mar 2026 at 16:39, Zenghui Yu <[email protected]> wrote:
>
> It seems that hvf doesn't deal with the abort generated when guest tries to
> execute instructions outside of the valid physical memory range, for
> unknown reason. The abort is forwarded to userspace and QEMU doesn't handle
> it either, which ends up with faulting on the same instruction infinitely.
>
> This was noticed by the kvm-unit-tests/selftest-vectors-kernel failure:
>
>   timeout -k 1s --foreground 90s /opt/homebrew/bin/qemu-system-aarch64 \
>     -nodefaults -machine virt -accel hvf -cpu host \
>     -device virtio-serial-device -device virtconsole,chardev=ctd \
>     -chardev testdev,id=ctd -device pci-testdev -display none \
>     -serial stdio -kernel arm/selftest.flat -smp 1 -append vectors-kernel
>
>   PASS: selftest: vectors-kernel: und
>   PASS: selftest: vectors-kernel: svc
>   qemu-system-aarch64: 0xffffc000: unhandled exception ec=0x20
>   qemu-system-aarch64: 0xffffc000: unhandled exception ec=0x20
>   qemu-system-aarch64: 0xffffc000: unhandled exception ec=0x20
>   [...]
>
> It's apparent that the guest is braindead and it's unsure what prevents hvf
> from injecting an abort directly in that case. Try to deal with the insane
> guest in QEMU by injecting an SEA back into it in the EC_INSNABORT
> emulation path.

Shouldn't that be an AddressSize fault, not an external abort?

My guess would be that hvf is handing us the EC_INSNABORT
cases for the same reason it hands us EC_DATABORT cases --
we might have some ability to emulate the access. We probably
also get this for cases like "guest tries to execute out of
an MMIO device".

What happens for a data access to this kind of
out-of-the-physical-memory-range address? Does hvf
pass it back to us, or handle it internally?

Is the problem here a bogus virtual address from the guest's
point of view, or a valid virtual address that the guest's
page tables have translated to an invalid (intermediate)
physical address ?

> Signed-off-by: Zenghui Yu <[email protected]>
> ---
>  target/arm/hvf/hvf.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
>
> diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
> index aabc7d32c1..54d6ea469c 100644
> --- a/target/arm/hvf/hvf.c
> +++ b/target/arm/hvf/hvf.c
> @@ -2332,9 +2332,32 @@ static int hvf_handle_exception(CPUState *cpu, 
> hv_vcpu_exit_exception_t *excp)
>          bool ea = (syndrome >> 9) & 1;
>          bool s1ptw = (syndrome >> 7) & 1;
>          uint32_t ifsc = (syndrome >> 0) & 0x3f;
> +        uint64_t ipa = excp->physical_address;
> +        AddressSpace *as = cpu_get_address_space(cpu, ARMASIdx_NS);
> +        hwaddr xlat;
> +        MemoryRegion *mr;
> +
> +        cpu_synchronize_state(cpu);
>
>          trace_hvf_insn_abort(env->pc, set, fnv, ea, s1ptw, ifsc);
>
> +        /*
> +         * TODO: If s1ptw, this is an error in the guest os page tables.
> +         * Inject the exception into the guest.
> +         */
> +        assert(!s1ptw);
> +
> +        mr = address_space_translate(as, ipa, &xlat, NULL, false,
> +                                     MEMTXATTRS_UNSPECIFIED);
> +        if (unlikely(!memory_region_is_ram(mr))) {

This doesn't look like the right kind of check, given the
stated problem. Addresses can be in range but not have RAM.

> +            uint32_t syn;
> +
> +            /* inject an SEA back into the guest */
> +            syn = syn_insn_abort(arm_current_el(env) == 1, ea, false, 0x10);
> +            hvf_raise_exception(cpu, EXCP_PREFETCH_ABORT, syn, 1);
> +            break;
> +        }
> +
>          /* fall through */

This "fall through" remains not correct, I think, and it's kind
of a big part of the problem here -- if we get an EC_INSNABORT
handed to us by hvf, then we could:
 * stop execution, exiting QEMU (as a "situation we can't
   handle and don't know what to do with")
 * advance the PC over the insn (questionable...)
 * feed some kind of exception into the guest

but "continue execution of the guest without changing PC at all"
is definitely wrong. A fix for this problem ought to involve
changing the EC_INSNABORT case so that it no lenger does that
"fall through to default" at all.

thanks
-- PMM

Reply via email to