Linux HVM fails to start with PANIC: early exception 0x00 IP 0010:clear_page_orig+0x12/0x40 error 0
Hi, I've got a report[1] that after some update Linux HVM fails to start with the error as in the subject. It looks to be caused by some change between Xen 4.17.3 and 4.17.4. Here the failure is on Linux 6.6.25 (both dom0 and domU), but the 6.1.62 that worked with older Xen before, now fails too. The full error (logged via earlyprintk=xen) is: [0.009500] Using GB pages for direct mapping PANIC: early exception 0x00 IP 10:b01c32e2 error 0 cr2 0xa08649801000 [0.009606] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.25-1.qubes.fc37.x86_64 #1 [0.009665] Hardware name: Xen HVM domU, BIOS 4.17.4 04/26/2024 [0.009710] RIP: 0010:clear_page_orig+0x12/0x40 [0.009766] Code: 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 31 c0 b9 40 00 00 00 0f 1f 44 00 00 ff c9 <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47 [0.009862] RSP: :b0e03d58 EFLAGS: 00010016 ORIG_RAX: [0.009915] RAX: RBX: RCX: 003f [0.009967] RDX: 9801 RSI: RDI: a08649801000 [0.010015] RBP: 0001 R08: 0001 R09: 6b7f283562d74b16 [0.010063] R10: R11: R12: 0001 [0.010112] R13: R14: b0e22a08 R15: a0864000 [0.010161] FS: () GS:b16ea000() knlGS: [0.010214] CS: 0010 DS: ES: CR0: 80050033 [0.010257] CR2: a08649801000 CR3: 08e8 CR4: 00b0 [0.010310] Call Trace: [0.010341] [0.010372] ? early_fixup_exception+0xf7/0x190 [0.010416] ? early_idt_handler_common+0x2f/0x3a [0.010460] ? clear_page_orig+0x12/0x40 [0.010501] ? alloc_low_pages+0xeb/0x150 [0.010541] ? __kernel_physical_mapping_init+0x1d2/0x630 [0.010588] ? init_memory_mapping+0x83/0x160 [0.010631] ? init_mem_mapping+0x9a/0x460 [0.010669] ? memblock_reserve+0x6d/0xf0 [0.010709] ? setup_arch+0x796/0xf90 [0.010748] ? start_kernel+0x63/0x420 [0.010787] ? x86_64_start_reservations+0x18/0x30 [0.010828] ? x86_64_start_kernel+0x96/0xa0 [0.010868] ? secondary_startup_64_no_verify+0x18f/0x19b [0.010918] I'm pretty sure the exception 0 is misleading here, I don't see how it could be #DE. More logs (including full hypervisor log) are attached to the linked issue. This is on HP 240 g7, and my educated guess is it's Intel Celeron N4020 CPU. I cannot reproduce the issue on different hardware. PVH domains seems to work. Any ideas what could have happened here? [1] https://github.com/QubesOS/qubes-issues/issues/9217 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab signature.asc Description: PGP signature
[linux-linus test] 185982: tolerable FAIL - PUSHED
flight 185982 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/185982/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-libvirt-vhd 10 host-ping-check-xen fail in 185977 pass in 185982 test-armhf-armhf-xl-qcow2 8 xen-boot fail in 185977 pass in 185982 test-armhf-armhf-xl-credit2 8 xen-boot fail in 185977 pass in 185982 test-amd64-amd64-xl-pvshim 20 guest-localmigrate/x10 fail pass in 185977 test-amd64-amd64-qemuu-freebsd11-amd64 21 guest-start/freebsd.repeat fail pass in 185977 test-armhf-armhf-xl-multivcpu 8 xen-boot fail pass in 185977 test-armhf-armhf-libvirt 8 xen-boot fail pass in 185977 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 16 saverestore-support-check fail in 185977 like 185972 test-armhf-armhf-xl-multivcpu 15 migrate-support-check fail in 185977 never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-check fail in 185977 never pass test-armhf-armhf-libvirt15 migrate-support-check fail in 185977 never pass test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 185972 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 185972 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 185972 test-armhf-armhf-examine 8 reboot fail like 185972 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 185972 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 185972 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-qcow214 migrate-support-checkfail never pass test-armhf-armhf-xl-qcow215 saverestore-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass version targeted for testing: linuxcf87f46fd34d6c19283d9625a7822f20d90b64a4 baseline version: linuxf4345f05c0dfc73c617e66f3b809edb8ddd41075 Last test of basis 185972 2024-05-10 18:14:02 Z1 days Testing same since 185977 2024-05-11 02:27:16 Z0 days2 attempts
[PATCH] x86/cpufreq: Rename cpuid variable/parameters to cpu
Various functions have a parameter or local variable called cpuid, but this triggers a MISRA R5.3 violation because we also have a function called cpuid() which wraps the real CPUID instruction. In all these cases, it's a Xen cpu index, which is far more commonly named just cpu in our code. While adjusting these, fix a couple of other issues: * cpufreq_cpu_init() is on the end of a hypercall (with in-memory parameters, even), making EFAULT the wrong error to use. Use EOPNOTSUPP instead. * check_est_cpu() is wrong to tie EIST to just Intel, and nowhere else using EIST makes this restriction. Just check the feature itself, which is more succinctly done after being folded into its single caller. * In powernow_cpufreq_update(), replace an opencoded cpu_online(). Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné CC: Stefano Stabellini CC: Nicola Vetrini CC: consult...@bugseng.com cpu needs to stay signed for now in set_px_pminfo(), because of get_cpu_id(). This can be cleaned up by making better use of BAD_APICID, but that's a much more involved change. --- xen/arch/x86/acpi/cpu_idle.c | 14 xen/arch/x86/acpi/cpufreq/cpufreq.c | 24 +++-- xen/arch/x86/acpi/cpufreq/hwp.c | 6 ++-- xen/arch/x86/acpi/cpufreq/powernow.c | 6 ++-- xen/drivers/cpufreq/cpufreq.c | 18 +- xen/drivers/cpufreq/utility.c | 43 +++ xen/include/acpi/cpufreq/cpufreq.h| 6 ++-- xen/include/acpi/cpufreq/processor_perf.h | 8 ++--- xen/include/xen/pmstat.h | 6 ++-- 9 files changed, 57 insertions(+), 74 deletions(-) diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c index cfce4cc0408f..c8db1aa9913a 100644 --- a/xen/arch/x86/acpi/cpu_idle.c +++ b/xen/arch/x86/acpi/cpu_idle.c @@ -1498,14 +1498,14 @@ static void amd_cpuidle_init(struct acpi_processor_power *power) vendor_override = -1; } -uint32_t pmstat_get_cx_nr(uint32_t cpuid) +uint32_t pmstat_get_cx_nr(unsigned int cpu) { -return processor_powers[cpuid] ? processor_powers[cpuid]->count : 0; +return processor_powers[cpu] ? processor_powers[cpu]->count : 0; } -int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat) +int pmstat_get_cx_stat(unsigned int cpu, struct pm_cx_stat *stat) { -struct acpi_processor_power *power = processor_powers[cpuid]; +struct acpi_processor_power *power = processor_powers[cpu]; uint64_t idle_usage = 0, idle_res = 0; uint64_t last_state_update_tick, current_stime, current_tick; uint64_t usage[ACPI_PROCESSOR_MAX_POWER] = { 0 }; @@ -1522,7 +1522,7 @@ int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat) return 0; } -stat->idle_time = get_cpu_idle_time(cpuid); +stat->idle_time = get_cpu_idle_time(cpu); nr = min(stat->nr, power->count); /* mimic the stat when detail info hasn't been registered by dom0 */ @@ -1572,7 +1572,7 @@ int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat) idle_res += res[i]; } -get_hw_residencies(cpuid, _res); +get_hw_residencies(cpu, _res); #define PUT_xC(what, n) do { \ if ( stat->nr_##what >= n && \ @@ -1613,7 +1613,7 @@ int pmstat_get_cx_stat(uint32_t cpuid, struct pm_cx_stat *stat) return 0; } -int pmstat_reset_cx_stat(uint32_t cpuid) +int pmstat_reset_cx_stat(unsigned int cpu) { return 0; } diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c b/xen/arch/x86/acpi/cpufreq/cpufreq.c index 2b6ef99678ae..a341f2f02063 100644 --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c @@ -55,17 +55,6 @@ struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS]; static bool __read_mostly acpi_pstate_strict; boolean_param("acpi_pstate_strict", acpi_pstate_strict); -static int check_est_cpu(unsigned int cpuid) -{ -struct cpuinfo_x86 *cpu = _data[cpuid]; - -if (cpu->x86_vendor != X86_VENDOR_INTEL || -!cpu_has(cpu, X86_FEATURE_EIST)) -return 0; - -return 1; -} - static unsigned extract_io(u32 value, struct acpi_cpufreq_data *data) { struct processor_performance *perf; @@ -530,7 +519,7 @@ static int cf_check acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) if (cpufreq_verbose) printk("xen_pminfo: @acpi_cpufreq_cpu_init," "HARDWARE addr space\n"); -if (!check_est_cpu(cpu)) { +if (!cpu_has(c, X86_FEATURE_EIST)) { result = -ENODEV; goto err_unreg; } @@ -690,15 +679,12 @@ static int __init cf_check cpufreq_driver_late_init(void) } __initcall(cpufreq_driver_late_init); -int cpufreq_cpu_init(unsigned int cpuid) +int cpufreq_cpu_init(unsigned int cpu) { -int ret; - /* Currently we only handle Intel, AMD and Hygon processor */ if ( boot_cpu_data.x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_AMD |
[xen-unstable test] 185980: tolerable FAIL - PUSHED
flight 185980 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/185980/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 185975 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 185975 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 185975 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 185975 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 185975 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 185975 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-qcow214 migrate-support-checkfail never pass test-armhf-armhf-xl-qcow215 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass version targeted for testing: xen 46aa3031ae89ac1771f4159972edab65710e7349 baseline version: xen 996576b965ccdf0de17aafa14282925e408e1200 Last test of basis 185975 2024-05-11 01:37:06 Z0 days Testing same since 185980 2024-05-11 10:13:22 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Daniel P. Smith Demi Marie Obenour Leigh Brown Marek Marczykowski-Górecki Nicola Vetrini Roger Pau Monne Roger Pau Monné Stefano Stabellini Stefano Stabellini jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64-xtf pass build-amd64 pass build-arm64
[linux-linus test] 185977: regressions - FAIL
flight 185977 linux-linus real [real] flight 185981 linux-linus real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/185977/ http://logs.test-lab.xenproject.org/osstest/logs/185981/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-qcow2 8 xen-boot fail REGR. vs. 185972 Tests which are failing intermittently (not blocking): test-armhf-armhf-libvirt-vhd 10 host-ping-check-xen fail pass in 185981-retest test-armhf-armhf-xl-credit2 8 xen-bootfail pass in 185981-retest Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-credit2 15 migrate-support-check fail in 185981 never pass test-armhf-armhf-xl-credit2 16 saverestore-support-check fail in 185981 never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-check fail in 185981 never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-check fail in 185981 never pass test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 185972 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 185972 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 185972 test-armhf-armhf-examine 8 reboot fail like 185972 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 185972 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 185972 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 185972 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass version targeted for testing: linuxcf87f46fd34d6c19283d9625a7822f20d90b64a4 baseline version: linuxf4345f05c0dfc73c617e66f3b809edb8ddd41075 Last test of basis 185972 2024-05-10 18:14:02 Z0 days Testing same since 185977 2024-05-11 02:27:16 Z0 days1 attempts People who touched revisions under test: "Paul E. McKenney" Agustin Gutierrez Alex Deucher Alexander Potapenko Andi Shyti Andrew Morton Animesh Manna Barry Song Bartosz Golaszewski Chaitanya Kumar Borah Christoph Hellwig Daniel
[PATCH] x86/pv: Sanity check bytes_per_rep in rep_outs()
I was playing with GCC-14, and gave the new static analyser (-fanalyze) a try. One of the more serious things it came back with was this: In file included from ./arch/x86/include/asm/mm.h:10, from ./include/xen/mm.h:233, from ./include/xen/domain_page.h:12, from arch/x86/pv/emul-priv-op.c:10: In function '__copy_from_guest_pv', inlined from 'rep_outs' at arch/x86/pv/emul-priv-op.c:718:20: ./arch/x86/include/asm/uaccess.h:174:9: warning: stack-based buffer overflow [CWE-121] [-Wanalyzer-out-of-bounds] 174 | __asm__ __volatile__( \ | ^~~ ./arch/x86/include/asm/uaccess.h:229:13: note: in expansion of macro 'get_unsafe_asm' 229 | case 8: get_unsafe_asm(x, ptr, grd, retval, "", "=r", errret); break; \ | ^~ ./arch/x86/include/asm/uaccess.h:236:5: note: in expansion of macro 'get_unsafe_size' 236 | get_unsafe_size(x, ptr, size, UA_KEEP, retval, errret) | ^~~ ./arch/x86/include/asm/uaccess.h:308:13: note: in expansion of macro 'get_guest_size' 308 | get_guest_size(*(uint64_t *)to, from, 8, ret, 8); | ^~ 'rep_outs': events 1-3 | |arch/x86/pv/emul-priv-op.c:674:21: | 674 | static int cf_check rep_outs( | | ^~~~ | | | | | (1) entry to 'rep_outs' |.. | 688 | if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) ) | | | | | | | (3) calling 'guest_io_okay' from 'rep_outs' |.. | 710 | unsigned int data = 0; | | | | | | | (2) capacity: 4 bytes | +--> 'guest_io_okay': events 4-5 | | 159 | static bool guest_io_okay(unsigned int port, unsigned int bytes, | | ^ | | | | | (4) entry to 'guest_io_okay' |.. | 165 | if ( iopl_ok(v, regs) ) | | | | | | | (5) calling 'iopl_ok' from 'guest_io_okay' | +--> 'iopl_ok': event 6 | | 148 | static bool iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs) | | ^~~ | | | | | (6) entry to 'iopl_ok' | 'iopl_ok': event 7 | |./include/xen/bug.h:141:13: | 141 | do { if ( unlikely(!(p)) ) assert_failed(#p); } while (0) | | ^ | | | | | (7) following 'false' branch... arch/x86/pv/emul-priv-op.c:153:5: note: in expansion of macro 'ASSERT' | 153 | ASSERT((v->arch.pv.iopl & ~X86_EFLAGS_IOPL) == 0); | | ^~ | 'iopl_ok': event 8 | |./include/xen/bug.h:124:31: | 124 | #define assert_failed(msg) do { \ | | ^ | | | | | (8) ...to here ./include/xen/bug.h:141:32: note: in expansion of macro 'assert_failed' | 141 | do { if ( unlikely(!(p)) ) assert_failed(#p); } while (0) | |^ arch/x86/pv/emul-priv-op.c:153:5: note: in expansion of macro 'ASSERT' | 153 | ASSERT((v->arch.pv.iopl & ~X86_EFLAGS_IOPL) == 0); | | ^~ | <--+ | 'guest_io_okay': event 9 | | 165 | if ( iopl_ok(v, regs) ) | | ^~~~ | | | | | (9) returning to 'guest_io_okay' from 'iopl_ok' | <--+ | 'rep_outs': events 10-13 | | 688 | if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) ) | |~ ^~~~ | || | | || (10) returning to 'rep_outs' from 'guest_io_okay' | |(11)
[libvirt test] 185978: tolerable all pass - PUSHED
flight 185978 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/185978/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 185968 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass version targeted for testing: libvirt d4528bb9dbf21464e68beb9175a38aaf6484536e baseline version: libvirt dda10ac8acd0b5ffa03f9659c5678c2bddd9eed4 Last test of basis 185968 2024-05-10 04:18:49 Z1 days Testing same since 185978 2024-05-11 04:20:38 Z0 days1 attempts People who touched revisions under test: Andi Chandler jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm pass test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt pass test-armhf-armhf-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-amd64-libvirt-qcow2 pass test-arm64-arm64-libvirt-qcow2 pass test-amd64-amd64-libvirt-raw pass test-arm64-arm64-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass test-armhf-armhf-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/libvirt.git dda10ac8ac..d4528bb9db d4528bb9dbf21464e68beb9175a38aaf6484536e -> xen-tested-master
Re: [PATCH v2 1/4] xen/arm: Alloc hypervisor reserved pages as magic pages for Dom0less DomUs
Hi Henry, On 11/05/2024 01:56, Henry Wang wrote: There are use cases (for example using the PV driver) in Dom0less setup that require Dom0less DomUs start immediately with Dom0, but initialize XenStore later after Dom0's successful boot and call to the init-dom0less application. An error message can seen from the init-dom0less application on 1:1 direct-mapped domains: ``` Allocating magic pages memory.c:238:d0v0 mfn 0x39000 doesn't belong to d1 Error on alloc magic pages ``` The "magic page" is a terminology used in the toolstack as reserved pages for the VM to have access to virtual platform capabilities. Currently the magic pages for Dom0less DomUs are populated by the init-dom0less app through populate_physmap(), and populate_physmap() automatically assumes gfn == mfn for 1:1 direct mapped domains. This cannot be true for the magic pages that are allocated later from the init-dom0less application executed in Dom0. For domain using statically allocated memory but not 1:1 direct-mapped, similar error "failed to retrieve a reserved page" can be seen as the reserved memory list is empty at that time. To solve above issue, this commit allocates hypervisor reserved pages (currently used as the magic pages) for Arm Dom0less DomUs at the domain construction time. The base address/PFN of the region will be noted and communicated to the init-dom0less application in Dom0. Reported-by: Alec Kwapis Suggested-by: Daniel P. Smith Signed-off-by: Henry Wang --- v2: - Reword the commit msg to explain what is "magic page" and use generic terminology "hypervisor reserved pages" in commit msg. (Daniel) - Also move the offset definition of magic pages. (Michal) - Extract the magic page allocation logic to a function. (Michal) --- tools/libs/guest/xg_dom_arm.c | 6 -- xen/arch/arm/dom0less-build.c | 32 xen/include/public/arch-arm.h | 6 ++ 3 files changed, 38 insertions(+), 6 deletions(-) diff --git a/tools/libs/guest/xg_dom_arm.c b/tools/libs/guest/xg_dom_arm.c index 2fd8ee7ad4..8c579d7576 100644 --- a/tools/libs/guest/xg_dom_arm.c +++ b/tools/libs/guest/xg_dom_arm.c @@ -25,12 +25,6 @@ #include "xg_private.h" -#define NR_MAGIC_PAGES 4 -#define CONSOLE_PFN_OFFSET 0 -#define XENSTORE_PFN_OFFSET 1 -#define MEMACCESS_PFN_OFFSET 2 -#define VUART_PFN_OFFSET 3 - #define LPAE_SHIFT 9 #define PFN_4K_SHIFT (0) diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index 74f053c242..4b96ddd9ce 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -739,6 +739,34 @@ static int __init alloc_xenstore_evtchn(struct domain *d) return 0; } +static int __init alloc_magic_pages(struct domain *d) +{ +struct page_info *magic_pg; +mfn_t mfn; +gfn_t gfn; +int rc; + +d->max_pages += NR_MAGIC_PAGES; +magic_pg = alloc_domheap_pages(d, get_order_from_pages(NR_MAGIC_PAGES), 0); +if ( magic_pg == NULL ) +return -ENOMEM; + +mfn = page_to_mfn(magic_pg); +if ( !is_domain_direct_mapped(d) ) +gfn = gaddr_to_gfn(GUEST_MAGIC_BASE); +else +gfn = gaddr_to_gfn(mfn_to_maddr(mfn)); Summarizing the discussion we had on Matrix. Regions like the extend area and shared memory may not be direct mapped. So unfortunately, I think it is possible that the GFN could clash with one of those. At least in the shared memory case, the user can provide the address. But as you use the domheap allocator, the address returned could easily change if you tweak your setup. I am not entirely sure what's the best solution. We could ask the user to provide the information for reserved region. But it feels like we are exposing a bit too much to the user. So possibly we would want to use the same approach as extended regions. Once we processed all the mappings, find some space for the hypervisor regions. Any other suggestions? Cheers, -- Julien Grall
[xen-unstable test] 185975: tolerable FAIL - PUSHED
flight 185975 xen-unstable real [real] flight 185979 xen-unstable real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/185975/ http://logs.test-lab.xenproject.org/osstest/logs/185979/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qcow222 guest-start.2 fail pass in 185979-retest test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail pass in 185979-retest test-armhf-armhf-xl-qcow2 8 xen-bootfail pass in 185979-retest Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-qcow2 14 migrate-support-check fail in 185979 never pass test-armhf-armhf-xl-qcow2 15 saverestore-support-check fail in 185979 never pass test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 185970 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 185970 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 185970 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 185970 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 185970 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 185970 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-vhd 15 saverestore-support-checkfail never pass version targeted for testing: xen 996576b965ccdf0de17aafa14282925e408e1200 baseline version: xen b0082b908391b29b7c4dd5e6c389ebd6481926f8 Last test of basis 185970 2024-05-10 14:25:40 Z0 days Testing same since 185975 2024-05-11 01:37:06 Z0 days1 attempts People who touched revisions under test: Jan Beulich Juergen Gross Julien Grall jobs: build-amd64-xsm pass build-arm64-xsm pass
Re: [PATCH v2 3/4] tools/init-dom0less: Avoid hardcoding GUEST_MAGIC_BASE
Hi Henry, On 11/05/2024 01:56, Henry Wang wrote: Currently the GUEST_MAGIC_BASE in the init-dom0less application is hardcoded, which will lead to failures for 1:1 direct-mapped Dom0less DomUs. Since the guest magic region is now allocated from the hypervisor, instead of hardcoding the guest magic pages region, use xc_hvm_param_get() to get the guest magic region PFN, and based on that the XenStore PFN can be calculated. Also, we don't need to set the max mem anymore, so drop the call to xc_domain_setmaxmem(). Rename the alloc_xs_page() to get_xs_page() to reflect the changes. Take the opportunity to do some coding style improvements when possible. Reported-by: Alec Kwapis Signed-off-by: Henry Wang --- v2: - Update HVMOP keys name. --- tools/helpers/init-dom0less.c | 40 +++ 1 file changed, 17 insertions(+), 23 deletions(-) diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c index fee93459c4..04039a2a66 100644 --- a/tools/helpers/init-dom0less.c +++ b/tools/helpers/init-dom0less.c @@ -19,24 +19,20 @@ #define XENSTORE_PFN_OFFSET 1 Based on previous discussion, I think this wants to be 0 so the hypervisor only need to allocate one page (rather than 2 with the current offset). #define STR_MAX_LENGTH 128 -static int alloc_xs_page(struct xc_interface_core *xch, - libxl_dominfo *info, - uint64_t *xenstore_pfn) +static int get_xs_page(struct xc_interface_core *xch, libxl_dominfo *info, + uint64_t *xenstore_pfn) { int rc; -const xen_pfn_t base = GUEST_MAGIC_BASE >> XC_PAGE_SHIFT; -xen_pfn_t p2m = (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET; +xen_pfn_t magic_base_pfn; -rc = xc_domain_setmaxmem(xch, info->domid, - info->max_memkb + (XC_PAGE_SIZE/1024)); -if (rc < 0) -return rc; - -rc = xc_domain_populate_physmap_exact(xch, info->domid, 1, 0, 0, ); -if (rc < 0) -return rc; +rc = xc_hvm_param_get(xch, info->domid, HVM_PARAM_HV_RSRV_BASE_PFN, + _base_pfn); +if (rc < 0) { +printf("Failed to get HVM_PARAM_HV_RSRV_BASE_PFN\n"); +return 1; +} If we go down the route of using HVM_PARAM... then we should avoid to rely on the hypervisor to provide a large enough region. So we should check the offset against the size. Cheers, -- Julien Grall
Re: [PATCH v2 1/4] xen/arm: Alloc hypervisor reserved pages as magic pages for Dom0less DomUs
Hi Henry, On 11/05/2024 10:02, Henry Wang wrote: + + rc = guest_physmap_add_pages(d, gfn, mfn, NR_MAGIC_PAGES); + if ( rc ) + { + free_domheap_pages(magic_pg, get_order_from_pages(NR_MAGIC_PAGES)); + return rc; + } + + return 0; +} + static int __init construct_domU(struct domain *d, const struct dt_device_node *node) { @@ -840,6 +868,10 @@ static int __init construct_domU(struct domain *d, if ( rc < 0 ) return rc; d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL; + + rc = alloc_magic_pages(d); + if ( rc < 0 ) + return rc; This will only be allocated xenstore is enabled. But I don't think some of the magic pages really require xenstore to work. In the future we may need some more fine graine choice (see my comment in patch #2 as well). Sorry, but it seems that by the time that I am writing this reply, I didn't get the email for patch #2 comment. I will reply both together when I see it. That was expected, I knew what I wanted to mention about patch #2 but the e-mail was not fully written. You should have it in your inbox now. } return rc; diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h index 289af81bd6..186520d01f 100644 --- a/xen/include/public/arch-arm.h +++ b/xen/include/public/arch-arm.h @@ -476,6 +476,12 @@ typedef uint64_t xen_callback_t; #define GUEST_MAGIC_BASE xen_mk_ullong(0x3900) #define GUEST_MAGIC_SIZE xen_mk_ullong(0x0100) +#define NR_MAGIC_PAGES 4 This name and the other below are far too generic to be part of the shared header. For NR_MAGIC_PAGES, can you explain why GUEST_MAGIC_SIZE cannot be used? Is it because it is "too" big? Yes, I don't really want to allocate whole 16MB when we merely need 4 pages. What about updating GUEST_MAGIC_SIZE to 0x4000? It doesn't make any sense to have two define which have pretty much the same meaning. Then on the toolstack side, you can check that NR_MAGIC_PAGES is smaller or equal to GUEST_MAGIC_SIZE. For the rest below... +#define CONSOLE_PFN_OFFSET 0 +#define XENSTORE_PFN_OFFSET 1 +#define MEMACCESS_PFN_OFFSET 2 +#define VUART_PFN_OFFSET 3 ... the order we allocate the magic pages is purely an internal decision of the toolstack. For the rest of the system, they need to access HVM_PARAM_*. So it doesn't feel like they should be part of the shared headers. If there is a strong reason to include them, then I think they all should be prefixed with GUEST_MAGIC_*. One of the benefits as Michal pointed out in comments for v1 [1] would be this would also allow init-dom0less.h not to re-define XENSTORE_PFN_OFFSET. At the moment, yes they have the same values. But with series, XENSTORE_PFN_OFFSET should technically be 0 in init-dom0less. This is because Xen may only allocate one page (you don't need 4) for the reserved area. So... I will rename them as suggested if both of you are ok with moving them to the arch header. ... I don't think they should be shared. Cheers, -- Julien Grall
Re: [PATCH v2 2/4] xen/arm: Add new HVM_PARAM_HV_RSRV_{BASE_PFN,SIZE} keys in HVMOP
Hi Henry, On 11/05/2024 01:56, Henry Wang wrote: For use cases such as Dom0less PV drivers, a mechanism to communicate Dom0less DomU's static data with the runtime control plane (Dom0) is needed. Since on Arm HVMOP is already the existing approach to address such use cases (for example the allocation of HVM_PARAM_CALLBACK_IRQ), add new HVMOP keys HVM_PARAM_HV_RSRV_{BASE_PFN,SIZE} for storing the hypervisor reserved pages region base PFN and size. Currently, the hypervisor reserved pages region is used as the Arm Dom0less DomU guest magic pages region. Therefore protect the HVMOP keys with "#if defined(__arm__) || defined(__aarch64__)". The values will be set at Dom0less DomU construction time after Dom0less DomU's magic pages region has been allocated. Reported-by: Alec Kwapis Signed-off-by: Henry Wang --- v2: - Rename the HVMOP keys to HVM_PARAM_HV_RSRV_{BASE_PFN,SIZE}. (Daniel) - Add comment on top of HVM_PARAM_HV_RSRV_{BASE_PFN,SIZE} to describe its usage. Protect them with #ifdef. (Daniel, Jan) --- xen/arch/arm/dom0less-build.c | 3 +++ xen/arch/arm/hvm.c | 2 ++ xen/include/public/hvm/params.h | 11 ++- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index 4b96ddd9ce..5bb53ebb47 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -764,6 +764,9 @@ static int __init alloc_magic_pages(struct domain *d) return rc; } +d->arch.hvm.params[HVM_PARAM_HV_RSRV_BASE_PFN] = gfn_x(gfn); +d->arch.hvm.params[HVM_PARAM_HV_RSRV_SIZE] = NR_MAGIC_PAGES; + return 0; } diff --git a/xen/arch/arm/hvm.c b/xen/arch/arm/hvm.c index 0989309fea..949d804f8b 100644 --- a/xen/arch/arm/hvm.c +++ b/xen/arch/arm/hvm.c @@ -55,6 +55,8 @@ static int hvm_allow_get_param(const struct domain *d, unsigned int param) case HVM_PARAM_STORE_EVTCHN: case HVM_PARAM_CONSOLE_PFN: case HVM_PARAM_CONSOLE_EVTCHN: +case HVM_PARAM_HV_RSRV_BASE_PFN: +case HVM_PARAM_HV_RSRV_SIZE: We should not allow the guest to read HVM_PARAM_HV_*. So this would need to be handled like HVM_PARAM_HV_*. But I have some reservation with using HVM params. It means we are setting in stone how we communicate/allocate the reserved pages. I can see a few issues that may bite back in the future. We always pre-allocate the reserved pages. But most of the feature may not be used which ends up to be a waste of memory. For instance, you only seem to one page for dom0less, but allocate 4. I understand that libxenguest is doing the same today, however we have the flexibility to change it by just not allocating the page. Similarly, we expect the hypervisor to provide enough reserved pages for the toolstack to work. Yet, I believe some of the features (e.g. memaccess) doesn't need to be known in advance. The number of pages are tiny at the moment (4) on Xen. But given the name, I could see someone asking for setting aside more reserved spaces and a single range may start to be a problem for direct mapped domain (unless we don't need to map the magic pages 1:1). Overall, if we want to make it part of the stable ABI. Then it would be better to provide a (large) reserved space that will be allocated on demand (IOW there is no associated physical memory). If it doesn't work for direct mapped domain, then another solution is to have a toolstack hypercall to allocate a GFN (they don't need to be contiguous) and the backing memory. Cheers, -- Julien Grall
Re: [PATCH v2 1/4] xen/arm: Alloc hypervisor reserved pages as magic pages for Dom0less DomUs
Hi Julien, On 5/11/2024 4:46 PM, Julien Grall wrote: Hi Henry, On 11/05/2024 01:56, Henry Wang wrote: There are use cases (for example using the PV driver) in Dom0less setup that require Dom0less DomUs start immediately with Dom0, but initialize XenStore later after Dom0's successful boot and call to the init-dom0less application. An error message can seen from the init-dom0less application on 1:1 direct-mapped domains: ``` Allocating magic pages memory.c:238:d0v0 mfn 0x39000 doesn't belong to d1 Error on alloc magic pages ``` The "magic page" is a terminology used in the toolstack as reserved pages for the VM to have access to virtual platform capabilities. Currently the magic pages for Dom0less DomUs are populated by the init-dom0less app through populate_physmap(), and populate_physmap() automatically assumes gfn == mfn for 1:1 direct mapped domains. This cannot be true for the magic pages that are allocated later from the init-dom0less application executed in Dom0. For domain using statically allocated memory but not 1:1 direct-mapped, similar error "failed to retrieve a reserved page" can be seen as the reserved memory list is empty at that time. To solve above issue, this commit allocates hypervisor reserved pages (currently used as the magic pages) for Arm Dom0less DomUs at the domain construction time. The base address/PFN of the region will be noted and communicated to the init-dom0less application in Dom0. Reported-by: Alec Kwapis Suggested-by: Daniel P. Smith Signed-off-by: Henry Wang --- v2: - Reword the commit msg to explain what is "magic page" and use generic terminology "hypervisor reserved pages" in commit msg. (Daniel) - Also move the offset definition of magic pages. (Michal) - Extract the magic page allocation logic to a function. (Michal) --- tools/libs/guest/xg_dom_arm.c | 6 -- xen/arch/arm/dom0less-build.c | 32 xen/include/public/arch-arm.h | 6 ++ 3 files changed, 38 insertions(+), 6 deletions(-) diff --git a/tools/libs/guest/xg_dom_arm.c b/tools/libs/guest/xg_dom_arm.c index 2fd8ee7ad4..8c579d7576 100644 --- a/tools/libs/guest/xg_dom_arm.c +++ b/tools/libs/guest/xg_dom_arm.c @@ -25,12 +25,6 @@ #include "xg_private.h" -#define NR_MAGIC_PAGES 4 -#define CONSOLE_PFN_OFFSET 0 -#define XENSTORE_PFN_OFFSET 1 -#define MEMACCESS_PFN_OFFSET 2 -#define VUART_PFN_OFFSET 3 - #define LPAE_SHIFT 9 #define PFN_4K_SHIFT (0) diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index 74f053c242..4b96ddd9ce 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -739,6 +739,34 @@ static int __init alloc_xenstore_evtchn(struct domain *d) return 0; } +static int __init alloc_magic_pages(struct domain *d) +{ + struct page_info *magic_pg; + mfn_t mfn; + gfn_t gfn; + int rc; + + d->max_pages += NR_MAGIC_PAGES; Here you bump d->max_mages by NR_MAGIC_PAGES but... + magic_pg = alloc_domheap_pages(d, get_order_from_pages(NR_MAGIC_PAGES), 0); ... here you will allocate using a power-of-two. Which may end up to fail as there is nothing guaranteeing that NR_MAGIC_PAGES is suitably aligned. For now NR_MAGIC_PAGES seems suitably aligned, so it BUILD_BUG_ON() woudl be ok. Great catch! I will add BUILD_BUG_ON(NR_MAGIC_PAGES & (NR_MAGIC_PAGES - 1)); Thanks. + if ( magic_pg == NULL ) + return -ENOMEM; + + mfn = page_to_mfn(magic_pg); + if ( !is_domain_direct_mapped(d) ) + gfn = gaddr_to_gfn(GUEST_MAGIC_BASE); + else + gfn = gaddr_to_gfn(mfn_to_maddr(mfn)); Allocating the magic pages contiguously is only necessary for direct mapped domain. For the other it might be preferable to allocate page by page. That said, NR_MAGIC_PAGES is not big enough. So it would be okay. + + rc = guest_physmap_add_pages(d, gfn, mfn, NR_MAGIC_PAGES); + if ( rc ) + { + free_domheap_pages(magic_pg, get_order_from_pages(NR_MAGIC_PAGES)); + return rc; + } + + return 0; +} + static int __init construct_domU(struct domain *d, const struct dt_device_node *node) { @@ -840,6 +868,10 @@ static int __init construct_domU(struct domain *d, if ( rc < 0 ) return rc; d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL; + + rc = alloc_magic_pages(d); + if ( rc < 0 ) + return rc; This will only be allocated xenstore is enabled. But I don't think some of the magic pages really require xenstore to work. In the future we may need some more fine graine choice (see my comment in patch #2 as well). Sorry, but it seems that by the time that I am writing this reply, I didn't get the email for patch #2 comment. I will reply both together when I see it. } return rc; diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h index 289af81bd6..186520d01f 100644 ---
Re: [PATCH v2 1/4] xen/arm: Alloc hypervisor reserved pages as magic pages for Dom0less DomUs
Hi Henry, On 11/05/2024 01:56, Henry Wang wrote: There are use cases (for example using the PV driver) in Dom0less setup that require Dom0less DomUs start immediately with Dom0, but initialize XenStore later after Dom0's successful boot and call to the init-dom0less application. An error message can seen from the init-dom0less application on 1:1 direct-mapped domains: ``` Allocating magic pages memory.c:238:d0v0 mfn 0x39000 doesn't belong to d1 Error on alloc magic pages ``` The "magic page" is a terminology used in the toolstack as reserved pages for the VM to have access to virtual platform capabilities. Currently the magic pages for Dom0less DomUs are populated by the init-dom0less app through populate_physmap(), and populate_physmap() automatically assumes gfn == mfn for 1:1 direct mapped domains. This cannot be true for the magic pages that are allocated later from the init-dom0less application executed in Dom0. For domain using statically allocated memory but not 1:1 direct-mapped, similar error "failed to retrieve a reserved page" can be seen as the reserved memory list is empty at that time. To solve above issue, this commit allocates hypervisor reserved pages (currently used as the magic pages) for Arm Dom0less DomUs at the domain construction time. The base address/PFN of the region will be noted and communicated to the init-dom0less application in Dom0. Reported-by: Alec Kwapis Suggested-by: Daniel P. Smith Signed-off-by: Henry Wang --- v2: - Reword the commit msg to explain what is "magic page" and use generic terminology "hypervisor reserved pages" in commit msg. (Daniel) - Also move the offset definition of magic pages. (Michal) - Extract the magic page allocation logic to a function. (Michal) --- tools/libs/guest/xg_dom_arm.c | 6 -- xen/arch/arm/dom0less-build.c | 32 xen/include/public/arch-arm.h | 6 ++ 3 files changed, 38 insertions(+), 6 deletions(-) diff --git a/tools/libs/guest/xg_dom_arm.c b/tools/libs/guest/xg_dom_arm.c index 2fd8ee7ad4..8c579d7576 100644 --- a/tools/libs/guest/xg_dom_arm.c +++ b/tools/libs/guest/xg_dom_arm.c @@ -25,12 +25,6 @@ #include "xg_private.h" -#define NR_MAGIC_PAGES 4 -#define CONSOLE_PFN_OFFSET 0 -#define XENSTORE_PFN_OFFSET 1 -#define MEMACCESS_PFN_OFFSET 2 -#define VUART_PFN_OFFSET 3 - #define LPAE_SHIFT 9 #define PFN_4K_SHIFT (0) diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index 74f053c242..4b96ddd9ce 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -739,6 +739,34 @@ static int __init alloc_xenstore_evtchn(struct domain *d) return 0; } +static int __init alloc_magic_pages(struct domain *d) +{ +struct page_info *magic_pg; +mfn_t mfn; +gfn_t gfn; +int rc; + +d->max_pages += NR_MAGIC_PAGES; Here you bump d->max_mages by NR_MAGIC_PAGES but... +magic_pg = alloc_domheap_pages(d, get_order_from_pages(NR_MAGIC_PAGES), 0); ... here you will allocate using a power-of-two. Which may end up to fail as there is nothing guaranteeing that NR_MAGIC_PAGES is suitably aligned. For now NR_MAGIC_PAGES seems suitably aligned, so it BUILD_BUG_ON() woudl be ok. +if ( magic_pg == NULL ) +return -ENOMEM; + +mfn = page_to_mfn(magic_pg); +if ( !is_domain_direct_mapped(d) ) +gfn = gaddr_to_gfn(GUEST_MAGIC_BASE); +else +gfn = gaddr_to_gfn(mfn_to_maddr(mfn)); Allocating the magic pages contiguously is only necessary for direct mapped domain. For the other it might be preferable to allocate page by page. That said, NR_MAGIC_PAGES is not big enough. So it would be okay. + +rc = guest_physmap_add_pages(d, gfn, mfn, NR_MAGIC_PAGES); +if ( rc ) +{ +free_domheap_pages(magic_pg, get_order_from_pages(NR_MAGIC_PAGES)); +return rc; +} + +return 0; +} + static int __init construct_domU(struct domain *d, const struct dt_device_node *node) { @@ -840,6 +868,10 @@ static int __init construct_domU(struct domain *d, if ( rc < 0 ) return rc; d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL; + +rc = alloc_magic_pages(d); +if ( rc < 0 ) +return rc; This will only be allocated xenstore is enabled. But I don't think some of the magic pages really require xenstore to work. In the future we may need some more fine graine choice (see my comment in patch #2 as well). } return rc; diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h index 289af81bd6..186520d01f 100644 --- a/xen/include/public/arch-arm.h +++ b/xen/include/public/arch-arm.h @@ -476,6 +476,12 @@ typedef uint64_t xen_callback_t; #define GUEST_MAGIC_BASE xen_mk_ullong(0x3900) #define GUEST_MAGIC_SIZE xen_mk_ullong(0x0100) +#define NR_MAGIC_PAGES 4 This name and the other below are far too generic to be part of the
Re: [PATCH 02/15] xen/arm/gic: Enable interrupt assignment to running VM
Hi Julien, On 5/11/2024 4:22 PM, Julien Grall wrote: Hi Henry, On 11/05/2024 08:29, Henry Wang wrote: + /* + * Handle the LR where the physical interrupt is de-assigned from the + * guest before it was EOIed + */ + struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq); This will return a vCPU from the current affinity. This may not be where the interrupt was injected. From a brief look, I can't tell whether we have an easy way to know where the interrupt was injected (other than the pending_irq is in the list lr_queue/inflight) I doubt if we need to handle more than this - I think if the pending_irq is not in the lr_queue/inflight list, it would not belong to the corner case we are talking about (?). I didn't suggest we would need to handle the case where the pending_irq is not any of the queues. I was pointing out that I think we don't directly store the vCPU ID where we injected the IRQ. Instead, the pending_irq is just in list, so we will possibly need to store the vCPU ID for convenience. Sorry for misunderstanding. Yeah you are definitely correct. Also thank you so much for the suggestion! Before seeing this suggestion, I was struggling in finding the correct vCPU by "for_each_vcpus" and comparison... but now I realized your suggestion is way more clever :) Kind regards, Henry
Re: [PATCH 02/15] xen/arm/gic: Enable interrupt assignment to running VM
Hi Henry, On 11/05/2024 08:29, Henry Wang wrote: + /* + * Handle the LR where the physical interrupt is de-assigned from the + * guest before it was EOIed + */ + struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq); This will return a vCPU from the current affinity. This may not be where the interrupt was injected. From a brief look, I can't tell whether we have an easy way to know where the interrupt was injected (other than the pending_irq is in the list lr_queue/inflight) I doubt if we need to handle more than this - I think if the pending_irq is not in the lr_queue/inflight list, it would not belong to the corner case we are talking about (?). I didn't suggest we would need to handle the case where the pending_irq is not any of the queues. I was pointing out that I think we don't directly store the vCPU ID where we injected the IRQ. Instead, the pending_irq is just in list, so we will possibly need to store the vCPU ID for convenience. Cheers, -- Julien Grall
Re: [PATCH 02/15] xen/arm/gic: Enable interrupt assignment to running VM
Hi Julien, On 5/10/2024 4:54 PM, Julien Grall wrote: Hi, On 09/05/2024 16:31, Henry Wang wrote: On 5/9/2024 4:46 AM, Julien Grall wrote: Hi Henry, [...] ``` diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index a775f886ed..d3f9cd2299 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -135,16 +135,6 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int virq, ASSERT(virq < vgic_num_irqs(d)); ASSERT(!is_lpi(virq)); - /* - * When routing an IRQ to guest, the virtual state is not synced - * back to the physical IRQ. To prevent get unsync, restrict the - * routing to when the Domain is been created. - */ -#ifndef CONFIG_OVERLAY_DTB - if ( d->creation_finished ) - return -EBUSY; -#endif - ret = vgic_connect_hw_irq(d, NULL, virq, desc, true); This is checking if the interrupt is already enabled. Do we also need to check for active/pending? Thank you for raising this! I assume you meant this? @@ -444,7 +444,9 @@ int vgic_connect_hw_irq(struct domain *d, struct vcpu *v, unsigned int virq, { /* The VIRQ should not be already enabled by the guest */ if ( !p->desc && - !test_bit(GIC_IRQ_GUEST_ENABLED, >status) ) + !test_bit(GIC_IRQ_GUEST_ENABLED, >status) && + !test_bit(GIC_IRQ_GUEST_ACTIVE, >status) && + !test_bit(GIC_IRQ_GUEST_VISIBLE, >status) ) p->desc = desc; else ret = -EBUSY; I think adding the check for active/pending check at the time of routing the IRQ makes sense, so I will add them (both for old and new vGIC implementation). if ( ret ) return ret; @@ -169,20 +159,40 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned int virq, ASSERT(test_bit(_IRQ_GUEST, >status)); ASSERT(!is_lpi(virq)); - /* - * Removing an interrupt while the domain is running may have - * undesirable effect on the vGIC emulation. - */ -#ifndef CONFIG_OVERLAY_DTB - if ( !d->is_dying ) - return -EBUSY; -#endif - desc->handler->shutdown(desc); /* EOI the IRQ if it has not been done by the guest */ if ( test_bit(_IRQ_INPROGRESS, >status) ) + { I assume this is just a PoC state, but I just want to point out that this will not work with the new vGIC (some of the functions doesn't exist there). Thank you. Yes currently we can discuss for the old vGIC implementation. After we reach the final conclusion I will do the changes for both old and new vGIC. + /* + * Handle the LR where the physical interrupt is de-assigned from the + * guest before it was EOIed + */ + struct vcpu *v_target = vgic_get_target_vcpu(d->vcpu[0], virq); This will return a vCPU from the current affinity. This may not be where the interrupt was injected. From a brief look, I can't tell whether we have an easy way to know where the interrupt was injected (other than the pending_irq is in the list lr_queue/inflight) I doubt if we need to handle more than this - I think if the pending_irq is not in the lr_queue/inflight list, it would not belong to the corner case we are talking about (?). + } + spin_unlock_irqrestore(_target->arch.vgic.lock, flags); + + vgic_lock_rank(v_target, rank, flags); + vgic_disable_irqs(v_target, (~rank->ienable) & rank->ienable, rank->index); + vgic_unlock_rank(v_target, rank, flags); Why do you need to call vgic_disable_irqs()? I will drop this part. Kind regards, Henry