Hello Teddy,
On 1/23/26 12:30 PM, Teddy Astie wrote:
Hello,
Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
Introduce architecture-specific functions to create and destroy VCPUs.
Note that arch_vcpu_create() currently returns -EOPNOTSUPP, as the virtual
timer and interrupt controller are not yet implemented.
As part of this change, add continue_new_vcpu(), which will be used after
the first context_switch() of a new vCPU. Since this functionality is not
yet implemented, continue_new_vcpu() is currently provided as a stub.
Update the STACK_SIZE definition and introduce STACK_ORDER (to align with
other architectures) for allocating the vCPU stack.
Introduce struct cpu_info to store per-vCPU state that lives at the top
of the vCPU's stack.
Signed-off-by: Oleksii Kurochko <[email protected]>
---
Changes in v2:
- Drop BUILD_BUG_ON() in arch_vcpu_create() as a check isn't very useful.
- Use vzalloc() instead of alloc_xenheap_page() to use the larger domheap to
allocate vCPU's stack.
- Drop printk() inside arch_vcpu_create() to not have potential big noise
in console as it could be that an amount of vCPUs is pretty big.
- Use XVFREE() instead of free_xenheap_pages() as vCPU's stack allocation
happens with a usage of vzalloc() now.
- Drop stack field as it is enough to have only cpu_info as stack pointer
could be calculated based on cpu_info.
- Drop cast when v.arch.cpu_info is inialized as it is not necessary
to have it.
- Drop memset() for arch.cpu_info() as it is enough to have vzalloc().
---
xen/arch/riscv/Makefile | 1 +
xen/arch/riscv/domain.c | 59 ++++++++++++++++++++++++++++
xen/arch/riscv/include/asm/config.h | 3 +-
xen/arch/riscv/include/asm/current.h | 6 +++
xen/arch/riscv/include/asm/domain.h | 2 +
xen/arch/riscv/stubs.c | 10 -----
6 files changed, 70 insertions(+), 11 deletions(-)
create mode 100644 xen/arch/riscv/domain.c
diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 87c1148b0010..8863d4b15605 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -1,5 +1,6 @@
obj-y += aplic.o
obj-y += cpufeature.o
+obj-y += domain.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-y += entry.o
obj-y += imsic.o
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
new file mode 100644
index 000000000000..9c546267881b
--- /dev/null
+++ b/xen/arch/riscv/domain.c
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/vmap.h>
+
+static void continue_new_vcpu(struct vcpu *prev)
+{
+ BUG_ON("unimplemented\n");
+}
+
+static void __init __maybe_unused build_assertions(void)
+{
+ /*
+ * Enforce the requirement documented in struct cpu_info that
+ * guest_cpu_user_regs must be the first field.
+ */
+ BUILD_BUG_ON(offsetof(struct cpu_info, guest_cpu_user_regs) != 0);
+}
+
+int arch_vcpu_create(struct vcpu *v)
+{
+ int rc = 0;
+ void *stack = vzalloc(STACK_SIZE);
+
Are there alignment constraints for the stack ?
vzalloc() allocates page-aligned memory as far as I can see:
...
va = __vmap(mfn, 1, pages, 1, PAGE_HYPERVISOR, type);
...
where 1 -> means align and what will lead that in vm_alloc():
...
start =(start +align)&~(align -1); ...
+ if ( !stack )
+ return -ENOMEM;
+
+ v->arch.cpu_info = stack + STACK_SIZE - sizeof(struct cpu_info);
+ memset(v->arch.cpu_info, 0, sizeof(*v->arch.cpu_info));
+
Given that vzalloc ensures that the memory is cleared, you don't need to
clear it another time.
Oh, right, missed to drop memset.
+ v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
+ v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
+
You probably also want to set the first parameter of continue_new_vcpu
(struct vcpu *prev), or otherwise I don't think we want the "prev"
parameter in continue_new_vcpu if it's always going to be 0.
It will be set by RISC-V ABI (prev will be stored in a0) when __context_switch()
will be called in context_switch():
void context_switch(struct vcpu *prev, struct vcpu *next)
{
ASSERT(local_irq_is_enabled());
ASSERT(prev != next);
ASSERT(!vcpu_cpu_dirty(next));
local_irq_disable();
set_current(next);
prev = __context_switch(prev, next);
schedule_tail(prev);
}
First call of the __context_switch() will lead to jump to continue_new_vcpu()
function which will have a0=prev.
IIRC continue_new_vcpu is only meant to bootstrap the new vCPU, not just
perform a context switch to it.
+ /* Idle VCPUs don't need the rest of this setup */
+ if ( is_idle_vcpu(v) )
+ return rc;
+
+ /*
+ * As the vtimer and interrupt controller (IC) are not yet implemented,
+ * return an error.
+ *
+ * TODO: Drop this once the vtimer and IC are implemented.
+ */
+ rc = -EOPNOTSUPP;
+ goto fail;
+
+ return rc;
+
+ fail:
+ arch_vcpu_destroy(v);
+ return rc;
+}
+
+void arch_vcpu_destroy(struct vcpu *v)
+{
+ vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
+}
That doesn't look correct, you want to vfree what was allocated as the
bottom of stack, i.e
v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE
Agree formula should be correct, now it points to the end of the stack
memory.
Or alternatively introduce bottom of stack as a additional vcpu_arch field.
There is not too much sense to have the separate field for that.
Thanks.
~ Oleksii