Hello Teddy,

On 1/23/26 12:30 PM, Teddy Astie wrote:
Hello,

Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
Introduce architecture-specific functions to create and destroy VCPUs.
Note that arch_vcpu_create() currently returns -EOPNOTSUPP, as the virtual
timer and interrupt controller are not yet implemented.

As part of this change, add continue_new_vcpu(), which will be used after
the first context_switch() of a new vCPU. Since this functionality is not
yet implemented, continue_new_vcpu() is currently provided as a stub.

Update the STACK_SIZE definition and introduce STACK_ORDER (to align with
other architectures) for allocating the vCPU stack.

Introduce struct cpu_info to store per-vCPU state that lives at the top
of the vCPU's stack.

Signed-off-by: Oleksii Kurochko <[email protected]>
---
Changes in v2:
   - Drop BUILD_BUG_ON() in arch_vcpu_create() as a check isn't very useful.
   - Use vzalloc() instead of alloc_xenheap_page() to use the larger domheap to
     allocate vCPU's stack.
   - Drop printk() inside arch_vcpu_create() to not have potential big noise
     in console as it could be that an amount of vCPUs is pretty big.
   - Use XVFREE() instead of free_xenheap_pages() as vCPU's stack allocation
     happens with a usage of vzalloc() now.
   - Drop stack field as it is enough to have only cpu_info as stack pointer
     could be calculated based on cpu_info.
   - Drop cast when v.arch.cpu_info is inialized as it is not necessary
          to have it.
   - Drop memset() for arch.cpu_info() as it is enough to have vzalloc().
---
   xen/arch/riscv/Makefile              |  1 +
   xen/arch/riscv/domain.c              | 59 ++++++++++++++++++++++++++++
   xen/arch/riscv/include/asm/config.h  |  3 +-
   xen/arch/riscv/include/asm/current.h |  6 +++
   xen/arch/riscv/include/asm/domain.h  |  2 +
   xen/arch/riscv/stubs.c               | 10 -----
   6 files changed, 70 insertions(+), 11 deletions(-)
   create mode 100644 xen/arch/riscv/domain.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 87c1148b0010..8863d4b15605 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -1,5 +1,6 @@
   obj-y += aplic.o
   obj-y += cpufeature.o
+obj-y += domain.o
   obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
   obj-y += entry.o
   obj-y += imsic.o
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
new file mode 100644
index 000000000000..9c546267881b
--- /dev/null
+++ b/xen/arch/riscv/domain.c
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/vmap.h>
+
+static void continue_new_vcpu(struct vcpu *prev)
+{
+    BUG_ON("unimplemented\n");
+}
+
+static void __init __maybe_unused build_assertions(void)
+{
+    /*
+     * Enforce the requirement documented in struct cpu_info that
+     * guest_cpu_user_regs must be the first field.
+     */
+    BUILD_BUG_ON(offsetof(struct cpu_info, guest_cpu_user_regs) != 0);
+}
+
+int arch_vcpu_create(struct vcpu *v)
+{
+    int rc = 0;
+    void *stack = vzalloc(STACK_SIZE);
+
Are there alignment constraints for the stack ?

vzalloc() allocates page-aligned memory as far as I can see:
  ...
  va = __vmap(mfn, 1, pages, 1, PAGE_HYPERVISOR, type);
  ...
where 1 -> means align and what will lead that in vm_alloc():
     ...
     start =(start +align)&~(align -1); ...


+    if ( !stack )
+        return -ENOMEM;
+
+    v->arch.cpu_info = stack + STACK_SIZE - sizeof(struct cpu_info);
+    memset(v->arch.cpu_info, 0, sizeof(*v->arch.cpu_info));
+
Given that vzalloc ensures that the memory is cleared, you don't need to
clear it another time.

Oh, right, missed to drop memset.


+    v->arch.xen_saved_context.sp = (register_t)v->arch.cpu_info;
+    v->arch.xen_saved_context.ra = (register_t)continue_new_vcpu;
+
You probably also want to set the first parameter of continue_new_vcpu
(struct vcpu *prev), or otherwise I don't think we want the "prev"
parameter in continue_new_vcpu if it's always going to be 0.

It will be set by RISC-V ABI (prev will be stored in a0) when __context_switch()
will be called in context_switch():
  void context_switch(struct vcpu *prev, struct vcpu *next)
  {
    ASSERT(local_irq_is_enabled());
    ASSERT(prev != next);
    ASSERT(!vcpu_cpu_dirty(next));

    local_irq_disable();

    set_current(next);

    prev = __context_switch(prev, next);

    schedule_tail(prev);
  }
First call of the __context_switch() will lead to jump to continue_new_vcpu()
function which will have a0=prev.


IIRC continue_new_vcpu is only meant to bootstrap the new vCPU, not just
perform a context switch to it.

+    /* Idle VCPUs don't need the rest of this setup */
+    if ( is_idle_vcpu(v) )
+        return rc;
+
+    /*
+     * As the vtimer and interrupt controller (IC) are not yet implemented,
+     * return an error.
+     *
+     * TODO: Drop this once the vtimer and IC are implemented.
+     */
+    rc = -EOPNOTSUPP;
+    goto fail;
+
+    return rc;
+
+ fail:
+    arch_vcpu_destroy(v);
+    return rc;
+}
+
+void arch_vcpu_destroy(struct vcpu *v)
+{
+    vfree((char *)v->arch.cpu_info + sizeof(struct cpu_info));
+}
That doesn't look correct, you want to vfree what was allocated as the
bottom of stack, i.e

v->arch.cpu_info + sizeof(struct cpu_info) - STACK_SIZE

Agree formula should be correct, now it points to the end of the stack
memory.


Or alternatively introduce bottom of stack as a additional vcpu_arch field.

There is not too much sense to have the separate field for that.

Thanks.

~ Oleksii


Reply via email to