On 1/24/26 11:44 PM, Teddy Astie wrote:
Le 22/01/2026 à 17:49, Oleksii Kurochko a écrit :
Introduce vcpu_csr_init() to set up the initial hypervisor-visible CSR
state for a vCPU before it is first scheduled.
To me, "hypervisor-visible CSR state" sounds like some state of the
guest the hypervisor has view on. But to what I read, it's more
hypervisor state CSRs regarding what to intercept and various
virtualization behavior.
I'll rephrase then to:
Introduce vcpu_csr_init() to initialise hypervisor CSRs that control
vCPU execution and virtualization behaviour before the vCPU is first
scheduled.
The function configures trap and interrupt delegation to VS-mode by
setting the appropriate bits in the hedeleg and hideleg registers,
initializes hstatus so that execution enters VS-mode when control is
passed to the guest, and restricts guest access to hardware performance
counters by initializing hcounteren, as unrestricted access would
require additional handling in Xen.
When the Smstateen and SSAIA extensions are available, access to AIA
CSRs and IMSIC guest interrupt files is enabled by setting the
corresponding bits in hstateen0, avoiding unnecessary traps into Xen
(note that SVSLCT(Supervisor Virtual Select) name is used intead of
CSRIND as OpenSBI uses such name and riscv_encoding.h is mostly based
on it).
If the Svpbmt extension is supported, the PBMTE bit is set in
henvcfg to allow its use for VS-stage address translation. Guest
access to the ENVCFG CSR is also enabled by setting ENVCFG bit in
hstateen0, as a guest may need to control certain characteristics of
the U-mode (VU-mode when V=1) execution environment.
For CSRs that may contain read-only bits (e.g. hedeleg, hideleg,
hstateen0), the written value is re-read from hardware and cached in
vcpu->arch to avoid divergence between the software state and the actual
CSR contents.
AFAIU from RISC-V isa document, at least for some CSRs the read-only-0
bits are well-defined and means "can't be configured" due to not having
a meaning (e.g hedeleg defines as read-only "Environment call from
HS-mode" because guest can't be in a "actual" HS-mode).
It was said about bits which hypervisor tries to set in the mentioned
registers and IIRC all of them could be r/o-0 as, for example, M-mode
can decide not to delegate them to HS-mode.
As hstatus is not part of struct arch_vcpu (it already resides in
struct cpu_user_regs), introduce vcpu_guest_cpu_user_regs() to provide
a uniform way to access hstatus and other guest CPU user registers.
This establishes a consistent and well-defined initial CSR state for
vCPUs prior to their first context switch.
Signed-off-by: Oleksii Kurochko <[email protected]>
---
Changes in v2:
- As hstatus isn't a part of arch_vcpu structure (as it is already a part of
cpu_user_regs) introduce vcpu_guest_cpu_user_regs() to be able to access
hstatus and other CPU user regs.
Sounds like hstatus wants to be splitted from guest_cpu_user_regs (which
are supposed to be GPR ?).
Generally, agree. But hstatus want to be saved during a trap even before guest
is started and considering that we already have the code which stores it in
guest_cpu_user_regs structure and handle it during the trap, I've decided just
to drop hstatus from arch_vcpu.
- Sort hideleg bit setting by value. Drop a stray blank.
- Drop | when the first initialization of hcounteren and hennvcfg happen.
- Introduce HEDELEG_DEFAULT. Sort set bits by value and use BIT() macros
instead of open-coding it.
- Apply pattern csr_write() -> csr_read() for hedeleg and hideleg instead
of direct bit setting in v->arch.h{i,e}deleg as it could be that for some
reason some bits of hedeleg and hideleg are r/o.
The similar patter is used for hstateen0 as some of the bits could be r/o.
- Add check that SSAIA is avaialable before setting of SMSTATEEN0_AIA |
SMSTATEEN0_IMSIC | SMSTATEEN0_SVSLCT bits.
- Drop local variables hstatus, hideleg and hedeleg as they aren't used
anymore.
---
xen/arch/riscv/cpufeature.c | 1 +
xen/arch/riscv/domain.c | 73 +++++++++++++++++++++
xen/arch/riscv/include/asm/cpufeature.h | 1 +
xen/arch/riscv/include/asm/current.h | 2 +
xen/arch/riscv/include/asm/riscv_encoding.h | 2 +
5 files changed, 79 insertions(+)
diff --git a/xen/arch/riscv/cpufeature.c b/xen/arch/riscv/cpufeature.c
index 02b68aeaa49f..03e27b037be0 100644
--- a/xen/arch/riscv/cpufeature.c
+++ b/xen/arch/riscv/cpufeature.c
@@ -137,6 +137,7 @@ const struct riscv_isa_ext_data __initconst riscv_isa_ext[]
= {
RISCV_ISA_EXT_DATA(zbb),
RISCV_ISA_EXT_DATA(zbs),
RISCV_ISA_EXT_DATA(smaia),
+ RISCV_ISA_EXT_DATA(smstateen),
RISCV_ISA_EXT_DATA(ssaia),
RISCV_ISA_EXT_DATA(svade),
RISCV_ISA_EXT_DATA(svpbmt),
diff --git a/xen/arch/riscv/domain.c b/xen/arch/riscv/domain.c
index 9c546267881b..3ae5fa3a8805 100644
--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -5,6 +5,77 @@
#include <xen/sched.h>
#include <xen/vmap.h>
+#include <asm/cpufeature.h>
+#include <asm/csr.h>
+#include <asm/riscv_encoding.h>
+
+#define HEDELEG_DEFAULT (BIT(CAUSE_MISALIGNED_FETCH, U) | \
+ BIT(CAUSE_FETCH_ACCESS, U) | \
+ BIT(CAUSE_ILLEGAL_INSTRUCTION, U) | \
+ BIT(CAUSE_BREAKPOINT, U) | \
+ BIT(CAUSE_MISALIGNED_LOAD, U) | \
+ BIT(CAUSE_LOAD_ACCESS, U) | \
+ BIT(CAUSE_MISALIGNED_STORE, U) | \
+ BIT(CAUSE_STORE_ACCESS, U) | \
+ BIT(CAUSE_USER_ECALL, U) | \
+ BIT(CAUSE_FETCH_PAGE_FAULT, U) | \
+ BIT(CAUSE_LOAD_PAGE_FAULT, U) | \
+ BIT(CAUSE_STORE_PAGE_FAULT, U))
+
+#define HIDELEG_DEFAULT (MIP_VSSIP | MIP_VSTIP | MIP_VSEIP)
+
+static void vcpu_csr_init(struct vcpu *v)
+{
+ register_t hstateen0;
+
+ csr_write(CSR_HEDELEG, HEDELEG_DEFAULT);
+ v->arch.hedeleg = csr_read(CSR_HEDELEG);
+
+ vcpu_guest_cpu_user_regs(v)->hstatus = HSTATUS_SPV | HSTATUS_SPVP;
+
+ csr_write(CSR_HIDELEG, HIDELEG_DEFAULT);
+ v->arch.hideleg = csr_read(CSR_HIDELEG);
+
To me, that feels odd to set machine CSR here. Especially if (I guess)
that we would update them anyway during context switches.
Yes, they will be updated anyway.
When this approach was initially suggested, I considered the case where
some code might attempt to use these bits before the context-switch logic
runs. In that situation, we could end up executing code that assumes the
feature is available (because the bit is set in|v->arch.some_reg|), only
to later discover that the corresponding CSR bit is read-only zero.
I think it would be better to have :
- vcpu_csr_init -> sets initial state CSR in vcpu structure, doesn't
touch machine CSR
- context switching logic -> loads vcpu-specific machine CSR from vcpu
structure
We would have to make a context switch to enter the vcpu for the first
time; but that's to be expected.
With my proposal, we would also want to move the vcpu_csr_init()
invocation to the place we initialize the vcpu_arch structure rather
than the first time we schedule inside that vcpu.
I think I may be misunderstanding something here. vcpu_csr_init() is now
called from arch_vcpu_create(), which initialises architecture-specific
arch_vcpu fields. Am I missing something?
That would also allow to take account of per-domain configuration if we
need to (e.g feature flags).
Thanks.
~ Oleksii