Re: [PATCH 1/6] x86: Introduce x86_decode_lite()

2024-04-23 Thread Andrew Cooper
On 23/04/2024 10:17 am, Jan Beulich wrote:
> On 22.04.2024 20:14, Andrew Cooper wrote:
>> --- /dev/null
>> +++ b/xen/arch/x86/x86_emulate/decode-lite.c
>> @@ -0,0 +1,245 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +
>> +#include "private.h"
>> +
>> +#define Imm8   (1 << 0)
>> +#define Imm(1 << 1)
>> +#define Branch (1 << 5) /* ... that we care about */
>> +/*  ModRM  (1 << 6) */
>> +#define Known  (1 << 7)
>> +
>> +#define ALU_OPS \
>> +(Known|ModRM),  \
>> +(Known|ModRM),  \
>> +(Known|ModRM),  \
>> +(Known|ModRM),  \
>> +(Known|Imm8),   \
>> +(Known|Imm)
>> +
>> +static const uint8_t init_or_livepatch_const onebyte[256] = {
>> +[0x00] = ALU_OPS, /* ADD */ [0x08] = ALU_OPS, /* OR  */
>> +[0x10] = ALU_OPS, /* ADC */ [0x18] = ALU_OPS, /* SBB */
>> +[0x20] = ALU_OPS, /* AND */ [0x28] = ALU_OPS, /* SUB */
>> +[0x30] = ALU_OPS, /* XOR */ [0x38] = ALU_OPS, /* CMP */
>> +
>> +[0x50 ... 0x5f] = (Known), /* PUSH/POP %reg */
>> +
>> +[0x70 ... 0x7f] = (Known|Branch|Imm8), /* Jcc disp8 */
>> +[0x80]  = (Known|ModRM|Imm8),
>> +[0x81]  = (Known|ModRM|Imm),
>> +
>> +[0x83]  = (Known|ModRM|Imm8),
>> +[0x84 ... 0x8e] = (Known|ModRM),   /* TEST/XCHG/MOV/MOV-SREG/LEA 
>> r/rm */
>> +
>> +[0xb0 ... 0xb7] = (Known|Imm8),/* MOV $imm8, %reg */
> I'm surprised you get away without at least NOP also marked as known.
> Imo the whole 0x90 row should be marked so.
>
> Similarly I'm not convinced that leaving the 0xA0 row unpopulated is a
> good idea. It's at best a trap set up for somebody to fall into rather
> sooner than later.
>
>> +[0xb8 ... 0xbf] = (Known|Imm), /* MOV $imm32, %reg */
>> +
>> +[0xcc]  = (Known), /* INT3 */
>> +[0xcd]  = (Known|Imm8),/* INT $imm8 */
> Like above, what about in particular any of the shifts/rotates and the
> MOV that's in the 0xC0 row?
>
> While the last sentence in the description is likely meant to cover
> that, I think the description wants to go further as to any pretty
> common but omitted insns. Already "x86: re-work memset()" and "x86: re-
> work memcpy()" (v2 pending for, soon, 3 years) would make it necessary
> to touch this table, thus increasing complexity of those changes to an
> area they shouldn't be concerned about at all.
>
>> +[0xe8 ... 0xe9] = (Known|Branch|Imm),  /* CALL/JMP disp32 */
>> +[0xeb]  = (Known|Branch|Imm8), /* JMP disp8 */
> 0xe0 ... 0xe7 and 0xec ... 0xef would imo also better be covered, as
> they easily can be (much like you also cover e.g. CMC despite it
> appearing pretty unlikely that we use that insn anywhere).
>
>> +[0xf1]  = (Known), /* ICEBP */
>> +[0xf4]  = (Known), /* HLT */
>> +[0xf5]  = (Known), /* CMC */
>> +[0xf6 ... 0xf7] = (Known|ModRM),   /* Grp3 */
>> +[0xf8 ... 0xfd] = (Known), /* CLC ... STD */
>> +[0xfe ... 0xff] = (Known|ModRM),   /* Grp4 */
>> +};
>> +static const uint8_t init_or_livepatch_const twobyte[256] = {
>> +[0x00 ... 0x01] = (Known|ModRM),   /* Grp6/Grp7 */
> LAR / LSL? CLTS? WBINVD? UD2?
>
>> +[0x18 ... 0x1f] = (Known|ModRM),   /* Grp16 (Hint Nop) */
>> +
>> +[0x20 ... 0x23] = (Known|ModRM),   /* MOV CR/DR */
>> +
>> +[0x30 ... 0x34] = (Known), /* WRMSR ... RDPMC */
> 0x34 is SYSENTER, isn't it?
>
>> +[0x40 ... 0x4f] = (Known|ModRM),   /* CMOVcc */
>> +
>> +[0x80 ... 0x8f] = (Known|Branch|Imm),  /* Jcc disp32 */
> What about things like VMREAD/VMWRITE?
>
>> +[0x90 ... 0x9f] = (Known|ModRM),   /* SETcc */
> PUSH/POP fs/gs? CPUID?
>
>> +[0xab]  = (Known|ModRM),   /* BTS */
> BTS (and BTC below) but not BT and BTR?
>
>> +[0xac]  = (Known|ModRM|Imm8),  /* SHRD $imm8 */
>> +[0xad ... 0xaf] = (Known|ModRM),   /* SHRD %cl / Grp15 / IMUL */
> SHRD but not SHLD?
>
> CMPXCHG?
>
>> +[0xb8 ... 0xb9] = (Known|ModRM),   /* POPCNT/Grp10 (UD1) */
>> +[0xba]  = (Known|ModRM|Imm8),  /* Grp8 */
>> +[0xbb ... 0xbf] = (Known|ModRM),   /* BSR/BSF/BSR/MOVSX */
> Nit (comment only): 0xbb is BTC.
>
> MOVSX but not MOVZX and also not MOVSXD (in the 1-byte table)?
>
>> +};
> XADD, MOVNTI, and the whole 0xc7-based group?

As you may have guessed, I filled in the opcode table until I could
parse all replacements.

When, at the end of this, I didn't need the osize=8 movs, I took the
decoding complexity back out.

In an ideal world I would have finished the userspace harness too, but I
ran out of time.  Where it's just populating the table, that's fine. 
Where it complicates the decoding logic, that needs more thought.
>> +/*
>> + * Bare minimum x86 instruction 

Re: [XEN PATCH] automation/eclair: add deviations for MISRA C:2012 Rule 16.4

2024-04-23 Thread Jan Beulich
On 23.04.2024 12:02, Federico Serafini wrote:
> --- a/docs/misra/deviations.rst
> +++ b/docs/misra/deviations.rst
> @@ -302,6 +302,19 @@ Deviations related to MISRA C:2012 Rules:
> leave such files as is.
>   - Tagged as `deliberate` for ECLAIR.
>  
> +   * - R16.4
> + - Switch statements having a controlling expression of enum type
> +   deliberately do not have a default case: gcc -Wall enables -Wswitch
> +   which warns (and breaks the build as we use -Werror) if one of the 
> enum
> +   labels is missing from the switch.
> + - Tagged as `deliberate` for ECLAIR.
> +
> +   * - R16.4
> + - A switch statement with a single switch clause and no default label 
> may
> +   be used in place of an equivalent if statement if it is considered to
> +   improve readability."

First a terminology related comment here: I'm afraid "switch clause" can be
interpreted multiple ways, when I think we want to leave no room for
interpretation here. It's not even clear to me whether

switch ( x )
{
case 1: case 2: case 3: case 4:
...
break;
}

would be covered by the deviation, or whether the multiple case labels
wouldn't already be too much.

And then it is not clear to me why

switch ( x )
{
case 1:
...
break;
default:
...
break;
}

shouldn't also be covered, as potentially a readability improvement /
future change simplification over

if ( x == 1 )
{
...
}
else
{
...
}

Jan



[XEN PATCH] automation/eclair: add deviations for MISRA C:2012 Rule 16.4

2024-04-23 Thread Federico Serafini
Update ECLAIR configuration to take into account the deviations
agreed during MISRA meetings for Rule 16.4.

Signed-off-by: Federico Serafini 
---
 automation/eclair_analysis/ECLAIR/deviations.ecl |  8 
 docs/misra/deviations.rst| 13 +
 2 files changed, 21 insertions(+)

diff --git a/automation/eclair_analysis/ECLAIR/deviations.ecl 
b/automation/eclair_analysis/ECLAIR/deviations.ecl
index d21f112a9b..f09ad71acf 100644
--- a/automation/eclair_analysis/ECLAIR/deviations.ecl
+++ b/automation/eclair_analysis/ECLAIR/deviations.ecl
@@ -384,6 +384,14 @@ explicit comment indicating the fallthrough intention is 
present."
 -config=MC3R1.R16.3,reports+={safe, "any_area(end_loc(any_exp(text(^(?s).*/\\* 
[fF]all ?through.? \\*/.*$,0..1"}
 -doc_end
 
+-doc_begin="Switch statements having a controlling expression of enum type 
deliberately do not have a default case: gcc -Wall enables -Wswitch which warns 
(and breaks the build as we use -Werror) if one of the enum labels is missing 
from the switch."
+-config=MC3R1.R16.4,reports+={deliberate,'any_area(kind(context)&&^.* has no 
`default.*$&(node(switch_stmt)&(cond,skip(__non_syntactic_paren_stmts,type(canonical(enum_underlying_type(any('}
+-doc_end
+
+-doc_begin="A switch statement with a single switch clause and no default 
label may be used in place of an equivalent if statement if it is considered to 
improve readability."
+-config=MC3R1.R16.4,switch_clauses+={deliberate,"switch(1)&(0)"}
+-doc_end
+
 -doc_begin="A switch statement with a single switch clause and no default 
label may be used in place of an equivalent if statement if it is considered to 
improve readability."
 -config=MC3R1.R16.6,switch_clauses+={deliberate, "default(0)"}
 -doc_end
diff --git a/docs/misra/deviations.rst b/docs/misra/deviations.rst
index ed0c1e8ed0..df87239b7d 100644
--- a/docs/misra/deviations.rst
+++ b/docs/misra/deviations.rst
@@ -302,6 +302,19 @@ Deviations related to MISRA C:2012 Rules:
leave such files as is.
  - Tagged as `deliberate` for ECLAIR.
 
+   * - R16.4
+ - Switch statements having a controlling expression of enum type
+   deliberately do not have a default case: gcc -Wall enables -Wswitch
+   which warns (and breaks the build as we use -Werror) if one of the enum
+   labels is missing from the switch.
+ - Tagged as `deliberate` for ECLAIR.
+
+   * - R16.4
+ - A switch statement with a single switch clause and no default label may
+   be used in place of an equivalent if statement if it is considered to
+   improve readability."
+ - Tagged as `deliberate` for ECLAIR.
+
* - R16.3
  - Switch clauses ending with continue, goto, return statements are safe.
  - Tagged as `safe` for ECLAIR.
-- 
2.34.1




Re: [XEN PATCH v2 4/5] xen/arm: allow dynamically assigned SGI handlers

2024-04-23 Thread Jens Wiklander
Hi Julien,

On Mon, Apr 22, 2024 at 12:57 PM Julien Grall  wrote:
>
> Hi Jens,
>
> On 22/04/2024 08:37, Jens Wiklander wrote:
> > Updates so request_irq() can be used with a dynamically assigned SGI irq
> > as input. This prepares for a later patch where an FF-A schedule
> > receiver interrupt handler is installed for an SGI generated by the
> > secure world.
>
> I would like to understand the use-case a bit more. Who is responsible
> to decide the SGI number? Is it Xen or the firmware?
>
> If the later, how can we ever guarantee the ID is not going to clash
> with what the OS/hypervisor is using? Is it described in a
> specification? If so, please give a pointer.

The firmware decides the SGI number. Given that the firmware doesn't
know which SGIs Xen is using it typically needs to donate one of the
secure SGIs, but that is transparent to Xen.


>
> >
> > gic_route_irq_to_xen() don't gic_set_irq_type() for SGIs since they are
> > always edge triggered.
> >
> > gic_interrupt() is updated to route the dynamically assigned SGIs to
> > do_IRQ() instead of do_sgi(). The latter still handles the statically
> > assigned SGI handlers like for instance GIC_SGI_CALL_FUNCTION.
> >
> > Signed-off-by: Jens Wiklander 
> > ---
> > v1->v2
> > - Update patch description as requested
> > ---
> >   xen/arch/arm/gic.c | 5 +++--
> >   xen/arch/arm/irq.c | 7 +--
>
> I am not sure where to write the comment. But I think the comment on top
> of irq_set_affinity() in setup_irq() should also be updated.
>
> >   2 files changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index 44c40e86defe..e9aeb7138455 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -117,7 +117,8 @@ void gic_route_irq_to_xen(struct irq_desc *desc, 
> > unsigned int priority)
> >
> >   desc->handler = gic_hw_ops->gic_host_irq_type;
> >
> > -gic_set_irq_type(desc, desc->arch.type);
> > +if ( desc->irq >= NR_GIC_SGI)
> > +gic_set_irq_type(desc, desc->arch.type);
>
> So above, you say that the SGIs are always edge-triggered interrupt. So
> I assume desc->arch.type. So are you skipping the call because it is
> unnessary or it could do the wrong thing?
>
> Ideally, the outcome of the answer be part of the comment on top of the
> check.

gic_set_irq_type() has an assert "ASSERT(type != IRQ_TYPE_INVALID)"
which is triggered without this check.
So it's both unnecessary and wrong. I suppose we could update the
bookkeeping of all SGIs to be edge-triggered instead of
IRQ_TYPE_INVALID. It would still be unnecessary though. What do you
suggest?

>
> >   gic_set_irq_priority(desc, priority);
> >   }
> >
> > @@ -375,7 +376,7 @@ void gic_interrupt(struct cpu_user_regs *regs, int 
> > is_fiq)
> >   /* Reading IRQ will ACK it */
> >   irq = gic_hw_ops->read_irq();
> >
> > -if ( likely(irq >= 16 && irq < 1020) )
> > +if ( likely(irq >= GIC_SGI_MAX && irq < 1020) )
>
> This check is now rather confusing as one could think that do_IRQ()
> would still not be reached for dynamic SGI. I think it would be clearer
> GIC_SGI_MAX needs to be renamed to GIC_SGI_STATIC_MAX and do_sgi() to
> do_static_sgi().

Thanks, I'll update.

>
> >   {
> >   isb();
> >   do_IRQ(regs, irq, is_fiq);
> > diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> > index bcce80a4d624..fdb214560978 100644
> > --- a/xen/arch/arm/irq.c
> > +++ b/xen/arch/arm/irq.c
> > @@ -224,9 +224,12 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int 
> > irq, int is_fiq)
> >
> >   perfc_incr(irqs);
> >
> > -ASSERT(irq >= 16); /* SGIs do not come down this path */
> > +/* Statically assigned SGIs do not come down this path */
> > +ASSERT(irq >= GIC_SGI_MAX);
>
>
> With this change, I think the path with vgic_inject_irq() now needs to
> gain an ASSERT(irq >= NR_GIC_SGI) because the path is not supposed to be
> taken for SGIs.

I'm sorry, I don't see the connection. If I add
ASSERT(virq >= NR_GIC_SGI);
at the top of vgic_inject_irq() it will panic when injecting a
Schedule Receiver or Notification Pending Interrupt for a guest.


>
> >
> > -if ( irq < 32 )
> > +if ( irq < NR_GIC_SGI )
> > +perfc_incr(ipis);
> > +else if ( irq < NR_GIC_LOCAL_IRQS )
> >   perfc_incr(ppis);
> >   else
> >   perfc_incr(spis);
>

Thanks,
Jens



Re: [PATCH 1/6] x86: Introduce x86_decode_lite()

2024-04-23 Thread Jan Beulich
On 22.04.2024 20:14, Andrew Cooper wrote:
> In order to relocate all IP-relative fields in an alternative replacement
> block, we need to decode the instructions enough to obtain their length and
> any relative fields.
> 
> Full x86_decode() is far too heavyweight, so introduce a minimal form which
> can make several simplifying assumptions.
> 
> This logic is sufficient to decode all alternative blocks that exist in Xen
> right now.
> 
> Signed-off-by: Andrew Cooper 
> 
> --- a/xen/arch/x86/x86_emulate/Makefile
> +++ b/xen/arch/x86/x86_emulate/Makefile
> @@ -3,6 +3,7 @@ obj-y += 0fae.o
>  obj-y += 0fc7.o
>  obj-y += blk.o
>  obj-y += decode.o
> +obj-y += decode-lite.o

When LIVEPATCH=n, this could do with conversion to *.init.o, like is
done in the parent directory for alternative.c as well.

> --- /dev/null
> +++ b/xen/arch/x86/x86_emulate/decode-lite.c
> @@ -0,0 +1,245 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include "private.h"
> +
> +#define Imm8   (1 << 0)
> +#define Imm(1 << 1)
> +#define Branch (1 << 5) /* ... that we care about */
> +/*  ModRM  (1 << 6) */
> +#define Known  (1 << 7)
> +
> +#define ALU_OPS \
> +(Known|ModRM),  \
> +(Known|ModRM),  \
> +(Known|ModRM),  \
> +(Known|ModRM),  \
> +(Known|Imm8),   \
> +(Known|Imm)
> +
> +static const uint8_t init_or_livepatch_const onebyte[256] = {
> +[0x00] = ALU_OPS, /* ADD */ [0x08] = ALU_OPS, /* OR  */
> +[0x10] = ALU_OPS, /* ADC */ [0x18] = ALU_OPS, /* SBB */
> +[0x20] = ALU_OPS, /* AND */ [0x28] = ALU_OPS, /* SUB */
> +[0x30] = ALU_OPS, /* XOR */ [0x38] = ALU_OPS, /* CMP */
> +
> +[0x50 ... 0x5f] = (Known), /* PUSH/POP %reg */
> +
> +[0x70 ... 0x7f] = (Known|Branch|Imm8), /* Jcc disp8 */
> +[0x80]  = (Known|ModRM|Imm8),
> +[0x81]  = (Known|ModRM|Imm),
> +
> +[0x83]  = (Known|ModRM|Imm8),
> +[0x84 ... 0x8e] = (Known|ModRM),   /* TEST/XCHG/MOV/MOV-SREG/LEA 
> r/rm */
> +
> +[0xb0 ... 0xb7] = (Known|Imm8),/* MOV $imm8, %reg */

I'm surprised you get away without at least NOP also marked as known.
Imo the whole 0x90 row should be marked so.

Similarly I'm not convinced that leaving the 0xA0 row unpopulated is a
good idea. It's at best a trap set up for somebody to fall into rather
sooner than later.

> +[0xb8 ... 0xbf] = (Known|Imm), /* MOV $imm32, %reg */
> +
> +[0xcc]  = (Known), /* INT3 */
> +[0xcd]  = (Known|Imm8),/* INT $imm8 */

Like above, what about in particular any of the shifts/rotates and the
MOV that's in the 0xC0 row?

While the last sentence in the description is likely meant to cover
that, I think the description wants to go further as to any pretty
common but omitted insns. Already "x86: re-work memset()" and "x86: re-
work memcpy()" (v2 pending for, soon, 3 years) would make it necessary
to touch this table, thus increasing complexity of those changes to an
area they shouldn't be concerned about at all.

> +[0xe8 ... 0xe9] = (Known|Branch|Imm),  /* CALL/JMP disp32 */
> +[0xeb]  = (Known|Branch|Imm8), /* JMP disp8 */

0xe0 ... 0xe7 and 0xec ... 0xef would imo also better be covered, as
they easily can be (much like you also cover e.g. CMC despite it
appearing pretty unlikely that we use that insn anywhere).

> +[0xf1]  = (Known), /* ICEBP */
> +[0xf4]  = (Known), /* HLT */
> +[0xf5]  = (Known), /* CMC */
> +[0xf6 ... 0xf7] = (Known|ModRM),   /* Grp3 */
> +[0xf8 ... 0xfd] = (Known), /* CLC ... STD */
> +[0xfe ... 0xff] = (Known|ModRM),   /* Grp4 */
> +};
> +static const uint8_t init_or_livepatch_const twobyte[256] = {
> +[0x00 ... 0x01] = (Known|ModRM),   /* Grp6/Grp7 */

LAR / LSL? CLTS? WBINVD? UD2?

> +[0x18 ... 0x1f] = (Known|ModRM),   /* Grp16 (Hint Nop) */
> +
> +[0x20 ... 0x23] = (Known|ModRM),   /* MOV CR/DR */
> +
> +[0x30 ... 0x34] = (Known), /* WRMSR ... RDPMC */

0x34 is SYSENTER, isn't it?

> +[0x40 ... 0x4f] = (Known|ModRM),   /* CMOVcc */
> +
> +[0x80 ... 0x8f] = (Known|Branch|Imm),  /* Jcc disp32 */

What about things like VMREAD/VMWRITE?

> +[0x90 ... 0x9f] = (Known|ModRM),   /* SETcc */

PUSH/POP fs/gs? CPUID?

> +[0xab]  = (Known|ModRM),   /* BTS */

BTS (and BTC below) but not BT and BTR?

> +[0xac]  = (Known|ModRM|Imm8),  /* SHRD $imm8 */
> +[0xad ... 0xaf] = (Known|ModRM),   /* SHRD %cl / Grp15 / IMUL */

SHRD but not SHLD?

CMPXCHG?

> +[0xb8 ... 0xb9] = (Known|ModRM),   /* POPCNT/Grp10 (UD1) */
> +[0xba]  = (Known|ModRM|Imm8),  /* Grp8 */
> +[0xbb ... 0xbf] = (Known|ModRM),   /* BSR/BSF/BSR/MOVSX */

Nit 

[XEN PATCH v1 7/7] x86/MCE: optional build of AMD/Intel MCE code

2024-04-23 Thread Sergiy Kibrik
Separate Intel/AMD-specific MCE code using CONFIG_{INTEL,AMD} config options.
Now we can avoid build of mcheck code if support for specific platform is
intentionally disabled by configuration.

Signed-off-by: Sergiy Kibrik 
---
 xen/arch/x86/cpu/mcheck/Makefile| 6 ++
 xen/arch/x86/cpu/mcheck/non-fatal.c | 6 ++
 xen/arch/x86/cpu/mcheck/vmce.h  | 1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/Makefile b/xen/arch/x86/cpu/mcheck/Makefile
index f927f10b4d..5b3f6d875c 100644
--- a/xen/arch/x86/cpu/mcheck/Makefile
+++ b/xen/arch/x86/cpu/mcheck/Makefile
@@ -1,12 +1,10 @@
-obj-y += amd_nonfatal.o
-obj-y += mce_amd.o
 obj-y += mcaction.o
 obj-y += barrier.o
-obj-y += intel-nonfatal.o
 obj-y += mctelem.o
 obj-y += mce.o
 obj-y += mce-apei.o
-obj-y += mce_intel.o
+obj-$(CONFIG_AMD) += mce_amd.o amd_nonfatal.o
+obj-$(CONFIG_INTEL) += mce_intel.o intel-nonfatal.o
 obj-y += non-fatal.o
 obj-y += util.o
 obj-y += vmce.o
diff --git a/xen/arch/x86/cpu/mcheck/non-fatal.c 
b/xen/arch/x86/cpu/mcheck/non-fatal.c
index 33cacd15c2..2d91a3b1e0 100644
--- a/xen/arch/x86/cpu/mcheck/non-fatal.c
+++ b/xen/arch/x86/cpu/mcheck/non-fatal.c
@@ -24,14 +24,20 @@ static int __init cf_check init_nonfatal_mce_checker(void)
 * Check for non-fatal errors every MCE_RATE s
 */
switch (c->x86_vendor) {
+#ifdef CONFIG_AMD
case X86_VENDOR_AMD:
case X86_VENDOR_HYGON:
/* Assume we are on K8 or newer AMD or Hygon CPU here */
amd_nonfatal_mcheck_init(c);
break;
+#endif
+#ifdef CONFIG_INTEL
case X86_VENDOR_INTEL:
intel_nonfatal_mcheck_init(c);
break;
+#endif
+   default:
+   return -ENODEV;
}
printk(KERN_INFO "mcheck_poll: Machine check polling timer started.\n");
return 0;
-- 
2.25.1




[XEN PATCH v1 6/7] x86/MCE: guard call to Intel-specific intel_get_extended_msrs()

2024-04-23 Thread Sergiy Kibrik
Add check for CONFIG_INTEL build option to conditional call of this routine,
so that if Intel support is disabled the call would be eliminated.

No functional change intended.

Signed-off-by: Sergiy Kibrik 
---
 xen/arch/x86/cpu/mcheck/mce.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 42e84e76b7..de2e16a6e2 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -328,7 +328,8 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks 
*bankmask,
 ASSERT(mig);
 mca_init_global(mc_flags, mig);
 /* A hook here to get global extended msrs */
-if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+if ( IS_ENABLED(CONFIG_INTEL) &&
+ boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
 intel_get_extended_msrs(mig, mci);
 }
 }
-- 
2.25.1




[XEN PATCH v1 5/7] x86/MCE: guard {intel/amd}_mcheck_init() calls

2024-04-23 Thread Sergiy Kibrik
Guard calls to CPU-specific mcheck init routines in common MCE code
using new INTEL/AMD config options.

The purpose is not to build platform-specific mcheck code and calls to it,
if this platform is disabled in config.

Signed-off-by: Sergiy Kibrik 
---
 xen/arch/x86/cpu/mcheck/mce.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 72dfaf28cb..42e84e76b7 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -761,7 +761,8 @@ void mcheck_init(struct cpuinfo_x86 *c, bool bsp)
 {
 case X86_VENDOR_AMD:
 case X86_VENDOR_HYGON:
-inited = amd_mcheck_init(c, bsp);
+inited = IS_ENABLED(CONFIG_AMD) ? amd_mcheck_init(c, bsp) :
+  mcheck_unset;
 break;
 
 case X86_VENDOR_INTEL:
@@ -769,7 +770,8 @@ void mcheck_init(struct cpuinfo_x86 *c, bool bsp)
 {
 case 6:
 case 15:
-inited = intel_mcheck_init(c, bsp);
+inited = IS_ENABLED(CONFIG_INTEL) ? intel_mcheck_init(c, bsp) :
+mcheck_unset;
 break;
 }
 break;
-- 
2.25.1




[XEN PATCH v1 4/7] x86/MCE: guard lmce_support/cmci_support

2024-04-23 Thread Sergiy Kibrik
Guard access to Intel-specific lmce_support & cmci_support variables in
common MCE/VMCE code. These are set in Intel-specific parts of mcheck code
and can potentially be skipped if building for non-intel platform by
disabling CONFIG_INTEL option.

Signed-off-by: Sergiy Kibrik 
---
 xen/arch/x86/cpu/mcheck/mce.c  | 4 ++--
 xen/arch/x86/cpu/mcheck/vmce.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 2844685983..72dfaf28cb 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -611,7 +611,7 @@ static void set_poll_bankmask(struct cpuinfo_x86 *c)
 mb = per_cpu(poll_bankmask, cpu);
 BUG_ON(!mb);
 
-if ( cmci_support && opt_mce )
+if ( IS_ENABLED(CONFIG_INTEL) && cmci_support && opt_mce )
 {
 const struct mca_banks *cmci = per_cpu(no_cmci_banks, cpu);
 
@@ -1607,7 +1607,7 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
 break;
 
 case XEN_MC_INJECT_TYPE_LMCE:
-if ( !lmce_support )
+if ( IS_ENABLED(CONFIG_INTEL) && !lmce_support )
 {
 ret = x86_mcerr("No LMCE support", -EINVAL);
 break;
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index be229684a4..6051ab2b2e 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -546,7 +546,7 @@ int vmce_enable_mca_cap(struct domain *d, uint64_t cap)
 
 if ( cap & XEN_HVM_MCA_CAP_LMCE )
 {
-if ( !lmce_support )
+if ( IS_ENABLED(CONFIG_INTEL) && !lmce_support )
 return -EINVAL;
 for_each_vcpu(d, v)
 v->arch.vmce.mcg_cap |= MCG_LMCE_P;
-- 
2.25.1




[XEN PATCH v1 3/7] x86/MCE: guard access to Intel/AMD-specific MCA MSRs

2024-04-23 Thread Sergiy Kibrik
Add build-time checks for newly introduced INTEL/AMD config options when
calling vmce_{intel/amd}_{rdmsr/wrmsr}() routines.
This way a platform-specific code can be omitted in vmce code, if this
platform is disabled in config.

Signed-off-by: Sergiy Kibrik 
---
 xen/arch/x86/cpu/mcheck/vmce.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index c437f62c0a..be229684a4 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -141,12 +141,14 @@ static int bank_mce_rdmsr(const struct vcpu *v, uint32_t 
msr, uint64_t *val)
 case X86_VENDOR_CENTAUR:
 case X86_VENDOR_SHANGHAI:
 case X86_VENDOR_INTEL:
-ret = vmce_intel_rdmsr(v, msr, val);
+ret = IS_ENABLED(CONFIG_INTEL) ?
+  vmce_intel_rdmsr(v, msr, val) : -ENODEV;
 break;
 
 case X86_VENDOR_AMD:
 case X86_VENDOR_HYGON:
-ret = vmce_amd_rdmsr(v, msr, val);
+ret = IS_ENABLED(CONFIG_AMD) ?
+  vmce_amd_rdmsr(v, msr, val) : -ENODEV;
 break;
 
 default:
@@ -272,12 +274,14 @@ static int bank_mce_wrmsr(struct vcpu *v, uint32_t msr, 
uint64_t val)
 switch ( boot_cpu_data.x86_vendor )
 {
 case X86_VENDOR_INTEL:
-ret = vmce_intel_wrmsr(v, msr, val);
+ret = IS_ENABLED(CONFIG_INTEL) ?
+  vmce_intel_wrmsr(v, msr, val) : -ENODEV;
 break;
 
 case X86_VENDOR_AMD:
 case X86_VENDOR_HYGON:
-ret = vmce_amd_wrmsr(v, msr, val);
+ret = IS_ENABLED(CONFIG_AMD) ?
+  vmce_amd_wrmsr(v, msr, val) : -ENODEV;
 break;
 
 default:
-- 
2.25.1




[XEN PATCH v1 2/7] x86/intel: guard vmce_has_lmce() with INTEL option

2024-04-23 Thread Sergiy Kibrik
Since MCG_LMCE_P bit is specific to Intel CPUs the code to check it can
possibly be excluded from build if !CONFIG_INTEL. With these guards
calls to vmce_has_lmce() are eliminated and mce_intel.c can end up
not being built.

Also replace boilerplate code that checks for MCG_LMCE_P flag with
vmce_has_lmce(), which might contribute to readability a bit.

Signed-off-by: Sergiy Kibrik 
---
 xen/arch/x86/cpu/mcheck/vmce.c | 4 ++--
 xen/arch/x86/msr.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 353d4f19b2..c437f62c0a 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -199,7 +199,7 @@ int vmce_rdmsr(uint32_t msr, uint64_t *val)
  * bits are always set in guest MSR_IA32_FEATURE_CONTROL by Xen, so it
  * does not need to check them here.
  */
-if ( cur->arch.vmce.mcg_cap & MCG_LMCE_P )
+if ( IS_ENABLED(CONFIG_INTEL) && vmce_has_lmce(cur) )
 {
 *val = cur->arch.vmce.mcg_ext_ctl;
 mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL %#"PRIx64"\n",
@@ -324,7 +324,7 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
 break;
 
 case MSR_IA32_MCG_EXT_CTL:
-if ( (cur->arch.vmce.mcg_cap & MCG_LMCE_P) &&
+if ( IS_ENABLED(CONFIG_INTEL) && vmce_has_lmce(cur) &&
  !(val & ~MCG_EXT_CTL_LMCE_EN) )
 cur->arch.vmce.mcg_ext_ctl = val;
 else
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 9babd441f9..4010c87d93 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -86,7 +86,7 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
 goto gp_fault;
 
 *val = IA32_FEATURE_CONTROL_LOCK;
-if ( vmce_has_lmce(v) )
+if ( IS_ENABLED(CONFIG_INTEL) && vmce_has_lmce(v) )
 *val |= IA32_FEATURE_CONTROL_LMCE_ON;
 if ( cp->basic.vmx )
 *val |= IA32_FEATURE_CONTROL_ENABLE_VMXON_OUTSIDE_SMX;
-- 
2.25.1




[XEN PATCH v1 1/7] x86/vpmu: separate amd/intel vPMU code

2024-04-23 Thread Sergiy Kibrik
Build AMD vPMU when CONFIG_AMD is on, and Intel vPMU when CONFIG_INTEL
is on respectively, allowing for a plaftorm-specific build. Also separate
arch_vpmu_ops initializers using these options and static inline stubs.

No functional change intended.

Signed-off-by: Sergiy Kibrik 
CC: Andrew Cooper 
CC: Jan Beulich 

---
changes in v1:
 - switch to CONFIG_{AMD,INTEL} instead of CONFIG_{SVM,VMX}


 xen/arch/x86/cpu/Makefile   |  4 +++-
 xen/arch/x86/include/asm/vpmu.h | 19 +++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/Makefile b/xen/arch/x86/cpu/Makefile
index 35561fe51d..eafce5f204 100644
--- a/xen/arch/x86/cpu/Makefile
+++ b/xen/arch/x86/cpu/Makefile
@@ -10,4 +10,6 @@ obj-y += intel.o
 obj-y += intel_cacheinfo.o
 obj-y += mwait-idle.o
 obj-y += shanghai.o
-obj-y += vpmu.o vpmu_amd.o vpmu_intel.o
+obj-y += vpmu.o
+obj-$(CONFIG_AMD) += vpmu_amd.o
+obj-$(CONFIG_INTEL) += vpmu_intel.o
diff --git a/xen/arch/x86/include/asm/vpmu.h b/xen/arch/x86/include/asm/vpmu.h
index dae9b43dac..e7a8f211f8 100644
--- a/xen/arch/x86/include/asm/vpmu.h
+++ b/xen/arch/x86/include/asm/vpmu.h
@@ -11,6 +11,7 @@
 #define __ASM_X86_HVM_VPMU_H_
 
 #include 
+#include 
 
 #define vcpu_vpmu(vcpu)   (&(vcpu)->arch.vpmu)
 #define vpmu_vcpu(vpmu)   container_of((vpmu), struct vcpu, arch.vpmu)
@@ -42,9 +43,27 @@ struct arch_vpmu_ops {
 #endif
 };
 
+#ifdef CONFIG_INTEL
 const struct arch_vpmu_ops *core2_vpmu_init(void);
+#else
+static inline const struct arch_vpmu_ops *core2_vpmu_init(void)
+{
+return ERR_PTR(-ENODEV);
+}
+#endif
+#ifdef CONFIG_AMD
 const struct arch_vpmu_ops *amd_vpmu_init(void);
 const struct arch_vpmu_ops *hygon_vpmu_init(void);
+#else
+static inline const struct arch_vpmu_ops *amd_vpmu_init(void)
+{
+return ERR_PTR(-ENODEV);
+}
+static inline const struct arch_vpmu_ops *hygon_vpmu_init(void)
+{
+return ERR_PTR(-ENODEV);
+}
+#endif
 
 struct vpmu_struct {
 u32 flags;
-- 
2.25.1




[XEN PATCH v1 0/7] x86: make Intel/AMD vPMU & MCE support configurable

2024-04-23 Thread Sergiy Kibrik
Here's an attempt to further separate support of Intel & AMD CPUs in Xen build.
The code to drive both platforms used to be built unconditionally, until recent
introduction of CONFIG_{AMD,INTEL} Kconfig options.

This series extends coverage of these options on vpmu and mcheck subsystems,
that is not building Intel or AMD vpmu/mcheck support if CPU vendor's support
was explicitly disabled.

Sergiy Kibrik (7):
  x86/vpmu: separate amd/intel vPMU code
  x86/intel: guard vmce_has_lmce() with INTEL option
  x86/MCE: guard access to Intel/AMD-specific MCA MSRs
  x86/MCE: guard lmce_support/cmci_support
  x86/MCE: guard {intel/amd}_mcheck_init() calls
  x86/MCE: guard call to Intel-specific intel_get_extended_msrs()
  x86/MCE: optional build of AMD/Intel MCE code

 xen/arch/x86/cpu/Makefile   |  4 +++-
 xen/arch/x86/cpu/mcheck/Makefile|  6 ++
 xen/arch/x86/cpu/mcheck/mce.c   | 13 -
 xen/arch/x86/cpu/mcheck/non-fatal.c |  6 ++
 xen/arch/x86/cpu/mcheck/vmce.c  | 18 +++---
 xen/arch/x86/cpu/mcheck/vmce.h  |  1 +
 xen/arch/x86/include/asm/vpmu.h | 19 +++
 xen/arch/x86/msr.c  |  2 +-
 8 files changed, 51 insertions(+), 18 deletions(-)

-- 
2.25.1




[PATCH 5/7] xen/arm: Rework heap page allocation outside allocate_bank_memory

2024-04-23 Thread Luca Fancellu
The function allocate_bank_memory allocates pages from the heap and
map them to the guest using guest_physmap_add_page.

As a preparation work to support static shared memory bank when the
host physical address is not provided, Xen needs to allocate memory
from the heap, so rework allocate_bank_memory moving out the page
allocation in a new function called allocate_domheap_memory.

The function allocate_domheap_memory takes a callback function and
a pointer to some extra information passed to the callback and this
function will be called for every page allocated, until a defined
size is reached.

In order to keep allocate_bank_memory functionality, the callback
passed to allocate_domheap_memory is a wrapper for
guest_physmap_add_page.

Let allocate_domheap_memory be externally visible, in order to use
it in the future from the static shared memory module.

Take the opportunity to change the signature of allocate_bank_memory
and remove the 'struct domain' parameter, which can be retrieved from
'struct kernel_info'.

No functional changes is intended.

Signed-off-by: Luca Fancellu 
---
 xen/arch/arm/dom0less-build.c   |  4 +-
 xen/arch/arm/domain_build.c | 77 +
 xen/arch/arm/include/asm/domain_build.h |  9 ++-
 3 files changed, 62 insertions(+), 28 deletions(-)

diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 74f053c242f4..20ddf6f8f250 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -60,12 +60,12 @@ static void __init allocate_memory(struct domain *d, struct 
kernel_info *kinfo)
 
 mem->nr_banks = 0;
 bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
-if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
+if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
bank_size) )
 goto fail;
 
 bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
-if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
+if ( !allocate_bank_memory(kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
bank_size) )
 goto fail;
 
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 0784e4c5e315..148db06b8ca3 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -417,26 +417,13 @@ static void __init allocate_memory_11(struct domain *d,
 }
 
 #ifdef CONFIG_DOM0LESS_BOOT
-bool __init allocate_bank_memory(struct domain *d, struct kernel_info *kinfo,
- gfn_t sgfn, paddr_t tot_size)
+bool __init allocate_domheap_memory(struct domain *d, paddr_t tot_size,
+alloc_domheap_mem_cb cb, void *extra)
 {
-struct membanks *mem = kernel_info_get_mem(kinfo);
-int res;
+unsigned int max_order = UINT_MAX;
 struct page_info *pg;
-struct membank *bank;
-unsigned int max_order = ~0;
 
-/*
- * allocate_bank_memory can be called with a tot_size of zero for
- * the second memory bank. It is not an error and we can safely
- * avoid creating a zero-size memory bank.
- */
-if ( tot_size == 0 )
-return true;
-
-bank = >bank[mem->nr_banks];
-bank->start = gfn_to_gaddr(sgfn);
-bank->size = tot_size;
+BUG_ON(!cb);
 
 while ( tot_size > 0 )
 {
@@ -463,17 +450,61 @@ bool __init allocate_bank_memory(struct domain *d, struct 
kernel_info *kinfo,
 continue;
 }
 
-res = guest_physmap_add_page(d, sgfn, page_to_mfn(pg), order);
-if ( res )
-{
-dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
+if ( cb(d, pg, order, extra) )
 return false;
-}
 
-sgfn = gfn_add(sgfn, 1UL << order);
 tot_size -= (1ULL << (PAGE_SHIFT + order));
 }
 
+return true;
+}
+
+static int __init guest_map_pages(struct domain *d, struct page_info *pg,
+  unsigned int order, void *extra)
+{
+gfn_t *sgfn = (gfn_t *)extra;
+int res;
+
+BUG_ON(!sgfn);
+res = guest_physmap_add_page(d, *sgfn, page_to_mfn(pg), order);
+if ( res )
+{
+dprintk(XENLOG_ERR, "Failed map pages to DOMU: %d", res);
+return res;
+}
+
+*sgfn = gfn_add(*sgfn, 1UL << order);
+
+return 0;
+}
+
+bool __init allocate_bank_memory(struct kernel_info *kinfo, gfn_t sgfn,
+ paddr_t tot_size)
+{
+struct membanks *mem = kernel_info_get_mem(kinfo);
+struct domain *d = kinfo->d;
+struct membank *bank;
+
+/*
+ * allocate_bank_memory can be called with a tot_size of zero for
+ * the second memory bank. It is not an error and we can safely
+ * avoid creating a zero-size memory bank.
+ */
+if ( tot_size == 0 )
+return true;
+
+bank = >bank[mem->nr_banks];
+bank->start = gfn_to_gaddr(sgfn);
+bank->size = tot_size;
+
+/*
+ * 

[PATCH 2/7] xen/arm: Wrap shared memory mapping code in one function

2024-04-23 Thread Luca Fancellu
Wrap the code and logic that is calling assign_shared_memory
and map_regions_p2mt into a new function 'handle_shared_mem_bank',
it will become useful later when the code will allow the user to
don't pass the host physical address.

Signed-off-by: Luca Fancellu 
---
 xen/arch/arm/static-shmem.c | 71 +++--
 1 file changed, 45 insertions(+), 26 deletions(-)

diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
index f6cf74e58a83..24e40495a481 100644
--- a/xen/arch/arm/static-shmem.c
+++ b/xen/arch/arm/static-shmem.c
@@ -185,6 +185,47 @@ append_shm_bank_to_domain(struct shared_meminfo 
*kinfo_shm_mem, paddr_t start,
 return 0;
 }
 
+static int __init handle_shared_mem_bank(struct domain *d, paddr_t gbase,
+ bool owner_dom_io,
+ const char *role_str,
+ const struct membank *shm_bank)
+{
+paddr_t pbase, psize;
+int ret;
+
+BUG_ON(!shm_bank);
+
+pbase = shm_bank->start;
+psize = shm_bank->size;
+/*
+ * DOMID_IO is a fake domain and is not described in the Device-Tree.
+ * Therefore when the owner of the shared region is DOMID_IO, we will
+ * only find the borrowers.
+ */
+if ( (owner_dom_io && !is_shm_allocated_to_domio(pbase)) ||
+ (!owner_dom_io && strcmp(role_str, "owner") == 0) )
+{
+/*
+ * We found the first borrower of the region, the owner was not
+ * specified, so they should be assigned to dom_io.
+ */
+ret = assign_shared_memory(owner_dom_io ? dom_io : d, gbase, shm_bank);
+if ( ret )
+return ret;
+}
+
+if ( owner_dom_io || (strcmp(role_str, "borrower") == 0) )
+{
+/* Set up P2M foreign mapping for borrower domain. */
+ret = map_regions_p2mt(d, _gfn(PFN_UP(gbase)), PFN_DOWN(psize),
+   _mfn(PFN_UP(pbase)), p2m_map_foreign_rw);
+if ( ret )
+return ret;
+}
+
+return 0;
+}
+
 int __init process_shm(struct domain *d, struct kernel_info *kinfo,
const struct dt_device_node *node)
 {
@@ -249,32 +290,10 @@ int __init process_shm(struct domain *d, struct 
kernel_info *kinfo,
 if ( dt_property_read_string(shm_node, "role", _str) == 0 )
 owner_dom_io = false;
 
-/*
- * DOMID_IO is a fake domain and is not described in the Device-Tree.
- * Therefore when the owner of the shared region is DOMID_IO, we will
- * only find the borrowers.
- */
-if ( (owner_dom_io && !is_shm_allocated_to_domio(pbase)) ||
- (!owner_dom_io && strcmp(role_str, "owner") == 0) )
-{
-/*
- * We found the first borrower of the region, the owner was not
- * specified, so they should be assigned to dom_io.
- */
-ret = assign_shared_memory(owner_dom_io ? dom_io : d, gbase,
-   boot_shm_bank);
-if ( ret )
-return ret;
-}
-
-if ( owner_dom_io || (strcmp(role_str, "borrower") == 0) )
-{
-/* Set up P2M foreign mapping for borrower domain. */
-ret = map_regions_p2mt(d, _gfn(PFN_UP(gbase)), PFN_DOWN(psize),
-   _mfn(PFN_UP(pbase)), p2m_map_foreign_rw);
-if ( ret )
-return ret;
-}
+ret = handle_shared_mem_bank(d, gbase, owner_dom_io, role_str,
+ boot_shm_bank);
+if ( ret )
+return ret;
 
 /*
  * Record static shared memory region info for later setting
-- 
2.34.1




[PATCH 3/7] xen/p2m: put reference for superpage

2024-04-23 Thread Luca Fancellu
From: Penny Zheng 

We are doing foreign memory mapping for static shared memory, and
there is a great possibility that it could be super mapped.
But today, p2m_put_l3_page could not handle superpages.

This commits implements a new function p2m_put_superpage to handle superpages,
specifically for helping put extra references for foreign superpages.

Signed-off-by: Penny Zheng 
Signed-off-by: Luca Fancellu 
---
v1:
 - patch from 
https://patchwork.kernel.org/project/xen-devel/patch/20231206090623.1932275-9-penny.zh...@arm.com/
---
 xen/arch/arm/mmu/p2m.c | 58 +++---
 1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c
index 41fcca011cf4..479a80fbd4cf 100644
--- a/xen/arch/arm/mmu/p2m.c
+++ b/xen/arch/arm/mmu/p2m.c
@@ -753,17 +753,9 @@ static int p2m_mem_access_radix_set(struct p2m_domain 
*p2m, gfn_t gfn,
 return rc;
 }
 
-/*
- * Put any references on the single 4K page referenced by pte.
- * TODO: Handle superpages, for now we only take special references for leaf
- * pages (specifically foreign ones, which can't be super mapped today).
- */
-static void p2m_put_l3_page(const lpae_t pte)
+/* Put any references on the single 4K page referenced by mfn. */
+static void p2m_put_l3_page(mfn_t mfn, unsigned type)
 {
-mfn_t mfn = lpae_get_mfn(pte);
-
-ASSERT(p2m_is_valid(pte));
-
 /*
  * TODO: Handle other p2m types
  *
@@ -771,16 +763,53 @@ static void p2m_put_l3_page(const lpae_t pte)
  * flush the TLBs if the page is reallocated before the end of
  * this loop.
  */
-if ( p2m_is_foreign(pte.p2m.type) )
+if ( p2m_is_foreign(type) )
 {
 ASSERT(mfn_valid(mfn));
 put_page(mfn_to_page(mfn));
 }
 /* Detect the xenheap page and mark the stored GFN as invalid. */
-else if ( p2m_is_ram(pte.p2m.type) && is_xen_heap_mfn(mfn) )
+else if ( p2m_is_ram(type) && is_xen_heap_mfn(mfn) )
 page_set_xenheap_gfn(mfn_to_page(mfn), INVALID_GFN);
 }
 
+/* Put any references on the superpage referenced by mfn. */
+static void p2m_put_superpage(mfn_t mfn, unsigned int next_level, unsigned 
type)
+{
+unsigned int i;
+unsigned int level_order = XEN_PT_LEVEL_ORDER(next_level);
+
+for ( i = 0; i < XEN_PT_LPAE_ENTRIES; i++ )
+{
+if ( next_level == 3 )
+p2m_put_l3_page(mfn, type);
+else
+p2m_put_superpage(mfn, next_level + 1, type);
+
+mfn = mfn_add(mfn, 1 << level_order);
+}
+}
+
+/* Put any references on the page referenced by pte. */
+static void p2m_put_page(const lpae_t pte, unsigned int level)
+{
+mfn_t mfn = lpae_get_mfn(pte);
+
+ASSERT(p2m_is_valid(pte));
+
+/*
+ * We are either having a first level 1G superpage or a
+ * second level 2M superpage.
+ */
+if ( p2m_is_superpage(pte, level) )
+return p2m_put_superpage(mfn, level + 1, pte.p2m.type);
+else
+{
+ASSERT(level == 3);
+return p2m_put_l3_page(mfn, pte.p2m.type);
+}
+}
+
 /* Free lpae sub-tree behind an entry */
 static void p2m_free_entry(struct p2m_domain *p2m,
lpae_t entry, unsigned int level)
@@ -809,9 +838,8 @@ static void p2m_free_entry(struct p2m_domain *p2m,
 #endif
 
 p2m->stats.mappings[level]--;
-/* Nothing to do if the entry is a super-page. */
-if ( level == 3 )
-p2m_put_l3_page(entry);
+p2m_put_page(entry, level);
+
 return;
 }
 
-- 
2.34.1




[PATCH 7/7] xen/docs: Describe static shared memory when host address is not provided

2024-04-23 Thread Luca Fancellu
From: Penny Zheng 

This commit describe the new scenario where host address is not provided
in "xen,shared-mem" property and a new example is added to the page to
explain in details.

Take the occasion to fix some typos in the page.

Signed-off-by: Penny Zheng 
Signed-off-by: Luca Fancellu 
---
v1:
 - patch from 
https://patchwork.kernel.org/project/xen-devel/patch/20231206090623.1932275-10-penny.zh...@arm.com/
   with some changes in the commit message.
---
 docs/misc/arm/device-tree/booting.txt | 52 ---
 1 file changed, 39 insertions(+), 13 deletions(-)

diff --git a/docs/misc/arm/device-tree/booting.txt 
b/docs/misc/arm/device-tree/booting.txt
index bbd955e9c2f6..ac4bad6fe5e0 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -590,7 +590,7 @@ communication.
 An array takes a physical address, which is the base address of the
 shared memory region in host physical address space, a size, and a guest
 physical address, as the target address of the mapping.
-e.g. xen,shared-mem = < [host physical address] [guest address] [size] >
+e.g. xen,shared-mem = < [host physical address] [guest address] [size] >;
 
 It shall also meet the following criteria:
 1) If the SHM ID matches with an existing region, the address range of the
@@ -601,8 +601,8 @@ communication.
 The number of cells for the host address (and size) is the same as the
 guest pseudo-physical address and they are inherited from the parent node.
 
-Host physical address is optional, when missing Xen decides the location
-(currently unimplemented).
+Host physical address is optional, when missing Xen decides the location.
+e.g. xen,shared-mem = < [guest address] [size] >;
 
 - role (Optional)
 
@@ -629,7 +629,7 @@ chosen {
 role = "owner";
 xen,shm-id = "my-shared-mem-0";
 xen,shared-mem = <0x1000 0x1000 0x1000>;
-}
+};
 
 domU1 {
 compatible = "xen,domain";
@@ -640,25 +640,36 @@ chosen {
 vpl011;
 
 /*
- * shared memory region identified as 0x0(xen,shm-id = <0x0>)
- * is shared between Dom0 and DomU1.
+ * shared memory region "my-shared-mem-0" is shared
+ * between Dom0 and DomU1.
  */
 domU1-shared-mem@1000 {
 compatible = "xen,domain-shared-memory-v1";
 role = "borrower";
 xen,shm-id = "my-shared-mem-0";
 xen,shared-mem = <0x1000 0x5000 0x1000>;
-}
+};
 
 /*
- * shared memory region identified as 0x1(xen,shm-id = <0x1>)
- * is shared between DomU1 and DomU2.
+ * shared memory region "my-shared-mem-1" is shared between
+ * DomU1 and DomU2.
  */
 domU1-shared-mem@5000 {
 compatible = "xen,domain-shared-memory-v1";
 xen,shm-id = "my-shared-mem-1";
 xen,shared-mem = <0x5000 0x6000 0x2000>;
-}
+};
+
+/*
+ * shared memory region "my-shared-mem-2" is shared between
+ * DomU1 and DomU2.
+ */
+domU1-shared-mem-2 {
+compatible = "xen,domain-shared-memory-v1";
+xen,shm-id = "my-shared-mem-2";
+role = "owner";
+xen,shared-mem = <0x8000 0x2000>;
+};
 
 ..
 
@@ -672,14 +683,21 @@ chosen {
 cpus = <1>;
 
 /*
- * shared memory region identified as 0x1(xen,shm-id = <0x1>)
- * is shared between domU1 and domU2.
+ * shared memory region "my-shared-mem-1" is shared between
+ * domU1 and domU2.
  */
 domU2-shared-mem@5000 {
 compatible = "xen,domain-shared-memory-v1";
 xen,shm-id = "my-shared-mem-1";
 xen,shared-mem = <0x5000 0x7000 0x2000>;
-}
+};
+
+domU2-shared-mem-2 {
+compatible = "xen,domain-shared-memory-v1";
+xen,shm-id = "my-shared-mem-2";
+role = "borrower";
+xen,shared-mem = <0x9000 0x2000>;
+};
 
 ..
 };
@@ -699,3 +717,11 @@ shared between DomU1 and DomU2. It will get mapped at 
0x6000 in DomU1 guest
 physical address space, and at 0x7000 in DomU2 guest physical address 
space.
 DomU1 and DomU2 are both the borrower domain, the owner domain is the default
 owner domain DOMID_IO.
+
+For the static shared memory region "my-shared-mem-2", since host physical
+address is not provided by user, Xen will automatically allocate 512MB
+from heap as static shared memory to be shared between DomU1 and DomU2.
+The automatically allocated static shared memory will get mapped at
+0x8000 in DomU1 guest physical address space, and at 0x9000 in DomU2
+guest physical address space. DomU1 is explicitly defined as the owner domain,
+and DomU2 is the borrower domain.
-- 
2.34.1




[PATCH 4/7] xen/arm: Parse xen,shared-mem when host phys address is not provided

2024-04-23 Thread Luca Fancellu
Handle the parsing of the 'xen,shared-mem' property when the host physical
address is not provided, this commit is introducing the logic to parse it,
but the functionality is still not implemented and will be part of future
commits.

Rework the logic inside process_shm_node to check the shm_id before doing
the other checks, because it ease the logic itself, add more comment on
the logic.
Now when the host physical address is not provided, the value
INVALID_PADDR is chosen to signal this condition and it is stored as
start of the bank, due to that change also early_print_info_shmem and
init_sharedmem_pages are changed, to don't handle banks with start equal
to INVALID_PADDR.

Another change is done inside meminfo_overlap_check, to skip banks that
are starting with the start address INVALID_PADDR, that function is used
to check banks from reserved memory and ACPI and it's unlikely for these
bank to have the start address as INVALID_PADDR. The change holds
because of this consideration.

Signed-off-by: Luca Fancellu 
---
 xen/arch/arm/setup.c|   3 +-
 xen/arch/arm/static-shmem.c | 129 +---
 2 files changed, 93 insertions(+), 39 deletions(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index d242674381d4..f15d40a85a5f 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -297,7 +297,8 @@ static bool __init meminfo_overlap_check(const struct 
membanks *mem,
 bank_start = mem->bank[i].start;
 bank_end = bank_start + mem->bank[i].size;
 
-if ( region_end <= bank_start || region_start >= bank_end )
+if ( INVALID_PADDR == bank_start || region_end <= bank_start ||
+ region_start >= bank_end )
 continue;
 else
 {
diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
index 24e40495a481..1c03bb7f1882 100644
--- a/xen/arch/arm/static-shmem.c
+++ b/xen/arch/arm/static-shmem.c
@@ -264,6 +264,12 @@ int __init process_shm(struct domain *d, struct 
kernel_info *kinfo,
 pbase = boot_shm_bank->start;
 psize = boot_shm_bank->size;
 
+if ( INVALID_PADDR == pbase )
+{
+printk("%pd: host physical address must be chosen by users at the 
moment.", d);
+return -EINVAL;
+}
+
 /*
  * xen,shared-mem = ;
  * TODO: pbase is optional.
@@ -382,7 +388,8 @@ int __init process_shm_node(const void *fdt, int node, 
uint32_t address_cells,
 {
 const struct fdt_property *prop, *prop_id, *prop_role;
 const __be32 *cell;
-paddr_t paddr, gaddr, size, end;
+paddr_t paddr = INVALID_PADDR;
+paddr_t gaddr, size, end;
 struct membanks *mem = bootinfo_get_shmem();
 struct shmem_membank_extra *shmem_extra = bootinfo_get_shmem_extra();
 unsigned int i;
@@ -437,24 +444,37 @@ int __init process_shm_node(const void *fdt, int node, 
uint32_t address_cells,
 if ( !prop )
 return -ENOENT;
 
+cell = (const __be32 *)prop->data;
 if ( len != dt_cells_to_size(address_cells + size_cells + address_cells) )
 {
-if ( len == dt_cells_to_size(size_cells + address_cells) )
-printk("fdt: host physical address must be chosen by users at the 
moment.\n");
-
-printk("fdt: invalid `xen,shared-mem` property.\n");
-return -EINVAL;
+if ( len == dt_cells_to_size(address_cells + size_cells) )
+device_tree_get_reg(, address_cells, size_cells, ,
+);
+else
+{
+printk("fdt: invalid `xen,shared-mem` property.\n");
+return -EINVAL;
+}
 }
+else
+{
+device_tree_get_reg(, address_cells, address_cells, ,
+);
+size = dt_next_cell(size_cells, );
 
-cell = (const __be32 *)prop->data;
-device_tree_get_reg(, address_cells, address_cells, , );
-size = dt_next_cell(size_cells, );
+if ( !IS_ALIGNED(paddr, PAGE_SIZE) )
+{
+printk("fdt: physical address 0x%"PRIpaddr" is not suitably 
aligned.\n",
+paddr);
+return -EINVAL;
+}
 
-if ( !IS_ALIGNED(paddr, PAGE_SIZE) )
-{
-printk("fdt: physical address 0x%"PRIpaddr" is not suitably 
aligned.\n",
-   paddr);
-return -EINVAL;
+end = paddr + size;
+if ( end <= paddr )
+{
+printk("fdt: static shared memory region %s overflow\n", shm_id);
+return -EINVAL;
+}
 }
 
 if ( !IS_ALIGNED(gaddr, PAGE_SIZE) )
@@ -476,41 +496,69 @@ int __init process_shm_node(const void *fdt, int node, 
uint32_t address_cells,
 return -EINVAL;
 }
 
-end = paddr + size;
-if ( end <= paddr )
-{
-printk("fdt: static shared memory region %s overflow\n", shm_id);
-return -EINVAL;
-}
-
 for ( i = 0; i < mem->nr_banks; i++ )
 {
 /*
  * Meet the following check:
+ * when 

[PATCH 6/7] xen/arm: Implement the logic for static shared memory from Xen heap

2024-04-23 Thread Luca Fancellu
This commit implements the logic to have the static shared memory banks
from the Xen heap instead of having the host physical address passed from
the user.

When the host physical address is not supplied, the physical memory is
taken from the Xen heap using allocate_domheap_memory, the allocation
needs to occur at the first handled DT node and the allocated banks
need to be saved somewhere, so introduce the 'shm_heap_banks' static
global variable of type 'struct meminfo' that will hold the banks
allocated from the heap, its field .shmem_extra will be used to point
to the bootinfo shared memory banks .shmem_extra space, so that there
is not further allocation of memory and every bank in shm_heap_banks
can be safely identified by the shm_id to reconstruct its traceability
and if it was allocated or not.

A search into 'shm_heap_banks' will reveal if the banks were allocated
or not, in case the host address is not passed, and the callback given
to allocate_domheap_memory will store the banks in the structure and
map them to the current domain, to do that, some changes to
acquire_shared_memory_bank are made to let it differentiate if the bank
is from the heap and if it is, then assign_pages is called for every
bank.

When the bank is already allocated, for every bank allocated with the
corresponding shm_id, handle_shared_mem_bank is called and the mapping
are done.

Signed-off-by: Luca Fancellu 
---
 xen/arch/arm/static-shmem.c | 193 +---
 1 file changed, 157 insertions(+), 36 deletions(-)

diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
index 1c03bb7f1882..10396ed52ff1 100644
--- a/xen/arch/arm/static-shmem.c
+++ b/xen/arch/arm/static-shmem.c
@@ -9,6 +9,18 @@
 #include 
 #include 
 
+typedef struct {
+struct domain *d;
+paddr_t gbase;
+bool owner_dom_io;
+const char *role_str;
+struct shmem_membank_extra *bank_extra_info;
+} alloc_heap_pages_cb_extra;
+
+static struct meminfo __initdata shm_heap_banks = {
+.common.max_banks = NR_MEM_BANKS
+};
+
 static void __init __maybe_unused build_assertions(void)
 {
 /*
@@ -66,7 +78,8 @@ static bool __init is_shm_allocated_to_domio(paddr_t pbase)
 }
 
 static mfn_t __init acquire_shared_memory_bank(struct domain *d,
-   paddr_t pbase, paddr_t psize)
+   paddr_t pbase, paddr_t psize,
+   bool bank_from_heap)
 {
 mfn_t smfn;
 unsigned long nr_pfns;
@@ -86,19 +99,31 @@ static mfn_t __init acquire_shared_memory_bank(struct 
domain *d,
 d->max_pages += nr_pfns;
 
 smfn = maddr_to_mfn(pbase);
-res = acquire_domstatic_pages(d, smfn, nr_pfns, 0);
+if ( bank_from_heap )
+/*
+ * When host address is not provided, static shared memory is
+ * allocated from heap and shall be assigned to owner domain.
+ */
+res = assign_pages(maddr_to_page(pbase), nr_pfns, d, 0);
+else
+res = acquire_domstatic_pages(d, smfn, nr_pfns, 0);
+
 if ( res )
 {
-printk(XENLOG_ERR
-   "%pd: failed to acquire static memory: %d.\n", d, res);
-d->max_pages -= nr_pfns;
-return INVALID_MFN;
+printk(XENLOG_ERR "%pd: failed to %s static memory: %d.\n", d,
+   bank_from_heap ? "assign" : "acquire", res);
+goto fail;
 }
 
 return smfn;
+
+ fail:
+d->max_pages -= nr_pfns;
+return INVALID_MFN;
 }
 
 static int __init assign_shared_memory(struct domain *d, paddr_t gbase,
+   bool bank_from_heap,
const struct membank *shm_bank)
 {
 mfn_t smfn;
@@ -113,10 +138,7 @@ static int __init assign_shared_memory(struct domain *d, 
paddr_t gbase,
 psize = shm_bank->size;
 nr_borrowers = shm_bank->shmem_extra->nr_shm_borrowers;
 
-printk("%pd: allocate static shared memory BANK 
%#"PRIpaddr"-%#"PRIpaddr".\n",
-   d, pbase, pbase + psize);
-
-smfn = acquire_shared_memory_bank(d, pbase, psize);
+smfn = acquire_shared_memory_bank(d, pbase, psize, bank_from_heap);
 if ( mfn_eq(smfn, INVALID_MFN) )
 return -EINVAL;
 
@@ -188,6 +210,7 @@ append_shm_bank_to_domain(struct shared_meminfo 
*kinfo_shm_mem, paddr_t start,
 static int __init handle_shared_mem_bank(struct domain *d, paddr_t gbase,
  bool owner_dom_io,
  const char *role_str,
+ bool bank_from_heap,
  const struct membank *shm_bank)
 {
 paddr_t pbase, psize;
@@ -197,6 +220,10 @@ static int __init handle_shared_mem_bank(struct domain *d, 
paddr_t gbase,
 
 pbase = shm_bank->start;
 psize = shm_bank->size;
+
+printk("%pd: SHMEM map from %s: mphys 0x%"PRIpaddr" -> gphys 
0x%"PRIpaddr", size 0x%"PRIpaddr"\n",
+   

[PATCH 0/7] Static shared memory followup v2 - pt2

2024-04-23 Thread Luca Fancellu
This serie is a partial rework of this other serie:
https://patchwork.kernel.org/project/xen-devel/cover/20231206090623.1932275-1-penny.zh...@arm.com/

The original serie is addressing an issue of the static shared memory feature
that impacts the memory footprint of other component when the feature is
enabled, another issue impacts the device tree generation for the guests when
the feature is enabled and used and the last one is a missing feature that is
the option to have a static shared memory region that is not from the host
address space.

This serie is handling some comment on the original serie and it is splitting
the rework in two part, this first part is addressing the memory footprint issue
and the device tree generation and currently is fully reviewed by Michal
(https://patchwork.kernel.org/project/xen-devel/cover/20240418073652.3622828-1-luca.fance...@arm.com/),
this serie is addressing the static shared memory allocation from the Xen heap.

This serie is meant to be applied on top of:
https://patchwork.kernel.org/project/xen-devel/cover/20240418073652.3622828-1-luca.fance...@arm.com/
where the last patch was amended in favour of:
https://patchwork.kernel.org/project/xen-devel/patch/20240422110207.204968-1-luca.fance...@arm.com/

Luca Fancellu (5):
  xen/arm: Lookup bootinfo shm bank during the mapping
  xen/arm: Wrap shared memory mapping code in one function
  xen/arm: Parse xen,shared-mem when host phys address is not provided
  xen/arm: Rework heap page allocation outside allocate_bank_memory
  xen/arm: Implement the logic for static shared memory from Xen heap

Penny Zheng (2):
  xen/p2m: put reference for superpage
  xen/docs: Describe static shared memory when host address is not
provided

 docs/misc/arm/device-tree/booting.txt   |  52 ++-
 xen/arch/arm/dom0less-build.c   |   4 +-
 xen/arch/arm/domain_build.c |  77 +++--
 xen/arch/arm/include/asm/domain_build.h |   9 +-
 xen/arch/arm/mmu/p2m.c  |  58 +++-
 xen/arch/arm/setup.c|   3 +-
 xen/arch/arm/static-shmem.c | 430 +---
 7 files changed, 463 insertions(+), 170 deletions(-)

-- 
2.34.1




[PATCH 1/7] xen/arm: Lookup bootinfo shm bank during the mapping

2024-04-23 Thread Luca Fancellu
The current static shared memory code is using bootinfo banks when it
needs to find the number of borrower, so every time assign_shared_memory
is called, the bank is searched in the bootinfo.shmem structure.

There is nothing wrong with it, however the bank can be used also to
retrieve the start address and size and also to pass less argument to
assign_shared_memory. When retrieving the information from the bootinfo
bank, it's also possible to move the checks on alignment to
process_shm_node in the early stages.

So create a new function find_shm() which takes a 'struct shared_meminfo'
structure and the shared memory ID, to look for a bank with a matching ID,
take the physical host address and size from the bank, pass the bank to
assign_shared_memory() removing the now unnecessary arguments and finally
remove the acquire_nr_borrower_domain() function since now the information
can be extracted from the passed bank.
Move the "xen,shm-id" parsing early in process_shm to bail out quickly in
case of errors (unlikely), as said above, move the checks on alignment
to process_shm_node.

Drawback of this change is that now the bootinfo are used also when the
bank doesn't need to be allocated, however it will be convinient later
to use it as an argument for assign_shared_memory when dealing with
the use case where the Host physical address is not supplied by the user.

Signed-off-by: Luca Fancellu 
---
 xen/arch/arm/static-shmem.c | 105 
 1 file changed, 58 insertions(+), 47 deletions(-)

diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
index 09f474ec6050..f6cf74e58a83 100644
--- a/xen/arch/arm/static-shmem.c
+++ b/xen/arch/arm/static-shmem.c
@@ -19,29 +19,24 @@ static void __init __maybe_unused build_assertions(void)
  offsetof(struct shared_meminfo, bank)));
 }
 
-static int __init acquire_nr_borrower_domain(struct domain *d,
- paddr_t pbase, paddr_t psize,
- unsigned long *nr_borrowers)
+static const struct membank __init *find_shm(const struct membanks *shmem,
+ const char *shm_id)
 {
-const struct membanks *shmem = bootinfo_get_shmem();
 unsigned int bank;
 
-/* Iterate reserved memory to find requested shm bank. */
+BUG_ON(!shmem || !shm_id);
+
 for ( bank = 0 ; bank < shmem->nr_banks; bank++ )
 {
-paddr_t bank_start = shmem->bank[bank].start;
-paddr_t bank_size = shmem->bank[bank].size;
-
-if ( (pbase == bank_start) && (psize == bank_size) )
+if ( strncmp(shm_id, shmem->bank[bank].shmem_extra->shm_id,
+ MAX_SHM_ID_LENGTH) == 0 )
 break;
 }
 
 if ( bank == shmem->nr_banks )
-return -ENOENT;
-
-*nr_borrowers = shmem->bank[bank].shmem_extra->nr_shm_borrowers;
+return NULL;
 
-return 0;
+return >bank[bank];
 }
 
 /*
@@ -103,14 +98,20 @@ static mfn_t __init acquire_shared_memory_bank(struct 
domain *d,
 return smfn;
 }
 
-static int __init assign_shared_memory(struct domain *d,
-   paddr_t pbase, paddr_t psize,
-   paddr_t gbase)
+static int __init assign_shared_memory(struct domain *d, paddr_t gbase,
+   const struct membank *shm_bank)
 {
 mfn_t smfn;
 int ret = 0;
 unsigned long nr_pages, nr_borrowers, i;
 struct page_info *page;
+paddr_t pbase, psize;
+
+BUG_ON(!shm_bank || !shm_bank->shmem_extra);
+
+pbase = shm_bank->start;
+psize = shm_bank->size;
+nr_borrowers = shm_bank->shmem_extra->nr_shm_borrowers;
 
 printk("%pd: allocate static shared memory BANK 
%#"PRIpaddr"-%#"PRIpaddr".\n",
d, pbase, pbase + psize);
@@ -135,14 +136,6 @@ static int __init assign_shared_memory(struct domain *d,
 }
 }
 
-/*
- * Get the right amount of references per page, which is the number of
- * borrower domains.
- */
-ret = acquire_nr_borrower_domain(d, pbase, psize, _borrowers);
-if ( ret )
-return ret;
-
 /*
  * Instead of letting borrower domain get a page ref, we add as many
  * additional reference as the number of borrowers when the owner
@@ -199,6 +192,7 @@ int __init process_shm(struct domain *d, struct kernel_info 
*kinfo,
 
 dt_for_each_child_node(node, shm_node)
 {
+const struct membank *boot_shm_bank;
 const struct dt_property *prop;
 const __be32 *cells;
 uint32_t addr_cells, size_cells;
@@ -212,6 +206,23 @@ int __init process_shm(struct domain *d, struct 
kernel_info *kinfo,
 if ( !dt_device_is_compatible(shm_node, "xen,domain-shared-memory-v1") 
)
 continue;
 
+if ( dt_property_read_string(shm_node, "xen,shm-id", _id) )
+{
+printk("%pd: invalid \"xen,shm-id\" property", d);
+

Re: [PATCH] x86/MTRR: correct inadvertently inverted WC check

2024-04-23 Thread Roger Pau Monné
On Tue, Apr 23, 2024 at 09:51:46AM +0200, Jan Beulich wrote:
> The ! clearly got lost by mistake.
> 
> Fixes: e9e0eb30d4d6 ("x86/MTRR: avoid several indirect calls")
> Reported-by: Marek Marczykowski-Górecki 
> Signed-off-by: Jan Beulich 

Acked-by: Roger Pau Monné 

Thanks, Roger.



Re: [PATCH] x86/rtc: Avoid UIP flag being set for longer than expected

2024-04-23 Thread Jan Beulich
On 22.04.2024 18:38, Ross Lagerwall wrote:
> On Thu, Apr 18, 2024 at 2:24 PM Jan Beulich  wrote:
>> On 08.04.2024 18:26, Ross Lagerwall wrote:
>>> --- a/xen/arch/x86/hvm/rtc.c
>>> +++ b/xen/arch/x86/hvm/rtc.c
>>> @@ -202,6 +202,7 @@ static void check_update_timer(RTCState *s)
>>>  }
>>>  else
>>>  {
>>> +s->hw.cmos_data[RTC_REG_A] &= ~RTC_UIP;
>>>  next_update_time = (USEC_PER_SEC - guest_usec - 244) * 
>>> NS_PER_USEC;
>>>  expire_time = NOW() + next_update_time;
>>>  s->next_update_time = expire_time;
>>
>> Is rtc_update_timer2() clearing UIP then still necessary, ahead of calling
>> here? Hmm, yes, it is; the question is rather why the function calls
>> check_update_timer() when all that'll do (due to UF now being set) is stop
>> the other timer (in case it's also running) and clear ->use_timer.
> 
> Are you suggesting something like this?
> 
> diff --git a/xen/arch/x86/hvm/rtc.c b/xen/arch/x86/hvm/rtc.c
> index 4bb1c7505534..eb4901bf129a 100644
> --- a/xen/arch/x86/hvm/rtc.c
> +++ b/xen/arch/x86/hvm/rtc.c
> @@ -240,7 +240,8 @@ static void cf_check rtc_update_timer2(void *opaque)
>  s->hw.cmos_data[RTC_REG_C] |= RTC_UF;
>  s->hw.cmos_data[RTC_REG_A] &= ~RTC_UIP;
>  rtc_update_irq(s);
> -check_update_timer(s);
> +stop_timer(>update_timer);
> +s->use_timer = 0;
>  }
>  spin_unlock(>lock);
>  }
> 
> That may indeed be an improvement but I'm not sure it is really related
> to this patch?

Well, the connection is the initial question of this part of my earlier
reply. That's not to say the further change needs (or even wants) doing
right here. However, upon looking again I get the impression that I was
mixing up timer and timer2. The code path you alter is one where timer
is set, not timer2.

Overall:
Reviewed-by: Jan Beulich 

Jan



Re: [PATCH] x86/MTRR: avoid several indirect calls

2024-04-23 Thread Jan Beulich
On 23.04.2024 03:17, Marek Marczykowski-Górecki wrote:
> On Wed, Jan 17, 2024 at 10:32:53AM +0100, Jan Beulich wrote:
>> --- a/xen/arch/x86/cpu/mtrr/main.c
>> +++ b/xen/arch/x86/cpu/mtrr/main.c
>> @@ -328,7 +316,7 @@ int mtrr_add_page(unsigned long base, un
>>  }
>>  
>>  /*  If the type is WC, check that this processor supports it  */
>> -if ((type == X86_MT_WC) && !have_wrcomb()) {
>> +if ((type == X86_MT_WC) && mtrr_have_wrcomb()) {
> 
> Is reversing the condition intentional here? 

Clearly not. Patch sent.

Jan




[PATCH] x86/MTRR: correct inadvertently inverted WC check

2024-04-23 Thread Jan Beulich
The ! clearly got lost by mistake.

Fixes: e9e0eb30d4d6 ("x86/MTRR: avoid several indirect calls")
Reported-by: Marek Marczykowski-Górecki 
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/cpu/mtrr/main.c
+++ b/xen/arch/x86/cpu/mtrr/main.c
@@ -316,7 +316,7 @@ int mtrr_add_page(unsigned long base, un
}
 
/*  If the type is WC, check that this processor supports it  */
-   if ((type == X86_MT_WC) && mtrr_have_wrcomb()) {
+   if ((type == X86_MT_WC) && !mtrr_have_wrcomb()) {
printk(KERN_WARNING
   "mtrr: your processor doesn't support 
write-combining\n");
return -EOPNOTSUPP;



[linux-linus test] 185763: regressions - FAIL

2024-04-23 Thread osstest service owner
flight 185763 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/185763/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf   6 xen-buildfail REGR. vs. 185750

Tests which did not succeed, but are not blocking:
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-examine  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-qcow2 1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-raw   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 185750
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 185750
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 185750
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 185750
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 185750
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linux4d2008430ce87061c9cefd4f83daf2d5bb323a96
baseline version:
 linux977b1ef51866aa7170409af80740788d4f9c4841

Last test of basis   185750  2024-04-21 12:13:57 Z1 days
Failing since185753  2024-04-21 22:44:09 Z1 days3 attempts
Testing same since   185763  2024-04-22 19:14:49 Z0 days1 attempts


People who touched revisions under test:
  Alan Stern 
  Alexander Usyskin 
  Andy Shevchenko 
  Andy Shevchenko 
  AngeloGioacchino Del Regno 
  bolan wang 
  Borislav Petkov (AMD) 
  Carlos Llamas 
  Christian A. Ehrhardt 
  Chuanhong Guo 
  Coia Prant 
  Dan Carpenter 
  Daniele Palmas 
  Dave Hansen 
  Dave Hansen  # for x86
  Emil Kronborg 
  Eric Biggers 
  Fabio Estevam 
  Finn Thain 
  Georgi Djakov 
  Gil Fine 
  Greg Kroah-Hartman 
  H. Peter Anvin (Intel) 
  Hans de Goede 
  Hou Wenlong 
  Ingo Molnar 
  Jerry Meng 
  Johan Hovold 
  Jonathan Corbet 
  Josh Poimboeuf 
  Kai-Heng Feng 
  Klara Modin 
  Konrad Dybcio 
  Krzysztof Kozlowski 
  Kyle Tso 
  Linus Torvalds 
  Mathias Nyman 
  Mathieu Desnoyers 
  Matthias Kaehlcke 
  Michael Ellerman 
  Mika Westerberg 
  Mike Tipton 
  Minas Harutyunyan 
  Nikita Zhandarovich 
  Norihiko Hama 
  Oliver Neukum 
  Paul Cercueil 
  Peter Collingbourne 
  Randy Dunlap 
  Richard Genoud 
  Ricky Wu 
  Sakari Ailus 
  Samuel Thibault 
  Tejun Heo 
  Thinh Nguyen 
  Thorsten Leemhuis 
  Todd Kjos 
  Tomas Winkler 
  Tony Lindgren 
  Uwe Kleine-König 
  Vanillan Wang 
  Wentong 

Re: [RFC PATCH v3 2/3] x86/mm: Do not validate/devalidate PGT_none type

2024-04-23 Thread Jan Beulich
On 31.10.2023 17:22, Matias Ezequiel Vara Larsen wrote:
> This commit prevents PGT_none type pages to be validated/devalidated.

This isn't quite true. It is rather the case that (de)validation of this
type is trivial, and hence can be short-circuited just like is done for
certain other types.

> This is required for the use-case in which a guest-agnostic resource is
> allocated. In this case, these pages may be accessible by an "owning" PV
> domain. To lock the page from guest pov, pages are required to be marked
> with PGT_none with a single type ref. In particular, this commit makes
> the stats_table resource type to use this flag during
> get_page_and_type(). 

Imo the change here wants to come first, so that stats_vcpu_alloc_mfn()
can use PGT_none right away (rather than being transiently broken).

Beyond that I think justification needs extending here. In particular,
"lock the page" is at best misleading, considering page_lock() and
PGT_locked that we have. What you mean is "lock the page read-only". You
may want to further refer to the place where this is also done
(share_xen_page_with_guest()), just without get_page_and_type(). This is
not the least to explain why share_xen_page_with_guest() cannot be used
for your purpose, or why doing it this way is a better approach
(potentially then wanting using elsewhere as well).

Jan



<    1   2