[PATCH 3/3] target/ppc/cpu_init: Remove "PowerPC" prefix from the CPU list

2024-04-19 Thread Thomas Huth
Printing a "PowerPC" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now and use two spaces at the
beginning of the lines for the indentation of the entries instead,
and add a "Available CPUs" in the very first line, like most other
target architectures are doing it for their CPU help output already.

Signed-off-by: Thomas Huth 
---
 target/ppc/cpu_init.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 6241de62ce..8b932dccfb 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -7063,7 +7063,7 @@ static void ppc_cpu_list_entry(gpointer data, gpointer 
user_data)
 }
 
 name = cpu_model_from_type(typename);
-qemu_printf("PowerPC %-16s PVR %08x\n", name, pcc->pvr);
+qemu_printf("  %-16s PVR %08x\n", name, pcc->pvr);
 for (i = 0; ppc_cpu_aliases[i].alias != NULL; i++) {
 PowerPCCPUAlias *alias = &ppc_cpu_aliases[i];
 ObjectClass *alias_oc = ppc_cpu_class_by_name(alias->model);
@@ -7076,10 +7076,10 @@ static void ppc_cpu_list_entry(gpointer data, gpointer 
user_data)
  * avoid printing the wrong alias here and use "preferred" instead
  */
 if (strcmp(alias->alias, family->desc) == 0) {
-qemu_printf("PowerPC %-16s (alias for preferred %s CPU)\n",
+qemu_printf("  %-16s (alias for preferred %s CPU)\n",
 alias->alias, family->desc);
 } else {
-qemu_printf("PowerPC %-16s (alias for %s)\n",
+qemu_printf("  %-16s (alias for %s)\n",
 alias->alias, name);
 }
 }
@@ -7090,6 +7090,7 @@ void ppc_cpu_list(void)
 {
 GSList *list;
 
+qemu_printf("Available CPUs:\n");
 list = object_class_get_list(TYPE_POWERPC_CPU, false);
 list = g_slist_sort(list, ppc_cpu_list_compare);
 g_slist_foreach(list, ppc_cpu_list_entry, NULL);
@@ -7097,7 +7098,7 @@ void ppc_cpu_list(void)
 
 #ifdef CONFIG_KVM
 qemu_printf("\n");
-qemu_printf("PowerPC %s\n", "host");
+qemu_printf("  %s\n", "host");
 #endif
 }
 
-- 
2.44.0




[PATCH 1/3] target/i386/cpu: Remove "x86" prefix from the CPU list

2024-04-19 Thread Thomas Huth
Printing an "x86" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now and use two spaces at the
beginning of the lines for the indentation of the entries instead,
like most other target architectures are doing it for their CPU help
output already.

Signed-off-by: Thomas Huth 
---
 target/i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 33760a2ee1..fd46e264a2 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5572,7 +5572,7 @@ static void x86_cpu_list_entry(gpointer data, gpointer 
user_data)
 desc = g_strdup_printf("%s (deprecated)", olddesc);
 }
 
-qemu_printf("x86 %-20s  %s\n", name, desc);
+qemu_printf("  %-20s  %s\n", name, desc);
 }
 
 /* list available CPU models and flags */
-- 
2.44.0




[PATCH 0/3] Remove useless architecture prefix from the CPU list

2024-04-19 Thread Thomas Huth
Printing an architecture prefix in front of each CPU name is not helpful
at all: It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and it also
takes some precious space in the dense output of the CPU entries. Let's
simply remove those now.

Thomas Huth (3):
  target/i386/cpu: Remove "x86" prefix from the CPU list
  target/s390x/cpu_models: Rework the output of "-cpu help"
  target/ppc/cpu_init: Remove "PowerPC" prefix from the CPU list

 target/i386/cpu.c | 2 +-
 target/ppc/cpu_init.c | 9 +
 target/s390x/cpu_models.c | 9 +
 3 files changed, 11 insertions(+), 9 deletions(-)

-- 
2.44.0




[PATCH 2/3] target/s390x/cpu_models: Rework the output of "-cpu help"

2024-04-19 Thread Thomas Huth
Printing an "s390x" in front of each CPU name is not helpful at all:
It is confusing for the users since they don't know whether they
have to specify these letters for the "-cpu" parameter, too, and
it also takes some precious space in the dense output of the CPU
entries. Let's simply remove this now!

While we're at it, use two spaces at the beginning of the lines for
the indentation of the entries, and add a "Available CPUs" in the
very first line, like most other target architectures are doing it
for their "-cpu help" output already.

Signed-off-by: Thomas Huth 
---
 target/s390x/cpu_models.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 8ed3bb6a27..58c58f05a0 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -355,9 +355,9 @@ static void s390_print_cpu_model_list_entry(gpointer data, 
gpointer user_data)
 /* strip off the -s390x-cpu */
 g_strrstr(name, "-" TYPE_S390_CPU)[0] = 0;
 if (details->len) {
-qemu_printf("s390 %-15s %-35s (%s)\n", name, scc->desc, details->str);
+qemu_printf("  %-15s %-35s (%s)\n", name, scc->desc, details->str);
 } else {
-qemu_printf("s390 %-15s %-35s\n", name, scc->desc);
+qemu_printf("  %-15s %-35s\n", name, scc->desc);
 }
 g_free(name);
 }
@@ -402,6 +402,7 @@ void s390_cpu_list(void)
 S390Feat feat;
 GSList *list;
 
+qemu_printf("Available CPUs:\n");
 list = object_class_get_list(TYPE_S390_CPU, false);
 list = g_slist_sort(list, s390_cpu_list_compare);
 g_slist_foreach(list, s390_print_cpu_model_list_entry, NULL);
@@ -411,14 +412,14 @@ void s390_cpu_list(void)
 for (feat = 0; feat < S390_FEAT_MAX; feat++) {
 const S390FeatDef *def = s390_feat_def(feat);
 
-qemu_printf("%-20s %s\n", def->name, def->desc);
+qemu_printf("  %-20s %s\n", def->name, def->desc);
 }
 
 qemu_printf("\nRecognized feature groups:\n");
 for (group = 0; group < S390_FEAT_GROUP_MAX; group++) {
 const S390FeatGroupDef *def = s390_feat_group_def(group);
 
-qemu_printf("%-20s %s\n", def->name, def->desc);
+qemu_printf("  %-20s %s\n", def->name, def->desc);
 }
 }
 
-- 
2.44.0




Re: [PATCH 4/5] docs/system/target-sparc: Improve the Sparc documentation

2024-04-19 Thread Mark Cave-Ayland

On 20/04/2024 00:14, Brad Smith wrote:


On 2024-04-18 4:27 p.m., Mark Cave-Ayland wrote:

On 07/03/2024 17:43, Thomas Huth wrote:


Add some words about how to enable or disable boolean features,
and remove the note about a Linux kernel being available on the
QEMU website (they have been removed long ago already).

Signed-off-by: Thomas Huth 
---
  docs/system/target-sparc.rst | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/docs/system/target-sparc.rst b/docs/system/target-sparc.rst
index 9ec8c90c14..9f418b9d3e 100644
--- a/docs/system/target-sparc.rst
+++ b/docs/system/target-sparc.rst
@@ -27,6 +27,11 @@ architecture machines:
  The emulation is somewhat complete. SMP up to 16 CPUs is supported, but
  Linux limits the number of usable CPUs to 4.
  +The list of available CPUs can be viewed by starting QEMU with ``-cpu help``.
+Optional boolean features can be added with a "+" in front of the feature name,
+or disabled with a "-" in front of the name, for example
+``-cpu TI-SuperSparc-II,+float128``.
+
  QEMU emulates the following sun4m peripherals:
    -  IOMMU
@@ -55,8 +60,7 @@ OpenBIOS is a free (GPL v2) portable firmware implementation. 
The goal

  is to implement a 100% IEEE 1275-1994 (referred to as Open Firmware)
  compliant firmware.
  -A sample Linux 2.6 series kernel and ram disk image are available on the
-QEMU web site. There are still issues with NetBSD and OpenBSD, but most
+There are still issues with NetBSD and OpenBSD, but most
  kernel versions work. Please note that currently older Solaris kernels
  don't work probably due to interface issues between OpenBIOS and
  Solaris.


Just curious as to what current issues exist with NetBSD and OpenBSD? At least both 
my NetBSD and OpenBSD test images survive a casual boot test here with latest git.


I was just trying OpenBSD/sparc64 with 8.2 recently and found hme(4) does
not work. I tried with the NE2k driver as I remember adding the driver to the
OpenBSD kernel before an hme driver existed and it sort of worked, but there
were still issues.

I'll re-test with 9 now and see what happens.


Thanks for the update: my local tests for SPARC changes are boot tests, so it's 
entirely possible I could miss an issue with the hme device.


Feel free to open a GitLab issue with the relevant information and I'll take a look 
as time allows.



ATB,

Mark.




Re: [PATCH] target/i386/translate.c: always write 32-bits for SGDT and SIDT

2024-04-19 Thread Mark Cave-Ayland

On 20/04/2024 02:21, Richard Henderson wrote:


On 4/19/24 12:51, Mark Cave-Ayland wrote:

The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits
or 32-bits depending upon the operand size, but this is incorrect. Not only do
the Intel CPU manuals give contradictory information between processor
revisions, but this information doesn't even match real-life behaviour.

In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT
and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 
with
Win32s to function correctly. Remove the masking applied due to the operand size
for SGDT and SIDT so that the TCG behaviour matches the behaviour on real
hardware.

Signed-off-by: Mark Cave-Ayland 
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198

--
MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this
patch fixes the issue in WFW 3.11 with Win32s. For more technical information I
highly recommend the excellent write-up at
https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/.
---
  target/i386/tcg/translate.c | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 76a42c679c..3026eb6774 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
  gen_op_st_v(s, MO_16, s->T0, s->A0);
  gen_add_A0_im(s, 2);
  tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, gdt.base));
-    if (dflag == MO_16) {
-    tcg_gen_andi_tl(s->T0, s->T0, 0xff);
-    }
+    /*
+ * NB: Despite claims to the contrary in Intel CPU documentation,
+ * all 32-bits are written regardless of operand size.
+ */


Current documentation agrees that all 32 bits are written, so I don't think you need 
this comment:


Ah that's good to know the docs are now correct. I added the comment as there was a 
lot of conflicting information around for older CPUs so I thought it was worth an 
explicit mention.


If everyone agrees a version without comments is preferable, I can re-send an updated 
version without them included.



   IF OperandSize =16 or OperandSize = 32 (* Legacy or Compatibility Mode *)
     THEN
   DEST[0:15] := GDTR(Limit);
   DEST[16:47] := GDTR(Base); (* Full 32-bit base address stored *)
   FI;


Anyway,
Reviewed-by: Richard Henderson 


Thanks!


ATB,

Mark.




[PATCH v9 2/6] ui/console: new dmabuf.h and dmabuf.c for QemuDmaBuf struct and helpers

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

New header and source files are added for containing QemuDmaBuf struct
definition and newly introduced helpers for creating/freeing the struct
and accessing its data.

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/ui/console.h |  20 +
 include/ui/dmabuf.h  |  81 +
 ui/dmabuf.c  | 206 +++
 ui/meson.build   |   1 +
 4 files changed, 289 insertions(+), 19 deletions(-)
 create mode 100644 include/ui/dmabuf.h
 create mode 100644 ui/dmabuf.c

diff --git a/include/ui/console.h b/include/ui/console.h
index 0bc7a00ac0..a208a68b88 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -7,6 +7,7 @@
 #include "qapi/qapi-types-ui.h"
 #include "ui/input.h"
 #include "ui/surface.h"
+#include "ui/dmabuf.h"
 
 #define TYPE_QEMU_CONSOLE "qemu-console"
 OBJECT_DECLARE_TYPE(QemuConsole, QemuConsoleClass, QEMU_CONSOLE)
@@ -185,25 +186,6 @@ struct QEMUGLParams {
 int minor_ver;
 };
 
-typedef struct QemuDmaBuf {
-int   fd;
-uint32_t  width;
-uint32_t  height;
-uint32_t  stride;
-uint32_t  fourcc;
-uint64_t  modifier;
-uint32_t  texture;
-uint32_t  x;
-uint32_t  y;
-uint32_t  backing_width;
-uint32_t  backing_height;
-bool  y0_top;
-void  *sync;
-int   fence_fd;
-bool  allow_fences;
-bool  draw_submitted;
-} QemuDmaBuf;
-
 enum display_scanout {
 SCANOUT_NONE,
 SCANOUT_SURFACE,
diff --git a/include/ui/dmabuf.h b/include/ui/dmabuf.h
new file mode 100644
index 00..d01bddf523
--- /dev/null
+++ b/include/ui/dmabuf.h
@@ -0,0 +1,81 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * QemuDmaBuf struct and helpers used for accessing its data
+ *
+ * Copyright (c) 2024 Dongwon Kim 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef DMABUF_H
+#define DMABUF_H
+
+typedef struct QemuDmaBuf {
+int   fd;
+uint32_t  width;
+uint32_t  height;
+uint32_t  stride;
+uint32_t  fourcc;
+uint64_t  modifier;
+uint32_t  texture;
+uint32_t  x;
+uint32_t  y;
+uint32_t  backing_width;
+uint32_t  backing_height;
+bool  y0_top;
+void  *sync;
+int   fence_fd;
+bool  allow_fences;
+bool  draw_submitted;
+} QemuDmaBuf;
+
+QemuDmaBuf *qemu_dmabuf_new(uint32_t width, uint32_t height,
+   uint32_t stride, uint32_t x,
+   uint32_t y, uint32_t backing_width,
+   uint32_t backing_height, uint32_t fourcc,
+   uint64_t modifier, int32_t dmabuf_fd,
+   bool allow_fences, bool y0_top);
+void qemu_dmabuf_free(QemuDmaBuf *dmabuf);
+
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(QemuDmaBuf, qemu_dmabuf_free);
+
+int32_t qemu_dmabuf_get_fd(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_width(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_height(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_stride(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_fourcc(QemuDmaBuf *dmabuf);
+uint64_t qemu_dmabuf_get_modifier(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_texture(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_x(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_y(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_backing_width(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_backing_height(QemuDmaBuf *dmabuf);
+bool qemu_dmabuf_get_y0_top(QemuDmaBuf *dmabuf);
+void *qemu_dmabuf_get_sync(QemuDmaBuf *dmabuf);
+int32_t qemu_dmabuf_get_fence_fd(QemuDmaBuf *dmabuf);
+bool qemu_dmabuf_get_allow_fences(QemuDmaBuf *dmabuf);
+bool qemu_dmabuf_get_draw_submitted(QemuDmaBuf *dmabuf);
+void qemu_dmabuf_set_texture(QemuDmaBuf *dmabuf, uint32_t texture);
+void qemu_dmabuf_set_fence_fd(QemuDmaBuf *dmabuf, int32_t fence_fd);
+void qemu_

[PATCH v9 3/6] ui/console: Use qemu_dmabuf_get_..() helpers instead

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This commit updates all instances where fields within the QemuDmaBuf
struct are directly accessed, replacing them with calls to these new
helper functions.

v6: fix typos in helper names in ui/spice-display.c

v7: removed prefix, "dpy_gl_" from all helpers

v8: Introduction of helpers was removed as those were already added
by the previous commit

Suggested-by: Marc-André Lureau 
Reviewed-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 hw/display/vhost-user-gpu.c |  6 ++--
 hw/display/virtio-gpu-udmabuf.c |  7 +++--
 hw/vfio/display.c   | 15 +++---
 ui/console.c|  4 +--
 ui/dbus-console.c   |  9 --
 ui/dbus-listener.c  | 43 +---
 ui/egl-headless.c   | 23 ++-
 ui/egl-helpers.c| 47 ++-
 ui/gtk-egl.c| 48 ---
 ui/gtk-gl-area.c| 37 
 ui/gtk.c|  6 ++--
 ui/spice-display.c  | 50 +++--
 12 files changed, 187 insertions(+), 108 deletions(-)

diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 709c8a02a1..ea9a6c5d10 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -249,6 +249,7 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 case VHOST_USER_GPU_DMABUF_SCANOUT: {
 VhostUserGpuDMABUFScanout *m = &msg->payload.dmabuf_scanout;
 int fd = qemu_chr_fe_get_msgfd(&g->vhost_chr);
+int old_fd;
 QemuDmaBuf *dmabuf;
 
 if (m->scanout_id >= g->parent_obj.conf.max_outputs) {
@@ -262,8 +263,9 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 g->parent_obj.enable = 1;
 con = g->parent_obj.scanout[m->scanout_id].con;
 dmabuf = &g->dmabuf[m->scanout_id];
-if (dmabuf->fd >= 0) {
-close(dmabuf->fd);
+old_fd = qemu_dmabuf_get_fd(dmabuf);
+if (old_fd >= 0) {
+close(old_fd);
 dmabuf->fd = -1;
 }
 dpy_gl_release_dmabuf(con, dmabuf);
diff --git a/hw/display/virtio-gpu-udmabuf.c b/hw/display/virtio-gpu-udmabuf.c
index d51184d658..c90eba281e 100644
--- a/hw/display/virtio-gpu-udmabuf.c
+++ b/hw/display/virtio-gpu-udmabuf.c
@@ -206,6 +206,7 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
 {
 struct virtio_gpu_scanout *scanout = &g->parent_obj.scanout[scanout_id];
 VGPUDMABuf *new_primary, *old_primary = NULL;
+uint32_t width, height;
 
 new_primary = virtio_gpu_create_dmabuf(g, scanout_id, res, fb, r);
 if (!new_primary) {
@@ -216,10 +217,10 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
 old_primary = g->dmabuf.primary[scanout_id];
 }
 
+width = qemu_dmabuf_get_width(&new_primary->buf);
+height = qemu_dmabuf_get_height(&new_primary->buf);
 g->dmabuf.primary[scanout_id] = new_primary;
-qemu_console_resize(scanout->con,
-new_primary->buf.width,
-new_primary->buf.height);
+qemu_console_resize(scanout->con, width, height);
 dpy_gl_scanout_dmabuf(scanout->con, &new_primary->buf);
 
 if (old_primary) {
diff --git a/hw/vfio/display.c b/hw/vfio/display.c
index 1aa440c663..4861c8161d 100644
--- a/hw/vfio/display.c
+++ b/hw/vfio/display.c
@@ -259,9 +259,13 @@ static VFIODMABuf *vfio_display_get_dmabuf(VFIOPCIDevice 
*vdev,
 
 static void vfio_display_free_one_dmabuf(VFIODisplay *dpy, VFIODMABuf *dmabuf)
 {
+int fd;
+
 QTAILQ_REMOVE(&dpy->dmabuf.bufs, dmabuf, next);
+
+fd = qemu_dmabuf_get_fd(&dmabuf->buf);
 dpy_gl_release_dmabuf(dpy->con, &dmabuf->buf);
-close(dmabuf->buf.fd);
+close(fd);
 g_free(dmabuf);
 }
 
@@ -286,6 +290,7 @@ static void vfio_display_dmabuf_update(void *opaque)
 VFIOPCIDevice *vdev = opaque;
 VFIODisplay *dpy = vdev->dpy;
 VFIODMABuf *primary, *cursor;
+uint32_t width, height;
 bool free_bufs = false, new_cursor = false;
 
 primary = vfio_display_get_dmabuf(vdev, DRM_PLANE_TYPE_PRIMARY);
@@ -296,10 +301,12 @@ static void vfio_display_dmabuf_update(void *opaque)
 return;
 }
 
+width = qemu_dmabuf_get_width(&primary->buf);
+height = qemu_dmabuf_get_height(&primary->buf);
+
 if (dpy->dmabuf.primary != primary) {
 dpy->dmabuf.primary = primary;
-qemu_console_resize(dpy->con,
-primary->buf.width, primary->buf.height);
+qemu_console_resize(dpy->con, width, height);
 dpy_gl_scanout_dmabuf(dpy->con, &primary->buf);
 free_bufs = true;
 }
@@ -328,7 +335,7 @@ static void vfio_display_dmabuf_update(void *opaque)
 cursor->pos_updates = 0;
 }
 
-dpy_gl_update(dpy->con, 0, 0, primary->buf.width, primary->buf.height);
+dpy_gl_up

[PATCH v9 5/6] ui/console: Use qemu_dmabuf_new() and free() helpers instead

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This commit introduces utility functions for the creation and deallocation
of QemuDmaBuf instances. Additionally, it updates all relevant sections
of the codebase to utilize these new utility functions.

v7: remove prefix, "dpy_gl_" from all helpers
qemu_dmabuf_free() returns without doing anything if input is null
(Daniel P. Berrangé )
call G_DEFINE_AUTOPTR_CLEANUP_FUNC for qemu_dmabuf_free()
(Daniel P. Berrangé )

v8: Introduction of helpers was removed as those were already added
by the previous commit

v9: set dmabuf->allow_fences to 'true' when dmabuf is created in
virtio_gpu_create_dmabuf()/virtio-gpu-udmabuf.c

removed unnecessary spaces were accidently added in the patch,
'ui/console: Use qemu_dmabuf_new() a...'

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/hw/vfio/vfio-common.h   |  2 +-
 include/hw/virtio/virtio-gpu.h  |  4 ++--
 hw/display/vhost-user-gpu.c | 32 ++--
 hw/display/virtio-gpu-udmabuf.c | 24 +---
 hw/vfio/display.c   | 26 --
 ui/dbus-listener.c  | 28 
 6 files changed, 54 insertions(+), 62 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index b9da6c08ef..d66e27db02 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -148,7 +148,7 @@ typedef struct VFIOGroup {
 } VFIOGroup;
 
 typedef struct VFIODMABuf {
-QemuDmaBuf buf;
+QemuDmaBuf *buf;
 uint32_t pos_x, pos_y, pos_updates;
 uint32_t hot_x, hot_y, hot_updates;
 int dmabuf_id;
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index ed44cdad6b..56d6e821bf 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -169,7 +169,7 @@ struct VirtIOGPUBaseClass {
 DEFINE_PROP_UINT32("yres", _state, _conf.yres, 800)
 
 typedef struct VGPUDMABuf {
-QemuDmaBuf buf;
+QemuDmaBuf *buf;
 uint32_t scanout_id;
 QTAILQ_ENTRY(VGPUDMABuf) next;
 } VGPUDMABuf;
@@ -238,7 +238,7 @@ struct VhostUserGPU {
 VhostUserBackend *vhost;
 int vhost_gpu_fd; /* closed by the chardev */
 CharBackend vhost_chr;
-QemuDmaBuf dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
+QemuDmaBuf *dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
 bool backend_blocked;
 };
 
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index ea9a6c5d10..c91a800455 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -250,6 +250,7 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 VhostUserGpuDMABUFScanout *m = &msg->payload.dmabuf_scanout;
 int fd = qemu_chr_fe_get_msgfd(&g->vhost_chr);
 int old_fd;
+uint64_t modifier = 0;
 QemuDmaBuf *dmabuf;
 
 if (m->scanout_id >= g->parent_obj.conf.max_outputs) {
@@ -262,31 +263,34 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 
 g->parent_obj.enable = 1;
 con = g->parent_obj.scanout[m->scanout_id].con;
-dmabuf = &g->dmabuf[m->scanout_id];
-old_fd = qemu_dmabuf_get_fd(dmabuf);
-if (old_fd >= 0) {
-close(old_fd);
-dmabuf->fd = -1;
+dmabuf = g->dmabuf[m->scanout_id];
+if (dmabuf) {
+old_fd = qemu_dmabuf_get_fd(dmabuf);
+if (old_fd >= 0) {
+close(old_fd);
+qemu_dmabuf_set_fd(dmabuf, -1);
+}
 }
 dpy_gl_release_dmabuf(con, dmabuf);
+g_clear_pointer(&dmabuf, qemu_dmabuf_free);
 if (fd == -1) {
 dpy_gl_scanout_disable(con);
 break;
 }
-*dmabuf = (QemuDmaBuf) {
-.fd = fd,
-.width = m->fd_width,
-.height = m->fd_height,
-.stride = m->fd_stride,
-.fourcc = m->fd_drm_fourcc,
-.y0_top = m->fd_flags & VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP,
-};
+
 if (msg->request == VHOST_USER_GPU_DMABUF_SCANOUT2) {
 VhostUserGpuDMABUFScanout2 *m2 = &msg->payload.dmabuf_scanout2;
-dmabuf->modifier = m2->modifier;
+modifier = m2->modifier;
 }
 
+dmabuf = qemu_dmabuf_new(m->fd_width, m->fd_height,
+ m->fd_stride, 0, 0, 0, 0,
+ m->fd_drm_fourcc, modifier,
+ fd, false, m->fd_flags &
+ VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP);
+
 dpy_gl_scanout_dmabuf(con, dmabuf);
+g->dmabuf[m->scanout_id] = dmabuf;
 break;
 }
 case VHOST_USER_GPU_DMABUF_UPDATE: {
diff --git a/hw/display/virtio-gpu-udmabuf.c b/hw/display/virtio-gpu-udmabuf.c
index c90eba281e..c02ec6d37d 100644
--- a/hw/display/virtio-gpu-udmabuf.c
+++ b/hw/display/virtio-gp

[PATCH v9 0/6] ui/console: Private QemuDmaBuf struct

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This series introduces privacy enhancements to the QemuDmaBuf struct
and its contained data to bolster security. it accomplishes this by
introducing of helper functions for allocating, deallocating, and
accessing individual fields within the struct and replacing all direct
references to individual fields in the struct with methods using helpers
throughout the codebase.

This change was made based on a suggestion from Marc-André Lureau


(Resumitting same patch series with this new cover-leter)

v6: fixed some typos in patch -
ui/console: Introduce dpy_gl_qemu_dmabuf_get_..() helpers)

v7: included minor fix (ui/gtk: Check if fence_fd is equal to or greater than 0)
(Marc-André Lureau )

migrated all helpers and QemuDmaBuf struct into dmabuf.c and their 
prototypes
to dmabuf.h for better encapsulation (ui/dmabuf: New dmabuf.c and 
dmabuf.h..)
(Daniel P. Berrangé  and
 Marc-André Lureau )

removed 'dpy_gl' from all helpers' names
Defined autoptr clean up function for QemuDmaBuf*
(Daniel P. Berrangé )

Minor corrections

v8: Introduce new dmabuf.c and dmabuf.h and all helper functions in the second
patch in the series (ui/console: new dmabuf.h and dmabuf.c for QemuDma)
(Philippe Mathieu-Daudé )

v9: set dmabuf->allow_fences true when it is created in virtio-gpu-udmabuf

removed unnecessary spaces were added in the patch,
'ui/console: Use qemu_dmabuf_new() a...'

Dongwon Kim (6):
  ui/gtk: Check if fence_fd is equal to or greater than 0
  ui/console: new dmabuf.h and dmabuf.c for QemuDmaBuf struct and
helpers
  ui/console: Use qemu_dmabuf_get_..() helpers instead
  ui/console: Use qemu_dmabuf_set_..() helpers instead
  ui/console: Use qemu_dmabuf_new() and free() helpers instead
  ui/console: move QemuDmaBuf struct def to dmabuf.c

 include/hw/vfio/vfio-common.h   |   2 +-
 include/hw/virtio/virtio-gpu.h  |   4 +-
 include/ui/console.h|  20 +--
 include/ui/dmabuf.h |  64 +
 hw/display/vhost-user-gpu.c |  32 +++--
 hw/display/virtio-gpu-udmabuf.c |  27 ++--
 hw/vfio/display.c   |  35 ++---
 ui/console.c|   4 +-
 ui/dbus-console.c   |   9 +-
 ui/dbus-listener.c  |  71 +-
 ui/dmabuf.c | 225 
 ui/egl-headless.c   |  23 +++-
 ui/egl-helpers.c|  59 +
 ui/gtk-egl.c|  52 +---
 ui/gtk-gl-area.c|  41 --
 ui/gtk.c|  12 +-
 ui/spice-display.c  |  50 ---
 ui/meson.build  |   1 +
 18 files changed, 539 insertions(+), 192 deletions(-)
 create mode 100644 include/ui/dmabuf.h
 create mode 100644 ui/dmabuf.c

-- 
2.34.1




[PATCH v9 6/6] ui/console: move QemuDmaBuf struct def to dmabuf.c

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

To complete privatizing process of QemuDmaBuf, QemuDmaBuf struct def
is moved to dmabuf.c

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/ui/dmabuf.h | 19 +--
 ui/dmabuf.c | 19 +++
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/include/ui/dmabuf.h b/include/ui/dmabuf.h
index d01bddf523..f2026c351a 100644
--- a/include/ui/dmabuf.h
+++ b/include/ui/dmabuf.h
@@ -27,24 +27,7 @@
 #ifndef DMABUF_H
 #define DMABUF_H
 
-typedef struct QemuDmaBuf {
-int   fd;
-uint32_t  width;
-uint32_t  height;
-uint32_t  stride;
-uint32_t  fourcc;
-uint64_t  modifier;
-uint32_t  texture;
-uint32_t  x;
-uint32_t  y;
-uint32_t  backing_width;
-uint32_t  backing_height;
-bool  y0_top;
-void  *sync;
-int   fence_fd;
-bool  allow_fences;
-bool  draw_submitted;
-} QemuDmaBuf;
+typedef struct QemuDmaBuf QemuDmaBuf;
 
 QemuDmaBuf *qemu_dmabuf_new(uint32_t width, uint32_t height,
uint32_t stride, uint32_t x,
diff --git a/ui/dmabuf.c b/ui/dmabuf.c
index f0878aa3a1..80169533b6 100644
--- a/ui/dmabuf.c
+++ b/ui/dmabuf.c
@@ -27,6 +27,25 @@
 #include "qemu/osdep.h"
 #include "ui/dmabuf.h"
 
+struct QemuDmaBuf {
+int   fd;
+uint32_t  width;
+uint32_t  height;
+uint32_t  stride;
+uint32_t  fourcc;
+uint64_t  modifier;
+uint32_t  texture;
+uint32_t  x;
+uint32_t  y;
+uint32_t  backing_width;
+uint32_t  backing_height;
+bool  y0_top;
+void  *sync;
+int   fence_fd;
+bool  allow_fences;
+bool  draw_submitted;
+};
+
 QemuDmaBuf *qemu_dmabuf_new(uint32_t width, uint32_t height,
 uint32_t stride, uint32_t x,
 uint32_t y, uint32_t backing_width,
-- 
2.34.1




[PATCH v9 1/6] ui/gtk: Check if fence_fd is equal to or greater than 0

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

'fence_fd' needs to be validated always before being referenced
And the passing condition should include '== 0' as 0 is a valid
value for the file descriptor.

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 ui/gtk-egl.c |  2 +-
 ui/gtk-gl-area.c |  2 +-
 ui/gtk.c | 10 ++
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index 3af5ac5bcf..955234429d 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -99,7 +99,7 @@ void gd_egl_draw(VirtualConsole *vc)
 #ifdef CONFIG_GBM
 if (dmabuf) {
 egl_dmabuf_create_fence(dmabuf);
-if (dmabuf->fence_fd > 0) {
+if (dmabuf->fence_fd >= 0) {
 qemu_set_fd_handler(dmabuf->fence_fd, gd_hw_gl_flushed, NULL, 
vc);
 return;
 }
diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c
index 52dcac161e..7fffd0544e 100644
--- a/ui/gtk-gl-area.c
+++ b/ui/gtk-gl-area.c
@@ -86,7 +86,7 @@ void gd_gl_area_draw(VirtualConsole *vc)
 #ifdef CONFIG_GBM
 if (dmabuf) {
 egl_dmabuf_create_fence(dmabuf);
-if (dmabuf->fence_fd > 0) {
+if (dmabuf->fence_fd >= 0) {
 qemu_set_fd_handler(dmabuf->fence_fd, gd_hw_gl_flushed, NULL, 
vc);
 return;
 }
diff --git a/ui/gtk.c b/ui/gtk.c
index 810d7fc796..7819a86321 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -597,10 +597,12 @@ void gd_hw_gl_flushed(void *vcon)
 VirtualConsole *vc = vcon;
 QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf;
 
-qemu_set_fd_handler(dmabuf->fence_fd, NULL, NULL, NULL);
-close(dmabuf->fence_fd);
-dmabuf->fence_fd = -1;
-graphic_hw_gl_block(vc->gfx.dcl.con, false);
+if (dmabuf->fence_fd >= 0) {
+qemu_set_fd_handler(dmabuf->fence_fd, NULL, NULL, NULL);
+close(dmabuf->fence_fd);
+dmabuf->fence_fd = -1;
+graphic_hw_gl_block(vc->gfx.dcl.con, false);
+}
 }
 
 /** DisplayState Callbacks (opengl version) **/
-- 
2.34.1




[PATCH v9 4/6] ui/console: Use qemu_dmabuf_set_..() helpers instead

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This commit updates all occurrences where these fields were
set directly have been updated to utilize helper functions.

v7: removed prefix, "dpy_gl_" from all helpers

v8: Introduction of helpers was removed as those were already added
by the previous commit

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 ui/egl-helpers.c | 16 +---
 ui/gtk-egl.c |  4 ++--
 ui/gtk-gl-area.c |  4 ++--
 ui/gtk.c |  6 +++---
 4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/ui/egl-helpers.c b/ui/egl-helpers.c
index 3f96e63d25..99b2ebbe23 100644
--- a/ui/egl-helpers.c
+++ b/ui/egl-helpers.c
@@ -348,8 +348,8 @@ void egl_dmabuf_import_texture(QemuDmaBuf *dmabuf)
 return;
 }
 
-glGenTextures(1, &dmabuf->texture);
-texture = qemu_dmabuf_get_texture(dmabuf);
+glGenTextures(1, &texture);
+qemu_dmabuf_set_texture(dmabuf, texture);
 glBindTexture(GL_TEXTURE_2D, texture);
 glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
 glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
@@ -368,7 +368,7 @@ void egl_dmabuf_release_texture(QemuDmaBuf *dmabuf)
 }
 
 glDeleteTextures(1, &texture);
-dmabuf->texture = 0;
+qemu_dmabuf_set_texture(dmabuf, 0);
 }
 
 void egl_dmabuf_create_sync(QemuDmaBuf *dmabuf)
@@ -382,7 +382,7 @@ void egl_dmabuf_create_sync(QemuDmaBuf *dmabuf)
 sync = eglCreateSyncKHR(qemu_egl_display,
 EGL_SYNC_NATIVE_FENCE_ANDROID, NULL);
 if (sync != EGL_NO_SYNC_KHR) {
-dmabuf->sync = sync;
+qemu_dmabuf_set_sync(dmabuf, sync);
 }
 }
 }
@@ -390,12 +390,14 @@ void egl_dmabuf_create_sync(QemuDmaBuf *dmabuf)
 void egl_dmabuf_create_fence(QemuDmaBuf *dmabuf)
 {
 void *sync = qemu_dmabuf_get_sync(dmabuf);
+int fence_fd;
 
 if (sync) {
-dmabuf->fence_fd = eglDupNativeFenceFDANDROID(qemu_egl_display,
-  sync);
+fence_fd = eglDupNativeFenceFDANDROID(qemu_egl_display,
+  sync);
+qemu_dmabuf_set_fence_fd(dmabuf, fence_fd);
 eglDestroySyncKHR(qemu_egl_display, sync);
-dmabuf->sync = NULL;
+qemu_dmabuf_set_sync(dmabuf, NULL);
 }
 }
 
diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index 7a45daefa1..ec0bf45482 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -87,7 +87,7 @@ void gd_egl_draw(VirtualConsole *vc)
 if (!qemu_dmabuf_get_draw_submitted(dmabuf)) {
 return;
 } else {
-dmabuf->draw_submitted = false;
+qemu_dmabuf_set_draw_submitted(dmabuf, false);
 }
 }
 #endif
@@ -381,7 +381,7 @@ void gd_egl_flush(DisplayChangeListener *dcl,
 if (vc->gfx.guest_fb.dmabuf &&
 !qemu_dmabuf_get_draw_submitted(vc->gfx.guest_fb.dmabuf)) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
-vc->gfx.guest_fb.dmabuf->draw_submitted = true;
+qemu_dmabuf_set_draw_submitted(vc->gfx.guest_fb.dmabuf, true);
 gtk_egl_set_scanout_mode(vc, true);
 gtk_widget_queue_draw_area(area, x, y, w, h);
 return;
diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c
index 2d70280803..9a3f3d0d71 100644
--- a/ui/gtk-gl-area.c
+++ b/ui/gtk-gl-area.c
@@ -63,7 +63,7 @@ void gd_gl_area_draw(VirtualConsole *vc)
 if (!qemu_dmabuf_get_draw_submitted(dmabuf)) {
 return;
 } else {
-dmabuf->draw_submitted = false;
+qemu_dmabuf_set_draw_submitted(dmabuf, false);
 }
 }
 #endif
@@ -291,7 +291,7 @@ void gd_gl_area_scanout_flush(DisplayChangeListener *dcl,
 if (vc->gfx.guest_fb.dmabuf &&
 !qemu_dmabuf_get_draw_submitted(vc->gfx.guest_fb.dmabuf)) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
-vc->gfx.guest_fb.dmabuf->draw_submitted = true;
+qemu_dmabuf_set_draw_submitted(vc->gfx.guest_fb.dmabuf, true);
 gtk_gl_area_set_scanout_mode(vc, true);
 }
 gtk_gl_area_queue_render(GTK_GL_AREA(vc->gfx.drawing_area));
diff --git a/ui/gtk.c b/ui/gtk.c
index 237c913b26..3a6832eb1b 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -598,11 +598,11 @@ void gd_hw_gl_flushed(void *vcon)
 QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf;
 int fence_fd;
 
-if (dmabuf->fence_fd >= 0) {
-fence_fd = qemu_dmabuf_get_fence_fd(dmabuf);
+fence_fd = qemu_dmabuf_get_fence_fd(dmabuf);
+if (fence_fd >= 0) {
 qemu_set_fd_handler(fence_fd, NULL, NULL, NULL);
 close(fence_fd);
-dmabuf->fence_fd = -1;
+qemu_dmabuf_set_fence_fd(dmabuf, -1);
 graphic_hw_gl_block(vc->gfx.dcl.con, false);
 }
 }
-- 
2.34.1




Re: [PATCH] target/i386/translate.c: always write 32-bits for SGDT and SIDT

2024-04-19 Thread Richard Henderson

On 4/19/24 12:51, Mark Cave-Ayland wrote:

The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits
or 32-bits depending upon the operand size, but this is incorrect. Not only do
the Intel CPU manuals give contradictory information between processor
revisions, but this information doesn't even match real-life behaviour.

In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT
and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 
with
Win32s to function correctly. Remove the masking applied due to the operand size
for SGDT and SIDT so that the TCG behaviour matches the behaviour on real
hardware.

Signed-off-by: Mark Cave-Ayland 
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198

--
MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this
patch fixes the issue in WFW 3.11 with Win32s. For more technical information I
highly recommend the excellent write-up at
https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/.
---
  target/i386/tcg/translate.c | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 76a42c679c..3026eb6774 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
  gen_op_st_v(s, MO_16, s->T0, s->A0);
  gen_add_A0_im(s, 2);
  tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, gdt.base));
-if (dflag == MO_16) {
-tcg_gen_andi_tl(s->T0, s->T0, 0xff);
-}
+/*
+ * NB: Despite claims to the contrary in Intel CPU documentation,
+ * all 32-bits are written regardless of operand size.
+ */


Current documentation agrees that all 32 bits are written, so I don't think you need this 
comment:


  IF OperandSize =16 or OperandSize = 32 (* Legacy or Compatibility Mode *)
THEN
  DEST[0:15] := GDTR(Limit);
  DEST[16:47] := GDTR(Base); (* Full 32-bit base address stored *)
  FI;


Anyway,
Reviewed-by: Richard Henderson 


r~



Re: [RFC 1/2] iova_tree: add an id member to DMAMap

2024-04-19 Thread Si-Wei Liu




On 4/19/2024 1:29 AM, Eugenio Perez Martin wrote:

On Thu, Apr 18, 2024 at 10:46 PM Si-Wei Liu  wrote:



On 4/10/2024 3:03 AM, Eugenio Pérez wrote:

IOVA tree is also used to track the mappings of virtio-net shadow
virtqueue.  This mappings may not match with the GPA->HVA ones.

This causes a problem when overlapped regions (different GPA but same
translated HVA) exists in the tree, as looking them by HVA will return
them twice.  To solve this, create an id member so we can assign unique
identifiers (GPA) to the maps.

Signed-off-by: Eugenio Pérez 
---
   include/qemu/iova-tree.h | 5 +++--
   util/iova-tree.c | 3 ++-
   2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/qemu/iova-tree.h b/include/qemu/iova-tree.h
index 2a10a7052e..34ee230e7d 100644
--- a/include/qemu/iova-tree.h
+++ b/include/qemu/iova-tree.h
@@ -36,6 +36,7 @@ typedef struct DMAMap {
   hwaddr iova;
   hwaddr translated_addr;
   hwaddr size;/* Inclusive */
+uint64_t id;
   IOMMUAccessFlags perm;
   } QEMU_PACKED DMAMap;
   typedef gboolean (*iova_tree_iterator)(DMAMap *map);
@@ -100,8 +101,8 @@ const DMAMap *iova_tree_find(const IOVATree *tree, const 
DMAMap *map);
* @map: the mapping to search
*
* Search for a mapping in the iova tree that translated_addr overlaps with 
the
- * mapping range specified.  Only the first found mapping will be
- * returned.
+ * mapping range specified and map->id is equal.  Only the first found
+ * mapping will be returned.
*
* Return: DMAMap pointer if found, or NULL if not found.  Note that
* the returned DMAMap pointer is maintained internally.  User should
diff --git a/util/iova-tree.c b/util/iova-tree.c
index 536789797e..0863e0a3b8 100644
--- a/util/iova-tree.c
+++ b/util/iova-tree.c
@@ -97,7 +97,8 @@ static gboolean iova_tree_find_address_iterator(gpointer key, 
gpointer value,

   needle = args->needle;
   if (map->translated_addr + map->size < needle->translated_addr ||
-needle->translated_addr + needle->size < map->translated_addr) {
+needle->translated_addr + needle->size < map->translated_addr ||
+needle->id != map->id) {

It looks this iterator can also be invoked by SVQ from
vhost_svq_translate_addr() -> iova_tree_find_iova(), where guest GPA
space will be searched on without passing in the ID (GPA), and exact
match for the same GPA range is not actually needed unlike the mapping
removal case. Could we create an API variant, for the SVQ lookup case
specifically? Or alternatively, add a special flag, say skip_id_match to
DMAMap, and the id match check may look like below:

(!needle->skip_id_match && needle->id != map->id)

I think vhost_svq_translate_addr() could just call the API variant or
pass DMAmap with skip_id_match set to true to svq_iova_tree_find_iova().


I think you're totally right. But I'd really like to not complicate
the API of the iova_tree more.

I think we can look for the hwaddr using memory_region_from_host and
then get the hwaddr. It is another lookup though...
Yeah, that will be another means of doing translation without having to 
complicate the API around iova_tree. I wonder how the lookup through 
memory_region_from_host() may perform compared to the iova tree one, the 
former looks to be an O(N) linear search on a linked list while the 
latter would be roughly O(log N) on an AVL tree? Of course, 
memory_region_from_host() won't search out of the guest memory space for 
sure. As this could be on the hot data path I have a little bit 
hesitance over the potential cost or performance regression this change 
could bring in, but maybe I'm overthinking it too much...


Thanks,
-Siwei




Thanks,
-Siwei

   return false;
   }






Re: [PATCH 4/5] docs/system/target-sparc: Improve the Sparc documentation

2024-04-19 Thread Brad Smith

On 2024-04-18 4:27 p.m., Mark Cave-Ayland wrote:

On 07/03/2024 17:43, Thomas Huth wrote:


Add some words about how to enable or disable boolean features,
and remove the note about a Linux kernel being available on the
QEMU website (they have been removed long ago already).

Signed-off-by: Thomas Huth 
---
  docs/system/target-sparc.rst | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/docs/system/target-sparc.rst b/docs/system/target-sparc.rst
index 9ec8c90c14..9f418b9d3e 100644
--- a/docs/system/target-sparc.rst
+++ b/docs/system/target-sparc.rst
@@ -27,6 +27,11 @@ architecture machines:
  The emulation is somewhat complete. SMP up to 16 CPUs is supported, 
but

  Linux limits the number of usable CPUs to 4.
  +The list of available CPUs can be viewed by starting QEMU with 
``-cpu help``.
+Optional boolean features can be added with a "+" in front of the 
feature name,

+or disabled with a "-" in front of the name, for example
+``-cpu TI-SuperSparc-II,+float128``.
+
  QEMU emulates the following sun4m peripherals:
    -  IOMMU
@@ -55,8 +60,7 @@ OpenBIOS is a free (GPL v2) portable firmware 
implementation. The goal

  is to implement a 100% IEEE 1275-1994 (referred to as Open Firmware)
  compliant firmware.
  -A sample Linux 2.6 series kernel and ram disk image are available 
on the

-QEMU web site. There are still issues with NetBSD and OpenBSD, but most
+There are still issues with NetBSD and OpenBSD, but most
  kernel versions work. Please note that currently older Solaris kernels
  don't work probably due to interface issues between OpenBIOS and
  Solaris.


Just curious as to what current issues exist with NetBSD and OpenBSD? 
At least both my NetBSD and OpenBSD test images survive a casual boot 
test here with latest git.


I was just trying OpenBSD/sparc64 with 8.2 recently and found hme(4) does
not work. I tried with the NE2k driver as I remember adding the driver 
to the

OpenBSD kernel before an hme driver existed and it sort of worked, but there
were still issues.

I'll re-test with 9 now and see what happens.



Issue#414 and qemu_mutex_lock() API Conversion

2024-04-19 Thread Vilhelm Gyda
Hi,

I am new here, so I found some issues suitable for beginners at [1]. I
am currently looking at the API Conversion task of replacing
`qemu_mutex_lock()` and `qemu_mutex_unlock()` with `QEMU_LOCK_GUARD()`

After reading the macro definition of `QEMU_LOCK_GUARD()` and
`WITH_QEMU_LOCK_GUARD()`, I think I can replace an instance of
`qemu_mutex_(un)lock` at line 1065[2] and 1072[3] with
`WITH_QEMU_LOCK_GUARD(&rs->bitmap_mutex) {}`

Will this be an acceptable patch? Do I need to do anything else before
submitting(other than the checkpatch script)? How and what should I
test before submitting? This page[4] has some tests, but will running
all these be too much for a change this trivial?

Also, about this bite-sized issue[5]. @jsnow has commented to get in
touch about it. So, what should be my next step for working on this
issue?

Thanks,
Will

[1]: https://wiki.qemu.org/Contribute/BiteSizedTasks
[2]: 
https://gitlab.com/qemu-project/qemu/-/blob/master/migration/ram.c?ref_type=heads#L1065
[3]: 
https://gitlab.com/qemu-project/qemu/-/blob/master/migration/ram.c?ref_type=heads#L1072
[4]: https://wiki.qemu.org/Testing
[5]: https://gitlab.com/qemu-project/qemu/-/issues/414



[PATCH] target/i386/translate.c: always write 32-bits for SGDT and SIDT

2024-04-19 Thread Mark Cave-Ayland
The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits
or 32-bits depending upon the operand size, but this is incorrect. Not only do
the Intel CPU manuals give contradictory information between processor
revisions, but this information doesn't even match real-life behaviour.

In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT
and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 
with
Win32s to function correctly. Remove the masking applied due to the operand size
for SGDT and SIDT so that the TCG behaviour matches the behaviour on real
hardware.

Signed-off-by: Mark Cave-Ayland 
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198

--
MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this
patch fixes the issue in WFW 3.11 with Win32s. For more technical information I
highly recommend the excellent write-up at
https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/.
---
 target/i386/tcg/translate.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 76a42c679c..3026eb6774 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_op_st_v(s, MO_16, s->T0, s->A0);
 gen_add_A0_im(s, 2);
 tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, gdt.base));
-if (dflag == MO_16) {
-tcg_gen_andi_tl(s->T0, s->T0, 0xff);
-}
+/*
+ * NB: Despite claims to the contrary in Intel CPU documentation,
+ * all 32-bits are written regardless of operand size.
+ */
 gen_op_st_v(s, CODE64(s) + MO_32, s->T0, s->A0);
 break;
 
@@ -5901,9 +5902,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_op_st_v(s, MO_16, s->T0, s->A0);
 gen_add_A0_im(s, 2);
 tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, idt.base));
-if (dflag == MO_16) {
-tcg_gen_andi_tl(s->T0, s->T0, 0xff);
-}
+/*
+ * NB: Despite claims to the contrary in Intel CPU documentation,
+ * all 32-bits are written regardless of operand size.
+ */
 gen_op_st_v(s, CODE64(s) + MO_32, s->T0, s->A0);
 break;
 
-- 
2.39.2




[PATCH v8 2/6] ui/console: new dmabuf.h and dmabuf.c for QemuDmaBuf struct and helpers

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

New header and source files are added for containing QemuDmaBuf struct
definition and newly introduced helpers for creating/freeing the struct
and accessing its data.

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/ui/console.h |  20 +
 include/ui/dmabuf.h  |  81 +
 ui/dmabuf.c  | 206 +++
 ui/meson.build   |   1 +
 4 files changed, 289 insertions(+), 19 deletions(-)
 create mode 100644 include/ui/dmabuf.h
 create mode 100644 ui/dmabuf.c

diff --git a/include/ui/console.h b/include/ui/console.h
index 0bc7a00ac0..a208a68b88 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -7,6 +7,7 @@
 #include "qapi/qapi-types-ui.h"
 #include "ui/input.h"
 #include "ui/surface.h"
+#include "ui/dmabuf.h"
 
 #define TYPE_QEMU_CONSOLE "qemu-console"
 OBJECT_DECLARE_TYPE(QemuConsole, QemuConsoleClass, QEMU_CONSOLE)
@@ -185,25 +186,6 @@ struct QEMUGLParams {
 int minor_ver;
 };
 
-typedef struct QemuDmaBuf {
-int   fd;
-uint32_t  width;
-uint32_t  height;
-uint32_t  stride;
-uint32_t  fourcc;
-uint64_t  modifier;
-uint32_t  texture;
-uint32_t  x;
-uint32_t  y;
-uint32_t  backing_width;
-uint32_t  backing_height;
-bool  y0_top;
-void  *sync;
-int   fence_fd;
-bool  allow_fences;
-bool  draw_submitted;
-} QemuDmaBuf;
-
 enum display_scanout {
 SCANOUT_NONE,
 SCANOUT_SURFACE,
diff --git a/include/ui/dmabuf.h b/include/ui/dmabuf.h
new file mode 100644
index 00..e332958c39
--- /dev/null
+++ b/include/ui/dmabuf.h
@@ -0,0 +1,81 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * QemuDmaBuf struct and helpers used for accessing its data
+ *
+ * Copyright (c) 2024 Dongwon Kim 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef DMABUF_H
+#define DMABUF_H
+
+typedef struct QemuDmaBuf {
+int   fd;
+uint32_t  width;
+uint32_t  height;
+uint32_t  stride;
+uint32_t  fourcc;
+uint64_t  modifier;
+uint32_t  texture;
+uint32_t  x;
+uint32_t  y;
+uint32_t  backing_width;
+uint32_t  backing_height;
+bool  y0_top;
+void  *sync;
+int   fence_fd;
+bool  allow_fences;
+bool  draw_submitted;
+} QemuDmaBuf;
+
+QemuDmaBuf *qemu_dmabuf_new(uint32_t width, uint32_t height,
+   uint32_t stride, uint32_t x,
+   uint32_t y, uint32_t backing_width,
+   uint32_t backing_height, uint32_t fourcc,
+   uint64_t modifier, int32_t dmabuf_fd,
+   bool allow_fences, bool y0_top);
+void qemu_dmabuf_free(QemuDmaBuf *dmabuf);
+
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(QemuDmaBuf, qemu_dmabuf_free);
+
+int32_t qemu_dmabuf_get_fd(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_width(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_height(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_stride(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_fourcc(QemuDmaBuf *dmabuf);
+uint64_t qemu_dmabuf_get_modifier(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_texture(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_x(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_y(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_backing_width(QemuDmaBuf *dmabuf);
+uint32_t qemu_dmabuf_get_backing_height(QemuDmaBuf *dmabuf);
+bool qemu_dmabuf_get_y0_top(QemuDmaBuf *dmabuf);
+void *qemu_dmabuf_get_sync(QemuDmaBuf *dmabuf);
+int32_t qemu_dmabuf_get_fence_fd(QemuDmaBuf *dmabuf);
+bool qemu_dmabuf_get_allow_fences(QemuDmaBuf *dmabuf);
+bool qemu_dmabuf_get_draw_submitted(QemuDmaBuf *dmabuf);
+void qemu_dmabuf_set_texture(QemuDmaBuf *dmabuf, uint32_t texture);
+void qemu_dmabuf_set_fence_fd(QemuDmaBuf *dmabuf, int32_t fence_fd);
+void qemu_

[PATCH v8 5/6] ui/console: Use qemu_dmabuf_new() and free() helpers instead

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This commit introduces utility functions for the creation and deallocation
of QemuDmaBuf instances. Additionally, it updates all relevant sections
of the codebase to utilize these new utility functions.

v7: remove prefix, "dpy_gl_" from all helpers
qemu_dmabuf_free() returns without doing anything if input is null
(Daniel P. Berrangé )
call G_DEFINE_AUTOPTR_CLEANUP_FUNC for qemu_dmabuf_free()
(Daniel P. Berrangé )

v8: Introduction of helpers was removed as those were already added
by the previous commit

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/hw/vfio/vfio-common.h   |  2 +-
 include/hw/virtio/virtio-gpu.h  |  4 ++--
 hw/display/vhost-user-gpu.c | 32 ++--
 hw/display/virtio-gpu-udmabuf.c | 24 +---
 hw/vfio/display.c   | 26 --
 ui/dbus-listener.c  | 28 
 6 files changed, 54 insertions(+), 62 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index b9da6c08ef..d66e27db02 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -148,7 +148,7 @@ typedef struct VFIOGroup {
 } VFIOGroup;
 
 typedef struct VFIODMABuf {
-QemuDmaBuf buf;
+QemuDmaBuf *buf;
 uint32_t pos_x, pos_y, pos_updates;
 uint32_t hot_x, hot_y, hot_updates;
 int dmabuf_id;
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index ed44cdad6b..56d6e821bf 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -169,7 +169,7 @@ struct VirtIOGPUBaseClass {
 DEFINE_PROP_UINT32("yres", _state, _conf.yres, 800)
 
 typedef struct VGPUDMABuf {
-QemuDmaBuf buf;
+QemuDmaBuf *buf;
 uint32_t scanout_id;
 QTAILQ_ENTRY(VGPUDMABuf) next;
 } VGPUDMABuf;
@@ -238,7 +238,7 @@ struct VhostUserGPU {
 VhostUserBackend *vhost;
 int vhost_gpu_fd; /* closed by the chardev */
 CharBackend vhost_chr;
-QemuDmaBuf dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
+QemuDmaBuf *dmabuf[VIRTIO_GPU_MAX_SCANOUTS];
 bool backend_blocked;
 };
 
diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index ea9a6c5d10..7f8cf38647 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -250,6 +250,7 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 VhostUserGpuDMABUFScanout *m = &msg->payload.dmabuf_scanout;
 int fd = qemu_chr_fe_get_msgfd(&g->vhost_chr);
 int old_fd;
+uint64_t modifier = 0;
 QemuDmaBuf *dmabuf;
 
 if (m->scanout_id >= g->parent_obj.conf.max_outputs) {
@@ -262,31 +263,34 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 
 g->parent_obj.enable = 1;
 con = g->parent_obj.scanout[m->scanout_id].con;
-dmabuf = &g->dmabuf[m->scanout_id];
-old_fd = qemu_dmabuf_get_fd(dmabuf);
-if (old_fd >= 0) {
-close(old_fd);
-dmabuf->fd = -1;
+dmabuf = g->dmabuf[m->scanout_id];
+if (dmabuf) {
+old_fd = qemu_dmabuf_get_fd(dmabuf);
+if (old_fd >= 0) {
+close(old_fd);
+qemu_dmabuf_set_fd(dmabuf, -1);
+}
 }
 dpy_gl_release_dmabuf(con, dmabuf);
+g_clear_pointer(&dmabuf, qemu_dmabuf_free);
 if (fd == -1) {
 dpy_gl_scanout_disable(con);
 break;
 }
-*dmabuf = (QemuDmaBuf) {
-.fd = fd,
-.width = m->fd_width,
-.height = m->fd_height,
-.stride = m->fd_stride,
-.fourcc = m->fd_drm_fourcc,
-.y0_top = m->fd_flags & VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP,
-};
+
 if (msg->request == VHOST_USER_GPU_DMABUF_SCANOUT2) {
 VhostUserGpuDMABUFScanout2 *m2 = &msg->payload.dmabuf_scanout2;
-dmabuf->modifier = m2->modifier;
+modifier = m2->modifier;
 }
 
+dmabuf = qemu_dmabuf_new(m->fd_width, m->fd_height,
+m->fd_stride, 0, 0, 0, 0,
+m->fd_drm_fourcc, modifier,
+fd, false, m->fd_flags &
+VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP);
+
 dpy_gl_scanout_dmabuf(con, dmabuf);
+g->dmabuf[m->scanout_id] = dmabuf;
 break;
 }
 case VHOST_USER_GPU_DMABUF_UPDATE: {
diff --git a/hw/display/virtio-gpu-udmabuf.c b/hw/display/virtio-gpu-udmabuf.c
index c90eba281e..edd7886cf2 100644
--- a/hw/display/virtio-gpu-udmabuf.c
+++ b/hw/display/virtio-gpu-udmabuf.c
@@ -162,7 +162,8 @@ static void virtio_gpu_free_dmabuf(VirtIOGPU *g, VGPUDMABuf 
*dmabuf)
 struct virtio_gpu_scanout *scanout;
 
 scanout = &g->parent_obj.scanout[dmabuf->scanout_id];

[PATCH v8 1/6] ui/gtk: Check if fence_fd is equal to or greater than 0

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

'fence_fd' needs to be validated always before being referenced
And the passing condition should include '== 0' as 0 is a valid
value for the file descriptor.

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 ui/gtk-egl.c |  2 +-
 ui/gtk-gl-area.c |  2 +-
 ui/gtk.c | 10 ++
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index 3af5ac5bcf..955234429d 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -99,7 +99,7 @@ void gd_egl_draw(VirtualConsole *vc)
 #ifdef CONFIG_GBM
 if (dmabuf) {
 egl_dmabuf_create_fence(dmabuf);
-if (dmabuf->fence_fd > 0) {
+if (dmabuf->fence_fd >= 0) {
 qemu_set_fd_handler(dmabuf->fence_fd, gd_hw_gl_flushed, NULL, 
vc);
 return;
 }
diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c
index 52dcac161e..7fffd0544e 100644
--- a/ui/gtk-gl-area.c
+++ b/ui/gtk-gl-area.c
@@ -86,7 +86,7 @@ void gd_gl_area_draw(VirtualConsole *vc)
 #ifdef CONFIG_GBM
 if (dmabuf) {
 egl_dmabuf_create_fence(dmabuf);
-if (dmabuf->fence_fd > 0) {
+if (dmabuf->fence_fd >= 0) {
 qemu_set_fd_handler(dmabuf->fence_fd, gd_hw_gl_flushed, NULL, 
vc);
 return;
 }
diff --git a/ui/gtk.c b/ui/gtk.c
index 810d7fc796..7819a86321 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -597,10 +597,12 @@ void gd_hw_gl_flushed(void *vcon)
 VirtualConsole *vc = vcon;
 QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf;
 
-qemu_set_fd_handler(dmabuf->fence_fd, NULL, NULL, NULL);
-close(dmabuf->fence_fd);
-dmabuf->fence_fd = -1;
-graphic_hw_gl_block(vc->gfx.dcl.con, false);
+if (dmabuf->fence_fd >= 0) {
+qemu_set_fd_handler(dmabuf->fence_fd, NULL, NULL, NULL);
+close(dmabuf->fence_fd);
+dmabuf->fence_fd = -1;
+graphic_hw_gl_block(vc->gfx.dcl.con, false);
+}
 }
 
 /** DisplayState Callbacks (opengl version) **/
-- 
2.34.1




[PATCH v8 6/6] ui/console: move QemuDmaBuf struct def to dmabuf.c

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

To complete privatizing process of QemuDmaBuf, QemuDmaBuf struct def
is moved to dmabuf.c

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 include/ui/dmabuf.h | 19 +--
 ui/dmabuf.c | 19 +++
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/include/ui/dmabuf.h b/include/ui/dmabuf.h
index e332958c39..0723846a3a 100644
--- a/include/ui/dmabuf.h
+++ b/include/ui/dmabuf.h
@@ -27,24 +27,7 @@
 #ifndef DMABUF_H
 #define DMABUF_H
 
-typedef struct QemuDmaBuf {
-int   fd;
-uint32_t  width;
-uint32_t  height;
-uint32_t  stride;
-uint32_t  fourcc;
-uint64_t  modifier;
-uint32_t  texture;
-uint32_t  x;
-uint32_t  y;
-uint32_t  backing_width;
-uint32_t  backing_height;
-bool  y0_top;
-void  *sync;
-int   fence_fd;
-bool  allow_fences;
-bool  draw_submitted;
-} QemuDmaBuf;
+typedef struct QemuDmaBuf QemuDmaBuf;
 
 QemuDmaBuf *qemu_dmabuf_new(uint32_t width, uint32_t height,
uint32_t stride, uint32_t x,
diff --git a/ui/dmabuf.c b/ui/dmabuf.c
index ef3b07956e..7a919160dc 100644
--- a/ui/dmabuf.c
+++ b/ui/dmabuf.c
@@ -27,6 +27,25 @@
 #include "qemu/osdep.h"
 #include "ui/dmabuf.h"
 
+struct QemuDmaBuf {
+int   fd;
+uint32_t  width;
+uint32_t  height;
+uint32_t  stride;
+uint32_t  fourcc;
+uint64_t  modifier;
+uint32_t  texture;
+uint32_t  x;
+uint32_t  y;
+uint32_t  backing_width;
+uint32_t  backing_height;
+bool  y0_top;
+void  *sync;
+int   fence_fd;
+bool  allow_fences;
+bool  draw_submitted;
+};
+
 QemuDmaBuf *qemu_dmabuf_new(uint32_t width, uint32_t height,
 uint32_t stride, uint32_t x,
 uint32_t y, uint32_t backing_width,
-- 
2.34.1




[PATCH v8 0/6] ui/console: Private QemuDmaBuf struct

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This series introduces privacy enhancements to the QemuDmaBuf struct
and its contained data to bolster security. it accomplishes this by
introducing of helper functions for allocating, deallocating, and
accessing individual fields within the struct and replacing all direct
references to individual fields in the struct with methods using helpers
throughout the codebase.

This change was made based on a suggestion from Marc-André Lureau


(Resumitting same patch series with this new cover-leter)

v6: fixed some typos in patch -
ui/console: Introduce dpy_gl_qemu_dmabuf_get_..() helpers)

v7: included minor fix (ui/gtk: Check if fence_fd is equal to or greater than 0)
(Marc-André Lureau )

migrated all helpers and QemuDmaBuf struct into dmabuf.c and their 
prototypes
to dmabuf.h for better encapsulation (ui/dmabuf: New dmabuf.c and 
dmabuf.h..)
(Daniel P. Berrangé  and
 Marc-André Lureau )

removed 'dpy_gl' from all helpers' names
Defined autoptr clean up function for QemuDmaBuf*
(Daniel P. Berrangé )

Minor corrections

v8: Introduce new dmabuf.c and dmabuf.h and all helper functions in the second
patch in the series (ui/console: new dmabuf.h and dmabuf.c for QemuDma)
(Philippe Mathieu-Daudé )

Move QemuDmaBuf struct definition to dmabuf.c in the last patch in
the series (ui/console: move QemuDmaBuf struct def...) to mitigates
compilation errors encountered during the midpoint of the series.

Dongwon Kim (6):
  ui/gtk: Check if fence_fd is equal to or greater than 0
  ui/console: new dmabuf.h and dmabuf.c for QemuDmaBuf struct and
helpers
  ui/console: Use qemu_dmabuf_get_..() helpers instead
  ui/console: Use qemu_dmabuf_set_..() helpers instead
  ui/console: Use qemu_dmabuf_new() and free() helpers instead
  ui/console: move QemuDmaBuf struct def to dmabuf.c

 include/hw/vfio/vfio-common.h   |   2 +-
 include/hw/virtio/virtio-gpu.h  |   4 +-
 include/ui/console.h|  20 +--
 include/ui/dmabuf.h |  64 +
 hw/display/vhost-user-gpu.c |  32 +++--
 hw/display/virtio-gpu-udmabuf.c |  27 ++--
 hw/vfio/display.c   |  35 ++---
 ui/console.c|   4 +-
 ui/dbus-console.c   |   9 +-
 ui/dbus-listener.c  |  71 +-
 ui/dmabuf.c | 225 
 ui/egl-headless.c   |  23 +++-
 ui/egl-helpers.c|  59 +
 ui/gtk-egl.c|  52 +---
 ui/gtk-gl-area.c|  41 --
 ui/gtk.c|  12 +-
 ui/spice-display.c  |  50 ---
 ui/meson.build  |   1 +
 18 files changed, 539 insertions(+), 192 deletions(-)
 create mode 100644 include/ui/dmabuf.h
 create mode 100644 ui/dmabuf.c

-- 
2.34.1




[PATCH v8 4/6] ui/console: Use qemu_dmabuf_set_..() helpers instead

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This commit updates all occurrences where these fields were
set directly have been updated to utilize helper functions.

v7: removed prefix, "dpy_gl_" from all helpers

v8: Introduction of helpers was removed as those were already added
by the previous commit

Suggested-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 ui/egl-helpers.c | 16 +---
 ui/gtk-egl.c |  4 ++--
 ui/gtk-gl-area.c |  4 ++--
 ui/gtk.c |  6 +++---
 4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/ui/egl-helpers.c b/ui/egl-helpers.c
index 3f96e63d25..99b2ebbe23 100644
--- a/ui/egl-helpers.c
+++ b/ui/egl-helpers.c
@@ -348,8 +348,8 @@ void egl_dmabuf_import_texture(QemuDmaBuf *dmabuf)
 return;
 }
 
-glGenTextures(1, &dmabuf->texture);
-texture = qemu_dmabuf_get_texture(dmabuf);
+glGenTextures(1, &texture);
+qemu_dmabuf_set_texture(dmabuf, texture);
 glBindTexture(GL_TEXTURE_2D, texture);
 glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
 glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
@@ -368,7 +368,7 @@ void egl_dmabuf_release_texture(QemuDmaBuf *dmabuf)
 }
 
 glDeleteTextures(1, &texture);
-dmabuf->texture = 0;
+qemu_dmabuf_set_texture(dmabuf, 0);
 }
 
 void egl_dmabuf_create_sync(QemuDmaBuf *dmabuf)
@@ -382,7 +382,7 @@ void egl_dmabuf_create_sync(QemuDmaBuf *dmabuf)
 sync = eglCreateSyncKHR(qemu_egl_display,
 EGL_SYNC_NATIVE_FENCE_ANDROID, NULL);
 if (sync != EGL_NO_SYNC_KHR) {
-dmabuf->sync = sync;
+qemu_dmabuf_set_sync(dmabuf, sync);
 }
 }
 }
@@ -390,12 +390,14 @@ void egl_dmabuf_create_sync(QemuDmaBuf *dmabuf)
 void egl_dmabuf_create_fence(QemuDmaBuf *dmabuf)
 {
 void *sync = qemu_dmabuf_get_sync(dmabuf);
+int fence_fd;
 
 if (sync) {
-dmabuf->fence_fd = eglDupNativeFenceFDANDROID(qemu_egl_display,
-  sync);
+fence_fd = eglDupNativeFenceFDANDROID(qemu_egl_display,
+  sync);
+qemu_dmabuf_set_fence_fd(dmabuf, fence_fd);
 eglDestroySyncKHR(qemu_egl_display, sync);
-dmabuf->sync = NULL;
+qemu_dmabuf_set_sync(dmabuf, NULL);
 }
 }
 
diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index 7a45daefa1..ec0bf45482 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -87,7 +87,7 @@ void gd_egl_draw(VirtualConsole *vc)
 if (!qemu_dmabuf_get_draw_submitted(dmabuf)) {
 return;
 } else {
-dmabuf->draw_submitted = false;
+qemu_dmabuf_set_draw_submitted(dmabuf, false);
 }
 }
 #endif
@@ -381,7 +381,7 @@ void gd_egl_flush(DisplayChangeListener *dcl,
 if (vc->gfx.guest_fb.dmabuf &&
 !qemu_dmabuf_get_draw_submitted(vc->gfx.guest_fb.dmabuf)) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
-vc->gfx.guest_fb.dmabuf->draw_submitted = true;
+qemu_dmabuf_set_draw_submitted(vc->gfx.guest_fb.dmabuf, true);
 gtk_egl_set_scanout_mode(vc, true);
 gtk_widget_queue_draw_area(area, x, y, w, h);
 return;
diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c
index 2d70280803..9a3f3d0d71 100644
--- a/ui/gtk-gl-area.c
+++ b/ui/gtk-gl-area.c
@@ -63,7 +63,7 @@ void gd_gl_area_draw(VirtualConsole *vc)
 if (!qemu_dmabuf_get_draw_submitted(dmabuf)) {
 return;
 } else {
-dmabuf->draw_submitted = false;
+qemu_dmabuf_set_draw_submitted(dmabuf, false);
 }
 }
 #endif
@@ -291,7 +291,7 @@ void gd_gl_area_scanout_flush(DisplayChangeListener *dcl,
 if (vc->gfx.guest_fb.dmabuf &&
 !qemu_dmabuf_get_draw_submitted(vc->gfx.guest_fb.dmabuf)) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
-vc->gfx.guest_fb.dmabuf->draw_submitted = true;
+qemu_dmabuf_set_draw_submitted(vc->gfx.guest_fb.dmabuf, true);
 gtk_gl_area_set_scanout_mode(vc, true);
 }
 gtk_gl_area_queue_render(GTK_GL_AREA(vc->gfx.drawing_area));
diff --git a/ui/gtk.c b/ui/gtk.c
index 237c913b26..3a6832eb1b 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -598,11 +598,11 @@ void gd_hw_gl_flushed(void *vcon)
 QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf;
 int fence_fd;
 
-if (dmabuf->fence_fd >= 0) {
-fence_fd = qemu_dmabuf_get_fence_fd(dmabuf);
+fence_fd = qemu_dmabuf_get_fence_fd(dmabuf);
+if (fence_fd >= 0) {
 qemu_set_fd_handler(fence_fd, NULL, NULL, NULL);
 close(fence_fd);
-dmabuf->fence_fd = -1;
+qemu_dmabuf_set_fence_fd(dmabuf, -1);
 graphic_hw_gl_block(vc->gfx.dcl.con, false);
 }
 }
-- 
2.34.1




[PATCH v8 3/6] ui/console: Use qemu_dmabuf_get_..() helpers instead

2024-04-19 Thread dongwon . kim
From: Dongwon Kim 

This commit updates all instances where fields within the QemuDmaBuf
struct are directly accessed, replacing them with calls to these new
helper functions.

v6: fix typos in helper names in ui/spice-display.c

v7: removed prefix, "dpy_gl_" from all helpers

v8: Introduction of helpers was removed as those were already added
by the previous commit

Suggested-by: Marc-André Lureau 
Reviewed-by: Marc-André Lureau 
Cc: Philippe Mathieu-Daudé 
Cc: Daniel P. Berrangé 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 hw/display/vhost-user-gpu.c |  6 ++--
 hw/display/virtio-gpu-udmabuf.c |  7 +++--
 hw/vfio/display.c   | 15 +++---
 ui/console.c|  4 +--
 ui/dbus-console.c   |  9 --
 ui/dbus-listener.c  | 43 +---
 ui/egl-headless.c   | 23 ++-
 ui/egl-helpers.c| 47 ++-
 ui/gtk-egl.c| 48 ---
 ui/gtk-gl-area.c| 37 
 ui/gtk.c|  6 ++--
 ui/spice-display.c  | 50 +++--
 12 files changed, 187 insertions(+), 108 deletions(-)

diff --git a/hw/display/vhost-user-gpu.c b/hw/display/vhost-user-gpu.c
index 709c8a02a1..ea9a6c5d10 100644
--- a/hw/display/vhost-user-gpu.c
+++ b/hw/display/vhost-user-gpu.c
@@ -249,6 +249,7 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 case VHOST_USER_GPU_DMABUF_SCANOUT: {
 VhostUserGpuDMABUFScanout *m = &msg->payload.dmabuf_scanout;
 int fd = qemu_chr_fe_get_msgfd(&g->vhost_chr);
+int old_fd;
 QemuDmaBuf *dmabuf;
 
 if (m->scanout_id >= g->parent_obj.conf.max_outputs) {
@@ -262,8 +263,9 @@ vhost_user_gpu_handle_display(VhostUserGPU *g, 
VhostUserGpuMsg *msg)
 g->parent_obj.enable = 1;
 con = g->parent_obj.scanout[m->scanout_id].con;
 dmabuf = &g->dmabuf[m->scanout_id];
-if (dmabuf->fd >= 0) {
-close(dmabuf->fd);
+old_fd = qemu_dmabuf_get_fd(dmabuf);
+if (old_fd >= 0) {
+close(old_fd);
 dmabuf->fd = -1;
 }
 dpy_gl_release_dmabuf(con, dmabuf);
diff --git a/hw/display/virtio-gpu-udmabuf.c b/hw/display/virtio-gpu-udmabuf.c
index d51184d658..c90eba281e 100644
--- a/hw/display/virtio-gpu-udmabuf.c
+++ b/hw/display/virtio-gpu-udmabuf.c
@@ -206,6 +206,7 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
 {
 struct virtio_gpu_scanout *scanout = &g->parent_obj.scanout[scanout_id];
 VGPUDMABuf *new_primary, *old_primary = NULL;
+uint32_t width, height;
 
 new_primary = virtio_gpu_create_dmabuf(g, scanout_id, res, fb, r);
 if (!new_primary) {
@@ -216,10 +217,10 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
 old_primary = g->dmabuf.primary[scanout_id];
 }
 
+width = qemu_dmabuf_get_width(&new_primary->buf);
+height = qemu_dmabuf_get_height(&new_primary->buf);
 g->dmabuf.primary[scanout_id] = new_primary;
-qemu_console_resize(scanout->con,
-new_primary->buf.width,
-new_primary->buf.height);
+qemu_console_resize(scanout->con, width, height);
 dpy_gl_scanout_dmabuf(scanout->con, &new_primary->buf);
 
 if (old_primary) {
diff --git a/hw/vfio/display.c b/hw/vfio/display.c
index 1aa440c663..4861c8161d 100644
--- a/hw/vfio/display.c
+++ b/hw/vfio/display.c
@@ -259,9 +259,13 @@ static VFIODMABuf *vfio_display_get_dmabuf(VFIOPCIDevice 
*vdev,
 
 static void vfio_display_free_one_dmabuf(VFIODisplay *dpy, VFIODMABuf *dmabuf)
 {
+int fd;
+
 QTAILQ_REMOVE(&dpy->dmabuf.bufs, dmabuf, next);
+
+fd = qemu_dmabuf_get_fd(&dmabuf->buf);
 dpy_gl_release_dmabuf(dpy->con, &dmabuf->buf);
-close(dmabuf->buf.fd);
+close(fd);
 g_free(dmabuf);
 }
 
@@ -286,6 +290,7 @@ static void vfio_display_dmabuf_update(void *opaque)
 VFIOPCIDevice *vdev = opaque;
 VFIODisplay *dpy = vdev->dpy;
 VFIODMABuf *primary, *cursor;
+uint32_t width, height;
 bool free_bufs = false, new_cursor = false;
 
 primary = vfio_display_get_dmabuf(vdev, DRM_PLANE_TYPE_PRIMARY);
@@ -296,10 +301,12 @@ static void vfio_display_dmabuf_update(void *opaque)
 return;
 }
 
+width = qemu_dmabuf_get_width(&primary->buf);
+height = qemu_dmabuf_get_height(&primary->buf);
+
 if (dpy->dmabuf.primary != primary) {
 dpy->dmabuf.primary = primary;
-qemu_console_resize(dpy->con,
-primary->buf.width, primary->buf.height);
+qemu_console_resize(dpy->con, width, height);
 dpy_gl_scanout_dmabuf(dpy->con, &primary->buf);
 free_bufs = true;
 }
@@ -328,7 +335,7 @@ static void vfio_display_dmabuf_update(void *opaque)
 cursor->pos_updates = 0;
 }
 
-dpy_gl_update(dpy->con, 0, 0, primary->buf.width, primary->buf.height);
+dpy_gl_up

[PATCH 0/3] target/arm: Make the counter frequency default 1GHz for new CPUs, machines

2024-04-19 Thread Peter Maydell
In previous versions of the Arm architecture, the frequency of the
generic timers as reported in CNTFRQ_EL0 could be any IMPDEF value,
and for QEMU we picked 62.5MHz, giving a timer tick period of 16ns.
In Armv8.6, the architecture standardized this frequency to 1GHz.

Because there is no ID register feature field that indicates whether a
CPU is v8.6 or that it ought to have this counter frequency, we
implement this by changing our default CNTFRQ value for all CPUs, with
exceptions for backwards compatibility:

 * CPU types which we already implement will retain the old
   default value. None of these are v8.6 CPUs, so this is
   architecturally OK.
 * CPUs used in versioned machine types with a version of 9.0
   or earlier will retain the old default value.

The upshot is that the only CPU type that changes is 'max'; but any
new type we add in future (whether v8.6 or not) will also get the new
1GHz default (assuming we spot in code review any attempts to set
the ARM_FEATURE_BACKCOMPAT_CNTFRQ flag on new CPU types as a result
of cut-n-paste from an older CPU initfn ;-)).

It remains the case that the machine model can override the default
value via the 'cntfrq' QOM property (regardless of the CPU type).

Patch 1 is Paolo's "add the new versioned machine types" patch that
he sent out last month; patch 2 is some preliminary cleanup so that
we set the default cntfrq value in exactly one place, and patch 3
is the mechanics to set the default appropriately for the two
back-compat scenarios.

thanks
-- PMM

Paolo Bonzini (1):
  hw: Add compat machines for 9.1

Peter Maydell (2):
  target/arm: Refactor default generic timer frequency handling
  target/arm: Default to 1GHz cntfrq for 'max' and new CPUs

 include/hw/boards.h|  3 +++
 include/hw/i386/pc.h   |  3 +++
 target/arm/cpu.h   | 11 +
 target/arm/internals.h | 15 +---
 hw/arm/virt.c  | 11 +++--
 hw/core/machine.c  |  5 
 hw/i386/pc.c   |  3 +++
 hw/i386/pc_piix.c  | 17 +++---
 hw/i386/pc_q35.c   | 14 ++--
 hw/m68k/virt.c | 11 +++--
 hw/ppc/spapr.c | 17 +++---
 hw/s390x/s390-virtio-ccw.c | 14 +++-
 target/arm/cpu.c   | 47 ++
 target/arm/cpu64.c |  2 ++
 target/arm/helper.c| 16 ++---
 target/arm/tcg/cpu32.c |  4 
 target/arm/tcg/cpu64.c | 18 +++
 17 files changed, 173 insertions(+), 38 deletions(-)

-- 
2.34.1




[PATCH 1/3] hw: Add compat machines for 9.1

2024-04-19 Thread Peter Maydell
From: Paolo Bonzini 

Add 9.1 machine types for arm/i440fx/m68k/q35/s390x/spapr.

Cc: Cornelia Huck 
Cc: Thomas Huth 
Cc: Harsh Prateek Bora 
Cc: Gavin Shan 
Signed-off-by: Paolo Bonzini 
Acked-by: Thomas Huth 
Reviewed-by: Cornelia Huck 
Reviewed-by: Harsh Prateek Bora 
Reviewed-by: Zhao Liu 
[PMM: fixed the typos Zhao Liu found in the s390 changes]
Signed-off-by: Peter Maydell 
---
 include/hw/boards.h|  3 +++
 include/hw/i386/pc.h   |  3 +++
 hw/arm/virt.c  | 11 +--
 hw/core/machine.c  |  3 +++
 hw/i386/pc.c   |  3 +++
 hw/i386/pc_piix.c  | 17 ++---
 hw/i386/pc_q35.c   | 14 --
 hw/m68k/virt.c | 11 +--
 hw/ppc/spapr.c | 17 ++---
 hw/s390x/s390-virtio-ccw.c | 14 +-
 10 files changed, 83 insertions(+), 13 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 8b8f6d5c00d..50e0cf4278e 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -425,6 +425,9 @@ struct MachineState {
 } \
 type_init(machine_initfn##_register_types)
 
+extern GlobalProperty hw_compat_9_0[];
+extern const size_t hw_compat_9_0_len;
+
 extern GlobalProperty hw_compat_8_2[];
 extern const size_t hw_compat_8_2_len;
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 27a68071d77..349f79df086 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -198,6 +198,9 @@ void pc_system_parse_ovmf_flash(uint8_t *flash_ptr, size_t 
flash_size);
 /* sgx.c */
 void pc_machine_init_sgx_epc(PCMachineState *pcms);
 
+extern GlobalProperty pc_compat_9_0[];
+extern const size_t pc_compat_9_0_len;
+
 extern GlobalProperty pc_compat_8_2[];
 extern const size_t pc_compat_8_2_len;
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a9a913aeadb..c9119ef3847 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3223,10 +3223,17 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_machine_9_0_options(MachineClass *mc)
+static void virt_machine_9_1_options(MachineClass *mc)
 {
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(9, 0)
+DEFINE_VIRT_MACHINE_AS_LATEST(9, 1)
+
+static void virt_machine_9_0_options(MachineClass *mc)
+{
+virt_machine_9_1_options(mc);
+compat_props_add(mc->compat_props, hw_compat_9_0, hw_compat_9_0_len);
+}
+DEFINE_VIRT_MACHINE(9, 0)
 
 static void virt_machine_8_2_options(MachineClass *mc)
 {
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 37ede0e7d4f..a92bec23147 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -33,6 +33,9 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "audio/audio.h"
 
+GlobalProperty hw_compat_9_0[] = {};
+const size_t hw_compat_9_0_len = G_N_ELEMENTS(hw_compat_9_0);
+
 GlobalProperty hw_compat_8_2[] = {
 { "migration", "zero-page-detection", "legacy"},
 { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" },
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5c21b0c4dbf..b0d818b2094 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -78,6 +78,9 @@
 { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
 { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
 
+GlobalProperty pc_compat_9_0[] = {};
+const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0);
+
 GlobalProperty pc_compat_8_2[] = {};
 const size_t pc_compat_8_2_len = G_N_ELEMENTS(pc_compat_8_2);
 
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 18ba0766092..8850c49c66a 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -513,13 +513,26 @@ static void pc_i440fx_machine_options(MachineClass *m)
  "Use a different south bridge than 
PIIX3");
 }
 
-static void pc_i440fx_9_0_machine_options(MachineClass *m)
+static void pc_i440fx_9_1_machine_options(MachineClass *m)
 {
 pc_i440fx_machine_options(m);
 m->alias = "pc";
 m->is_default = true;
 }
 
+DEFINE_I440FX_MACHINE(v9_1, "pc-i440fx-9.1", NULL,
+  pc_i440fx_9_1_machine_options);
+
+static void pc_i440fx_9_0_machine_options(MachineClass *m)
+{
+pc_i440fx_9_1_machine_options(m);
+m->alias = NULL;
+m->is_default = false;
+
+compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
+compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
+}
+
 DEFINE_I440FX_MACHINE(v9_0, "pc-i440fx-9.0", NULL,
   pc_i440fx_9_0_machine_options);
 
@@ -528,8 +541,6 @@ static void pc_i440fx_8_2_machine_options(MachineClass *m)
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 
 pc_i440fx_9_0_machine_options(m);
-m->alias = NULL;
-m->is_default = false;
 
 compat_props_add(m->compat_props, hw_compat_8_2, hw_compat_8_2_len);
 compat_props_add(m->compat_props, pc_compat_8_2, pc_compat_8_2_len);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index c7bc8a2041f..6e1180d4b60 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -365,12 +365,23 @@ static void pc_q35_machine_options(M

[PATCH 2/3] target/arm: Refactor default generic timer frequency handling

2024-04-19 Thread Peter Maydell
The generic timer frequency is settable by board code via a QOM
property "cntfrq", but otherwise defaults to 62.5MHz.  The way this
is done includes some complication resulting from how this was
originally a fixed value with no QOM property.  Clean it up:

 * always set cpu->gt_cntfrq_hz to some sensible value, whether
   the CPU has the generic timer or not, and whether it's system
   or user-only emulation
 * this means we can always use gt_cntfrq_hz, and never need
   the old GTIMER_SCALE define
 * set the default value in exactly one place, in the realize fn

The aim here is to pave the way for handling the ARMv8.6 requirement
that the generic timer frequency is always 1GHz.  We're going to do
that by having old CPU types keep their legacy-in-QEMU behaviour and
having the default for any new CPU types be a 1GHz rather han 62.5MHz
cntfrq, so we want the point where the default is decided to be in
one place, and in code, not in a DEFINE_PROP_UINT64() initializer.

This commit should have no behavioural changes.

Signed-off-by: Peter Maydell 
---
 target/arm/internals.h |  7 ---
 target/arm/cpu.c   | 31 +--
 target/arm/helper.c| 16 
 3 files changed, 29 insertions(+), 25 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index dd3da211a3f..74d4b1b0990 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -59,10 +59,11 @@ static inline bool excp_is_internal(int excp)
 || excp == EXCP_SEMIHOST;
 }
 
-/* Scale factor for generic timers, ie number of ns per tick.
- * This gives a 62.5MHz timer.
+/*
+ * Default frequency for the generic timer, in Hz.
+ * This is 62.5MHz, which gives a 16 ns tick period.
  */
-#define GTIMER_SCALE 16
+#define GTIMER_DEFAULT_HZ 6250
 
 /* Bit definitions for the v7M CONTROL register */
 FIELD(V7M_CONTROL, NPRIV, 0, 1)
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index ab8d007a86c..b248b283423 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1381,9 +1381,12 @@ static void arm_cpu_initfn(Object *obj)
 }
 }
 
+/*
+ * 0 means "unset, use the default value". That default might vary depending
+ * on the CPU type, and is set in the realize fn.
+ */
 static Property arm_cpu_gt_cntfrq_property =
-DEFINE_PROP_UINT64("cntfrq", ARMCPU, gt_cntfrq_hz,
-   NANOSECONDS_PER_SECOND / GTIMER_SCALE);
+DEFINE_PROP_UINT64("cntfrq", ARMCPU, gt_cntfrq_hz, 0);
 
 static Property arm_cpu_reset_cbar_property =
 DEFINE_PROP_UINT64("reset-cbar", ARMCPU, reset_cbar, 0);
@@ -1829,6 +1832,17 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
 return;
 }
 
+if (!cpu->gt_cntfrq_hz) {
+/*
+ * 0 means "the board didn't set a value, use the default".
+ * The default value of the generic timer frequency (as seen in
+ * CNTFRQ_EL0) is 62.5MHz, which corresponds to a period of 16ns.
+ * This is what you get (a) for a CONFIG_USER_ONLY CPU (b) if the
+ * board doesn't set it.
+ */
+cpu->gt_cntfrq_hz = GTIMER_DEFAULT_HZ;
+}
+
 #ifndef CONFIG_USER_ONLY
 /* The NVIC and M-profile CPU are two halves of a single piece of
  * hardware; trying to use one without the other is a command line
@@ -1877,18 +1891,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 
 {
-uint64_t scale;
-
-if (arm_feature(env, ARM_FEATURE_GENERIC_TIMER)) {
-if (!cpu->gt_cntfrq_hz) {
-error_setg(errp, "Invalid CNTFRQ: %"PRId64"Hz",
-   cpu->gt_cntfrq_hz);
-return;
-}
-scale = gt_cntfrq_period_ns(cpu);
-} else {
-scale = GTIMER_SCALE;
-}
+uint64_t scale = gt_cntfrq_period_ns(cpu);
 
 cpu->gt_timer[GTIMER_PHYS] = timer_new(QEMU_CLOCK_VIRTUAL, scale,
arm_gt_ptimer_cb, cpu);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8bdbb404195..01cf231a861 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2461,6 +2461,13 @@ static const ARMCPRegInfo v6k_cp_reginfo[] = {
   .resetvalue = 0 },
 };
 
+static void arm_gt_cntfrq_reset(CPUARMState *env, const ARMCPRegInfo *opaque)
+{
+ARMCPU *cpu = env_archcpu(env);
+
+cpu->env.cp15.c14_cntfrq = cpu->gt_cntfrq_hz;
+}
+
 #ifndef CONFIG_USER_ONLY
 
 static CPAccessResult gt_cntfrq_access(CPUARMState *env, const ARMCPRegInfo 
*ri,
@@ -3215,13 +3222,6 @@ void arm_gt_hvtimer_cb(void *opaque)
 gt_recalc_timer(cpu, GTIMER_HYPVIRT);
 }
 
-static void arm_gt_cntfrq_reset(CPUARMState *env, const ARMCPRegInfo *opaque)
-{
-ARMCPU *cpu = env_archcpu(env);
-
-cpu->env.cp15.c14_cntfrq = cpu->gt_cntfrq_hz;
-}
-
 static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 /*
  * Note that CNTFRQ is purely reads-as-written for the benefit
@@ -3501,7 +3501,7 @@ static const 

[PATCH 3/3] target/arm: Default to 1GHz cntfrq for 'max' and new CPUs

2024-04-19 Thread Peter Maydell
In previous versions of the Arm architecture, the frequency of the
generic timers as reported in CNTFRQ_EL0 could be any IMPDEF value,
and for QEMU we picked 62.5MHz, giving a timer tick period of 16ns.
In Armv8.6, the architecture standardized this frequency to 1GHz.

Because there is no ID register feature field that indicates whether
a CPU is v8.6 or that it ought to have this counter frequency, we
implement this by changing our default CNTFRQ value for all CPUs,
with exceptions for backwards compatibility:

 * CPU types which we already implement will retain the old
   default value. None of these are v8.6 CPUs, so this is
   architecturally OK.
 * CPUs used in versioned machine types with a version of 9.0
   or earlier will retain the old default value.

The upshot is that the only CPU type that changes is 'max'; but any
new type we add in future (whether v8.6 or not) will also get the new
1GHz default.

It remains the case that the machine model can override the default
value via the 'cntfrq' QOM property (regardless of the CPU type).

Signed-off-by: Peter Maydell 
---
 target/arm/cpu.h   | 11 +++
 target/arm/internals.h | 12 ++--
 hw/core/machine.c  |  4 +++-
 target/arm/cpu.c   | 28 ++--
 target/arm/cpu64.c |  2 ++
 target/arm/tcg/cpu32.c |  4 
 target/arm/tcg/cpu64.c | 18 ++
 7 files changed, 70 insertions(+), 9 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 20d8257c853..4eeeac3fe94 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -953,6 +953,9 @@ struct ArchCPU {
  */
 bool host_cpu_probe_failed;
 
+/* QOM property to indicate we should use the back-compat CNTFRQ default */
+bool backcompat_cntfrq;
+
 /* Specify the number of cores in this CPU cluster. Used for the L2CTLR
  * register.
  */
@@ -2367,6 +2370,14 @@ enum arm_features {
 ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
 ARM_FEATURE_M_MAIN, /* M profile Main Extension */
 ARM_FEATURE_V8_1M, /* M profile extras only in v8.1M and later */
+/*
+ * ARM_FEATURE_BACKCOMPAT_CNTFRQ makes the CPU default cntfrq be 62.5MHz
+ * if the board doesn't set a value, instead of 1GHz. It is for backwards
+ * compatibility and used only with CPU definitions that were already
+ * in QEMU before we changed the default. It should not be set on any
+ * CPU types added in future.
+ */
+ARM_FEATURE_BACKCOMPAT_CNTFRQ, /* 62.5MHz timer default */
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 74d4b1b0990..11d9ff0fc08 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -61,9 +61,17 @@ static inline bool excp_is_internal(int excp)
 
 /*
  * Default frequency for the generic timer, in Hz.
- * This is 62.5MHz, which gives a 16 ns tick period.
+ * ARMv8.6 and later CPUs architecturally must use a 1GHz timer; before
+ * that it was an IMPDEF choice, and QEMU initially picked 62.5MHz,
+ * which gives a 16ns tick period.
+ *
+ * We will use the back-compat value:
+ *  - for QEMU CPU types added before we standardized on 1GHz
+ *  - for versioned machine types with a version of 9.0 or earlier
+ * In any case, the machine model may override via the cntfrq property.
  */
-#define GTIMER_DEFAULT_HZ 6250
+#define GTIMER_DEFAULT_HZ 10
+#define GTIMER_BACKCOMPAT_HZ 6250
 
 /* Bit definitions for the v7M CONTROL register */
 FIELD(V7M_CONTROL, NPRIV, 0, 1)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index a92bec23147..bd40483d880 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -33,7 +33,9 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "audio/audio.h"
 
-GlobalProperty hw_compat_9_0[] = {};
+GlobalProperty hw_compat_9_0[] = {
+{"arm-cpu", "backcompat-cntfrq", "true" },
+};
 const size_t hw_compat_9_0_len = G_N_ELEMENTS(hw_compat_9_0);
 
 GlobalProperty hw_compat_8_2[] = {
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index b248b283423..2c8160d6b74 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1388,6 +1388,11 @@ static void arm_cpu_initfn(Object *obj)
 static Property arm_cpu_gt_cntfrq_property =
 DEFINE_PROP_UINT64("cntfrq", ARMCPU, gt_cntfrq_hz, 0);
 
+/* True to default to the backwards-compatibility old CNTFRQ rather than 1Ghz 
*/
+static Property arm_cpu_backcompat_cntfrq_property =
+DEFINE_PROP_BOOL("backcompat-cntfrq", ARMCPU,
+ backcompat_cntfrq, false);
+
 static Property arm_cpu_reset_cbar_property =
 DEFINE_PROP_UINT64("reset-cbar", ARMCPU, reset_cbar, 0);
 
@@ -1709,6 +1714,8 @@ void arm_cpu_post_init(Object *obj)
 qdev_property_add_static(DEVICE(cpu), &arm_cpu_gt_cntfrq_property);
 }
 
+qdev_property_add_static(DEVICE(obj), &arm_cpu_backcompat_cntfrq_property);
+
 if (kvm_enabled()) {
 kvm_arm_add_vcpu_properties(cpu);
 }
@@ -1834,13

Re: [PATCH v7 00/12] Enabling DCD emulation support in Qemu

2024-04-19 Thread fan
On Fri, Apr 19, 2024 at 02:24:36PM -0400, Gregory Price wrote:
> On Thu, Apr 18, 2024 at 04:10:51PM -0700, nifan@gmail.com wrote:
> > A git tree of this series can be found here (with one extra commit on top
> > for printing out accepted/pending extent list): 
> > https://github.com/moking/qemu/tree/dcd-v7
> > 
> > v6->v7:
> > 
> > 1. Fixed the dvsec range register issue mentioned in the the cover letter 
> > in v6.
> >Only relevant bits are set to mark the device ready (Patch 6). (Jonathan)
> > 2. Moved the if statement in cxl_setup_memory from Patch 6 to Patch 4. 
> > (Jonathan)
> > 3. Used MIN instead of if statement to get record_count in Patch 7. 
> > (Jonathan)
> > 4. Added "Reviewed-by" tag to Patch 7.
> > 5. Modified cxl_dc_extent_release_dry_run so the updated extent list can be
> >reused in cmd_dcd_release_dyn_cap to simplify the process in Patch 8. 
> > (Jørgen) 
> > 6. Added comments to indicate further "TODO" items in 
> > cmd_dcd_add_dyn_cap_rsp.
> > (Jonathan)
> > 7. Avoided irrelevant code reformat in Patch 8. (Jonathan)
> > 8. Modified QMP interfaces for adding/releasing DC extents to allow passing
> >tags, selection policy, flags in the interface. (Jonathan, Gregory)
> > 9. Redesigned the pending list so extents in the same requests are grouped
> > together. A new data structure is introduced to represent "extent group"
> > in pending list.  (Jonathan)
> > 10. Added support in QMP interface for "More" flag. 
> > 11. Check "Forced removal" flag for release request and not let it pass 
> > through.
> > 12. Removed the dynamic capacity log type from CxlEventLog definition in 
> > cxl.json
> >to avoid the side effect it may introduce to inject error to DC event 
> > log.
> >(Jonathan)
> > 13. Hard coded the event log type to dynamic capacity event log in QMP
> > interfaces. (Jonathan)
> > 14. Adding space in between "-1]". (Jonathan)
> > 15. Some minor comment fixes.
> > 
> > The code is tested with similar setup and has passed similar tests as listed
> > in the cover letter of v5[1] and v6[2].
> > Also, the code is tested with the latest DCD kernel patchset[3].
> > 
> > [1] Qemu DCD patchset v5: 
> > https://lore.kernel.org/linux-cxl/20240304194331.1586191-1-nifan@gmail.com/T/#t
> > [2] Qemu DCD patchset v6: 
> > https://lore.kernel.org/linux-cxl/20240325190339.696686-1-nifan@gmail.com/T/#t
> > [3] DCD kernel patches: 
> > https://lore.kernel.org/linux-cxl/20240324-dcd-type2-upstream-v1-0-b7b00d623...@intel.com/T/#m11c571e21c4fe17c7d04ec5c2c7bc7cbf2cd07e3
> >
> 
> added review to all patches, will hopefully be able to add a Tested-by
> tag early next week, along with a v1 RFC for MHD bit-tracking.
> 
> We've been testing v5/v6 for a bit, so I expect as soon as we get the
> MHD code ported over to v7 i'll ship a tested-by tag pretty quick.
> 
> The super-set release will complicate a few things but this doesn't
> look like a blocker on our end, just a change to how we track bits in a
> shared bit/bytemap.
> 

Hi Gregory,
Thanks for reviewing the patches so quickly. 

No pressure, but look forward to your MHD work. :)

Fan

> > 
> > Fan Ni (12):
> >   hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
> > payload of identify memory device command
> >   hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
> > and mailbox command support
> >   include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
> > type3 memory devices
> >   hw/mem/cxl_type3: Add support to create DC regions to type3 memory
> > devices
> >   hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr
> > size instead of mr as argument
> >   hw/mem/cxl_type3: Add host backend and address space handling for DC
> > regions
> >   hw/mem/cxl_type3: Add DC extent list representative and get DC extent
> > list mailbox support
> >   hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release
> > dynamic capacity response
> >   hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
> > extents
> >   hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions
> >   hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support
> >   hw/mem/cxl_type3: Allow to release extent superset in QMP interface
> > 
> >  hw/cxl/cxl-mailbox-utils.c  | 620 ++-
> >  hw/mem/cxl_type3.c  | 633 +---
> >  hw/mem/cxl_type3_stubs.c|  20 ++
> >  include/hw/cxl/cxl_device.h |  81 -
> >  include/hw/cxl/cxl_events.h |  18 +
> >  qapi/cxl.json   |  69 
> >  6 files changed, 1396 insertions(+), 45 deletions(-)
> > 
> > -- 
> > 2.43.0
> > 



[PATCH] target/arm: fix MPIDR value for ARM CPUs with SMT

2024-04-19 Thread Dorjoy Chowdhury
Some ARM CPUs advertise themselves as SMT by having the MT[24] bit set
to 1 in the MPIDR register. These CPUs have the thread id in Aff0[7:0]
bits, CPU id in Aff1[15:8] bits and cluster id in Aff2[23:16] bits in
MPIDR.

On the other hand, ARM CPUs without SMT have the MT[24] bit set to 0,
CPU id in Aff0[7:0] bits and cluster id in Aff1[15:8] bits in MPIDR.

The mpidr_read_val() function always reported non-SMT i.e., MT=0 style
MPIDR value which means it was wrong for the following CPUs with SMT
supported by QEMU:
- cortex-a55
- cortex-a76
- cortex-a710
- neoverse-v1
- neoverse-n1
- neoverse-n2

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1608
Signed-off-by: Dorjoy Chowdhury 
---
 hw/arm/npcm7xx.c   |  2 +-
 hw/arm/sbsa-ref.c  | 21 -
 hw/arm/virt.c  | 18 +++---
 target/arm/cpu.c   | 14 --
 target/arm/cpu.h   |  5 -
 target/arm/helper.c|  4 
 target/arm/tcg/cpu64.c | 12 
 7 files changed, 64 insertions(+), 12 deletions(-)

diff --git a/hw/arm/npcm7xx.c b/hw/arm/npcm7xx.c
index cc68b5d8f1..9d5dcf1a3f 100644
--- a/hw/arm/npcm7xx.c
+++ b/hw/arm/npcm7xx.c
@@ -487,7 +487,7 @@ static void npcm7xx_realize(DeviceState *dev, Error **errp)
 /* CPUs */
 for (i = 0; i < nc->num_cpus; i++) {
 object_property_set_int(OBJECT(&s->cpu[i]), "mp-affinity",
-arm_build_mp_affinity(i, NPCM7XX_MAX_NUM_CPUS),
+arm_build_mp_affinity(ARM_CPU(&s->cpu[i]), i, 
NPCM7XX_MAX_NUM_CPUS),
 &error_abort);
 object_property_set_int(OBJECT(&s->cpu[i]), "reset-cbar",
 NPCM7XX_GIC_CPU_IF_ADDR, &error_abort);
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index f5709d6c14..dd42788f23 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -147,10 +147,10 @@ static const int sbsa_ref_irqmap[] = {
 [SBSA_GWDT_WS0] = 16,
 };
 
-static uint64_t sbsa_ref_cpu_mp_affinity(SBSAMachineState *sms, int idx)
+static uint64_t sbsa_ref_cpu_mp_affinity(ARMCPU *cpu, int idx)
 {
 uint8_t clustersz = ARM_DEFAULT_CPUS_PER_CLUSTER;
-return arm_build_mp_affinity(idx, clustersz);
+return arm_build_mp_affinity(cpu, idx, clustersz);
 }
 
 static void sbsa_fdt_add_gic_node(SBSAMachineState *sms)
@@ -254,7 +254,7 @@ static void create_fdt(SBSAMachineState *sms)
 char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
 ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
 CPUState *cs = CPU(armcpu);
-uint64_t mpidr = sbsa_ref_cpu_mp_affinity(sms, cpu);
+uint64_t mpidr = sbsa_ref_cpu_mp_affinity(armcpu, cpu);
 
 qemu_fdt_add_subnode(sms->fdt, nodename);
 qemu_fdt_setprop_u64(sms->fdt, nodename, "reg", mpidr);
@@ -816,8 +816,9 @@ static void sbsa_ref_init(MachineState *machine)
 static const CPUArchIdList *sbsa_ref_possible_cpu_arch_ids(MachineState *ms)
 {
 unsigned int max_cpus = ms->smp.max_cpus;
-SBSAMachineState *sms = SBSA_MACHINE(ms);
 int n;
+Object *cpuobj;
+ARMCPU *armcpu;
 
 if (ms->possible_cpus) {
 assert(ms->possible_cpus->len == max_cpus);
@@ -827,13 +828,23 @@ static const CPUArchIdList 
*sbsa_ref_possible_cpu_arch_ids(MachineState *ms)
 ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
   sizeof(CPUArchId) * max_cpus);
 ms->possible_cpus->len = max_cpus;
+
+/*
+ * Instantiate a temporary CPU object to build mp_affinity
+ * of the possible CPUs.
+ */
+cpuobj = object_new(ms->cpu_type);
+armcpu = ARM_CPU(cpuobj);
+
 for (n = 0; n < ms->possible_cpus->len; n++) {
 ms->possible_cpus->cpus[n].type = ms->cpu_type;
 ms->possible_cpus->cpus[n].arch_id =
-sbsa_ref_cpu_mp_affinity(sms, n);
+sbsa_ref_cpu_mp_affinity(armcpu, n);
 ms->possible_cpus->cpus[n].props.has_thread_id = true;
 ms->possible_cpus->cpus[n].props.thread_id = n;
 }
+
+object_unref(cpuobj);
 return ms->possible_cpus;
 }
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a9a913aead..fe6d13c08f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1703,7 +1703,7 @@ void virt_machine_done(Notifier *notifier, void *data)
 virt_build_smbios(vms);
 }
 
-static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx)
+static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, ARMCPU *cpu, int 
idx)
 {
 uint8_t clustersz = ARM_DEFAULT_CPUS_PER_CLUSTER;
 VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1723,7 +1723,7 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState 
*vms, int idx)
 clustersz = GICV3_TARGETLIST_BITS;
 }
 }
-return arm_build_mp_affinity(idx, clustersz);
+return arm_build_mp_affinity(cpu, idx, clustersz);
 }
 
 static inline bool *virt_get_high_memmap_enabled(VirtMachineState *vms,
@@ -2683,6 +2683,8 @@ static const CPUArchId

Re: [PATCH v7 00/12] Enabling DCD emulation support in Qemu

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:51PM -0700, nifan@gmail.com wrote:
> A git tree of this series can be found here (with one extra commit on top
> for printing out accepted/pending extent list): 
> https://github.com/moking/qemu/tree/dcd-v7
> 
> v6->v7:
> 
> 1. Fixed the dvsec range register issue mentioned in the the cover letter in 
> v6.
>Only relevant bits are set to mark the device ready (Patch 6). (Jonathan)
> 2. Moved the if statement in cxl_setup_memory from Patch 6 to Patch 4. 
> (Jonathan)
> 3. Used MIN instead of if statement to get record_count in Patch 7. (Jonathan)
> 4. Added "Reviewed-by" tag to Patch 7.
> 5. Modified cxl_dc_extent_release_dry_run so the updated extent list can be
>reused in cmd_dcd_release_dyn_cap to simplify the process in Patch 8. 
> (Jørgen) 
> 6. Added comments to indicate further "TODO" items in cmd_dcd_add_dyn_cap_rsp.
> (Jonathan)
> 7. Avoided irrelevant code reformat in Patch 8. (Jonathan)
> 8. Modified QMP interfaces for adding/releasing DC extents to allow passing
>tags, selection policy, flags in the interface. (Jonathan, Gregory)
> 9. Redesigned the pending list so extents in the same requests are grouped
> together. A new data structure is introduced to represent "extent group"
> in pending list.  (Jonathan)
> 10. Added support in QMP interface for "More" flag. 
> 11. Check "Forced removal" flag for release request and not let it pass 
> through.
> 12. Removed the dynamic capacity log type from CxlEventLog definition in 
> cxl.json
>to avoid the side effect it may introduce to inject error to DC event log.
>(Jonathan)
> 13. Hard coded the event log type to dynamic capacity event log in QMP
> interfaces. (Jonathan)
> 14. Adding space in between "-1]". (Jonathan)
> 15. Some minor comment fixes.
> 
> The code is tested with similar setup and has passed similar tests as listed
> in the cover letter of v5[1] and v6[2].
> Also, the code is tested with the latest DCD kernel patchset[3].
> 
> [1] Qemu DCD patchset v5: 
> https://lore.kernel.org/linux-cxl/20240304194331.1586191-1-nifan@gmail.com/T/#t
> [2] Qemu DCD patchset v6: 
> https://lore.kernel.org/linux-cxl/20240325190339.696686-1-nifan@gmail.com/T/#t
> [3] DCD kernel patches: 
> https://lore.kernel.org/linux-cxl/20240324-dcd-type2-upstream-v1-0-b7b00d623...@intel.com/T/#m11c571e21c4fe17c7d04ec5c2c7bc7cbf2cd07e3
>

added review to all patches, will hopefully be able to add a Tested-by
tag early next week, along with a v1 RFC for MHD bit-tracking.

We've been testing v5/v6 for a bit, so I expect as soon as we get the
MHD code ported over to v7 i'll ship a tested-by tag pretty quick.

The super-set release will complicate a few things but this doesn't
look like a blocker on our end, just a change to how we track bits in a
shared bit/bytemap.

> 
> Fan Ni (12):
>   hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
> payload of identify memory device command
>   hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
> and mailbox command support
>   include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
> type3 memory devices
>   hw/mem/cxl_type3: Add support to create DC regions to type3 memory
> devices
>   hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr
> size instead of mr as argument
>   hw/mem/cxl_type3: Add host backend and address space handling for DC
> regions
>   hw/mem/cxl_type3: Add DC extent list representative and get DC extent
> list mailbox support
>   hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release
> dynamic capacity response
>   hw/cxl/events: Add qmp interfaces to add/release dynamic capacity
> extents
>   hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions
>   hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support
>   hw/mem/cxl_type3: Allow to release extent superset in QMP interface
> 
>  hw/cxl/cxl-mailbox-utils.c  | 620 ++-
>  hw/mem/cxl_type3.c  | 633 +---
>  hw/mem/cxl_type3_stubs.c|  20 ++
>  include/hw/cxl/cxl_device.h |  81 -
>  include/hw/cxl/cxl_events.h |  18 +
>  qapi/cxl.json   |  69 
>  6 files changed, 1396 insertions(+), 45 deletions(-)
> 
> -- 
> 2.43.0
> 



Re: [PATCH v2 3/4] docs/system/target-sparc: Improve the Sparc documentation

2024-04-19 Thread Peter Maydell
On Fri, 19 Apr 2024 at 09:49, Thomas Huth  wrote:
>
> Add some words about how to enable or disable boolean features,
> and remove the note about a Linux kernel being available on the
> QEMU website (they have been removed long ago already), and the
> note about NetBSD and OpenBSD still having issues (they should
> work fine nowadays).
>
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2141
> Signed-off-by: Thomas Huth 
> ---
>  docs/system/target-sparc.rst | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/docs/system/target-sparc.rst b/docs/system/target-sparc.rst
> index 9ec8c90c14..54bd8b6ead 100644
> --- a/docs/system/target-sparc.rst
> +++ b/docs/system/target-sparc.rst
> @@ -27,6 +27,11 @@ architecture machines:
>  The emulation is somewhat complete. SMP up to 16 CPUs is supported, but
>  Linux limits the number of usable CPUs to 4.
>
> +The list of available CPUs can be viewed by starting QEMU with ``-cpu help``.
> +Optional boolean features can be added with a "+" in front of the feature 
> name,
> +or disabled with a "-" in front of the name, for example
> +``-cpu TI-SuperSparc-II,+float128``.
> +
>  QEMU emulates the following sun4m peripherals:
>
>  -  IOMMU
> @@ -55,8 +60,5 @@ OpenBIOS is a free (GPL v2) portable firmware 
> implementation. The goal
>  is to implement a 100% IEEE 1275-1994 (referred to as Open Firmware)
>  compliant firmware.
>
> -A sample Linux 2.6 series kernel and ram disk image are available on the
> -QEMU web site. There are still issues with NetBSD and OpenBSD, but most
> -kernel versions work. Please note that currently older Solaris kernels
> -don't work probably due to interface issues between OpenBIOS and
> -Solaris.
> +Please note that currently older Solaris kernels don't work probably due
> +to interface issues between OpenBIOS and Solaris.

If we're touching this text anyway I guess we could clean up the
grammar: "don't work; this is probably due to".

thanks
-- PMM



Re: [PATCH v7 12/12] hw/mem/cxl_type3: Allow to release extent superset in QMP interface

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:11:03PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> Before the change, the QMP interface used for add/release DC extents
> only allows to release an extent whose DPA range is contained by a single
> accepted extent in the device.
> 
> With the change, we relax the constraints.  As long as the DPA range of
> the extent is covered by accepted extents, we allow the release.
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/mem/cxl_type3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reveiwed-by: Gregory Price 

> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index a3e1a5de25..9e725647f1 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1941,7 +1941,7 @@ static void 
> qmp_cxl_process_dynamic_capacity_prescriptive(const char *path,
> "cannot release extent with pending DPA range");
>  return;
>  }
> -if (!cxl_extents_contains_dpa_range(&dcd->dc.extents, dpa, len)) 
> {
> +if (!ct3_test_region_block_backed(dcd, dpa, len)) {
>  error_setg(errp,
> "cannot release extent with non-existing DPA 
> range");
>  return;
> -- 
> 2.43.0
> 



Re: [PATCH v7 11/12] hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:11:02PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> With the change, we extend the extent release mailbox command processing
> to allow more flexible release. As long as the DPA range of the extent to
> release is covered by accepted extent(s) in the device, the release can be
> performed.
> 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c | 21 -
>  1 file changed, 8 insertions(+), 13 deletions(-)
>

Hmmm.  This will complicate MHD accounting, but it looks ok to me as-is.

Reviewed-by: Gregory Price 

> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 57f1ce9cce..89f0ab8116 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -1704,6 +1704,13 @@ static CXLRetCode 
> cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
>  dpa = in->updated_entries[i].start_dpa;
>  len = in->updated_entries[i].len;
>  
> +/* Check if the DPA range is not fully backed with valid extents */
> +if (!ct3_test_region_block_backed(ct3d, dpa, len)) {
> +ret = CXL_MBOX_INVALID_PA;
> +goto free_and_exit;
> +}
> +
> +/* After this point, extent overflow is the only error can happen */
>  while (len > 0) {
>  QTAILQ_FOREACH(ent, updated_list, node) {
>  range_init_nofail(&range, ent->start_dpa, ent->len);
> @@ -1718,14 +1725,7 @@ static CXLRetCode 
> cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
>  if (range_contains(&range, dpa + len - 1)) {
>  len2 = ent_start_dpa + ent_len - dpa - len;
>  } else {
> -/*
> - * TODO: we reject the attempt to remove an extent
> - * that overlaps with multiple extents in the device
> - * for now. We will allow it once superset release
> - * support is added.
> - */
> -ret = CXL_MBOX_INVALID_PA;
> -goto free_and_exit;
> +dpa = ent_start_dpa + ent_len;
>  }
>  len_done = ent_len - len1 - len2;
>  
> @@ -1752,14 +1752,9 @@ static CXLRetCode 
> cxl_dc_extent_release_dry_run(CXLType3Dev *ct3d,
>  }
>  
>  len -= len_done;
> -/* len == 0 here until superset release is added */
>  break;
>  }
>  }
> -if (len) {
> -ret = CXL_MBOX_INVALID_PA;
> -goto free_and_exit;
> -}
>  }
>  }
>  free_and_exit:
> -- 
> 2.43.0
> 



Re: [PATCH v7 08/12] hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release dynamic capacity response

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:59PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> Per CXL spec 3.1, two mailbox commands are implemented:
> Add Dynamic Capacity Response (Opcode 4802h) 8.2.9.9.9.3, and
> Release Dynamic Capacity (Opcode 4803h) 8.2.9.9.9.4.
> 
> For the process of the above two commands, we use two-pass approach.
> Pass 1: Check whether the input payload is valid or not; if not, skip
> Pass 2 and return mailbox process error.
> Pass 2: Do the real work--add or release extents, respectively.
> 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 394 
>  hw/mem/cxl_type3.c  |  11 +
>  include/hw/cxl/cxl_device.h |   4 +
>  3 files changed, 409 insertions(+)
> 

Reviewed-by: Gregory Price 



Re: [PATCH v7 09/12] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:11:00PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> To simulate FM functionalities for initiating Dynamic Capacity Add
> (Opcode 5604h) and Dynamic Capacity Release (Opcode 5605h) as in CXL spec
> r3.1 7.6.7.6.5 and 7.6.7.6.6, we implemented two QMP interfaces to issue
> add/release dynamic capacity extents requests.
> 
> With the change, we allow to release an extent only when its DPA range
> is contained by a single accepted extent in the device. That is to say,
> extent superset release is not supported yet.
> 
...
> 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c  |  62 +--
>  hw/mem/cxl_type3.c  | 311 +++-
>  hw/mem/cxl_type3_stubs.c|  20 +++
>  include/hw/cxl/cxl_device.h |  22 +++
>  include/hw/cxl/cxl_events.h |  18 +++
>  qapi/cxl.json   |  69 
>  6 files changed, 489 insertions(+), 13 deletions(-)
> 

Reviewed-by: Gregory Price 



Re: [PATCH v2 3/4] docs/system/target-sparc: Improve the Sparc documentation

2024-04-19 Thread Mark Cave-Ayland

On 19/04/2024 09:48, Thomas Huth wrote:


Add some words about how to enable or disable boolean features,
and remove the note about a Linux kernel being available on the
QEMU website (they have been removed long ago already), and the
note about NetBSD and OpenBSD still having issues (they should
work fine nowadays).

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2141
Signed-off-by: Thomas Huth 
---
  docs/system/target-sparc.rst | 12 +++-
  1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/docs/system/target-sparc.rst b/docs/system/target-sparc.rst
index 9ec8c90c14..54bd8b6ead 100644
--- a/docs/system/target-sparc.rst
+++ b/docs/system/target-sparc.rst
@@ -27,6 +27,11 @@ architecture machines:
  The emulation is somewhat complete. SMP up to 16 CPUs is supported, but
  Linux limits the number of usable CPUs to 4.
  
+The list of available CPUs can be viewed by starting QEMU with ``-cpu help``.

+Optional boolean features can be added with a "+" in front of the feature name,
+or disabled with a "-" in front of the name, for example
+``-cpu TI-SuperSparc-II,+float128``.
+
  QEMU emulates the following sun4m peripherals:
  
  -  IOMMU

@@ -55,8 +60,5 @@ OpenBIOS is a free (GPL v2) portable firmware implementation. 
The goal
  is to implement a 100% IEEE 1275-1994 (referred to as Open Firmware)
  compliant firmware.
  
-A sample Linux 2.6 series kernel and ram disk image are available on the

-QEMU web site. There are still issues with NetBSD and OpenBSD, but most
-kernel versions work. Please note that currently older Solaris kernels
-don't work probably due to interface issues between OpenBIOS and
-Solaris.
+Please note that currently older Solaris kernels don't work probably due
+to interface issues between OpenBIOS and Solaris.


Reviewed-by: Mark Cave-Ayland 


ATB,

Mark.




Re: [edk2-devel] [PATCH v3 5/6] target/arm: Do memory type alignment check when translation disabled

2024-04-19 Thread Ard Biesheuvel
On Fri, 19 Apr 2024 at 18:36, Ard Biesheuvel  wrote:
>
> On Fri, 19 Apr 2024 at 18:09, Jonathan Cameron via groups.io
>  wrote:
> >
> > On Fri, 19 Apr 2024 13:52:07 +0200
> > Gerd Hoffmann  wrote:
> >
> > >   Hi,
> > >
> > > > Gerd, any ideas?  Maybe I needs something subtly different in my
> > > > edk2 build?  I've not looked at this bit of the qemu infrastructure
> > > > before - is there a document on how that image is built?
> > >
> > > There is roms/Makefile for that.
> > >
> > > make -C roms help
> > > make -C roms efi
> > >
> > > So easiest would be to just update the edk2 submodule to what you
> > > need, then rebuild.
> > >
> > > The build is handled by the roms/edk2-build.py script,
> > > with the build configuration being in roms/edk2-build.config.
> > > That is usable outside the qemu source tree too, i.e. like this:
> > >
> > >   python3 /path/to/qemu.git/roms/edk2-build.py \
> > > --config /path/to/qemu.git/roms/edk2-build.config \
> > > --core /path/to/edk2.git \
> > > --match armvirt \
> > > --silent --no-logs
> > >
> > > That'll try to place the images build in "../pc-bios", so maybe better
> > > work with a copy of the config file where you adjust this.
> > >
> > > HTH,
> > >   Gerd
> > >
> >
> > Thanks Gerd!
> >
> > So the builds are very similar via the two method...
> > However - the QEMU build sets -D CAVIUM_ERRATUM_27456=TRUE
> >
> > And that's the difference - with that set for my other builds the alignment
> > problems go away...
> >
> > Any idea why we have that set in roms/edk2-build.config?
> > Superficially it seems rather unlikely anyone cares about thunderx1
> > (if they do we need to get them some new hardware with fresh bugs)
> > bugs now and this config file was only added last year.
> >
> >
> > However, the last comment in Ard's commit message below seems
> > highly likely to be relevant!
> >
> > Chasing through Ard's patch it has the side effect of dropping
> > an override of a requirement for strict alignment.
> > So with out the errata
> > DEFINE GCC_AARCH64_CC_XIPFLAGS = -mstrict-align -mgeneral-regs-only
> > is replaced with
> >  [BuildOptions]
> > +!if $(CAVIUM_ERRATUM_27456) == TRUE^M
> > +  GCC:*_*_AARCH64_PP_FLAGS = -DCAVIUM_ERRATUM_27456^M
> > +!else^M
> >GCC:*_*_AARCH64_CC_XIPFLAGS ==
> > +!endif^M
> >
> > The edk2 commit that added this was the following +CC Ard.
> >
> > Given I wasn't sure of the syntax of that file I set it
> > manually to the original value and indeed it works.
> >
> >
> > commit ec54ce1f1ab41b92782b37ae59e752fff0ef9c41
> > Author: Ard Biesheuvel 
> > Date:   Wed Jan 4 16:51:35 2023 +0100
> >
> > ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX
> >
> > The early ID map used by ArmVirtQemu uses ASID scoped non-global
> > mappings, as this allows us to switch to the permanent ID map seamlessly
> > without the need for explicit TLB maintenance.
> >
> > However, this triggers a known erratum on ThunderX, which does not
> > tolerate non-global mappings that are executable at EL1, as this appears
> > to result in I-cache corruption. (Linux disables the KPTI based Meltdown
> > mitigation on ThunderX for the same reason)
> >
> > So work around this, by detecting the CPU implementor and part number,
> > and proceeding without the early ID map if a ThunderX CPU is detected.
> >
> > Note that this requires the C code to be built with strict alignment
> > again, as we may end up executing it with the MMU and caches off.
> >
> > Signed-off-by: Ard Biesheuvel 
> > Acked-by: Laszlo Ersek 
> > Tested-by: dann frazier 
> >
> > Test case is
> > qemu-system-aarch64 -M virt,virtualization=true, -m 4g -cpu cortex-a76 \
> > -bios QEMU_EFI.fd -d int
> >
> > Which gets alignment faults since:
> > https://lore.kernel.org/all/20240301204110.656742-6-richard.hender...@linaro.org/
> >
> > So my feeling here is EDK2 should either have yet another config for QEMU 
> > as a host
> > or should always set the alignment without needing to pick the CAVIUM 27456 
> > errata
> > which I suspect will get dropped soonish anyway if anyone ever cleans up
> > old errata.
> >
>
> This code was never really intended for execution at EL2, but it
> happened to work, partially because TCG's lack of strict alignment
> checking when the MMU is off.
>
> Those assumptions no longer hold, so yes, let's get this fixed properly.
>
> Given VHE and nested virt (which will likely imply VHE in practice), I
> would like to extend this functionality (i.e., the use of preliminary
> page tables in NOR flash) to EL2 as well, but with VHE enabled. This
> means we can still elide TLB maintenance (and BBM checks) by using
> different ASIDs, and otherwise, fall back to entering with the MMU off
> if VHE is not available. In that case, we should enforce strict
> alignment too, so that needs to be fixed regardless.
>
> I'll try to code something up and send it round. In the mean time,
> feel free to propose a minimal pa

Re: [PATCH 1/5] docs/system/arm/emulation.rst: Add missing implemented features

2024-04-19 Thread Peter Maydell
On Thu, 18 Apr 2024 at 16:20, Peter Maydell  wrote:
>
> As of version DDI0487K.a of the Arm ARM, some architectural features
> which previously didn't have official names have been named.  Add
> these to the list of features which QEMU's TCG emulation supports.
> Mostly these are features which we thought of as part of baseline 8.0
> support.  For SVE and SVE2, the names have been brought into line
> with the FEAT_* naming convention of other extensions, and some
> sub-components split into separate FEAT_ items.  In a few cases (eg
> FEAT_CCIDX, FEAT_DPB2) the omission from our list was just an oversight.
>
> Signed-off-by: Peter Maydell 
> ---
>  docs/system/arm/emulation.rst | 37 +--
>  1 file changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
> index 2a7bbb82dc4..9388c7dd553 100644
> --- a/docs/system/arm/emulation.rst
> +++ b/docs/system/arm/emulation.rst
> @@ -8,13 +8,25 @@ Armv8 versions of the A-profile architecture. It also has 
> support for
>  the following architecture extensions:
>
>  - FEAT_AA32BF16 (AArch32 BFloat16 instructions)
> +- FEAT_AA32EL0 (Support for AArch32 at EL0)
> +- FEAT_AA32EL1 (Support for AArch32 at EL1)
> +- FEAT_AA32EL2 (Support for AArch32 at EL2)
> +- FEAT_AA32EL3 (Support for AArch32 at EL3)
>  - FEAT_AA32HPD (AArch32 hierarchical permission disables)
>  - FEAT_AA32I8MM (AArch32 Int8 matrix multiplication instructions)
> +- FEAT_AA64EL0 (Support for AArch64 at EL0)
> +- FEAT_AA64EL1 (Support for AArch64 at EL1)
> +- FEAT_AA64EL2 (Support for AArch64 at EL2)
> +- FEAT_AA64EL3 (Support for AArch64 at EL3)
> +- FEAT_AdvSIMD (Advanced SIMD Extension)
>  - FEAT_AES (AESD and AESE instructions)
> +- FEAT_ASID16 (16 bit ASID)
>  - FEAT_BBM at level 2 (Translation table break-before-make levels)
>  - FEAT_BF16 (AArch64 BFloat16 instructions)
>  - FEAT_BTI (Branch Target Identification)
> +- FEAT_CCIDX (Extended cache index)
>  - FEAT_CRC32 (CRC32 instructions)
> +- FEAT_Crypto (Cryptographic Extension)

I missed one here: we can also add
FEAT_Armv9_Crypto (Armv9 Cryptographic Extension)

(Like FEAT_Crypto, this is an "umbrella" feature naming the
combination of various other crypto related features, all of which
we already implement.)

-- PMM



Re: [PATCH v5 0/3] Add support for the RAPL MSRs series

2024-04-19 Thread Paolo Bonzini
On Wed, Apr 17, 2024 at 7:58 PM Daniel P. Berrangé  wrote:
> > > However, one question remains unanswered pointing the issue with the
> > > location of "/var/local/run/qemu-vmsr-helper.sock", created by
> > > compute_default_paths(). QEMU is not allowed to reach the socket here.
> >
> > If I understand correctly the question, that is expected. This is a
> > privileged functionality and therefore it requires manual intervention
> > to change the owner of the socket and allow QEMU to access it.
>
> In the systemd case, it will set the owner and mode, but in the
> non-system case, I wonder if it worth making this helper program
> have "--socket-owner" and "--socket-mode" args, so it can create
> the socket with the right mode/owner immediately, rather than
> expecting the admin to manuall chmod+chown after start the
> helper

I think a better idea would be to contribute them to
systemd-socket-activate, and just launch the helper that way. It's
mostly a testing tool, but tbh if you're not using systemd you're on
your own. If you write an init script for example, that would be the
place where you put the chmod/chown.

Paolo




Re: [PATCH v7 06/12] hw/mem/cxl_type3: Add host backend and address space handling for DC regions

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:57PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> Add (file/memory backed) host backend for DCD. All the dynamic capacity
> regions will share a single, large enough host backend. Set up address
> space for DC regions to support read/write operations to dynamic capacity
> for DCD.
> 
> With the change, the following support is added:
> 1. Add a new property to type3 device "volatile-dc-memdev" to point to host
>memory backend for dynamic capacity. Currently, all DC regions share one
>host backend;
> 2. Add namespace for dynamic capacity for read/write support;
> 3. Create cdat entries for each dynamic capacity region.
> 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c  |  16 ++--
>  hw/mem/cxl_type3.c  | 172 +---
>  include/hw/cxl/cxl_device.h |   8 ++
>  3 files changed, 160 insertions(+), 36 deletions(-)
> 

A couple general comments in line for discussion, but patch looks good
otherwise. Notes are mostly on improvements we could make that should
not block this patch.

Reviewed-by: Gregory Price 

>  
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index a1fe268560..ac87398089 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -45,7 +45,8 @@ enum {
>  
>  static void ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table,
>int dsmad_handle, uint64_t size,
> -  bool is_pmem, uint64_t dpa_base)
> +  bool is_pmem, bool is_dynamic,
> +  uint64_t dpa_base)

We should probably change the is_* fields into a flags field and do some
error checking on the combination of flags.

>  {
>  CDATDsmas *dsmas;
>  CDATDslbis *dslbis0;
> @@ -61,7 +62,8 @@ static void ct3_build_cdat_entries_for_mr(CDATSubHeader 
> **cdat_table,
>  .length = sizeof(*dsmas),
>  },
>  .DSMADhandle = dsmad_handle,
> -.flags = is_pmem ? CDAT_DSMAS_FLAG_NV : 0,
> +.flags = (is_pmem ? CDAT_DSMAS_FLAG_NV : 0) |
> + (is_dynamic ? CDAT_DSMAS_FLAG_DYNAMIC_CAP : 0),

For example, as noted elsewhere in the code, is_pmem+is_dynamic is not
presently supported, so this shouldn't even be allowed in this function.

> +if (dc_mr) {
> +int i;
> +uint64_t region_base = vmr_size + pmr_size;
> +
> +/*
> + * TODO: we assume the dynamic capacity to be volatile for now.
> + * Non-volatile dynamic capacity will be added if needed in the
> + * future.
> + */

Probably don't need to mark this TODO, can just leave it as a note.

Non-volatile dynamic capacity will coincide with shared memory, so it'll
end up handled.  So this isn't really a TODO for this current work, and
should read more like:

"Dynamic Capacity is always volatile, until shared memory is
implemented"

> +} else if (ct3d->hostpmem) {
>  range1_size_hi = ct3d->hostpmem->size >> 32;
>  range1_size_lo = (2 << 5) | (2 << 2) | 0x3 |
>   (ct3d->hostpmem->size & 0xF000);
> +} else {
> +/*
> + * For DCD with no static memory, set memory active, memory class 
> bits.
> + * No range is set.
> + */
> +range1_size_lo = (2 << 5) | (2 << 2) | 0x3;

We should probably add defs for these fields at some point. Can be
tabled for later work though.

> +/*
> + * TODO: set dc as volatile for now, non-volatile support can be 
> added
> + * in the future if needed.
> + */
> +memory_region_set_nonvolatile(dc_mr, false);

Again can probably drop the TODO and just leave a statement.

~Gregory



Re: Add 'info pg' command to monitor

2024-04-19 Thread Dr. David Alan Gilbert
* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On Tue, 16 Apr 2024 at 19:11, Don Porter  wrote:
> >
> > On 4/16/24 13:03, Peter Maydell wrote:
> > > On Tue, 16 Apr 2024 at 17:53, Don Porter  wrote:
> > >> There is still a lot I am learning about the code base, but it seems
> > >> that qemu_get_guest_memory_mapping() does most of what one would need.
> > >> It currently only returns the "leaves" of the page table tree in a list.
> > >>
> > >> What if I extend this function with an optional argument to either
> > >> 1) return the interior nodes of the page table in additional lists (and
> > >> then parse+print in the monitor code), or
> > >> 2) inline the monitor printing in the arch-specific hook, and pass a
> > >> flag to get_guest_memory_mapping() that turns on/off the statements that
> > >> pretty print the page tables?
> > >>
> > >> It looks like most CPUs implement this function as part of checkpointing.
> > > As far as I can see only x86 implements the get_memory_mapping
> > > function, so once again somebody has added some bit of
> > > functionality that does a walk of the page tables that is
> > > x86 only and that shares no code with any of the other
> > > page table walking code :-(
> >
> > My mistake - get_memory_mappings() is only implemented in x86.
> >
> > In doing some searching of the code, many architectures implement
> > mmu_translate() and
> > get_physical_address() functions, but they are not standardized. I also
> > see your larger point
> > about replicating page walking code in x86.
> >
> > I imagine you have something in mind that abstracts things like the
> > height of the radix tree,
> > entries per node, checking permissions, printing the contents, etc.
> >
> > Perhaps I should start by trying to merge the x86 page walking code into
> > one set of common
> > helper functions, get more feedback (perhaps on a new patch thread?),
> > and then consider
> > how to abstract across architectures after getting feedback on this?
> 
> I think the cross-architecture abstraction is probably the
> trickiest part. I would actually be happy for us to drop
> 'info tlb' and 'info mem' entirely if we have a cross-arch
> command that gives basically the same information -- we don't
> IMHO need more than one command for this, and we only have
> multiple commands for basically legacy reasons. And for the
> human monitor (HMP) we don't need to keep things around
> for backwards compatibility.

I'm not sure what happens for the (MIPS/SPARC ?) where it's not
a traditional table hierarchy.

The other thing you might want (and I'm not sure how it interacts
with any of this) is to specify the root of the MMU tree (i.e. CR3
value for those in Intel thinking) to dump different processes etc.

Dave

> thanks
> -- PMM
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Re: [PATCH 24/27] docs/qapi-domain: add type cross-refs to field lists

2024-04-19 Thread John Snow
On Fri, Apr 19, 2024 at 12:38 AM John Snow  wrote:
>
> This commit, finally, adds cross-referencing support to various field
> lists; modeled tightly after Sphinx's own Python domain code.
>
> Cross-referencing support is added to type names provided to :arg:,
> :memb:, :returns: and :choice:.
>
> :feat:, :error: and :value:, which do not take type names, do not
> support this syntax.
>
> The general syntax is simple:
>
> :arg TypeName ArgName: Lorem Ipsum ...
>
> The domain will transform TypeName into :qapi:type:`TypeName` in this
> basic case, and also apply the ``literal`` decoration to indicate that
> this is a type cross-reference.
>
> For Optional arguments, the special "?" suffix is used. Because "*" has
> special meaning in ReST that would cause parsing errors, we elect to use
> "?" instead. The special syntax processing in QAPIXrefMixin strips this
> character from the end of any type name argument and will append ",
> Optional" to the rendered output, applying the cross-reference only to
> the actual type name.
>
> The intent here is that the actual syntax in doc-blocks need not change;
> but e.g. qapidoc.py will need to process and transform "@arg foo lorem
> ipsum" into ":arg type? foo: lorem ipsum" based on the schema
> information. Therefore, nobody should ever actually witness this
> intermediate syntax unless they are writing manual documentation or the
> doc transmogrifier breaks.
>
> For array arguments, type names can similarly be surrounded by "[]",
> which are stripped off and then re-appended outside of the
> cross-reference.
>
> Note: The mixin pattern here (borrowed from Sphinx) confuses mypy
> because it cannot tell that it will be mixed into a descendent of
> Field. Doing that instead causes more errors, because many versions of
> Sphinx erroneously did not mark various arguments as Optional, so we're
> a bit hosed either way. Do the simpler thing.
>
> Signed-off-by: John Snow 
> ---
>  docs/qapi/index.rst|  34 
>  docs/sphinx/qapi-domain.py | 110 +++--
>  2 files changed, 138 insertions(+), 6 deletions(-)
>
> diff --git a/docs/qapi/index.rst b/docs/qapi/index.rst
> index 8352a27d4a5..6e85ea5280d 100644
> --- a/docs/qapi/index.rst
> +++ b/docs/qapi/index.rst
> @@ -105,6 +105,11 @@ Explicit cross-referencing syntax for QAPI modules is 
> available with
> :arg str bar: Another normal parameter description.
> :arg baz: Missing a type.
> :arg no-descr:
> +   :arg int? oof: Testing optional argument parsing.
> +   :arg [XDbgBlockGraphNode] rab: Testing array argument parsing.
> +   :arg [BitmapSyncMode]? zab: Testing optional array argument parsing,
> +  even though Markus said this should never happen. I believe him,
> +  but I didn't *forbid* the syntax either.
> :arg BitmapSyncMode discrim: How about branches in commands?
>
> .. qapi:branch:: discrim on-success
> @@ -261,3 +266,32 @@ Explicit cross-referencing syntax for QAPI modules is 
> available with
>
>:memb str key-secret: ID of a QCryptoSecret object providing a
>   passphrase for unlocking the encryption
> +
> +.. qapi:command:: x-debug-query-block-graph
> +   :since: 4.0
> +   :unstable:
> +
> +   Get the block graph.
> +
> +   :feat unstable: This command is meant for debugging.
> +   :return XDbgBlockGraph: lorem ipsum ...
> +
> +.. qapi:struct:: XDbgBlockGraph
> +   :since: 4.0
> +
> +   Block Graph - list of nodes and list of edges.
> +
> +   :memb [XDbgBlockGraphNode] nodes:
> +   :memb [XDbgBlockGraphEdge] edges:
> +
> +.. qapi:struct:: XDbgBlockGraphNode
> +   :since: 4.0
> +
> +   :memb uint64 id: Block graph node identifier.  This @id is generated only 
> for
> +  x-debug-query-block-graph and does not relate to any other
> +  identifiers in Qemu.
> +   :memb XDbgBlockGraphNodeType type: Type of graph node.  Can be one of
> +  block-backend, block-job or block-driver-state.
> +   :memb str name: Human readable name of the node.  Corresponds to
> +  node-name for block-driver-state nodes; is not guaranteed to be
> +  unique in the whole graph (with block-jobs and block-backends).
> diff --git a/docs/sphinx/qapi-domain.py b/docs/sphinx/qapi-domain.py
> index bf8bb933345..074453193ce 100644
> --- a/docs/sphinx/qapi-domain.py
> +++ b/docs/sphinx/qapi-domain.py
> @@ -50,11 +50,12 @@
>
>  if TYPE_CHECKING:
>  from docutils.nodes import Element, Node
> +from docutils.parsers.rst.states import Inliner
>
>  from sphinx.application import Sphinx
>  from sphinx.builders import Builder
>  from sphinx.environment import BuildEnvironment
> -from sphinx.util.typing import OptionSpec
> +from sphinx.util.typing import OptionSpec, TextlikeNode
>
>  logger = logging.getLogger(__name__)
>
> @@ -68,6 +69,90 @@ class ObjectEntry(NamedTuple):
>  aliased: bool
>
>
> +class QAPIXrefMixin:
> +def make_xref(
> +self,
> +rolename: str,
> +domain: str,
> +target

Re: [PATCH v7 10/12] hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:11:01PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> All DPA ranges in the DC regions are invalid to access until an extent
> covering the range has been successfully accepted by the host. A bitmap
> is added to each region to record whether a DC block in the region has
> been backed by a DC extent. Each bit in the bitmap represents a DC block.
> When a DC extent is accepted, all the bits representing the blocks in the
> extent are set, which will be cleared when the extent is released.
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c  |  3 ++
>  hw/mem/cxl_type3.c  | 76 +
>  include/hw/cxl/cxl_device.h |  7 
>  3 files changed, 86 insertions(+)
> 

Reviewed-by: Gregory Price 




Re: [PATCH v7 07/12] hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:58PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> Add dynamic capacity extent list representative to the definition of
> CXLType3Dev and implement get DC extent list mailbox command per
> CXL.spec.3.1:.8.2.9.9.9.2.
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 73 -
>  hw/mem/cxl_type3.c  |  1 +
>  include/hw/cxl/cxl_device.h | 22 +++
>  3 files changed, 95 insertions(+), 1 deletion(-)
> 

Reviewed-by: Gregory Price 



Re: [PATCH v7 04/12] hw/mem/cxl_type3: Add support to create DC regions to type3 memory devices

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:55PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> With the change, when setting up memory for type3 memory device, we can
> create DC regions.
> A property 'num-dc-regions' is added to ct3_props to allow users to pass the
> number of DC regions to create. To make it easier, other region parameters
> like region base, length, and block size are hard coded. If needed,
> these parameters can be added easily.
> 
> With the change, we can create DC regions with proper kernel side
> support like below:
> 
> region=$(cat /sys/bus/cxl/devices/decoder0.0/create_dc_region)
> echo $region > /sys/bus/cxl/devices/decoder0.0/create_dc_region
> echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity
> echo 1 > /sys/bus/cxl/devices/$region/interleave_ways
> 
> echo "dc0" >/sys/bus/cxl/devices/decoder2.0/mode
> echo 0x4000 >/sys/bus/cxl/devices/decoder2.0/dpa_size
> 
> echo 0x4000 > /sys/bus/cxl/devices/$region/size
> echo  "decoder2.0" > /sys/bus/cxl/devices/$region/target0
> echo 1 > /sys/bus/cxl/devices/$region/commit
> echo $region > /sys/bus/cxl/drivers/cxl_region/bind
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/mem/cxl_type3.c | 49 ++
>  1 file changed, 49 insertions(+)
> 

Reviewed-by: Gregory Price 



Re: [PATCH v7 03/12] include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for type3 memory devices

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:54PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> Rename mem_size as static_mem_size for type3 memdev to cover static RAM and
> pmem capacity, preparing for the introduction of dynamic capacity to support
> dynamic capacity devices.
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 4 ++--
>  hw/mem/cxl_type3.c  | 8 
>  include/hw/cxl/cxl_device.h | 2 +-
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 

Reviewed-by: Gregory Price 




Re: [PATCH v7 02/12] hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative and mailbox command support

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:53PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> Per cxl spec r3.1, add dynamic capacity region representative based on
> Table 8-165 and extend the cxl type3 device definition to include DC region
> information. Also, based on info in 8.2.9.9.9.1, add 'Get Dynamic Capacity
> Configuration' mailbox support.
> 
> Note: we store region decode length as byte-wise length on the device, which
> should be divided by 256 * MiB before being returned to the host
> for "Get Dynamic Capacity Configuration" mailbox command per
> specification.
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 96 +
>  include/hw/cxl/cxl_device.h | 16 +++
>  2 files changed, 112 insertions(+)
> 

Reviewed-by: Gregory Price 



Re: [PATCH v7 01/12] hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output payload of identify memory device command

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:52PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> Based on CXL spec r3.1 Table 8-127 (Identify Memory Device Output
> Payload), dynamic capacity event log size should be part of
> output of the Identify command.
> Add dc_event_log_size to the output payload for the host to get the info.
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/cxl/cxl-mailbox-utils.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 

Reviewed-by: Gregory Price 



Re: [PATCH v7 05/12] hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr size instead of mr as argument

2024-04-19 Thread Gregory Price
On Thu, Apr 18, 2024 at 04:10:56PM -0700, nifan@gmail.com wrote:
> From: Fan Ni 
> 
> The function ct3_build_cdat_entries_for_mr only uses size of the passed
> memory region argument, refactor the function definition to make the passed
> arguments more specific.
> 
> Reviewed-by: Jonathan Cameron 
> Signed-off-by: Fan Ni 
> ---
>  hw/mem/cxl_type3.c | 15 +--
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 5ceed0ab4c..a1fe268560 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -44,7 +44,7 @@ enum {
>  };

Reviewed-by: Gregory Price 



Re: [edk2-devel] [PATCH v3 5/6] target/arm: Do memory type alignment check when translation disabled

2024-04-19 Thread Ard Biesheuvel
On Fri, 19 Apr 2024 at 18:09, Jonathan Cameron via groups.io
 wrote:
>
> On Fri, 19 Apr 2024 13:52:07 +0200
> Gerd Hoffmann  wrote:
>
> >   Hi,
> >
> > > Gerd, any ideas?  Maybe I needs something subtly different in my
> > > edk2 build?  I've not looked at this bit of the qemu infrastructure
> > > before - is there a document on how that image is built?
> >
> > There is roms/Makefile for that.
> >
> > make -C roms help
> > make -C roms efi
> >
> > So easiest would be to just update the edk2 submodule to what you
> > need, then rebuild.
> >
> > The build is handled by the roms/edk2-build.py script,
> > with the build configuration being in roms/edk2-build.config.
> > That is usable outside the qemu source tree too, i.e. like this:
> >
> >   python3 /path/to/qemu.git/roms/edk2-build.py \
> > --config /path/to/qemu.git/roms/edk2-build.config \
> > --core /path/to/edk2.git \
> > --match armvirt \
> > --silent --no-logs
> >
> > That'll try to place the images build in "../pc-bios", so maybe better
> > work with a copy of the config file where you adjust this.
> >
> > HTH,
> >   Gerd
> >
>
> Thanks Gerd!
>
> So the builds are very similar via the two method...
> However - the QEMU build sets -D CAVIUM_ERRATUM_27456=TRUE
>
> And that's the difference - with that set for my other builds the alignment
> problems go away...
>
> Any idea why we have that set in roms/edk2-build.config?
> Superficially it seems rather unlikely anyone cares about thunderx1
> (if they do we need to get them some new hardware with fresh bugs)
> bugs now and this config file was only added last year.
>
>
> However, the last comment in Ard's commit message below seems
> highly likely to be relevant!
>
> Chasing through Ard's patch it has the side effect of dropping
> an override of a requirement for strict alignment.
> So with out the errata
> DEFINE GCC_AARCH64_CC_XIPFLAGS = -mstrict-align -mgeneral-regs-only
> is replaced with
>  [BuildOptions]
> +!if $(CAVIUM_ERRATUM_27456) == TRUE^M
> +  GCC:*_*_AARCH64_PP_FLAGS = -DCAVIUM_ERRATUM_27456^M
> +!else^M
>GCC:*_*_AARCH64_CC_XIPFLAGS ==
> +!endif^M
>
> The edk2 commit that added this was the following +CC Ard.
>
> Given I wasn't sure of the syntax of that file I set it
> manually to the original value and indeed it works.
>
>
> commit ec54ce1f1ab41b92782b37ae59e752fff0ef9c41
> Author: Ard Biesheuvel 
> Date:   Wed Jan 4 16:51:35 2023 +0100
>
> ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX
>
> The early ID map used by ArmVirtQemu uses ASID scoped non-global
> mappings, as this allows us to switch to the permanent ID map seamlessly
> without the need for explicit TLB maintenance.
>
> However, this triggers a known erratum on ThunderX, which does not
> tolerate non-global mappings that are executable at EL1, as this appears
> to result in I-cache corruption. (Linux disables the KPTI based Meltdown
> mitigation on ThunderX for the same reason)
>
> So work around this, by detecting the CPU implementor and part number,
> and proceeding without the early ID map if a ThunderX CPU is detected.
>
> Note that this requires the C code to be built with strict alignment
> again, as we may end up executing it with the MMU and caches off.
>
> Signed-off-by: Ard Biesheuvel 
> Acked-by: Laszlo Ersek 
> Tested-by: dann frazier 
>
> Test case is
> qemu-system-aarch64 -M virt,virtualization=true, -m 4g -cpu cortex-a76 \
> -bios QEMU_EFI.fd -d int
>
> Which gets alignment faults since:
> https://lore.kernel.org/all/20240301204110.656742-6-richard.hender...@linaro.org/
>
> So my feeling here is EDK2 should either have yet another config for QEMU as 
> a host
> or should always set the alignment without needing to pick the CAVIUM 27456 
> errata
> which I suspect will get dropped soonish anyway if anyone ever cleans up
> old errata.
>

This code was never really intended for execution at EL2, but it
happened to work, partially because TCG's lack of strict alignment
checking when the MMU is off.

Those assumptions no longer hold, so yes, let's get this fixed properly.

Given VHE and nested virt (which will likely imply VHE in practice), I
would like to extend this functionality (i.e., the use of preliminary
page tables in NOR flash) to EL2 as well, but with VHE enabled. This
means we can still elide TLB maintenance (and BBM checks) by using
different ASIDs, and otherwise, fall back to entering with the MMU off
if VHE is not available. In that case, we should enforce strict
alignment too, so that needs to be fixed regardless.

I'll try to code something up and send it round. In the mean time,
feel free to propose a minimal patch that reinstates the strict
alignment if you are pressed for time, and I'll merge it right away.



Re: [PATCH 00/27] Add qapi-domain Sphinx extension

2024-04-19 Thread John Snow
On Fri, Apr 19, 2024, 10:45 AM Markus Armbruster  wrote:

> John Snow  writes:
>
> > This series adds a new qapi-domain extension for Sphinx, which adds a
> > series of custom directives for documenting QAPI definitions.
> >
> > GitLab CI: https://gitlab.com/jsnow/qemu/-/pipelines/1259566476
> >
> > (Link to a demo HTML page at the end of this cover letter, but I want
> > you to read the cover letter first to explain what you're seeing.)
> >
> > This adds a new QAPI Index page, cross-references for QMP commands,
> > events, and data types, and improves the aesthetics of the QAPI/QMP
> > documentation.
>
> Cross-references alone will be a *massive* improvement!  I'm sure
> readers will appreciate better looks and an index, too.
>
> > This series adds only the new ReST syntax, *not* the autogenerator. The
> > ReST syntax used in this series is, in general, not intended for anyone
> > to actually write by hand. This mimics how Sphinx's own autodoc
> > extension generates Python domain directives, which are then re-parsed
> > to produce the final result.
> >
> > I have prototyped such a generator, but it isn't ready for inclusion
> > yet. (Rest assured: error context reporting is preserved down to the
> > line, even in generated ReST. There is no loss in usability for this
> > approach. It will likely either supplant qapidoc.py or heavily alter
> > it.) The generator requires only extremely minor changes to
> > scripts/qapi/parser.py to preserve nested indentation and provide more
> > accurate line information. It is less invasive than you may
> > fear. Relying on a secondary ReST parse phase eliminates much of the
> > complexity of qapidoc.py. Sleep soundly.
>
> I'm a Sphinx noob.  Let me paraphrase you to make sure I understand.
>
> You proprose to generate formatted documentation in two steps:
>
> • First, the QAPI generator generates .rst from the QAPI schema.  The
>   generated .rst makes use of a custom directives.
>

Yes, but this .rst file is built in-memory and never makes it to disk, like
Sphinx's autodoc for Python.

(We can add a debug knob to log it or save it out to disk if needed.)


> • Second, Sphinx turns the .rst into formatted documentation.  A Sphinx
>   qapi-domain extension implements the custom directives
>

Yes.


> This mirrors how Sphinx works for Python docs.  Which is its original
> use case.
>
> Your series demonstrates the second step, with test input you wrote
> manually.
>
> You have code for the first step, but you'd prefer to show it later.
>

Right, it's not fully finished, although I have events, commands, and
objects working. Unions, Alternates and Events need work.


> Fair?
>

Bingo!


> > The purpose of sending this series in its current form is largely to
> > solicit feedback on general aesthetics, layout, and features. Sphinx is
> > a wily beast, and feedback at this stage will dictate how and where
> > certain features are implemented.
>
> I'd appreciate help with that.  Opinions?


> > A goal for this syntax (and the generator) is to fully in-line all
> > command/event/object members, inherited or local, boxed or not, branched
> > or not. This should provide a boon to using these docs as a reference,
> > because users will not have to grep around the page looking for various
> > types, branches, or inherited members. Any arguments types will be
> > hyperlinked to their definition, further aiding usability. Commands can
> > be hotlinked from anywhere else in the manual, and will provide a
> > complete reference directly on the first screenful of information.
>
> Let me elaborate a bit here.
>
> A command's arguments can be specified inline, like so:
>
> { 'command': 'job-cancel', 'data': { 'id': 'str' } }
>
> The arguments are then documented right with the command.
>
> But they can also be specified by referencing an object type, like so:
>
> { 'command': 'block-dirty-bitmap-remove',
>   'data': 'BlockDirtyBitmap' }
>
> Reasons for doing it this way:
>
> • Several commands take the same arguments, and you don't want to repeat
>   yourself.
>
> • You want generated C take a single struct argument ('boxed': true).
>
> • The arguments are a union (which requires 'boxed': true).
>
> Drawback: the arguments are then documented elsewhere.  Not nice.
>
> Bug: the generated documentation fails to point there.
>
> You're proposing to inline the argument documentation, so it appears
> right with the command.
>
> An event's data is just like a command's argument.
>
> A command's return value can only specified by referencing a type.  Same
> doc usability issue.
>
> Similarly, a union type's base can specified inline or by referencing a
> struct type, and a union's branches must be specified by referencing a
> struct type.  Same doc usability issue.
>
> At least, the generated documentation does point to the referenced
> types.
>

Right. My proposal is to recursively inline referenced bases for the
top-level members so that this manual is useful as a user reference,

Re: [PATCH] hw/core/clock: always iterate through childs in clock_propagate_period

2024-04-19 Thread Philippe Mathieu-Daudé

On 19/4/24 18:08, Raphael Poggi wrote:

Hi Peter,

Le ven. 19 avr. 2024 à 16:08, Peter Maydell  a écrit :


On Thu, 18 Apr 2024 at 21:39, Raphael Poggi
 wrote:


Hi Philippe,

Le jeu. 18 avr. 2024 à 20:43, Philippe Mathieu-Daudé
 a écrit :


Hi Raphael,

On 18/4/24 21:16, Raphael Poggi wrote:

When dealing with few clocks depending with each others, sometimes
we might only want to update the multiplier/diviser on a specific clock
(cf clockB in drawing below) and call "clock_propagate(clockA)" to
update the childs period according to the potential new multiplier/diviser 
values.

++ ++  ++
| clockA | --> | clockB |  --> | clockC |
++ ++  ++

The actual code would not allow that because, since we cannot call
"clock_propagate" directly on a child, it would exit on the
first child has the period has not changed for clockB, only clockC is


Typo "as the period has not changed"?


That's a typo indeed, thanks!



Why can't you call clock_propagate() on a child?


There is an assert "assert(clk->source == NULL);" in clock_propagate().
If I am not wrong, clk->source is set when the clock has a parent.


I think that assertion is probably there because we didn't
originally have the idea of a clock having a multiplier/divider
setting. So the idea was that calling clock_propagate() on a
clock with a parent would always be wrong, because the only
reason for its period to change would be if the parent had
changed, and if the parent changes then clock_propagate()
should be called on the parent.

We added mul/div later, and we (I) didn't think through all
the consequences. If you change the mul/div settings on
clockB in this example then you need to call clock_propagate()
on it, so we should remove that assert(). Then when you change
the mul/div on clockB you can directly clock_propagate(clockB),
and I don't think you need this patch at that point.


Alright, that makes sense, is that OK if I send a patch removing the assert ?


Sure, that is welcomed :)

Regards,

Phil.



[PATCH] hw/core/clock: remove assert in clock_propagate

2024-04-19 Thread Raphael Poggi
This commit allows childs clock to propagate their new frequency,
for example, after setting a new multiplier/diviser.

Signed-off-by: Raphael Poggi 
---
 hw/core/clock.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/core/clock.c b/hw/core/clock.c
index 85421f8b55..174c8be095 100644
--- a/hw/core/clock.c
+++ b/hw/core/clock.c
@@ -109,7 +109,6 @@ static void clock_propagate_period(Clock *clk, bool 
call_callbacks)
 
 void clock_propagate(Clock *clk)
 {
-assert(clk->source == NULL);
 trace_clock_propagate(CLOCK_PATH(clk));
 clock_propagate_period(clk, true);
 }
-- 
2.44.0




Re: [PATCH v2 02/22] target/arm: Add confidential guest support

2024-04-19 Thread Daniel P . Berrangé
On Fri, Apr 19, 2024 at 04:56:50PM +0100, Jean-Philippe Brucker wrote:
> Add a new RmeGuest object, inheriting from ConfidentialGuestSupport, to
> support the Arm Realm Management Extension (RME). It is instantiated by
> passing on the command-line:
> 
>   -M virt,confidential-guest-support=
>   -object guest-rme,id=[,options...]

How about either "arm-rme" or merely 'rme' as the object name 

> 
> This is only the skeleton. Support will be added in following patches.
> 
> Cc: Eric Blake 
> Cc: Markus Armbruster 
> Cc: Daniel P. Berrangé 
> Cc: Eduardo Habkost 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Richard Henderson 
> Signed-off-by: Jean-Philippe Brucker 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] hw/core/clock: always iterate through childs in clock_propagate_period

2024-04-19 Thread Peter Maydell
On Fri, 19 Apr 2024 at 17:09, Raphael Poggi
 wrote:
>
> Hi Peter,
>
> Le ven. 19 avr. 2024 à 16:08, Peter Maydell  a 
> écrit :
> >
> > On Thu, 18 Apr 2024 at 21:39, Raphael Poggi
> >  wrote:
> > > There is an assert "assert(clk->source == NULL);" in clock_propagate().
> > > If I am not wrong, clk->source is set when the clock has a parent.
> >
> > I think that assertion is probably there because we didn't
> > originally have the idea of a clock having a multiplier/divider
> > setting. So the idea was that calling clock_propagate() on a
> > clock with a parent would always be wrong, because the only
> > reason for its period to change would be if the parent had
> > changed, and if the parent changes then clock_propagate()
> > should be called on the parent.
> >
> > We added mul/div later, and we (I) didn't think through all
> > the consequences. If you change the mul/div settings on
> > clockB in this example then you need to call clock_propagate()
> > on it, so we should remove that assert(). Then when you change
> > the mul/div on clockB you can directly clock_propagate(clockB),
> > and I don't think you need this patch at that point.
>
> Alright, that makes sense, is that OK if I send a patch removing the assert ?

Yes, please do.

-- PMM



Re: [edk2-devel] [PATCH v3 5/6] target/arm: Do memory type alignment check when translation disabled

2024-04-19 Thread Jonathan Cameron via
On Fri, 19 Apr 2024 13:52:07 +0200
Gerd Hoffmann  wrote:

>   Hi,
> 
> > Gerd, any ideas?  Maybe I needs something subtly different in my
> > edk2 build?  I've not looked at this bit of the qemu infrastructure
> > before - is there a document on how that image is built?  
> 
> There is roms/Makefile for that.
> 
> make -C roms help
> make -C roms efi
> 
> So easiest would be to just update the edk2 submodule to what you
> need, then rebuild.
> 
> The build is handled by the roms/edk2-build.py script,
> with the build configuration being in roms/edk2-build.config.
> That is usable outside the qemu source tree too, i.e. like this:
> 
>   python3 /path/to/qemu.git/roms/edk2-build.py \
> --config /path/to/qemu.git/roms/edk2-build.config \
> --core /path/to/edk2.git \
> --match armvirt \
> --silent --no-logs
> 
> That'll try to place the images build in "../pc-bios", so maybe better
> work with a copy of the config file where you adjust this.
> 
> HTH,
>   Gerd
> 

Thanks Gerd!

So the builds are very similar via the two method...
However - the QEMU build sets -D CAVIUM_ERRATUM_27456=TRUE

And that's the difference - with that set for my other builds the alignment
problems go away...

Any idea why we have that set in roms/edk2-build.config?
Superficially it seems rather unlikely anyone cares about thunderx1
(if they do we need to get them some new hardware with fresh bugs)
bugs now and this config file was only added last year.


However, the last comment in Ard's commit message below seems
highly likely to be relevant!

Chasing through Ard's patch it has the side effect of dropping
an override of a requirement for strict alignment. 
So with out the errata 
DEFINE GCC_AARCH64_CC_XIPFLAGS = -mstrict-align -mgeneral-regs-only
is replaced with
 [BuildOptions]
+!if $(CAVIUM_ERRATUM_27456) == TRUE^M
+  GCC:*_*_AARCH64_PP_FLAGS = -DCAVIUM_ERRATUM_27456^M
+!else^M
   GCC:*_*_AARCH64_CC_XIPFLAGS ==
+!endif^M

The edk2 commit that added this was the following +CC Ard.

Given I wasn't sure of the syntax of that file I set it
manually to the original value and indeed it works.


commit ec54ce1f1ab41b92782b37ae59e752fff0ef9c41
Author: Ard Biesheuvel 
Date:   Wed Jan 4 16:51:35 2023 +0100

ArmVirtPkg/ArmVirtQemu: Avoid early ID map on ThunderX

The early ID map used by ArmVirtQemu uses ASID scoped non-global
mappings, as this allows us to switch to the permanent ID map seamlessly
without the need for explicit TLB maintenance.

However, this triggers a known erratum on ThunderX, which does not
tolerate non-global mappings that are executable at EL1, as this appears
to result in I-cache corruption. (Linux disables the KPTI based Meltdown
mitigation on ThunderX for the same reason)

So work around this, by detecting the CPU implementor and part number,
and proceeding without the early ID map if a ThunderX CPU is detected.

Note that this requires the C code to be built with strict alignment
again, as we may end up executing it with the MMU and caches off.

Signed-off-by: Ard Biesheuvel 
Acked-by: Laszlo Ersek 
Tested-by: dann frazier 

Test case is
qemu-system-aarch64 -M virt,virtualization=true, -m 4g -cpu cortex-a76 \
-bios QEMU_EFI.fd -d int

Which gets alignment faults since:
https://lore.kernel.org/all/20240301204110.656742-6-richard.hender...@linaro.org/

So my feeling here is EDK2 should either have yet another config for QEMU as a 
host
or should always set the alignment without needing to pick the CAVIUM 27456 
errata
which I suspect will get dropped soonish anyway if anyone ever cleans up
old errata.

Jonathan






Re: [PATCH] hw/core/clock: always iterate through childs in clock_propagate_period

2024-04-19 Thread Raphael Poggi
Hi Peter,

Le ven. 19 avr. 2024 à 16:08, Peter Maydell  a écrit :
>
> On Thu, 18 Apr 2024 at 21:39, Raphael Poggi
>  wrote:
> >
> > Hi Philippe,
> >
> > Le jeu. 18 avr. 2024 à 20:43, Philippe Mathieu-Daudé
> >  a écrit :
> > >
> > > Hi Raphael,
> > >
> > > On 18/4/24 21:16, Raphael Poggi wrote:
> > > > When dealing with few clocks depending with each others, sometimes
> > > > we might only want to update the multiplier/diviser on a specific clock
> > > > (cf clockB in drawing below) and call "clock_propagate(clockA)" to
> > > > update the childs period according to the potential new 
> > > > multiplier/diviser values.
> > > >
> > > > ++ ++  ++
> > > > | clockA | --> | clockB |  --> | clockC |
> > > > ++ ++  ++
> > > >
> > > > The actual code would not allow that because, since we cannot call
> > > > "clock_propagate" directly on a child, it would exit on the
> > > > first child has the period has not changed for clockB, only clockC is
> > >
> > > Typo "as the period has not changed"?
> >
> > That's a typo indeed, thanks!
> >
> > >
> > > Why can't you call clock_propagate() on a child?
> >
> > There is an assert "assert(clk->source == NULL);" in clock_propagate().
> > If I am not wrong, clk->source is set when the clock has a parent.
>
> I think that assertion is probably there because we didn't
> originally have the idea of a clock having a multiplier/divider
> setting. So the idea was that calling clock_propagate() on a
> clock with a parent would always be wrong, because the only
> reason for its period to change would be if the parent had
> changed, and if the parent changes then clock_propagate()
> should be called on the parent.
>
> We added mul/div later, and we (I) didn't think through all
> the consequences. If you change the mul/div settings on
> clockB in this example then you need to call clock_propagate()
> on it, so we should remove that assert(). Then when you change
> the mul/div on clockB you can directly clock_propagate(clockB),
> and I don't think you need this patch at that point.

Alright, that makes sense, is that OK if I send a patch removing the assert ?

Thanks,
>
> thanks
> -- PMM



[PATCH v2 06/22] hw/arm/virt: Disable DTB randomness for confidential VMs

2024-04-19 Thread Jean-Philippe Brucker
The dtb-randomness feature, which adds random seeds to the DTB, isn't
really compatible with confidential VMs since it randomizes the Realm
Initial Measurement. Enabling it is not an error, but it prevents
attestation. It also isn't useful to a Realm, which doesn't trust host
input.

Currently the feature is automatically enabled, unless the user disables
it on the command-line. Change it to OnOffAuto, and automatically
disable it for confidential VMs, unless the user explicitly enables it.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: separate patch, use OnOffAuto
---
 docs/system/arm/virt.rst |  9 +
 include/hw/arm/virt.h|  2 +-
 hw/arm/virt.c| 41 +---
 3 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/docs/system/arm/virt.rst b/docs/system/arm/virt.rst
index 26fcba00b7..e4bbfec662 100644
--- a/docs/system/arm/virt.rst
+++ b/docs/system/arm/virt.rst
@@ -172,10 +172,11 @@ dtb-randomness
   rng-seed and kaslr-seed nodes (in both "/chosen" and
   "/secure-chosen") to use for features like the random number
   generator and address space randomisation. The default is
-  ``on``. You will want to disable it if your trusted boot chain
-  will verify the DTB it is passed, since this option causes the
-  DTB to be non-deterministic. It would be the responsibility of
-  the firmware to come up with a seed and pass it on if it wants to.
+  ``off`` for confidential VMs, and ``on`` otherwise. You will want
+  to disable it if your trusted boot chain will verify the DTB it is
+  passed, since this option causes the DTB to be non-deterministic.
+  It would be the responsibility of the firmware to come up with a
+  seed and pass it on if it wants to.
 
 dtb-kaslr-seed
   A deprecated synonym for dtb-randomness.
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index bb486d36b1..90a148dac2 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -150,7 +150,7 @@ struct VirtMachineState {
 bool virt;
 bool ras;
 bool mte;
-bool dtb_randomness;
+OnOffAuto dtb_randomness;
 OnOffAuto acpi;
 VirtGICType gic_version;
 VirtIOMMUType iommu;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 07ad31876e..f300f100b5 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -259,6 +259,7 @@ static bool ns_el2_virt_timer_present(void)
 
 static void create_fdt(VirtMachineState *vms)
 {
+bool dtb_randomness = true;
 MachineState *ms = MACHINE(vms);
 int nb_numa_nodes = ms->numa_state->num_nodes;
 void *fdt = create_device_tree(&vms->fdt_size);
@@ -268,6 +269,16 @@ static void create_fdt(VirtMachineState *vms)
 exit(1);
 }
 
+/*
+ * Including random data in the DTB causes random intial measurement on 
CCA,
+ * so disable it for confidential VMs.
+ */
+if (vms->dtb_randomness == ON_OFF_AUTO_OFF ||
+(vms->dtb_randomness == ON_OFF_AUTO_AUTO &&
+ virt_machine_is_confidential(vms))) {
+dtb_randomness = false;
+}
+
 ms->fdt = fdt;
 
 /* Header */
@@ -278,13 +289,13 @@ static void create_fdt(VirtMachineState *vms)
 
 /* /chosen must exist for load_dtb to fill in necessary properties later */
 qemu_fdt_add_subnode(fdt, "/chosen");
-if (vms->dtb_randomness) {
+if (dtb_randomness) {
 create_randomness(ms, "/chosen");
 }
 
 if (vms->secure) {
 qemu_fdt_add_subnode(fdt, "/secure-chosen");
-if (vms->dtb_randomness) {
+if (dtb_randomness) {
 create_randomness(ms, "/secure-chosen");
 }
 }
@@ -2474,18 +2485,21 @@ static void virt_set_its(Object *obj, bool value, Error 
**errp)
 vms->its = value;
 }
 
-static bool virt_get_dtb_randomness(Object *obj, Error **errp)
+static void virt_get_dtb_randomness(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
 {
 VirtMachineState *vms = VIRT_MACHINE(obj);
+OnOffAuto dtb_randomness = vms->dtb_randomness;
 
-return vms->dtb_randomness;
+visit_type_OnOffAuto(v, name, &dtb_randomness, errp);
 }
 
-static void virt_set_dtb_randomness(Object *obj, bool value, Error **errp)
+static void virt_set_dtb_randomness(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
 {
 VirtMachineState *vms = VIRT_MACHINE(obj);
 
-vms->dtb_randomness = value;
+visit_type_OnOffAuto(v, name, &vms->dtb_randomness, errp);
 }
 
 static char *virt_get_oem_id(Object *obj, Error **errp)
@@ -3123,16 +3137,16 @@ static void virt_machine_class_init(ObjectClass *oc, 
void *data)
   "Set on/off to enable/disable "
   "ITS instantiation");
 
-object_class_property_add_bool(oc, "dtb-randomness",
-   virt_get_dtb_randomness,
-   virt_set_dtb_randomness);
+object_class_property_add(oc,

[PATCH v2 18/22] target/arm/kvm: Disable Realm reboot

2024-04-19 Thread Jean-Philippe Brucker
A realm cannot be reset, it must be recreated from scratch. The RMM
specification defines states of a Realm as NEW -> ACTIVE -> SYSTEM_OFF,
after which the Realm can only be destroyed. A PCSI_SYSTEM_RESET call,
which normally reboots the system, puts the Realm in SYSTEM_OFF state.

QEMU does not support recreating a VM. Normally, a reboot request by the
guest causes all devices to reset, which cannot work for a Realm.
Indeed, loading images into Realm memory and changing the PC is only
allowed for a Realm in NEW state. Resetting the images for a Realm in
SYSTEM_OFF state will cause QEMU to crash with a bus error.

Handle reboot requests by the guest more gracefully, by indicating to
runstate.c that the vCPUs of a Realm are not resettable, and that QEMU
should exit.

Reviewed-by: Richard Henderson 
Signed-off-by: Jean-Philippe Brucker 
---
 target/arm/kvm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 9855cadb1b..60c2ef9388 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1694,7 +1694,8 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
 
 bool kvm_arch_cpu_check_are_resettable(void)
 {
-return true;
+/* A Realm cannot be reset */
+return !kvm_arm_rme_enabled();
 }
 
 static void kvm_arch_get_eager_split_size(Object *obj, Visitor *v,
-- 
2.44.0




[PATCH v2 19/22] target/arm/cpu: Inform about reading confidential CPU registers

2024-04-19 Thread Jean-Philippe Brucker
The host cannot access registers of a Realm. Instead of showing all
registers as zero in "info registers", display a message about this
restriction.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 target/arm/cpu.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index ab8d007a86..18d1b88e2f 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1070,6 +1070,11 @@ static void aarch64_cpu_dump_state(CPUState *cs, FILE 
*f, int flags)
 const char *ns_status;
 bool sve;
 
+if (cpu->kvm_rme) {
+qemu_fprintf(f, "the CPU registers are confidential to the realm\n");
+return;
+}
+
 qemu_fprintf(f, " PC=%016" PRIx64 " ", env->pc);
 for (i = 0; i < 32; i++) {
 if (i == 31) {
-- 
2.44.0




[PATCH v2 13/22] hw/arm/boot: Register Linux BSS section for confidential guests

2024-04-19 Thread Jean-Philippe Brucker
Although the BSS section is not currently part of the kernel blob, it
needs to be registered as guest RAM for confidential guest support,
because the kernel needs to access it before it is able to setup its RAM
regions.

It would be tempting to simply add the BSS as part of the ROM blob (ie
pass kernel_size as max_len argument to rom_add_blob()) and let the ROM
loader notifier deal with the full image size generically, but that
would add zero-initialization of the BSS region by the loader, which
adds a significant overhead. For a 40MB kernel with a 17MB BSS, I
measured an average boot time regression of 2.8ms on a fast desktop,
5.7% of the QEMU setup time). On a slower host, the regression could be
much larger.

Instead, add a special case to initialize the kernel's BSS IPA range.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 target/arm/kvm_arm.h |  5 +
 hw/arm/boot.c| 11 +++
 target/arm/kvm-rme.c | 10 ++
 3 files changed, 26 insertions(+)

diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 4386b0..4b787dd628 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -218,6 +218,7 @@ int kvm_arm_set_irq(int cpu, int irqtype, int irq, int 
level);
 
 int kvm_arm_rme_init(MachineState *ms);
 int kvm_arm_rme_vm_type(MachineState *ms);
+void kvm_arm_rme_init_guest_ram(hwaddr base, size_t size);
 
 bool kvm_arm_rme_enabled(void);
 int kvm_arm_rme_vcpu_init(CPUState *cs);
@@ -243,6 +244,10 @@ static inline bool kvm_arm_sve_supported(void)
 return false;
 }
 
+static inline void kvm_arm_rme_init_guest_ram(hwaddr base, size_t size)
+{
+}
+
 /*
  * These functions should never actually be called without KVM support.
  */
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 84ea6a807a..9f522e332b 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -26,6 +26,7 @@
 #include "qemu/config-file.h"
 #include "qemu/option.h"
 #include "qemu/units.h"
+#include "kvm_arm.h"
 
 /* Kernel boot protocol is specified in the kernel docs
  * Documentation/arm/Booting and Documentation/arm64/booting.txt
@@ -850,6 +851,7 @@ static uint64_t load_aarch64_image(const char *filename, 
hwaddr mem_base,
 {
 hwaddr kernel_load_offset = KERNEL64_LOAD_ADDR;
 uint64_t kernel_size = 0;
+uint64_t page_size;
 uint8_t *buffer;
 int size;
 
@@ -916,6 +918,15 @@ static uint64_t load_aarch64_image(const char *filename, 
hwaddr mem_base,
 *entry = mem_base + kernel_load_offset;
 rom_add_blob_fixed_as(filename, buffer, size, *entry, as);
 
+/*
+ * Register the kernel BSS as realm resource, so the kernel can use it 
right
+ * away. Align up to skip the last page, which still contains kernel
+ * data.
+ */
+page_size = qemu_real_host_page_size();
+kvm_arm_rme_init_guest_ram(QEMU_ALIGN_UP(*entry + size, page_size),
+   QEMU_ALIGN_DOWN(kernel_size - size, page_size));
+
 g_free(buffer);
 
 return kernel_size;
diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
index bee6694d6d..b2ad10ef6d 100644
--- a/target/arm/kvm-rme.c
+++ b/target/arm/kvm-rme.c
@@ -203,6 +203,16 @@ int kvm_arm_rme_init(MachineState *ms)
 return 0;
 }
 
+/*
+ * kvm_arm_rme_init_guest_ram - Initialize a Realm IPA range
+ */
+void kvm_arm_rme_init_guest_ram(hwaddr base, size_t size)
+{
+if (rme_guest) {
+rme_add_ram_region(base, size, /* populate */ false);
+}
+}
+
 int kvm_arm_rme_vcpu_init(CPUState *cs)
 {
 ARMCPU *cpu = ARM_CPU(cs);
-- 
2.44.0




[PATCH v2 15/22] target/arm/kvm-rme: Add measurement algorithm property

2024-04-19 Thread Jean-Philippe Brucker
This option selects which measurement algorithm to use for attestation.
Supported values are SHA256 and SHA512. Default to SHA512 arbitrarily.

SHA512 is generally faster on 64-bit architectures. On a few arm64 CPUs
I tested SHA256 is much faster, but that's most likely because they only
support acceleration via FEAT_SHA256 (Armv8.0) and not FEAT_SHA512
(Armv8.2). Future CPUs supporting RME are likely to also support
FEAT_SHA512.

Cc: Eric Blake 
Cc: Markus Armbruster 
Cc: Daniel P. Berrangé 
Cc: Eduardo Habkost 
Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: use enum, pick default
---
 qapi/qom.json| 18 +-
 target/arm/kvm-rme.c | 39 ++-
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 91654aa267..84dce666b2 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -931,18 +931,34 @@
   'data': { '*cpu-affinity': ['uint16'],
 '*node-affinity': ['uint16'] } }
 
+##
+# @RmeGuestMeasurementAlgo:
+#
+# @sha256: Use the SHA256 algorithm
+# @sha512: Use the SHA512 algorithm
+#
+# Algorithm to use for realm measurements
+#
+# Since: FIXME
+##
+{ 'enum': 'RmeGuestMeasurementAlgo',
+  'data': ['sha256', 'sha512'] }
+
 ##
 # @RmeGuestProperties:
 #
 # Properties for rme-guest objects.
 #
+# @measurement-algo: Realm measurement algorithm (default: sha512)
+#
 # @personalization-value: Realm personalization value, as a 64-byte hex string
 # (default: 0)
 #
 # Since: FIXME
 ##
 { 'struct': 'RmeGuestProperties',
-  'data': { '*personalization-value': 'str' } }
+  'data': { '*personalization-value': 'str',
+'*measurement-algo': 'RmeGuestMeasurementAlgo' } }
 
 ##
 # @ObjectType:
diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
index cb5c3f7a22..8f39e54aaa 100644
--- a/target/arm/kvm-rme.c
+++ b/target/arm/kvm-rme.c
@@ -23,13 +23,14 @@ OBJECT_DECLARE_SIMPLE_TYPE(RmeGuest, RME_GUEST)
 
 #define RME_PAGE_SIZE qemu_real_host_page_size()
 
-#define RME_MAX_CFG 1
+#define RME_MAX_CFG 2
 
 struct RmeGuest {
 ConfidentialGuestSupport parent_obj;
 Notifier rom_load_notifier;
 GSList *ram_regions;
 uint8_t *personalization_value;
+RmeGuestMeasurementAlgo measurement_algo;
 };
 
 typedef struct {
@@ -73,6 +74,19 @@ static int rme_configure_one(RmeGuest *guest, uint32_t cfg, 
Error **errp)
 memcpy(args.rpv, guest->personalization_value, 
KVM_CAP_ARM_RME_RPV_SIZE);
 cfg_str = "personalization value";
 break;
+case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
+switch (guest->measurement_algo) {
+case RME_GUEST_MEASUREMENT_ALGO_SHA256:
+args.hash_algo = KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256;
+break;
+case RME_GUEST_MEASUREMENT_ALGO_SHA512:
+args.hash_algo = KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512;
+break;
+default:
+g_assert_not_reached();
+}
+cfg_str = "hash algorithm";
+break;
 default:
 g_assert_not_reached();
 }
@@ -338,12 +352,34 @@ static void rme_set_rpv(Object *obj, const char *value, 
Error **errp)
 }
 }
 
+static int rme_get_measurement_algo(Object *obj, Error **errp)
+{
+RmeGuest *guest = RME_GUEST(obj);
+
+return guest->measurement_algo;
+}
+
+static void rme_set_measurement_algo(Object *obj, int algo, Error **errp)
+{
+RmeGuest *guest = RME_GUEST(obj);
+
+guest->measurement_algo = algo;
+}
+
 static void rme_guest_class_init(ObjectClass *oc, void *data)
 {
 object_class_property_add_str(oc, "personalization-value", rme_get_rpv,
   rme_set_rpv);
 object_class_property_set_description(oc, "personalization-value",
 "Realm personalization value (512-bit hexadecimal number)");
+
+object_class_property_add_enum(oc, "measurement-algo",
+   "RmeGuestMeasurementAlgo",
+   &RmeGuestMeasurementAlgo_lookup,
+   rme_get_measurement_algo,
+   rme_set_measurement_algo);
+object_class_property_set_description(oc, "measurement-algo",
+"Realm measurement algorithm ('sha256', 'sha512')");
 }
 
 static void rme_guest_instance_init(Object *obj)
@@ -353,6 +389,7 @@ static void rme_guest_instance_init(Object *obj)
 exit(1);
 }
 rme_guest = RME_GUEST(obj);
+rme_guest->measurement_algo = RME_GUEST_MEASUREMENT_ALGO_SHA512;
 }
 
 static const TypeInfo rme_guest_info = {
-- 
2.44.0




[PATCH v2 04/22] target/arm/kvm-rme: Initialize realm

2024-04-19 Thread Jean-Philippe Brucker
The machine code calls kvm_arm_rme_vm_type() to get the VM flag and KVM
calls kvm_arm_rme_init() to issue KVM hypercalls:

* create the realm descriptor,
* load images into Realm RAM (in another patch),
* finalize the REC (vCPU) after the registers are reset,
* activate the realm at the end, at which point the realm is sealed.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2:
* Use g_assert_not_reached() in stubs
* Init from kvm_arch_init() rather than hw/arm/virt
* Cache rme_guest
---
 target/arm/kvm_arm.h |  16 +++
 target/arm/kvm-rme.c | 101 +++
 target/arm/kvm.c |   7 ++-
 3 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index cfaa0d9bc7..8e2d90c265 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -203,6 +203,8 @@ int kvm_arm_vgic_probe(void);
 void kvm_arm_pmu_init(ARMCPU *cpu);
 void kvm_arm_pmu_set_irq(ARMCPU *cpu, int irq);
 
+int kvm_arm_vcpu_finalize(ARMCPU *cpu, int feature);
+
 /**
  * kvm_arm_pvtime_init:
  * @cpu: ARMCPU
@@ -214,6 +216,11 @@ void kvm_arm_pvtime_init(ARMCPU *cpu, uint64_t ipa);
 
 int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
 
+int kvm_arm_rme_init(MachineState *ms);
+int kvm_arm_rme_vm_type(MachineState *ms);
+
+bool kvm_arm_rme_enabled(void);
+
 #else
 
 /*
@@ -283,6 +290,15 @@ static inline uint32_t kvm_arm_sve_get_vls(ARMCPU *cpu)
 g_assert_not_reached();
 }
 
+static inline int kvm_arm_rme_init(MachineState *ms)
+{
+g_assert_not_reached();
+}
+
+static inline int kvm_arm_rme_vm_type(MachineState *ms)
+{
+g_assert_not_reached();
+}
 #endif
 
 #endif
diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
index 960dd75608..23ac2d32d4 100644
--- a/target/arm/kvm-rme.c
+++ b/target/arm/kvm-rme.c
@@ -23,14 +23,115 @@ struct RmeGuest {
 ConfidentialGuestSupport parent_obj;
 };
 
+static RmeGuest *rme_guest;
+
+bool kvm_arm_rme_enabled(void)
+{
+return !!rme_guest;
+}
+
+static int rme_create_rd(Error **errp)
+{
+int ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_RME, 0,
+KVM_CAP_ARM_RME_CREATE_RD);
+
+if (ret) {
+error_setg_errno(errp, -ret, "RME: failed to create Realm Descriptor");
+}
+return ret;
+}
+
+static void rme_vm_state_change(void *opaque, bool running, RunState state)
+{
+int ret;
+CPUState *cs;
+
+if (!running) {
+return;
+}
+
+ret = rme_create_rd(&error_abort);
+if (ret) {
+return;
+}
+
+/*
+ * Now that do_cpu_reset() initialized the boot PC and
+ * kvm_cpu_synchronize_post_reset() registered it, we can finalize the REC.
+ */
+CPU_FOREACH(cs) {
+ret = kvm_arm_vcpu_finalize(ARM_CPU(cs), KVM_ARM_VCPU_REC);
+if (ret) {
+error_report("RME: failed to finalize vCPU: %s", strerror(-ret));
+exit(1);
+}
+}
+
+ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_RME, 0,
+KVM_CAP_ARM_RME_ACTIVATE_REALM);
+if (ret) {
+error_report("RME: failed to activate realm: %s", strerror(-ret));
+exit(1);
+}
+}
+
+int kvm_arm_rme_init(MachineState *ms)
+{
+static Error *rme_mig_blocker;
+ConfidentialGuestSupport *cgs = ms->cgs;
+
+if (!rme_guest) {
+return 0;
+}
+
+if (!cgs) {
+error_report("missing -machine confidential-guest-support parameter");
+return -EINVAL;
+}
+
+if (!kvm_check_extension(kvm_state, KVM_CAP_ARM_RME)) {
+return -ENODEV;
+}
+
+error_setg(&rme_mig_blocker, "RME: migration is not implemented");
+migrate_add_blocker(&rme_mig_blocker, &error_fatal);
+
+/*
+ * The realm activation is done last, when the VM starts, after all images
+ * have been loaded and all vcpus finalized.
+ */
+qemu_add_vm_change_state_handler(rme_vm_state_change, NULL);
+
+cgs->ready = true;
+return 0;
+}
+
+int kvm_arm_rme_vm_type(MachineState *ms)
+{
+if (rme_guest) {
+return KVM_VM_TYPE_ARM_REALM;
+}
+return 0;
+}
+
 static void rme_guest_class_init(ObjectClass *oc, void *data)
 {
 }
 
+static void rme_guest_instance_init(Object *obj)
+{
+if (rme_guest) {
+error_report("a single instance of RmeGuest is supported");
+exit(1);
+}
+rme_guest = RME_GUEST(obj);
+}
+
 static const TypeInfo rme_guest_info = {
 .parent = TYPE_CONFIDENTIAL_GUEST_SUPPORT,
 .name = TYPE_RME_GUEST,
 .instance_size = sizeof(struct RmeGuest),
+.instance_init = rme_guest_instance_init,
 .class_init = rme_guest_class_init,
 .interfaces = (InterfaceInfo[]) {
 { TYPE_USER_CREATABLE },
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index a5673241e5..b00077c1a5 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -93,7 +93,7 @@ static int kvm_arm_vcpu_init(ARMCPU *cpu)
  *
  * Returns: 0 if success else < 0 error code
  */
-static int kvm_arm_vcpu_finalize(ARMCPU *cpu, int fe

[PATCH v2 09/22] target/arm/kvm-rme: Initialize vCPU

2024-04-19 Thread Jean-Philippe Brucker
The target code calls kvm_arm_vcpu_init() to mark the vCPU as part of a
Realm. For a Realm vCPU, only x0-x7 can be set at runtime. Before boot,
the PC can also be set, and is ignored at runtime. KVM also accepts a
few system register changes during initial configuration, as returned by
KVM_GET_REG_LIST.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: only do the GP regs, since they are sync'd explicitly. Other
  registers use the existing reglist facility.
---
 target/arm/cpu.h |  3 +++
 target/arm/kvm_arm.h |  1 +
 target/arm/kvm-rme.c | 10 
 target/arm/kvm.c | 61 
 4 files changed, 75 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index bc0c84873f..d3ff1b4a31 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -945,6 +945,9 @@ struct ArchCPU {
 OnOffAuto kvm_steal_time;
 #endif /* CONFIG_KVM */
 
+/* Realm Management Extension */
+bool kvm_rme;
+
 /* Uniprocessor system with MP extensions */
 bool mp_is_up;
 
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 8e2d90c265..4386b0 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -220,6 +220,7 @@ int kvm_arm_rme_init(MachineState *ms);
 int kvm_arm_rme_vm_type(MachineState *ms);
 
 bool kvm_arm_rme_enabled(void);
+int kvm_arm_rme_vcpu_init(CPUState *cs);
 
 #else
 
diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
index 23ac2d32d4..aa9c3b5551 100644
--- a/target/arm/kvm-rme.c
+++ b/target/arm/kvm-rme.c
@@ -106,6 +106,16 @@ int kvm_arm_rme_init(MachineState *ms)
 return 0;
 }
 
+int kvm_arm_rme_vcpu_init(CPUState *cs)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+
+if (rme_guest) {
+cpu->kvm_rme = true;
+}
+return 0;
+}
+
 int kvm_arm_rme_vm_type(MachineState *ms)
 {
 if (rme_guest) {
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 3504276822..3a2233ec73 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1920,6 +1920,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
 return ret;
 }
 
+ret = kvm_arm_rme_vcpu_init(cs);
+if (ret) {
+return ret;
+}
+
 if (cpu_isar_feature(aa64_sve, cpu)) {
 ret = kvm_arm_sve_set_vls(cpu);
 if (ret) {
@@ -2056,6 +2061,35 @@ static int kvm_arch_put_sve(CPUState *cs)
 return 0;
 }
 
+static int kvm_arm_rme_put_core_regs(CPUState *cs)
+{
+int i, ret;
+struct kvm_one_reg reg;
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = &cpu->env;
+
+/*
+ * The RME ABI only allows us to set 8 GPRs and the PC
+ */
+for (i = 0; i < 8; i++) {
+reg.id = AARCH64_CORE_REG(regs.regs[i]);
+reg.addr = (uintptr_t) &env->xregs[i];
+ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
+if (ret) {
+return ret;
+}
+}
+
+reg.id = AARCH64_CORE_REG(regs.pc);
+reg.addr = (uintptr_t) &env->pc;
+ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, ®);
+if (ret) {
+return ret;
+}
+
+return 0;
+}
+
 static int kvm_arm_put_core_regs(CPUState *cs, int level)
 {
 uint64_t val;
@@ -2066,6 +2100,10 @@ static int kvm_arm_put_core_regs(CPUState *cs, int level)
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = &cpu->env;
 
+if (cpu->kvm_rme) {
+return kvm_arm_rme_put_core_regs(cs);
+}
+
 /* If we are in AArch32 mode then we need to copy the AArch32 regs to the
  * AArch64 registers before pushing them out to 64-bit KVM.
  */
@@ -2253,6 +2291,25 @@ static int kvm_arch_get_sve(CPUState *cs)
 return 0;
 }
 
+static int kvm_arm_rme_get_core_regs(CPUState *cs)
+{
+int i, ret;
+struct kvm_one_reg reg;
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = &cpu->env;
+
+for (i = 0; i < 8; i++) {
+reg.id = AARCH64_CORE_REG(regs.regs[i]);
+reg.addr = (uintptr_t) &env->xregs[i];
+ret = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, ®);
+if (ret) {
+return ret;
+}
+}
+
+return 0;
+}
+
 static int kvm_arm_get_core_regs(CPUState *cs)
 {
 uint64_t val;
@@ -2263,6 +2320,10 @@ static int kvm_arm_get_core_regs(CPUState *cs)
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = &cpu->env;
 
+if (cpu->kvm_rme) {
+return kvm_arm_rme_get_core_regs(cs);
+}
+
 for (i = 0; i < 31; i++) {
 ret = kvm_get_one_reg(cs, AARCH64_CORE_REG(regs.regs[i]),
   &env->xregs[i]);
-- 
2.44.0




[PATCH v2 16/22] target/arm/cpu: Set number of breakpoints and watchpoints in KVM

2024-04-19 Thread Jean-Philippe Brucker
Add "num-breakpoints" and "num-watchpoints" CPU parameters to configure
the debug features that KVM presents to the guest. The KVM vCPU
configuration is modified by calling SET_ONE_REG on the ID register.

This is needed for Realm VMs, whose parameters include breakpoints and
watchpoints, and influence the Realm Initial Measurement.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 target/arm/cpu.h  |  4 ++
 target/arm/kvm_arm.h  |  2 +
 target/arm/arm-qmp-cmds.c |  1 +
 target/arm/cpu64.c| 77 +++
 target/arm/kvm.c  | 56 +++-
 5 files changed, 139 insertions(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index d3ff1b4a31..24080da2b7 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1089,6 +1089,10 @@ struct ArchCPU {
 
 /* Generic timer counter frequency, in Hz */
 uint64_t gt_cntfrq_hz;
+
+/* Allows to override the default configuration */
+uint8_t num_bps;
+uint8_t num_wps;
 };
 
 typedef struct ARMCPUInfo {
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index 4b787dd628..b040686eab 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -16,6 +16,8 @@
 #define KVM_ARM_VGIC_V2   (1 << 0)
 #define KVM_ARM_VGIC_V3   (1 << 1)
 
+#define KVM_REG_ARM_ID_AA64DFR0_EL1 ARM64_SYS_REG(3, 0, 0, 5, 0)
+
 /**
  * kvm_arm_register_device:
  * @mr: memory region for this device
diff --git a/target/arm/arm-qmp-cmds.c b/target/arm/arm-qmp-cmds.c
index 3cc8cc738b..0f574bb1dd 100644
--- a/target/arm/arm-qmp-cmds.c
+++ b/target/arm/arm-qmp-cmds.c
@@ -95,6 +95,7 @@ static const char *cpu_model_advertised_features[] = {
 "sve1408", "sve1536", "sve1664", "sve1792", "sve1920", "sve2048",
 "kvm-no-adjvtime", "kvm-steal-time",
 "pauth", "pauth-impdef", "pauth-qarma3",
+"num-breakpoints", "num-watchpoints",
 NULL
 };
 
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 985b1efe16..9ca74eb019 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -571,6 +571,82 @@ void aarch64_add_pauth_properties(Object *obj)
 }
 }
 
+#if defined(CONFIG_KVM)
+static void arm_cpu_get_num_wps(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+uint8_t val;
+ARMCPU *cpu = ARM_CPU(obj);
+
+val = cpu->num_wps;
+if (val == 0) {
+val = FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) + 1;
+}
+
+visit_type_uint8(v, name, &val, errp);
+}
+
+static void arm_cpu_set_num_wps(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+uint8_t val;
+ARMCPU *cpu = ARM_CPU(obj);
+uint8_t max_wps = FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, WRPS) + 1;
+
+if (!visit_type_uint8(v, name, &val, errp)) {
+return;
+}
+
+if (val < 2 || val > max_wps) {
+error_setg(errp, "invalid number of watchpoints");
+return;
+}
+
+cpu->num_wps = val;
+}
+
+static void arm_cpu_get_num_bps(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+uint8_t val;
+ARMCPU *cpu = ARM_CPU(obj);
+
+val = cpu->num_bps;
+if (val == 0) {
+val = FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) + 1;
+}
+
+visit_type_uint8(v, name, &val, errp);
+}
+
+static void arm_cpu_set_num_bps(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+uint8_t val;
+ARMCPU *cpu = ARM_CPU(obj);
+uint8_t max_bps = FIELD_EX64(cpu->isar.id_aa64dfr0, ID_AA64DFR0, BRPS) + 1;
+
+if (!visit_type_uint8(v, name, &val, errp)) {
+return;
+}
+
+if (val < 2 || val > max_bps) {
+error_setg(errp, "invalid number of breakpoints");
+return;
+}
+
+cpu->num_bps = val;
+}
+
+static void aarch64_add_kvm_writable_properties(Object *obj)
+{
+object_property_add(obj, "num-breakpoints", "uint8", arm_cpu_get_num_bps,
+arm_cpu_set_num_bps, NULL, NULL);
+object_property_add(obj, "num-watchpoints", "uint8", arm_cpu_get_num_wps,
+arm_cpu_set_num_wps, NULL, NULL);
+}
+#endif /* CONFIG_KVM */
+
 void arm_cpu_lpa2_finalize(ARMCPU *cpu, Error **errp)
 {
 uint64_t t;
@@ -713,6 +789,7 @@ static void aarch64_host_initfn(Object *obj)
 if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
 aarch64_add_sve_properties(obj);
 aarch64_add_pauth_properties(obj);
+aarch64_add_kvm_writable_properties(obj);
 }
 #elif defined(CONFIG_HVF)
 ARMCPU *cpu = ARM_CPU(obj);
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 6d368bf442..623980a25b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -318,7 +318,7 @@ static bool 
kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
 err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64smfr0,
   ARM6

[PATCH v2 11/22] hw/core/loader: Add ROM loader notifier

2024-04-19 Thread Jean-Philippe Brucker
Add a function to register a notifier, that is invoked after a ROM gets
loaded into guest memory.

It will be used by Arm confidential guest support, in order to register
all blobs loaded into memory with KVM, so that their content is part of
the initial VM measurement and contribute to the guest attestation.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 include/hw/loader.h | 15 +++
 hw/core/loader.c| 15 +++
 2 files changed, 30 insertions(+)

diff --git a/include/hw/loader.h b/include/hw/loader.h
index 8685e27334..79fab25dd9 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -356,6 +356,21 @@ void hmp_info_roms(Monitor *mon, const QDict *qdict);
 ssize_t rom_add_vga(const char *file);
 ssize_t rom_add_option(const char *file, int32_t bootindex);
 
+typedef struct RomLoaderNotify {
+/* Parameters passed to rom_add_blob() */
+hwaddr addr;
+size_t len;
+size_t max_len;
+} RomLoaderNotify;
+
+/**
+ * rom_add_load_notifier - Add a notifier for loaded images
+ *
+ * Add a notifier that will be invoked with a RomLoaderNotify structure for 
each
+ * blob loaded into guest memory, after the blob is loaded.
+ */
+void rom_add_load_notifier(Notifier *notifier);
+
 /* This is the usual maximum in uboot, so if a uImage overflows this, it would
  * overflow on real hardware too. */
 #define UBOOT_MAX_GUNZIP_BYTES (64 << 20)
diff --git a/hw/core/loader.c b/hw/core/loader.c
index b8e52f3fb0..4bd236cf89 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -67,6 +67,8 @@
 #include 
 
 static int roms_loaded;
+static NotifierList rom_loader_notifier =
+NOTIFIER_LIST_INITIALIZER(rom_loader_notifier);
 
 /* return the size or -1 if error */
 int64_t get_image_size(const char *filename)
@@ -1209,6 +1211,11 @@ MemoryRegion *rom_add_blob(const char *name, const void 
*blob, size_t len,
 return mr;
 }
 
+void rom_add_load_notifier(Notifier *notifier)
+{
+notifier_list_add(&rom_loader_notifier, notifier);
+}
+
 /* This function is specific for elf program because we don't need to allocate
  * all the rom. We just allocate the first part and the rest is just zeros. 
This
  * is why romsize and datasize are different. Also, this function takes its own
@@ -1250,6 +1257,7 @@ ssize_t rom_add_option(const char *file, int32_t 
bootindex)
 static void rom_reset(void *unused)
 {
 Rom *rom;
+RomLoaderNotify notify;
 
 QTAILQ_FOREACH(rom, &roms, next) {
 if (rom->fw_file) {
@@ -1298,6 +1306,13 @@ static void rom_reset(void *unused)
 cpu_flush_icache_range(rom->addr, rom->datasize);
 
 trace_loader_write_rom(rom->name, rom->addr, rom->datasize, 
rom->isrom);
+
+notify = (RomLoaderNotify) {
+.addr = rom->addr,
+.len = rom->datasize,
+.max_len = rom->romsize,
+};
+notifier_list_notify(&rom_loader_notifier, ¬ify);
 }
 }
 
-- 
2.44.0




[PATCH v2 21/22] hw/arm/virt: Move virt_flash_create() to machvirt_init()

2024-04-19 Thread Jean-Philippe Brucker
For confidential VMs we'll want to skip flash device creation.
Unfortunately, in virt_instance_init() the machine->cgs member has not
yet been initialized, so we cannot check whether confidential guest is
enabled. Move virt_flash_create() to machvirt_init(), where we can
access the machine->cgs member.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 hw/arm/virt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index eca9a96b5a..bed19d0b79 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2071,6 +2071,8 @@ static void machvirt_init(MachineState *machine)
 unsigned int smp_cpus = machine->smp.cpus;
 unsigned int max_cpus = machine->smp.max_cpus;
 
+virt_flash_create(vms);
+
 possible_cpus = mc->possible_cpu_arch_ids(machine);
 
 /*
@@ -3229,8 +3231,6 @@ static void virt_instance_init(Object *obj)
 
 vms->irqmap = a15irqmap;
 
-virt_flash_create(vms);
-
 vms->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
 vms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
 }
-- 
2.44.0




[PATCH v2 12/22] target/arm/kvm-rme: Populate Realm memory

2024-04-19 Thread Jean-Philippe Brucker
Collect the images copied into guest RAM into a sorted list, and issue
POPULATE_REALM KVM ioctls once we've created the Realm Descriptor. The
images are part of the Realm Initial Measurement.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: Use a ROM loader notifier
---
 target/arm/kvm-rme.c | 97 
 1 file changed, 97 insertions(+)

diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
index aa9c3b5551..bee6694d6d 100644
--- a/target/arm/kvm-rme.c
+++ b/target/arm/kvm-rme.c
@@ -9,9 +9,11 @@
 #include "exec/confidential-guest-support.h"
 #include "hw/boards.h"
 #include "hw/core/cpu.h"
+#include "hw/loader.h"
 #include "kvm_arm.h"
 #include "migration/blocker.h"
 #include "qapi/error.h"
+#include "qemu/error-report.h"
 #include "qom/object_interfaces.h"
 #include "sysemu/kvm.h"
 #include "sysemu/runstate.h"
@@ -19,10 +21,21 @@
 #define TYPE_RME_GUEST "rme-guest"
 OBJECT_DECLARE_SIMPLE_TYPE(RmeGuest, RME_GUEST)
 
+#define RME_PAGE_SIZE qemu_real_host_page_size()
+
 struct RmeGuest {
 ConfidentialGuestSupport parent_obj;
+Notifier rom_load_notifier;
+GSList *ram_regions;
 };
 
+typedef struct {
+hwaddr base;
+hwaddr len;
+/* Populate guest RAM with data, or only initialize the IPA range */
+bool populate;
+} RmeRamRegion;
+
 static RmeGuest *rme_guest;
 
 bool kvm_arm_rme_enabled(void)
@@ -41,6 +54,41 @@ static int rme_create_rd(Error **errp)
 return ret;
 }
 
+static void rme_populate_realm(gpointer data, gpointer unused)
+{
+int ret;
+const RmeRamRegion *region = data;
+
+if (region->populate) {
+struct kvm_cap_arm_rme_populate_realm_args populate_args = {
+.populate_ipa_base = region->base,
+.populate_ipa_size = region->len,
+.flags = KVM_ARM_RME_POPULATE_FLAGS_MEASURE,
+};
+ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_RME, 0,
+KVM_CAP_ARM_RME_POPULATE_REALM,
+(intptr_t)&populate_args);
+if (ret) {
+error_report("RME: failed to populate realm (0x%"HWADDR_PRIx", 
0x%"HWADDR_PRIx"): %s",
+ region->base, region->len, strerror(-ret));
+exit(1);
+}
+} else {
+struct kvm_cap_arm_rme_init_ipa_args init_args = {
+.init_ipa_base = region->base,
+.init_ipa_size = region->len,
+};
+ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_RME, 0,
+KVM_CAP_ARM_RME_INIT_IPA_REALM,
+(intptr_t)&init_args);
+if (ret) {
+error_report("RME: failed to initialize GPA range 
(0x%"HWADDR_PRIx", 0x%"HWADDR_PRIx"): %s",
+ region->base, region->len, strerror(-ret));
+exit(1);
+}
+}
+}
+
 static void rme_vm_state_change(void *opaque, bool running, RunState state)
 {
 int ret;
@@ -55,6 +103,9 @@ static void rme_vm_state_change(void *opaque, bool running, 
RunState state)
 return;
 }
 
+g_slist_foreach(rme_guest->ram_regions, rme_populate_realm, NULL);
+g_slist_free_full(g_steal_pointer(&rme_guest->ram_regions), g_free);
+
 /*
  * Now that do_cpu_reset() initialized the boot PC and
  * kvm_cpu_synchronize_post_reset() registered it, we can finalize the REC.
@@ -75,6 +126,49 @@ static void rme_vm_state_change(void *opaque, bool running, 
RunState state)
 }
 }
 
+static gint rme_compare_ram_regions(gconstpointer a, gconstpointer b)
+{
+const RmeRamRegion *ra = a;
+const RmeRamRegion *rb = b;
+
+g_assert(ra->base != rb->base);
+return ra->base < rb->base ? -1 : 1;
+}
+
+static void rme_add_ram_region(hwaddr base, hwaddr len, bool populate)
+{
+RmeRamRegion *region;
+
+region = g_new0(RmeRamRegion, 1);
+region->base = QEMU_ALIGN_DOWN(base, RME_PAGE_SIZE);
+region->len = QEMU_ALIGN_UP(len, RME_PAGE_SIZE);
+region->populate = populate;
+
+/*
+ * The Realm Initial Measurement (RIM) depends on the order in which we
+ * initialize and populate the RAM regions. To help a verifier
+ * independently calculate the RIM, sort regions by GPA.
+ */
+rme_guest->ram_regions = g_slist_insert_sorted(rme_guest->ram_regions,
+   region,
+   rme_compare_ram_regions);
+}
+
+static void rme_rom_load_notify(Notifier *notifier, void *data)
+{
+RomLoaderNotify *rom = data;
+
+if (rom->addr == -1) {
+/*
+ * These blobs (ACPI tables) are not loaded into guest RAM at reset.
+ * Instead the firmware will load them via fw_cfg and measure them
+ * itself.
+ */
+return;
+}
+rme_add_ram_region(rom->addr, rom->max_len, /* populate */ true);
+}
+
 int kvm_arm_rme_init(MachineState *ms)
 {
 static Error *rme_mig_blocker;
@@ -102,6 +196,9 @@ int kvm_arm

[PATCH v2 17/22] target/arm/cpu: Set number of PMU counters in KVM

2024-04-19 Thread Jean-Philippe Brucker
Add a "num-pmu-counters" CPU parameter to configure the number of
counters that KVM presents to the guest. This is needed for Realm VMs,
whose parameters include the number of PMU counters and influence the
Realm Initial Measurement.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 target/arm/cpu.h  |  3 +++
 target/arm/kvm_arm.h  |  1 +
 target/arm/arm-qmp-cmds.c |  2 +-
 target/arm/cpu64.c| 41 +++
 target/arm/kvm.c  | 34 +++-
 5 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 24080da2b7..84f3a67dab 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1093,6 +1093,7 @@ struct ArchCPU {
 /* Allows to override the default configuration */
 uint8_t num_bps;
 uint8_t num_wps;
+int8_t num_pmu_ctrs;
 };
 
 typedef struct ARMCPUInfo {
@@ -2312,6 +2313,8 @@ FIELD(MFAR, FPA, 12, 40)
 FIELD(MFAR, NSE, 62, 1)
 FIELD(MFAR, NS, 63, 1)
 
+FIELD(PMCR, N, 11, 5)
+
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= 
R_V7M_CSSELR_INDEX_MASK);
 
 /* If adding a feature bit which corresponds to a Linux ELF
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index b040686eab..62e39e7184 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -17,6 +17,7 @@
 #define KVM_ARM_VGIC_V3   (1 << 1)
 
 #define KVM_REG_ARM_ID_AA64DFR0_EL1 ARM64_SYS_REG(3, 0, 0, 5, 0)
+#define KVM_REG_ARM_PMCR_EL0ARM64_SYS_REG(3, 3, 9, 12, 0)
 
 /**
  * kvm_arm_register_device:
diff --git a/target/arm/arm-qmp-cmds.c b/target/arm/arm-qmp-cmds.c
index 0f574bb1dd..985d4270b8 100644
--- a/target/arm/arm-qmp-cmds.c
+++ b/target/arm/arm-qmp-cmds.c
@@ -95,7 +95,7 @@ static const char *cpu_model_advertised_features[] = {
 "sve1408", "sve1536", "sve1664", "sve1792", "sve1920", "sve2048",
 "kvm-no-adjvtime", "kvm-steal-time",
 "pauth", "pauth-impdef", "pauth-qarma3",
-"num-breakpoints", "num-watchpoints",
+"num-breakpoints", "num-watchpoints", "num-pmu-counters",
 NULL
 };
 
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 9ca74eb019..6c2b922d93 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -638,12 +638,53 @@ static void arm_cpu_set_num_bps(Object *obj, Visitor *v, 
const char *name,
 cpu->num_bps = val;
 }
 
+static void arm_cpu_get_num_pmu_ctrs(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint8_t val;
+ARMCPU *cpu = ARM_CPU(obj);
+
+if (cpu->num_pmu_ctrs == -1) {
+val = FIELD_EX64(cpu->isar.reset_pmcr_el0, PMCR, N);
+} else {
+val = cpu->num_pmu_ctrs;
+}
+
+visit_type_uint8(v, name, &val, errp);
+}
+
+static void arm_cpu_set_num_pmu_ctrs(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint8_t val;
+ARMCPU *cpu = ARM_CPU(obj);
+uint8_t max_ctrs = FIELD_EX64(cpu->isar.reset_pmcr_el0, PMCR, N);
+
+if (!visit_type_uint8(v, name, &val, errp)) {
+return;
+}
+
+if (val > max_ctrs) {
+error_setg(errp, "invalid number of PMU counters");
+return;
+}
+
+cpu->num_pmu_ctrs = val;
+}
+
 static void aarch64_add_kvm_writable_properties(Object *obj)
 {
+ARMCPU *cpu = ARM_CPU(obj);
+
 object_property_add(obj, "num-breakpoints", "uint8", arm_cpu_get_num_bps,
 arm_cpu_set_num_bps, NULL, NULL);
 object_property_add(obj, "num-watchpoints", "uint8", arm_cpu_get_num_wps,
 arm_cpu_set_num_wps, NULL, NULL);
+
+cpu->num_pmu_ctrs = -1;
+object_property_add(obj, "num-pmu-counters", "uint8",
+arm_cpu_get_num_pmu_ctrs, arm_cpu_set_num_pmu_ctrs,
+NULL, NULL);
 }
 #endif /* CONFIG_KVM */
 
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 623980a25b..9855cadb1b 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -418,7 +418,7 @@ static bool 
kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
 if (pmu_supported) {
 /* PMCR_EL0 is only accessible if the vCPU has feature PMU_V3 */
 err |= read_sys_reg64(fdarray[2], &ahcf->isar.reset_pmcr_el0,
-  ARM64_SYS_REG(3, 3, 9, 12, 0));
+  KVM_REG_ARM_PMCR_EL0);
 }
 
 if (sve_supported) {
@@ -919,9 +919,41 @@ static void kvm_arm_configure_aa64dfr0(ARMCPU *cpu)
 }
 }
 
+static void kvm_arm_configure_pmcr(ARMCPU *cpu)
+{
+int ret;
+uint64_t val, newval;
+CPUState *cs = CPU(cpu);
+
+if (cpu->num_pmu_ctrs == -1) {
+return;
+}
+
+newval = FIELD_DP64(cpu->isar.reset_pmcr_el0, PMCR, N, cpu->num_pmu_ctrs);
+ret = kvm_set_one_reg(cs, KVM_REG_ARM_PMCR_EL0, &newval);
+if (ret) {
+error_report("Failed to set KVM_REG_ARM_PMCR_EL0");
+return;
+}
+
+/*
+ * Check if the write su

[PATCH v2 20/22] target/arm/kvm-rme: Enable guest memfd

2024-04-19 Thread Jean-Philippe Brucker
Request that RAM block uses the KVM guest memfd call to allocate guest
memory. With RME, guest memory is not accessible by the host, and using
guest memfd ensures that the host kernel is aware of this and doesn't
attempt to access guest pages.

Done in a separate patch because ms->require_guest_memfd is not yet
merged.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 target/arm/kvm-rme.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
index 8f39e54aaa..71cc1d4147 100644
--- a/target/arm/kvm-rme.c
+++ b/target/arm/kvm-rme.c
@@ -263,6 +263,7 @@ int kvm_arm_rme_init(MachineState *ms)
 rme_guest->rom_load_notifier.notify = rme_rom_load_notify;
 rom_add_load_notifier(&rme_guest->rom_load_notifier);
 
+ms->require_guest_memfd = true;
 cgs->ready = true;
 return 0;
 }
-- 
2.44.0




[PATCH v2 22/22] hw/arm/virt: Use RAM instead of flash for confidential guest firmware

2024-04-19 Thread Jean-Philippe Brucker
The flash device that holds firmware code relies on read-only stage-2
mappings. Read accesses behave as RAM and write accesses as MMIO. Since
the RMM does not support read-only mappings we cannot use the flash
device as-is.

That isn't a problem because the firmware does not want to disclose any
information to the host, hence will not store its variables in clear
persistent memory. We can therefore replace the flash device with RAM,
and load the firmware there.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 include/hw/arm/boot.h |  9 +
 hw/arm/boot.c | 34 ++---
 hw/arm/virt.c | 44 +++
 3 files changed, 84 insertions(+), 3 deletions(-)

diff --git a/include/hw/arm/boot.h b/include/hw/arm/boot.h
index 80c492d742..d91cfc6942 100644
--- a/include/hw/arm/boot.h
+++ b/include/hw/arm/boot.h
@@ -112,6 +112,10 @@ struct arm_boot_info {
  */
 bool firmware_loaded;
 
+/* Used when loading firmware into RAM */
+hwaddr firmware_base;
+hwaddr firmware_max_size;
+
 /* Address at which board specific loader/setup code exists. If enabled,
  * this code-blob will run before anything else. It must return to the
  * caller via the link register. There is no stack set up. Enabled by
@@ -132,6 +136,11 @@ struct arm_boot_info {
 bool secure_board_setup;
 
 arm_endianness endianness;
+
+/*
+ * Confidential guest boot loads everything into RAM so it can be measured.
+ */
+bool confidential;
 };
 
 /**
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 9f522e332b..26c6334d52 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -1154,7 +1154,31 @@ static void arm_setup_direct_kernel_boot(ARMCPU *cpu,
 }
 }
 
-static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info)
+static void arm_setup_confidential_firmware_boot(ARMCPU *cpu,
+ struct arm_boot_info *info,
+ const char *firmware_filename)
+{
+ssize_t fw_size;
+const char *fname;
+AddressSpace *as = arm_boot_address_space(cpu, info);
+
+fname = qemu_find_file(QEMU_FILE_TYPE_BIOS, firmware_filename);
+if (!fname) {
+error_report("Could not find firmware image '%s'", firmware_filename);
+exit(1);
+}
+
+fw_size = load_image_targphys_as(firmware_filename,
+ info->firmware_base,
+ info->firmware_max_size, as);
+if (fw_size <= 0) {
+error_report("could not load firmware '%s'", firmware_filename);
+exit(1);
+}
+}
+
+static void arm_setup_firmware_boot(ARMCPU *cpu, struct arm_boot_info *info,
+const char *firmware_filename)
 {
 /* Set up for booting firmware (which might load a kernel via fw_cfg) */
 
@@ -1205,6 +1229,10 @@ static void arm_setup_firmware_boot(ARMCPU *cpu, struct 
arm_boot_info *info)
 }
 }
 
+if (info->confidential) {
+arm_setup_confidential_firmware_boot(cpu, info, firmware_filename);
+}
+
 /*
  * We will start from address 0 (typically a boot ROM image) in the
  * same way as hardware. Leave env->boot_info NULL, so that
@@ -1243,9 +1271,9 @@ void arm_load_kernel(ARMCPU *cpu, MachineState *ms, 
struct arm_boot_info *info)
 info->dtb_filename = ms->dtb;
 info->dtb_limit = 0;
 
-/* Load the kernel.  */
+/* Load the kernel and/or firmware. */
 if (!info->kernel_filename || info->firmware_loaded) {
-arm_setup_firmware_boot(cpu, info);
+arm_setup_firmware_boot(cpu, info, ms->firmware);
 } else {
 arm_setup_direct_kernel_boot(cpu, info);
 }
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index bed19d0b79..4a6281fc89 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1178,6 +1178,10 @@ static PFlashCFI01 *virt_flash_create1(VirtMachineState 
*vms,
 
 static void virt_flash_create(VirtMachineState *vms)
 {
+if (virt_machine_is_confidential(vms)) {
+return;
+}
+
 vms->flash[0] = virt_flash_create1(vms, "virt.flash0", "pflash0");
 vms->flash[1] = virt_flash_create1(vms, "virt.flash1", "pflash1");
 }
@@ -1213,6 +1217,10 @@ static void virt_flash_map(VirtMachineState *vms,
 hwaddr flashsize = vms->memmap[VIRT_FLASH].size / 2;
 hwaddr flashbase = vms->memmap[VIRT_FLASH].base;
 
+if (virt_machine_is_confidential(vms)) {
+return;
+}
+
 virt_flash_map1(vms->flash[0], flashbase, flashsize,
 secure_sysmem);
 virt_flash_map1(vms->flash[1], flashbase + flashsize, flashsize,
@@ -1228,6 +1236,10 @@ static void virt_flash_fdt(VirtMachineState *vms,
 MachineState *ms = MACHINE(vms);
 char *nodename;
 
+if (virt_machine_is_confidential(vms)) {
+return;
+}
+
 if (sysmem == secure_sysmem) {
 /* Report both flash devices as a single node in the DT */

[PATCH v2 00/22] arm: Run CCA VMs with KVM

2024-04-19 Thread Jean-Philippe Brucker
These patches enable launching a confidential guest with QEMU KVM on
Arm. The KVM changes for CCA have now been posted as v2 [1]. Launching a
confidential VM requires two additional command-line parameters:

-M confidential-guest-support=rme0
-object rme-guest,id=rme0

Since the RFC [2] I tried to address all review comments, and added a
few features:

* Enabled support for guest memfd by Xiaoyao Li and Chao Peng [3].
  Guest memfd is mandatory for CCA.

* Support firmware boot (edk2).

* Use CPU command-line arguments for Realm parameters. SVE vector length
  uses the existing sve -cpu parameters, while breakpoints, watchpoints
  and PMU counters use new CPU parameters.

The full series based on the memfd patches is at:
https://git.codelinaro.org/linaro/dcap/qemu.git branch cca/v2

Please find instructions for building and running the whole CCA stack at:
https://linaro.atlassian.net/wiki/spaces/QEMU/pages/29051027459/Building+an+RME+stack+for+QEMU

[1] https://lore.kernel.org/kvm/20240412084056.1733704-1-steven.pr...@arm.com/
[2] 
https://lore.kernel.org/all/20230127150727.612594-1-jean-phili...@linaro.org/
[3] 
https://lore.kernel.org/qemu-devel/20240322181116.1228416-1-pbonz...@redhat.com/

Jean-Philippe Brucker (22):
  kvm: Merge kvm_check_extension() and kvm_vm_check_extension()
  target/arm: Add confidential guest support
  target/arm/kvm: Return immediately on error in kvm_arch_init()
  target/arm/kvm-rme: Initialize realm
  hw/arm/virt: Add support for Arm RME
  hw/arm/virt: Disable DTB randomness for confidential VMs
  hw/arm/virt: Reserve one bit of guest-physical address for RME
  target/arm/kvm: Split kvm_arch_get/put_registers
  target/arm/kvm-rme: Initialize vCPU
  target/arm/kvm: Create scratch VM as Realm if necessary
  hw/core/loader: Add ROM loader notifier
  target/arm/kvm-rme: Populate Realm memory
  hw/arm/boot: Register Linux BSS section for confidential guests
  target/arm/kvm-rme: Add Realm Personalization Value parameter
  target/arm/kvm-rme: Add measurement algorithm property
  target/arm/cpu: Set number of breakpoints and watchpoints in KVM
  target/arm/cpu: Set number of PMU counters in KVM
  target/arm/kvm: Disable Realm reboot
  target/arm/cpu: Inform about reading confidential CPU registers
  target/arm/kvm-rme: Enable guest memfd
  hw/arm/virt: Move virt_flash_create() to machvirt_init()
  hw/arm/virt: Use RAM instead of flash for confidential guest firmware

 docs/system/arm/virt.rst   |   9 +-
 docs/system/confidential-guest-support.rst |   1 +
 qapi/qom.json  |  34 +-
 include/hw/arm/boot.h  |   9 +
 include/hw/arm/virt.h  |   2 +-
 include/hw/loader.h|  15 +
 include/sysemu/kvm.h   |   2 -
 include/sysemu/kvm_int.h   |   1 +
 target/arm/cpu.h   |  10 +
 target/arm/kvm_arm.h   |  25 ++
 accel/kvm/kvm-all.c|  34 +-
 hw/arm/boot.c  |  45 ++-
 hw/arm/virt.c  | 118 --
 hw/core/loader.c   |  15 +
 target/arm/arm-qmp-cmds.c  |   1 +
 target/arm/cpu.c   |   5 +
 target/arm/cpu64.c | 118 ++
 target/arm/kvm-rme.c   | 413 +
 target/arm/kvm.c   | 200 +-
 target/i386/kvm/kvm.c  |   6 +-
 target/ppc/kvm.c   |  36 +-
 target/arm/meson.build |   7 +-
 22 files changed, 1023 insertions(+), 83 deletions(-)
 create mode 100644 target/arm/kvm-rme.c

-- 
2.44.0




[PATCH v2 03/22] target/arm/kvm: Return immediately on error in kvm_arch_init()

2024-04-19 Thread Jean-Philippe Brucker
Returning an error to kvm_init() is fatal anyway, no need to continue
the initialization.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: new
---
 target/arm/kvm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 3371ffa401..a5673241e5 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -566,7 +566,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 !kvm_check_extension(s, KVM_CAP_ARM_IRQ_LINE_LAYOUT_2)) {
 error_report("Using more than 256 vcpus requires a host kernel "
  "with KVM_CAP_ARM_IRQ_LINE_LAYOUT_2");
-ret = -EINVAL;
+return -EINVAL;
 }
 
 if (kvm_check_extension(s, KVM_CAP_ARM_NISV_TO_USER)) {
@@ -595,6 +595,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 if (ret < 0) {
 error_report("Enabling of Eager Page Split failed: %s",
  strerror(-ret));
+return ret;
 }
 }
 }
-- 
2.44.0




[PATCH v2 01/22] kvm: Merge kvm_check_extension() and kvm_vm_check_extension()

2024-04-19 Thread Jean-Philippe Brucker
The KVM_CHECK_EXTENSION ioctl can be issued either on the global fd
(/dev/kvm), or on the VM fd obtained with KVM_CREATE_VM. For most
extensions, KVM returns the same value with either method, but for some
of them it can refine the returned value depending on the VM type. The
KVM documentation [1] advises to use the VM fd:

  Based on their initialization different VMs may have different
  capabilities. It is thus encouraged to use the vm ioctl to query for
  capabilities (available with KVM_CAP_CHECK_EXTENSION_VM on the vm fd)

Ongoing work on Arm confidential VMs confirms this, as some capabilities
become unavailable to confidential VMs, requiring changes in QEMU to use
kvm_vm_check_extension() instead of kvm_check_extension() [2]. Rather
than changing each check one by one, change kvm_check_extension() to
always issue the ioctl on the VM fd when available, and remove
kvm_vm_check_extension().

Fall back to the global fd when the VM check is unavailable:

* Ancient kernels do not support KVM_CHECK_EXTENSION on the VM fd, since
  it was added by commit 92b591a4c46b ("KVM: Allow KVM_CHECK_EXTENSION
  on the vm fd") in Linux 3.17 [3]. Support for Linux 3.16 ended in June
  2020, but there may still be old images around.

* A couple of calls must be issued before the VM fd is available, since
  they determine the VM type: KVM_CAP_MIPS_VZ and KVM_CAP_ARM_VM_IPA_SIZE

Does any user actually depend on the check being done on the global fd
instead of the VM fd?  I surveyed all cases where KVM presently returns
different values depending on the query method. Luckily QEMU already
calls kvm_vm_check_extension() for most of those. Only three of them are
ambiguous, because currently done on the global fd:

* KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_VCPU_ID on Arm, changes value if the
  user requests a vGIC different from the default. But QEMU queries this
  before vGIC configuration, so the reported value will be the same.

* KVM_CAP_SW_TLB on PPC. When issued on the global fd, returns false if
  the kvm-hv module is loaded; when issued on the VM fd, returns false
  only if the VM type is HV instead of PR. If this returns false, then
  QEMU will fail to initialize a BOOKE206 MMU model.

  So this patch supposedly improves things, as it allows to run this
  type of vCPU even when both KVM modules are loaded.

* KVM_CAP_PPC_SECURE_GUEST. Similarly, doing this check on a VM fd
  refines the returned value, and ensures that SVM is actually
  supported. Since QEMU follows the check with kvm_vm_enable_cap(), this
  patch should only provide better error reporting.

[1] https://www.kernel.org/doc/html/latest/virt/kvm/api.html#kvm-check-extension
[2] https://lore.kernel.org/kvm/875ybi0ytc@redhat.com/
[3] https://github.com/torvalds/linux/commit/92b591a4c46b

Cc: Marcelo Tosatti 
Cc: Nicholas Piggin 
Cc: Daniel Henrique Barboza 
Cc: qemu-...@nongnu.org
Suggested-by: Cornelia Huck 
Signed-off-by: Jean-Philippe Brucker 
---
v1: 
https://lore.kernel.org/qemu-devel/20230421163822.839167-1-jean-phili...@linaro.org/
v1->v2: Initialize check_extension_vm using kvm_vm_ioctl() as suggested
---
 include/sysemu/kvm.h |  2 --
 include/sysemu/kvm_int.h |  1 +
 accel/kvm/kvm-all.c  | 34 +++---
 target/arm/kvm.c |  2 +-
 target/i386/kvm/kvm.c|  6 +++---
 target/ppc/kvm.c | 36 ++--
 6 files changed, 38 insertions(+), 43 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index c6f34d4794..df97077434 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -404,8 +404,6 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cpu);
 
 int kvm_check_extension(KVMState *s, unsigned int extension);
 
-int kvm_vm_check_extension(KVMState *s, unsigned int extension);
-
 #define kvm_vm_enable_cap(s, capability, cap_flags, ...) \
 ({   \
 struct kvm_enable_cap cap = {\
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index cad763e240..fa4c9aeb96 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -123,6 +123,7 @@ struct KVMState
 uint16_t xen_gnttab_max_frames;
 uint16_t xen_evtchn_max_pirq;
 char *device;
+bool check_extension_vm;
 };
 
 void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index e08dd04164..3d9fbc8a98 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1128,7 +1128,11 @@ int kvm_check_extension(KVMState *s, unsigned int 
extension)
 {
 int ret;
 
-ret = kvm_ioctl(s, KVM_CHECK_EXTENSION, extension);
+if (!s->check_extension_vm) {
+ret = kvm_ioctl(s, KVM_CHECK_EXTENSION, extension);
+} else {
+ret = kvm_vm_ioctl(s, KVM_CHECK_EXTENSION, extension);
+}
 if (ret < 0) {
 ret = 0;
 }
@@ -1136,19 +1140,6 @@ int kvm_check_extension(KVMSta

[PATCH v2 14/22] target/arm/kvm-rme: Add Realm Personalization Value parameter

2024-04-19 Thread Jean-Philippe Brucker
The Realm Personalization Value (RPV) is provided by the user to
distinguish Realms that have the same initial measurement.

The user provides up to 64 hexadecimal bytes. They are stored into the
RPV in the same order, zero-padded on the right.

Cc: Eric Blake 
Cc: Markus Armbruster 
Cc: Daniel P. Berrangé 
Cc: Eduardo Habkost 
Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: Move parsing early, store as-is rather than reverted
---
 qapi/qom.json|  15 +-
 target/arm/kvm-rme.c | 111 +++
 2 files changed, 125 insertions(+), 1 deletion(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 623ec8071f..91654aa267 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -931,6 +931,18 @@
   'data': { '*cpu-affinity': ['uint16'],
 '*node-affinity': ['uint16'] } }
 
+##
+# @RmeGuestProperties:
+#
+# Properties for rme-guest objects.
+#
+# @personalization-value: Realm personalization value, as a 64-byte hex string
+# (default: 0)
+#
+# Since: FIXME
+##
+{ 'struct': 'RmeGuestProperties',
+  'data': { '*personalization-value': 'str' } }
 
 ##
 # @ObjectType:
@@ -1066,7 +1078,8 @@
   'tls-creds-x509': 'TlsCredsX509Properties',
   'tls-cipher-suites':  'TlsCredsProperties',
   'x-remote-object':'RemoteObjectProperties',
-  'x-vfio-user-server': 'VfioUserServerProperties'
+  'x-vfio-user-server': 'VfioUserServerProperties',
+  'rme-guest':  'RmeGuestProperties'
   } }
 
 ##
diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
index b2ad10ef6d..cb5c3f7a22 100644
--- a/target/arm/kvm-rme.c
+++ b/target/arm/kvm-rme.c
@@ -23,10 +23,13 @@ OBJECT_DECLARE_SIMPLE_TYPE(RmeGuest, RME_GUEST)
 
 #define RME_PAGE_SIZE qemu_real_host_page_size()
 
+#define RME_MAX_CFG 1
+
 struct RmeGuest {
 ConfidentialGuestSupport parent_obj;
 Notifier rom_load_notifier;
 GSList *ram_regions;
+uint8_t *personalization_value;
 };
 
 typedef struct {
@@ -54,6 +57,48 @@ static int rme_create_rd(Error **errp)
 return ret;
 }
 
+static int rme_configure_one(RmeGuest *guest, uint32_t cfg, Error **errp)
+{
+int ret;
+const char *cfg_str;
+struct kvm_cap_arm_rme_config_item args = {
+.cfg = cfg,
+};
+
+switch (cfg) {
+case KVM_CAP_ARM_RME_CFG_RPV:
+if (!guest->personalization_value) {
+return 0;
+}
+memcpy(args.rpv, guest->personalization_value, 
KVM_CAP_ARM_RME_RPV_SIZE);
+cfg_str = "personalization value";
+break;
+default:
+g_assert_not_reached();
+}
+
+ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_RME, 0,
+KVM_CAP_ARM_RME_CONFIG_REALM, (intptr_t)&args);
+if (ret) {
+error_setg_errno(errp, -ret, "RME: failed to configure %s", cfg_str);
+}
+return ret;
+}
+
+static int rme_configure(void)
+{
+int ret;
+int cfg;
+
+for (cfg = 0; cfg < RME_MAX_CFG; cfg++) {
+ret = rme_configure_one(rme_guest, cfg, &error_abort);
+if (ret) {
+return ret;
+}
+}
+return 0;
+}
+
 static void rme_populate_realm(gpointer data, gpointer unused)
 {
 int ret;
@@ -98,6 +143,11 @@ static void rme_vm_state_change(void *opaque, bool running, 
RunState state)
 return;
 }
 
+ret = rme_configure();
+if (ret) {
+return;
+}
+
 ret = rme_create_rd(&error_abort);
 if (ret) {
 return;
@@ -231,8 +281,69 @@ int kvm_arm_rme_vm_type(MachineState *ms)
 return 0;
 }
 
+static char *rme_get_rpv(Object *obj, Error **errp)
+{
+RmeGuest *guest = RME_GUEST(obj);
+GString *s;
+int i;
+
+if (!guest->personalization_value) {
+return NULL;
+}
+
+s = g_string_sized_new(KVM_CAP_ARM_RME_RPV_SIZE * 2 + 1);
+
+for (i = 0; i < KVM_CAP_ARM_RME_RPV_SIZE; i++) {
+g_string_append_printf(s, "%02x", guest->personalization_value[i]);
+}
+
+return g_string_free(s, /* free_segment */ false);
+}
+
+static void rme_set_rpv(Object *obj, const char *value, Error **errp)
+{
+RmeGuest *guest = RME_GUEST(obj);
+size_t len = strlen(value);
+uint8_t *out;
+int i = 1;
+int ret;
+
+g_free(guest->personalization_value);
+guest->personalization_value = out = g_malloc0(KVM_CAP_ARM_RME_RPV_SIZE);
+
+/* Two chars per byte */
+if (len > KVM_CAP_ARM_RME_RPV_SIZE * 2) {
+error_setg(errp, "Realm Personalization Value is too large");
+return;
+}
+
+/* First byte may have a single char */
+if (len % 2) {
+ret = sscanf(value, "%1hhx", out++);
+} else {
+ret = sscanf(value, "%2hhx", out++);
+i++;
+}
+if (ret != 1) {
+error_setg(errp, "Invalid Realm Personalization Value");
+return;
+}
+
+for (; i < len; i += 2) {
+ret = sscanf(value + i, "%2hhx", out++);
+if (ret != 1) {
+error_setg(errp, "Invalid Realm Pers

[PATCH v2 05/22] hw/arm/virt: Add support for Arm RME

2024-04-19 Thread Jean-Philippe Brucker
When confidential-guest-support is enabled for the virt machine, call
the RME init function, and add the RME flag to the VM type.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2:
* Don't explicitly disable steal_time, it's now done through KVM capabilities
* Split patch
---
 hw/arm/virt.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a9a913aead..07ad31876e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -224,6 +224,11 @@ static const int a15irqmap[] = {
 [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
 };
 
+static bool virt_machine_is_confidential(VirtMachineState *vms)
+{
+return MACHINE(vms)->cgs;
+}
+
 static void create_randomness(MachineState *ms, const char *node)
 {
 struct {
@@ -2111,10 +2116,11 @@ static void machvirt_init(MachineState *machine)
  * if the guest has EL2 then we will use SMC as the conduit,
  * and otherwise we will use HVC (for backwards compatibility and
  * because if we're using KVM then we must use HVC).
+ * Realm guests must also use SMC.
  */
 if (vms->secure && firmware_loaded) {
 vms->psci_conduit = QEMU_PSCI_CONDUIT_DISABLED;
-} else if (vms->virt) {
+} else if (vms->virt || virt_machine_is_confidential(vms)) {
 vms->psci_conduit = QEMU_PSCI_CONDUIT_SMC;
 } else {
 vms->psci_conduit = QEMU_PSCI_CONDUIT_HVC;
@@ -2917,6 +2923,7 @@ static HotplugHandler 
*virt_machine_get_hotplug_handler(MachineState *machine,
 static int virt_kvm_type(MachineState *ms, const char *type_str)
 {
 VirtMachineState *vms = VIRT_MACHINE(ms);
+int rme_vm_type = kvm_arm_rme_vm_type(ms);
 int max_vm_pa_size, requested_pa_size;
 bool fixed_ipa;
 
@@ -2946,7 +2953,11 @@ static int virt_kvm_type(MachineState *ms, const char 
*type_str)
  * the implicit legacy 40b IPA setting, in which case the kvm_type
  * must be 0.
  */
-return fixed_ipa ? 0 : requested_pa_size;
+if (fixed_ipa) {
+return 0;
+}
+
+return requested_pa_size | rme_vm_type;
 }
 
 static void virt_machine_class_init(ObjectClass *oc, void *data)
-- 
2.44.0




[PATCH v2 07/22] hw/arm/virt: Reserve one bit of guest-physical address for RME

2024-04-19 Thread Jean-Philippe Brucker
When RME is enabled, the upper GPA bit is used to distinguish protected
from unprotected addresses. Reserve it when setting up the guest memory
map.

Signed-off-by: Jean-Philippe Brucker 
---
v1->v2: separate patch
---
 hw/arm/virt.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f300f100b5..eca9a96b5a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2939,14 +2939,24 @@ static int virt_kvm_type(MachineState *ms, const char 
*type_str)
 VirtMachineState *vms = VIRT_MACHINE(ms);
 int rme_vm_type = kvm_arm_rme_vm_type(ms);
 int max_vm_pa_size, requested_pa_size;
+int rme_reserve_bit = 0;
 bool fixed_ipa;
 
-max_vm_pa_size = kvm_arm_get_max_vm_ipa_size(ms, &fixed_ipa);
+if (rme_vm_type) {
+/*
+ * With RME, the upper GPA bit differentiates Realm from NS memory.
+ * Reserve the upper bit to ensure that highmem devices will fit.
+ */
+rme_reserve_bit = 1;
+}
+
+max_vm_pa_size = kvm_arm_get_max_vm_ipa_size(ms, &fixed_ipa) -
+ rme_reserve_bit;
 
 /* we freeze the memory map to compute the highest gpa */
 virt_set_memmap(vms, max_vm_pa_size);
 
-requested_pa_size = 64 - clz64(vms->highest_gpa);
+requested_pa_size = 64 - clz64(vms->highest_gpa) + rme_reserve_bit;
 
 /*
  * KVM requires the IPA size to be at least 32 bits.
-- 
2.44.0




[PATCH v2 10/22] target/arm/kvm: Create scratch VM as Realm if necessary

2024-04-19 Thread Jean-Philippe Brucker
Some ID registers have a different value for a Realm VM, for example
ID_AA64DFR0_EL1 contains the number of breakpoints/watchpoints
implemented by RMM instead of the hardware.

Even though RMM is in charge of setting up most Realm registers, KVM
still provides GET_ONE_REG interface on a Realm VM to probe the VM's
capabilities.

KVM only reports the maximum IPA it supports, but RMM may support
smaller sizes. If the VM creation fails with the value returned by KVM,
then retry with the smaller working address. This needs a better
solution.

Signed-off-by: Jean-Philippe Brucker 
---
 target/arm/kvm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 3a2233ec73..6d368bf442 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -104,6 +104,7 @@ bool kvm_arm_create_scratch_host_vcpu(const uint32_t 
*cpus_to_try,
 {
 int ret = 0, kvmfd = -1, vmfd = -1, cpufd = -1;
 int max_vm_pa_size;
+int vm_type;
 
 kvmfd = qemu_open_old("/dev/kvm", O_RDWR);
 if (kvmfd < 0) {
@@ -113,8 +114,9 @@ bool kvm_arm_create_scratch_host_vcpu(const uint32_t 
*cpus_to_try,
 if (max_vm_pa_size < 0) {
 max_vm_pa_size = 0;
 }
+vm_type = kvm_arm_rme_vm_type(MACHINE(qdev_get_machine()));
 do {
-vmfd = ioctl(kvmfd, KVM_CREATE_VM, max_vm_pa_size);
+vmfd = ioctl(kvmfd, KVM_CREATE_VM, max_vm_pa_size | vm_type);
 } while (vmfd == -1 && errno == EINTR);
 if (vmfd < 0) {
 goto err;
-- 
2.44.0




[PATCH v2 08/22] target/arm/kvm: Split kvm_arch_get/put_registers

2024-04-19 Thread Jean-Philippe Brucker
The confidential guest support in KVM limits the number of registers
that we can read and write. Split the get/put_registers function to
prepare for it.

Signed-off-by: Jean-Philippe Brucker 
---
 target/arm/kvm.c | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index b00077c1a5..3504276822 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2056,7 +2056,7 @@ static int kvm_arch_put_sve(CPUState *cs)
 return 0;
 }
 
-int kvm_arch_put_registers(CPUState *cs, int level)
+static int kvm_arm_put_core_regs(CPUState *cs, int level)
 {
 uint64_t val;
 uint32_t fpr;
@@ -2159,6 +2159,19 @@ int kvm_arch_put_registers(CPUState *cs, int level)
 return ret;
 }
 
+return 0;
+}
+
+int kvm_arch_put_registers(CPUState *cs, int level)
+{
+int ret;
+ARMCPU *cpu = ARM_CPU(cs);
+
+ret = kvm_arm_put_core_regs(cs, level);
+if (ret) {
+return ret;
+}
+
 write_cpustate_to_list(cpu, true);
 
 if (!write_list_to_kvmstate(cpu, level)) {
@@ -2240,7 +2253,7 @@ static int kvm_arch_get_sve(CPUState *cs)
 return 0;
 }
 
-int kvm_arch_get_registers(CPUState *cs)
+static int kvm_arm_get_core_regs(CPUState *cs)
 {
 uint64_t val;
 unsigned int el;
@@ -2343,6 +2356,19 @@ int kvm_arch_get_registers(CPUState *cs)
 }
 vfp_set_fpcr(env, fpr);
 
+return 0;
+}
+
+int kvm_arch_get_registers(CPUState *cs)
+{
+int ret;
+ARMCPU *cpu = ARM_CPU(cs);
+
+ret = kvm_arm_get_core_regs(cs);
+if (ret) {
+return ret;
+}
+
 ret = kvm_get_vcpu_events(cpu);
 if (ret) {
 return ret;
-- 
2.44.0




[PATCH v2 02/22] target/arm: Add confidential guest support

2024-04-19 Thread Jean-Philippe Brucker
Add a new RmeGuest object, inheriting from ConfidentialGuestSupport, to
support the Arm Realm Management Extension (RME). It is instantiated by
passing on the command-line:

  -M virt,confidential-guest-support=
  -object guest-rme,id=[,options...]

This is only the skeleton. Support will be added in following patches.

Cc: Eric Blake 
Cc: Markus Armbruster 
Cc: Daniel P. Berrangé 
Cc: Eduardo Habkost 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Jean-Philippe Brucker 
---
 docs/system/confidential-guest-support.rst |  1 +
 qapi/qom.json  |  3 +-
 target/arm/kvm-rme.c   | 46 ++
 target/arm/meson.build |  7 +++-
 4 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100644 target/arm/kvm-rme.c

diff --git a/docs/system/confidential-guest-support.rst 
b/docs/system/confidential-guest-support.rst
index 0c490dbda2..acf46d8856 100644
--- a/docs/system/confidential-guest-support.rst
+++ b/docs/system/confidential-guest-support.rst
@@ -40,5 +40,6 @@ Currently supported confidential guest mechanisms are:
 * AMD Secure Encrypted Virtualization (SEV) (see 
:doc:`i386/amd-memory-encryption`)
 * POWER Protected Execution Facility (PEF) (see 
:ref:`power-papr-protected-execution-facility-pef`)
 * s390x Protected Virtualization (PV) (see :doc:`s390x/protvirt`)
+* Arm Realm Management Extension (RME)
 
 Other mechanisms may be supported in future.
diff --git a/qapi/qom.json b/qapi/qom.json
index 85e6b4f84a..623ec8071f 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -996,7 +996,8 @@
 'tls-creds-x509',
 'tls-cipher-suites',
 { 'name': 'x-remote-object', 'features': [ 'unstable' ] },
-{ 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }
+{ 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] },
+'rme-guest'
   ] }
 
 ##
diff --git a/target/arm/kvm-rme.c b/target/arm/kvm-rme.c
new file mode 100644
index 00..960dd75608
--- /dev/null
+++ b/target/arm/kvm-rme.c
@@ -0,0 +1,46 @@
+/*
+ * QEMU Arm RME support
+ *
+ * Copyright Linaro 2024
+ */
+
+#include "qemu/osdep.h"
+
+#include "exec/confidential-guest-support.h"
+#include "hw/boards.h"
+#include "hw/core/cpu.h"
+#include "kvm_arm.h"
+#include "migration/blocker.h"
+#include "qapi/error.h"
+#include "qom/object_interfaces.h"
+#include "sysemu/kvm.h"
+#include "sysemu/runstate.h"
+
+#define TYPE_RME_GUEST "rme-guest"
+OBJECT_DECLARE_SIMPLE_TYPE(RmeGuest, RME_GUEST)
+
+struct RmeGuest {
+ConfidentialGuestSupport parent_obj;
+};
+
+static void rme_guest_class_init(ObjectClass *oc, void *data)
+{
+}
+
+static const TypeInfo rme_guest_info = {
+.parent = TYPE_CONFIDENTIAL_GUEST_SUPPORT,
+.name = TYPE_RME_GUEST,
+.instance_size = sizeof(struct RmeGuest),
+.class_init = rme_guest_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_USER_CREATABLE },
+{ }
+}
+};
+
+static void rme_register_types(void)
+{
+type_register_static(&rme_guest_info);
+}
+
+type_init(rme_register_types);
diff --git a/target/arm/meson.build b/target/arm/meson.build
index 2e10464dbb..c610c078f7 100644
--- a/target/arm/meson.build
+++ b/target/arm/meson.build
@@ -8,7 +8,12 @@ arm_ss.add(files(
 ))
 arm_ss.add(zlib)
 
-arm_ss.add(when: 'CONFIG_KVM', if_true: files('hyp_gdbstub.c', 'kvm.c'), 
if_false: files('kvm-stub.c'))
+arm_ss.add(when: 'CONFIG_KVM',
+  if_true: files(
+'hyp_gdbstub.c',
+'kvm.c',
+'kvm-rme.c'),
+  if_false: files('kvm-stub.c'))
 arm_ss.add(when: 'CONFIG_HVF', if_true: files('hyp_gdbstub.c'))
 
 arm_ss.add(when: 'TARGET_AARCH64', if_true: files(
-- 
2.44.0




Re: [PATCH 0/3] hw/cxl/cxl-cdat: Make cxl_doe_cdat_init() return boolean

2024-04-19 Thread Philippe Mathieu-Daudé

On 18/4/24 12:04, Zhao Liu wrote:

From: Zhao Liu 




---
Zhao Liu (3):
   hw/cxl/cxl-cdat: Make ct3_load_cdat() return boolean
   hw/cxl/cxl-cdat: Make ct3_build_cdat() return boolean
   hw/cxl/cxl-cdat: Make cxl_doe_cdat_init() return boolean


Since Jonathan Ack'ed the series, I'm queuing it via my hw-misc tree.




Re: [PATCH v5 3/3] Add support for RAPL MSRs in KVM/Qemu

2024-04-19 Thread Zhao Liu
Hi Anthony,

On Thu, Apr 18, 2024 at 12:52:14PM +0200, Anthony Harivel wrote:
> Date: Thu, 18 Apr 2024 12:52:14 +0200
> From: Anthony Harivel 
> Subject: Re: [PATCH v5 3/3] Add support for RAPL MSRs in KVM/Qemu
> 
> > The package energy consumption includes core part and uncore part, where
> > uncore part consumption may not be able to be scaled based on vCPU
> > runtime ratio.
> >
> > When the uncore part consumption is small, the error in this part is
> > small, but if it is large, then the error generated by scaling by vCPU
> > runtime will be large.
> >
> 
> So far we can only work with what Intel is giving us i.e Package power 
> plane and DRAM power plane on server, which is the main target of 
> this feature. Maybe in the future, Intel will expand the core power 
> plane and the uncore power plane to server class CPU ?

Not future features, I'd like to illustrate the impact of the uncore
part (iGPU/NPU on Client or various accelerators on Server) on this
algorithm. Because the consumption of the uncore part is complex and not
necessarily linearly related to the vCPU task running time.

It might be worth to state potential impact on accuracy of uncore parts
to doc (I doubt that heavy uncore consumption will even affect the
consistency of the energy trend as you said).

Anyway, clearer scenarios help this feature get used.

> > May I ask what your usage scenario is? Is there significant uncore
> > consumption (e.g. GPU)?
> >
> 
> Same answer as above: uncore/graphics power plane is only available on 
> client class CPU. 

Yes, iGPU is, but server may have other accelerators, e.g., DSA/IAA/QAT
on SPR.

> > Also, I think of a generic question is whether the error in this
> > calculation is measurable? Like comparing the RAPL status of the same
> > workload on Guest and bare metal to check the error.
> >
> > IIUC, this calculation is highly affected by native/sibling Guests,
> > especially in cloud scenarios where there are multiple Guests, the
> > accuracy of this algorithm needs to be checked.
> >
> 
> Indeed, depending on where your vCPUs are running within the package (on 
> the native or sibling CPU), you might observe different power 
> consumption levels. However, I don't consider this to be a problem, as 
> the ratio calculation takes into account the vCPU's location.
> 
> We also need to approach the measurement differently. Due to the 
> complexity of factors influencing power consumption, we must compare 
> what is comparable. If you require precise power consumption data, 
> use a power meter on the PSU of the server.It will provide the 
> ultimate judgment. However, if you need an estimation to optimize 
> software workloads in a guest, then this feature could be useful. All my 
> tests have consistently shown reproducible output in terms of power 
> consumption, which has convinced me that we can effectively work with 
> it.

Thanks, another mail in which you illustrated that the trend is
consistent.

[snip]

> >
> > In addition, RAPL is basically a CPU feature, I think it would be more
> > appropriate to make it as a x86 CPU's property.
> >
> > Your RAPL support actually provides a framework for assisting KVM
> > emulation in userspace, so this informs other feature support (maybe model
> > specific, or architectural) in the future. Enabling/disabling CPU features
> > via -cpu looks more natural.
> 
> This is totally dependant of KVM because it used the KVM MSR Filtering 
> to access userspace when a specific MSR is required.

Yes, but in other words, other KVM based features (completely hardware
virtualization) are also configured by -cpu. This RAPL is still a CPU
feature and just need KVM's help.
 
> I can try to find a way to use -cpu for this feature and check if KVM is 
> activated or not. 
>

[snip]

> >
> > I understand tick would ignore frequency changes, e.g., HWP's auto-pilot
> > or turbo boost. All these CPU frequency change would impact on core energy
> > consumption.
> >
> > I think the better way would be to use APERF counter, but unfortunately it
> > lacks virtualization support (for Intel).
> >
> > Due to such considerations, it may be more worthwhile to evaluate the
> > accuracy of this tick-based algorithm.
> >
> 
> I've evaluated such things with another tool called Kepler [1]. This 
> tool calculate the power ratio with metrics from RAPL and uses either 
> eBPF or the tick based systems for time metrics.

Thanks for this information! I understand current tick based algorithm
is a common approximation in the industry (like Kepler), right?

> The eBPF part [2] is 
> triggered on each 'finish_task_switch' of Thread and calculate the delta 
> of cpu cycle, cache miss, cpu time, etc. Very complex. My tests showed 
> that the difference between using eBPF and tick based ratio is really 
> not that important. Maybe on some special cases, using eBPF would show 
> a way better accuracy but I'm not aware of that.

Good to know!

Just curious, so for using Kepler in Guest to optim

Re: [PATCH RFC 00/26] Multifd 🔀 device state transfer support with VFIO consumer

2024-04-19 Thread Peter Xu
On Fri, Apr 19, 2024 at 11:07:21AM +0100, Daniel P. Berrangé wrote:
> On Thu, Apr 18, 2024 at 04:02:49PM -0400, Peter Xu wrote:
> > On Thu, Apr 18, 2024 at 08:14:15PM +0200, Maciej S. Szmigiero wrote:
> > > I think one of the reasons for these results is that mixed (RAM + device
> > > state) multifd channels participate in the RAM sync process
> > > (MULTIFD_FLAG_SYNC) whereas device state dedicated channels don't.
> > 
> > Firstly, I'm wondering whether we can have better names for these new
> > hooks.  Currently (only comment on the async* stuff):
> > 
> >   - complete_precopy_async
> >   - complete_precopy
> >   - complete_precopy_async_wait
> > 
> > But perhaps better:
> > 
> >   - complete_precopy_begin
> >   - complete_precopy
> >   - complete_precopy_end
> > 
> > ?
> > 
> > As I don't see why the device must do something with async in such hook.
> > To me it's more like you're splitting one process into multiple, then
> > begin/end sounds more generic.
> > 
> > Then, if with that in mind, IIUC we can already split ram_save_complete()
> > into >1 phases too. For example, I would be curious whether the performance
> > will go back to normal if we offloading multifd_send_sync_main() into the
> > complete_precopy_end(), because we really only need one shot of that, and I
> > am quite surprised it already greatly affects VFIO dumping its own things.
> > 
> > I would even ask one step further as what Dan was asking: have you thought
> > about dumping VFIO states via multifd even during iterations?  Would that
> > help even more than this series (which IIUC only helps during the blackout
> > phase)?
> 
> To dump during RAM iteration, the VFIO device will need to have
> dirty tracking and iterate on its state, because the guest CPUs
> will still be running potentially changing VFIO state. That seems
> impractical in the general case.

We already do such interations in vfio_save_iterate()?

My understanding is the recent VFIO work is based on the fact that the VFIO
device can track device state changes more or less (besides being able to
save/load full states).  E.g. I still remember in our QE tests some old
devices report much more dirty pages than expected during the iterations
when we were looking into such issue that a huge amount of dirty pages
reported.  But newer models seem to have fixed that and report much less.

That issue was about GPU not NICs, though, and IIUC a major portion of such
tracking used to be for GPU vRAMs.  So maybe I was mixing up these, and
maybe they work differently.

Thanks,

-- 
Peter Xu




Re: [PATCH 00/27] Add qapi-domain Sphinx extension

2024-04-19 Thread Markus Armbruster
Markus Armbruster  writes:

[...]

>> The purpose of sending this series in its current form is largely to
>> solicit feedback on general aesthetics, layout, and features. Sphinx is
>> a wily beast, and feedback at this stage will dictate how and where
>> certain features are implemented.
>
> I'd appreciate help with that.  Opinions?

Less than clear, let me try again: I'm soliciting opinions on the new
look.  Check it out...

[...]

>> This RFC series includes a "sandbox" .rst document that showcases the
>> features of this domain by writing QAPI directives by hand; this
>> document will be removed from the series before final inclusion. It's
>> here to serve as a convenient test-bed for y'all to give feedback.

... here:

>> All that said, here's the sandbox document fully rendered:
>> https://jsnow.gitlab.io/qemu/qapi/index.html
>>
>> And here's the new QAPI index page created by that sandbox document:
>> https://jsnow.gitlab.io/qemu/qapi-index.html

[...]




Re: [PATCH] hw/core/clock: always iterate through childs in clock_propagate_period

2024-04-19 Thread Peter Maydell
On Thu, 18 Apr 2024 at 21:39, Raphael Poggi
 wrote:
>
> Hi Philippe,
>
> Le jeu. 18 avr. 2024 à 20:43, Philippe Mathieu-Daudé
>  a écrit :
> >
> > Hi Raphael,
> >
> > On 18/4/24 21:16, Raphael Poggi wrote:
> > > When dealing with few clocks depending with each others, sometimes
> > > we might only want to update the multiplier/diviser on a specific clock
> > > (cf clockB in drawing below) and call "clock_propagate(clockA)" to
> > > update the childs period according to the potential new 
> > > multiplier/diviser values.
> > >
> > > ++ ++  ++
> > > | clockA | --> | clockB |  --> | clockC |
> > > ++ ++  ++
> > >
> > > The actual code would not allow that because, since we cannot call
> > > "clock_propagate" directly on a child, it would exit on the
> > > first child has the period has not changed for clockB, only clockC is
> >
> > Typo "as the period has not changed"?
>
> That's a typo indeed, thanks!
>
> >
> > Why can't you call clock_propagate() on a child?
>
> There is an assert "assert(clk->source == NULL);" in clock_propagate().
> If I am not wrong, clk->source is set when the clock has a parent.

I think that assertion is probably there because we didn't
originally have the idea of a clock having a multiplier/divider
setting. So the idea was that calling clock_propagate() on a
clock with a parent would always be wrong, because the only
reason for its period to change would be if the parent had
changed, and if the parent changes then clock_propagate()
should be called on the parent.

We added mul/div later, and we (I) didn't think through all
the consequences. If you change the mul/div settings on
clockB in this example then you need to call clock_propagate()
on it, so we should remove that assert(). Then when you change
the mul/div on clockB you can directly clock_propagate(clockB),
and I don't think you need this patch at that point.

thanks
-- PMM



Re: hw/usb/hcd-ohci: Fix #1510, #303: pid not IN or OUT

2024-04-19 Thread Cord Amfmgm
Hi Michael,

This just got lost somehow. It is still an issue (see
https://gitlab.com/qemu-project/qemu/-/issues/1510 ). I believe this change
fixes the issue.

On Thu, Apr 18, 2024 at 10:43 AM Michael Tokarev  wrote:

> 06.02.2024 10:13, Cord Amfmgm wrote:
> > This changes the ohci validation to not assert if invalid
> > data is fed to the ohci controller. The poc suggested in
> > https://bugs.launchpad.net/qemu/+bug/1907042
> > and then migrated to bug #303 does the following to
> > feed it a SETUP pid and EndPt of 1:
> >
> >  uint32_t MaxPacket = 64;
> >  uint32_t TDFormat = 0;
> >  uint32_t Skip = 0;
> >  uint32_t Speed = 0;
> >  uint32_t Direction = 0;  /* #define OHCI_TD_DIR_SETUP 0 */
> >  uint32_t EndPt = 1;
> >  uint32_t FuncAddress = 0;
> >  ed->attr = (MaxPacket << 16) | (TDFormat << 15) | (Skip << 14)
> > | (Speed << 13) | (Direction << 11) | (EndPt << 7)
> > | FuncAddress;
> >  ed->tailp = /*TDQTailPntr= */ 0;
> >  ed->headp = ((/*TDQHeadPntr= */ &td[0]) & 0xfff0)
> > | (/* ToggleCarry= */ 0 << 1);
> >  ed->next_ed = (/* NextED= */ 0 & 0xfff0)
> >
> > qemu-fuzz also caught the same issue in #1510. They are
> > both fixed by this patch.
> >
> > The if (td.cbp > td.be) logic in ohci_service_td() causes an
> > ohci_die(). My understanding of the OHCI spec 4.3.1.2
> > Table 4-2 allows td.cbp to be one byte more than td.be to
> > signal the buffer has zero length. The new check in qemu
> > appears to have been added since qemu-4.2. This patch
> > includes both fixes since they are located very close
> > together.
> >
> > Signed-off-by: David Hubbard 
>
> Wonder if this got lost somehow.  Or is it not needed?
>
> Thanks,
>
> /mjt
>
> > diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
> > index d73b53f33c..a53808126f 100644
> > --- a/hw/usb/hcd-ohci.c
> > +++ b/hw/usb/hcd-ohci.c
> > @@ -927,6 +927,11 @@ static int ohci_service_td(OHCIState *ohci,
> > struct ohci_ed *ed)
> >   case OHCI_TD_DIR_SETUP:
> >   str = "setup";
> >   pid = USB_TOKEN_SETUP;
> > +if (OHCI_BM(ed->flags, ED_EN) > 0) {  /* setup only allowed to
> ep 0 */
> > +trace_usb_ohci_td_bad_pid(str, ed->flags, td.flags);
> > +ohci_die(ohci);
> > +return 1;
> > +}
> >   break;
> >   default:
> >   trace_usb_ohci_td_bad_direction(dir);
> > @@ -936,8 +941,8 @@ static int ohci_service_td(OHCIState *ohci, struct
> > ohci_ed *ed)
> >   if ((td.cbp & 0xf000) != (td.be & 0xf000)) {
> >   len = (td.be & 0xfff) + 0x1001 - (td.cbp & 0xfff);
> >   } else {
> > -if (td.cbp > td.be) {
> > -trace_usb_ohci_iso_td_bad_cc_overrun(td.cbp, td.be);
> > +if (td.cbp > td.be + 1) {
> > +trace_usb_ohci_td_bad_buf(td.cbp, td.be);
> >   ohci_die(ohci);
> >   return 1;
> >   }
> > diff --git a/hw/usb/trace-events b/hw/usb/trace-events
> > index ed7dc210d3..b47d082fa3 100644
> > --- a/hw/usb/trace-events
> > +++ b/hw/usb/trace-events
> > @@ -28,6 +28,8 @@ usb_ohci_iso_td_data_overrun(int ret, ssize_t len)
> > "DataOverrun %d > %zu"
> >   usb_ohci_iso_td_data_underrun(int ret) "DataUnderrun %d"
> >   usb_ohci_iso_td_nak(int ret) "got NAK/STALL %d"
> >   usb_ohci_iso_td_bad_response(int ret) "Bad device response %d"
> > +usb_ohci_td_bad_buf(uint32_t cbp, uint32_t be) "Bad cbp = 0x%x > be =
> 0x%x"
> > +usb_ohci_td_bad_pid(const char *s, uint32_t edf, uint32_t tdf) "Bad
> > pid %s: ed.flags 0x%x td.flags 0x%x"
> >   usb_ohci_port_attach(int index) "port #%d"
> >   usb_ohci_port_detach(int index) "port #%d"
> >   usb_ohci_port_wakeup(int index) "port #%d"
> >
>
>


Re: Add 'info pg' command to monitor

2024-04-19 Thread Peter Maydell
On Tue, 16 Apr 2024 at 19:11, Don Porter  wrote:
>
> On 4/16/24 13:03, Peter Maydell wrote:
> > On Tue, 16 Apr 2024 at 17:53, Don Porter  wrote:
> >> There is still a lot I am learning about the code base, but it seems
> >> that qemu_get_guest_memory_mapping() does most of what one would need.
> >> It currently only returns the "leaves" of the page table tree in a list.
> >>
> >> What if I extend this function with an optional argument to either
> >> 1) return the interior nodes of the page table in additional lists (and
> >> then parse+print in the monitor code), or
> >> 2) inline the monitor printing in the arch-specific hook, and pass a
> >> flag to get_guest_memory_mapping() that turns on/off the statements that
> >> pretty print the page tables?
> >>
> >> It looks like most CPUs implement this function as part of checkpointing.
> > As far as I can see only x86 implements the get_memory_mapping
> > function, so once again somebody has added some bit of
> > functionality that does a walk of the page tables that is
> > x86 only and that shares no code with any of the other
> > page table walking code :-(
>
> My mistake - get_memory_mappings() is only implemented in x86.
>
> In doing some searching of the code, many architectures implement
> mmu_translate() and
> get_physical_address() functions, but they are not standardized. I also
> see your larger point
> about replicating page walking code in x86.
>
> I imagine you have something in mind that abstracts things like the
> height of the radix tree,
> entries per node, checking permissions, printing the contents, etc.
>
> Perhaps I should start by trying to merge the x86 page walking code into
> one set of common
> helper functions, get more feedback (perhaps on a new patch thread?),
> and then consider
> how to abstract across architectures after getting feedback on this?

I think the cross-architecture abstraction is probably the
trickiest part. I would actually be happy for us to drop
'info tlb' and 'info mem' entirely if we have a cross-arch
command that gives basically the same information -- we don't
IMHO need more than one command for this, and we only have
multiple commands for basically legacy reasons. And for the
human monitor (HMP) we don't need to keep things around
for backwards compatibility.

thanks
-- PMM



Re: [PATCH 00/27] Add qapi-domain Sphinx extension

2024-04-19 Thread Markus Armbruster
John Snow  writes:

> This series adds a new qapi-domain extension for Sphinx, which adds a
> series of custom directives for documenting QAPI definitions.
>
> GitLab CI: https://gitlab.com/jsnow/qemu/-/pipelines/1259566476
>
> (Link to a demo HTML page at the end of this cover letter, but I want
> you to read the cover letter first to explain what you're seeing.)
>
> This adds a new QAPI Index page, cross-references for QMP commands,
> events, and data types, and improves the aesthetics of the QAPI/QMP
> documentation.

Cross-references alone will be a *massive* improvement!  I'm sure
readers will appreciate better looks and an index, too.

> This series adds only the new ReST syntax, *not* the autogenerator. The
> ReST syntax used in this series is, in general, not intended for anyone
> to actually write by hand. This mimics how Sphinx's own autodoc
> extension generates Python domain directives, which are then re-parsed
> to produce the final result.
>
> I have prototyped such a generator, but it isn't ready for inclusion
> yet. (Rest assured: error context reporting is preserved down to the
> line, even in generated ReST. There is no loss in usability for this
> approach. It will likely either supplant qapidoc.py or heavily alter
> it.) The generator requires only extremely minor changes to
> scripts/qapi/parser.py to preserve nested indentation and provide more
> accurate line information. It is less invasive than you may
> fear. Relying on a secondary ReST parse phase eliminates much of the
> complexity of qapidoc.py. Sleep soundly.

I'm a Sphinx noob.  Let me paraphrase you to make sure I understand.

You proprose to generate formatted documentation in two steps:

• First, the QAPI generator generates .rst from the QAPI schema.  The
  generated .rst makes use of a custom directives.

• Second, Sphinx turns the .rst into formatted documentation.  A Sphinx
  qapi-domain extension implements the custom directives

This mirrors how Sphinx works for Python docs.  Which is its original
use case.

Your series demonstrates the second step, with test input you wrote
manually.

You have code for the first step, but you'd prefer to show it later.

Fair?

> The purpose of sending this series in its current form is largely to
> solicit feedback on general aesthetics, layout, and features. Sphinx is
> a wily beast, and feedback at this stage will dictate how and where
> certain features are implemented.

I'd appreciate help with that.  Opinions?

> A goal for this syntax (and the generator) is to fully in-line all
> command/event/object members, inherited or local, boxed or not, branched
> or not. This should provide a boon to using these docs as a reference,
> because users will not have to grep around the page looking for various
> types, branches, or inherited members. Any arguments types will be
> hyperlinked to their definition, further aiding usability. Commands can
> be hotlinked from anywhere else in the manual, and will provide a
> complete reference directly on the first screenful of information.

Let me elaborate a bit here.

A command's arguments can be specified inline, like so:

{ 'command': 'job-cancel', 'data': { 'id': 'str' } }

The arguments are then documented right with the command.

But they can also be specified by referencing an object type, like so:

{ 'command': 'block-dirty-bitmap-remove',
  'data': 'BlockDirtyBitmap' }

Reasons for doing it this way:

• Several commands take the same arguments, and you don't want to repeat
  yourself.

• You want generated C take a single struct argument ('boxed': true).

• The arguments are a union (which requires 'boxed': true).

Drawback: the arguments are then documented elsewhere.  Not nice.

Bug: the generated documentation fails to point there.

You're proposing to inline the argument documentation, so it appears
right with the command.

An event's data is just like a command's argument.

A command's return value can only specified by referencing a type.  Same
doc usability issue.

Similarly, a union type's base can specified inline or by referencing a
struct type, and a union's branches must be specified by referencing a
struct type.  Same doc usability issue.

At least, the generated documentation does point to the referenced
types.

> (Okay, maybe two screenfuls for commands with a ton of
> arguments... There's only so much I can do!)

*cough* blockdev-add *cough*

> This RFC series includes a "sandbox" .rst document that showcases the
> features of this domain by writing QAPI directives by hand; this
> document will be removed from the series before final inclusion. It's
> here to serve as a convenient test-bed for y'all to give feedback.
>
> All that said, here's the sandbox document fully rendered:
> https://jsnow.gitlab.io/qemu/qapi/index.html
>
> And here's the new QAPI index page created by that sandbox document:
> https://jsnow.gitlab.io/qemu/qapi-index.html
>
> Known issues / points of interest:
>
> - The formatting upsets 

Re: [PATCH v3 0/5] hw/char: Implement the STM32L4x5 USART, UART and LPUART

2024-04-19 Thread Peter Maydell
On Fri, 5 Apr 2024 at 15:27, Peter Maydell  wrote:
>
> On Fri, 29 Mar 2024 at 17:44, Arnaud Minier
>  wrote:
> >
> > This patch adds the STM32L4x5 USART
> > (Universal Synchronous/Asynchronous Receiver/Transmitter)
> > device and is part of a series implementing the
> > STM32L4x5 with a few peripherals.
> >
> > It implements the necessary functionalities to receive/send
> > characters over the serial port, which are useful to
> > communicate with the program currently running.
>
> All the patches here are reviewed, so once we reopen
> for the 9.1 release (in another two or three weeks) it
> should be ready to go in. I'll keep it on my list to apply
> once I start collecting 9.1 patchsets.

I've now applied this to target-arm.next for 9.1; thanks.

-- PMM



Re: [PATCH] hw/dma: prevent overflow in soc_dma_set_request

2024-04-19 Thread Peter Maydell
On Tue, 9 Apr 2024 at 14:38, Peter Maydell  wrote:
>
> On Tue, 9 Apr 2024 at 14:32, Anastasia Belova  wrote:
> >
> >
> >
> > 09/04/24 15:02, Peter Maydell пишет:
> > > On Tue, 9 Apr 2024 at 12:54, Anastasia Belova  
> > > wrote:
> > >> ch->num can reach values up to 31. Add casting to
> > >> a larger type before performing left shift to
> > >> prevent integer overflow.
> > > If ch->num can only reach up to 31, then 1 << ch->num
> > > is fine, because QEMU can assume that integers are 32 bits,
> > > and we compile with -fwrapv so there isn't a problem with
> > > shifting into the sign bit.
> >
> > Right, thanks for your comments.
> > I didn't know about this flag before. It became more clear for me now.
>
> Yep; if you're using a static analyser you probably want to
> configure it to accept the behaviours that are
> undefined-in-standard-C and which get defined behaviour
> with -fwrapv.
>
> This code is definitely a bit dubious, though, because
> ch_enable_mask is a uint64_t, so the intention was clearly
> to allow up to 64 channels. So I think we should take this
> patch anyway, with a slightly adjusted commit message.
>
> All the soc_dma.c code will probably be removed in the
> 9.2 release, because it's only used by the OMAP board models
> which we've just deprecated, so it doesn't seem worth spending
> too much time on cleaning up the code, but in this case you've
> already written the patch.
>
> I'll put this patch on my list to apply after we've made the
> 9.0 release and restarted development for 9.1.

Now applied to target-arm.next for 9.1 (with adjustments
to the commit message); thanks.

-- PMM



Re: [PATCH v11 2/2] memory tier: create CPUless memory tiers after obtaining HMAT info

2024-04-19 Thread Jonathan Cameron via
On Fri,  5 Apr 2024 00:07:06 +
"Ho-Ren (Jack) Chuang"  wrote:

> The current implementation treats emulated memory devices, such as
> CXL1.1 type3 memory, as normal DRAM when they are emulated as normal memory
> (E820_TYPE_RAM). However, these emulated devices have different
> characteristics than traditional DRAM, making it important to
> distinguish them. Thus, we modify the tiered memory initialization process
> to introduce a delay specifically for CPUless NUMA nodes. This delay
> ensures that the memory tier initialization for these nodes is deferred
> until HMAT information is obtained during the boot process. Finally,
> demotion tables are recalculated at the end.
> 
> * late_initcall(memory_tier_late_init);
> Some device drivers may have initialized memory tiers between
> `memory_tier_init()` and `memory_tier_late_init()`, potentially bringing
> online memory nodes and configuring memory tiers. They should be excluded
> in the late init.
> 
> * Handle cases where there is no HMAT when creating memory tiers
> There is a scenario where a CPUless node does not provide HMAT information.
> If no HMAT is specified, it falls back to using the default DRAM tier.
> 
> * Introduce another new lock `default_dram_perf_lock` for adist calculation
> In the current implementation, iterating through CPUlist nodes requires
> holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up
> trying to acquire the same lock, leading to a potential deadlock.
> Therefore, we propose introducing a standalone `default_dram_perf_lock` to
> protect `default_dram_perf_*`. This approach not only avoids deadlock
> but also prevents holding a large lock simultaneously.
> 
> * Upgrade `set_node_memory_tier` to support additional cases, including
>   default DRAM, late CPUless, and hot-plugged initializations.
> To cover hot-plugged memory nodes, `mt_calc_adistance()` and
> `mt_find_alloc_memory_type()` are moved into `set_node_memory_tier()` to
> handle cases where memtype is not initialized and where HMAT information is
> available.
> 
> * Introduce `default_memory_types` for those memory types that are not
>   initialized by device drivers.
> Because late initialized memory and default DRAM memory need to be managed,
> a default memory type is created for storing all memory types that are
> not initialized by device drivers and as a fallback.
> 
> Signed-off-by: Ho-Ren (Jack) Chuang 
> Signed-off-by: Hao Xiang 
> Reviewed-by: "Huang, Ying" 
Reviewed-by: Jonathan Cameron 



Re: [PATCH v3 07/16] aspeed/smc: fix dma moving incorrect data length issue

2024-04-19 Thread Cédric Le Goater

On 4/16/24 11:18, Jamin Lin wrote:

DMA length is from 1 byte to 32MB for AST2600 and AST10x0
and DMA length is from 4 bytes to 32MB for AST2500.

In other words, if "R_DMA_LEN" is 0, it should move at least 1 byte
data for AST2600 and AST10x0 and 4 bytes data for AST2500.

To support all ASPEED SOCs, adds dma_start_length parameter to store

the start length, add helper routines function to compute the dma length
and update DMA_LENGTH mask to "1FF" to fix dma moving
incorrect data length issue.


OK. There are two problems to address, the "zero" length transfer and
the DMA length unit, which is missing today. Newer SoC use a 1 bit / byte
and older ones, AST2400 and AST2500, use 1 bit / 4 bytes.

We can introduce a AspeedSMCClass::dma_len_unit and rework the loop to :

do {

  

   if (s->regs[R_DMA_LEN]) {
s->regs[R_DMA_LEN] -= 4 / asc->dma_len_unit;
}
} while (s->regs[R_DMA_LEN]);

It should fix the current implementation.

I don't think this is necessary to add a Fixes tag because the problem
has been there for ages and no one reported it. Probably because the
only place DMA transfers are used is in U-Boot and transfers have a
non-zero length.


Currently, only supports dma length 4 bytes aligned.


this looks like a third topic. So the minimum value R_DMA_LEN should
have on the AST2600 SoC and above is '3'. I would opt to replace the
DMA_LENGTH macro with a dma_length_sanitize() helper to fix the software
input of R_DMA_LEN.


Thanks,

C.


 

Signed-off-by: Troy Lee 
Signed-off-by: Jamin Lin 
---
  hw/ssi/aspeed_smc.c | 52 -
  include/hw/ssi/aspeed_smc.h |  1 +
  2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
index 8a8d77b480..71abc7a2d8 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -178,13 +178,17 @@
   * DMA flash addresses should be 4 bytes aligned and the valid address
   * range is 0x2000 - 0x2FFF.
   *
- * DMA length is from 4 bytes to 32MB
+ * DMA length is from 4 bytes to 32MB (AST2500)
   *   0: 4 bytes
   *   0x7F: 32M bytes
+ *
+ * DMA length is from 1 byte to 32MB (AST2600, AST10x0)
+ *   0: 1 byte
+ *   0x1FF: 32M bytes
   */
  #define DMA_DRAM_ADDR(asc, val)   ((val) & (asc)->dma_dram_mask)
  #define DMA_FLASH_ADDR(asc, val)  ((val) & (asc)->dma_flash_mask)
-#define DMA_LENGTH(val) ((val) & 0x01FC)
+#define DMA_LENGTH(val) ((val) & 0x01FF)
  
  /* Flash opcodes. */

  #define SPI_OP_READ   0x03/* Read data bytes (low frequency) */
@@ -843,6 +847,24 @@ static bool aspeed_smc_inject_read_failure(AspeedSMCState 
*s)
  }
  }
  
+static uint32_t aspeed_smc_dma_len(AspeedSMCState *s)

+{
+AspeedSMCClass *asc = ASPEED_SMC_GET_CLASS(s);
+uint32_t dma_len;
+uint32_t extra;
+
+dma_len = s->regs[R_DMA_LEN] + asc->dma_start_length;
+
+/* dma length 4 bytes aligned */
+extra = dma_len % 4;
+
+if (extra != 0) {
+dma_len += 4 - extra;
+}
+
+return dma_len;
+}
+
  /*
   * Accumulate the result of the reads to provide a checksum that will
   * be used to validate the read timing settings.
@@ -850,6 +872,7 @@ static bool aspeed_smc_inject_read_failure(AspeedSMCState 
*s)
  static void aspeed_smc_dma_checksum(AspeedSMCState *s)
  {
  MemTxResult result;
+uint32_t dma_len;
  uint32_t data;
  
  if (s->regs[R_DMA_CTRL] & DMA_CTRL_WRITE) {

@@ -861,7 +884,9 @@ static void aspeed_smc_dma_checksum(AspeedSMCState *s)
  aspeed_smc_dma_calibration(s);
  }
  
-while (s->regs[R_DMA_LEN]) {

+dma_len = aspeed_smc_dma_len(s);
+
+while (dma_len) {
  data = address_space_ldl_le(&s->flash_as, s->regs[R_DMA_FLASH_ADDR],
  MEMTXATTRS_UNSPECIFIED, &result);
  if (result != MEMTX_OK) {
@@ -877,7 +902,8 @@ static void aspeed_smc_dma_checksum(AspeedSMCState *s)
   */
  s->regs[R_DMA_CHECKSUM] += data;
  s->regs[R_DMA_FLASH_ADDR] += 4;
-s->regs[R_DMA_LEN] -= 4;
+dma_len -= 4;
+s->regs[R_DMA_LEN] = dma_len;
  }
  
  if (s->inject_failure && aspeed_smc_inject_read_failure(s)) {

@@ -889,14 +915,17 @@ static void aspeed_smc_dma_checksum(AspeedSMCState *s)
  static void aspeed_smc_dma_rw(AspeedSMCState *s)
  {
  MemTxResult result;
+uint32_t dma_len;
  uint32_t data;
  
+dma_len = aspeed_smc_dma_len(s);

+
  trace_aspeed_smc_dma_rw(s->regs[R_DMA_CTRL] & DMA_CTRL_WRITE ?
  "write" : "read",
  s->regs[R_DMA_FLASH_ADDR],
  s->regs[R_DMA_DRAM_ADDR],
-s->regs[R_DMA_LEN]);
-while (s->regs[R_DMA_LEN]) {
+dma_len);
+while (dma_len) {
  if (s->regs[R_DMA_CTRL] & DMA_CTRL_WRITE) {
  data = address_space_ldl_le(&s->dram_as, s->regs[R_DMA_DRAM_ADDR],

Re: [PATCH v13 00/24] target/arm: Implement FEAT_NMI and FEAT_GICv3_NMI

2024-04-19 Thread Peter Maydell
On Sun, 7 Apr 2024 at 09:19, Jinjie Ruan  wrote:
>
> This patch set implements FEAT_NMI and FEAT_GICv3_NMI for ARMv8. These
> introduce support for a new category of interrupts in the architecture
> which we can use to provide NMI like functionality.

I had one last loose end I wanted to tidy up, and I got round
to working through reading the spec about it today. This is
the question of what the "is NMI enabled?" test should be
in the code in arm_gicv3_cpuif.c.

The spec wording isn't always super clear, but there are several
things here:

 * FEAT_NMI : the changes to the CPU proper which implement
   superpriority for IRQ and FIQ, PSTATE.ALLINT, etc etc.
 * FEAT_GICv3_NMI : the changes to the CPU interface for
   GICv3 NMI handling. Any CPU with FEAT_NMI and FEAT_GICv3
   must have this.
 * NMI support in the IRI (Interrupt Routing Infrastructure,
   i.e. all the bits of the GIC that aren't the cpuif; the
   distributor and redistributors). Table 3-1 in the GIC spec
   says that you can have an IRI without NMI support connected
   to a CPU which does have NMI support. This is what the ID
   register bit GICD_TYPER.NMI reports.

At the moment this patchset conflates FEAT_GICv3_NMI and
the NMI support in the IRI. The effect of this is that we
allow a machine model to create a CPU with FEAT_NMI but
without FEAT_GICv3_NMI in the cpuif, and we don't allow
a setup where the CPU and cpuif have NMI support but the
IRI does not. (This will actually happen with this patchset
with the sbsa-ref machine and -cpu max, because we haven't
(yet) made sbsa-ref enable NMI in the GIC device when the
CPU has NMI support.)

For a Linux guest this doesn't make much difference, because
Linux will only enable NMI support if it finds it in both
the IRI and the CPU, but I think it would be better to
get the enable-tests right as these can be awkward to change
after the fact in a backwards-compatible way.

I think this is easy to fix -- we can add a new bool field
GICv3CPUState::nmi_support which we initialize in
gicv3_init_cpuif() if the CPU has FEAT_NMI, and make the
checks in arm_gicv3_cpuif.c check cs->nmi_support instead
of cs->gic->nmi_support. That looks like this squashed into
patch 18:

diff --git a/include/hw/intc/arm_gicv3_common.h
b/include/hw/intc/arm_gicv3_common.h
index 88533749ebb..cd09bee3bc4 100644
--- a/include/hw/intc/arm_gicv3_common.h
+++ b/include/hw/intc/arm_gicv3_common.h
@@ -225,6 +225,13 @@ struct GICv3CPUState {

 /* This is temporary working state, to avoid a malloc in gicv3_update() */
 bool seenbetter;
+
+/*
+ * Whether the CPU interface has NMI support (FEAT_GICv3_NMI). The
+ * CPU interface may support NMIs even when the GIC proper (what the
+ * spec calls the IRI; the redistributors and distributor) does not.
+ */
+bool nmi_support;
 };

 /*
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 2457b7bca23..715909d0f7d 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -21,6 +21,7 @@
 #include "hw/irq.h"
 #include "cpu.h"
 #include "target/arm/cpregs.h"
+#include "target/arm/cpu-features.h"
 #include "sysemu/tcg.h"
 #include "sysemu/qtest.h"

@@ -839,7 +840,7 @@ static int icc_highest_active_prio(GICv3CPUState *cs)
  */
 int i;

-if (cs->gic->nmi_support) {
+if (cs->nmi_support) {
 /*
  * If an NMI is active this takes precedence over anything else
  * for priority purposes; the NMI bit is only in the AP1R0 bit.
@@ -1285,7 +1286,7 @@ static void icc_drop_prio(GICv3CPUState *cs, int grp)
 continue;
 }

-if (i == 0 && cs->gic->nmi_support && (*papr & ICC_AP1R_EL1_NMI)) {
+if (i == 0 && cs->nmi_support && (*papr & ICC_AP1R_EL1_NMI)) {
 *papr &= (~ICC_AP1R_EL1_NMI);
 break;
 }
@@ -1324,7 +1325,7 @@ static int icc_highest_active_group(GICv3CPUState *cs)
  */
 int i;

-if (cs->gic->nmi_support) {
+if (cs->nmi_support) {
 if (cs->icc_apr[GICV3_G1][0] & ICC_AP1R_EL1_NMI) {
 return GICV3_G1;
 }
@@ -1787,7 +1788,7 @@ static void icc_ap_write(CPUARMState *env, const
ARMCPRegInfo *ri,
 return;
 }

-if (cs->gic->nmi_support) {
+if (cs->nmi_support) {
 cs->icc_apr[grp][regno] = value & (0xU | ICC_AP1R_EL1_NMI);
 } else {
 cs->icc_apr[grp][regno] = value & 0xU;
@@ -1901,7 +1902,7 @@ static uint64_t icc_rpr_read(CPUARMState *env,
const ARMCPRegInfo *ri)
 }
 }

-if (cs->gic->nmi_support) {
+if (cs->nmi_support) {
 /* NMI info is reported in the high bits of RPR */
 if (arm_feature(env, ARM_FEATURE_EL3) && !arm_is_secure(env)) {
 if (cs->icc_apr[GICV3_G1NS][0] & ICC_AP1R_EL1_NMI) {
@@ -2961,7 +2962,16 @@ void gicv3_init_cpuif(GICv3State *s)
  */
 define_arm_cp_regs(cpu, gicv3_cpuif_reginfo);

-if (s->nmi_support) {
+/*
+ * If the CPU implements FEAT_NMI 

Re: [PATCH v3 09/13] block/gluster: Use URI parsing code from glib

2024-04-19 Thread Eric Blake
On Thu, Apr 18, 2024 at 12:10:52PM +0200, Thomas Huth wrote:
> Since version 2.66, glib has useful URI parsing functions, too.
> Use those instead of the QEMU-internal ones to be finally able
> to get rid of the latter.
> 
> Since g_uri_get_path() returns a const pointer, we also need to
> tweak the parameter of parse_volume_options() (where we use the
> result of g_uri_get_path() as input).
> 
> Reviewed-by: Eric Blake 
> Reviewed-by: Daniel P. Berrangé 
> Signed-off-by: Thomas Huth 
> ---
>  block/gluster.c | 71 -
>  1 file changed, 35 insertions(+), 36 deletions(-)
> 

> @@ -364,57 +363,57 @@ static int 
> qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
>  QAPI_LIST_PREPEND(gconf->server, gsconf);
>  
>  /* transport */
> -if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
> +uri_scheme = g_uri_get_scheme(uri);
> +if (!uri_scheme || !strcmp(uri_scheme, "gluster")) {
>  gsconf->type = SOCKET_ADDRESS_TYPE_INET;

It may be worth a mention in the commit message that we are aware that
this provides a positive user-visible change as a side-effect: namely,
by virtue of using glib's parser (which normalizes the scheme to
lowercase) instead of our own (which did not), we now accept
GLUSTER:// URIs in addition to the usual gluster:// spelling.  Similar
comments to all the other affected patches in the series.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH v3 04/13] tests: Update our CI to use CentOS Stream 9 instead of 8

2024-04-19 Thread Eric Blake
On Thu, Apr 18, 2024 at 12:10:47PM +0200, Thomas Huth wrote:
> RHEL 9 (and thus also the derivatives) are available since two years
> now, so according to QEMU's support policy, we can drop the active

Grammar suggestion:

RHEL 9 (and thus also the derivatives) have been available for two years now,

> support for the previous major version 8 now.
> 
> Another reason for doing this is that Centos Stream 8 will go EOL soon:
> 
> https://blog.centos.org/2023/04/end-dates-are-coming-for-centos-stream-8-and-centos-linux-7/
> 
>   "After May 31, 2024, CentOS Stream 8 will be archived
>and no further updates will be provided."
> 
> Thus upgrade our CentOS Stream container to major version 9 now.
> 
> Reviewed-by: Daniel P. Berrangé 
> Signed-off-by: Thomas Huth 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH] tests/unit: Remove debug statements in test-nested-aio-poll.c

2024-04-19 Thread Eric Blake
On Fri, Apr 19, 2024 at 10:58:19AM +0200, Philippe Mathieu-Daudé wrote:
> We are running this test since almost a year; it is

Grammar suggestion:

We have been running this test for almost a year;

> safe to remove its debug statements, which clutter
> CI jobs output:
> 
>   ▶  88/100 /nested-aio-poll  OK
>   io_read 0x16bb26158
>   io_poll_true 0x16bb26158
>   > io_poll_ready
>   io_read 0x16bb26164
>   < io_poll_ready
>   io_poll_true 0x16bb26158
>   io_poll_false 0x16bb26164
>   > io_poll_ready
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_poll_false 0x16bb26164
>   io_read 0x16bb26164
>   < io_poll_ready
>   88/100 qemu:unit / test-nested-aio-pollOK
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/unit/test-nested-aio-poll.c | 7 ---
>  1 file changed, 7 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH v3 01/13] tests: Remove Ubuntu 20.04 container

2024-04-19 Thread Eric Blake
On Thu, Apr 18, 2024 at 12:10:44PM +0200, Thomas Huth wrote:
> Since Ubuntu 22.04 is now available since two years, we can stop

Grammar suggestion:

Since Ubuntu 22.04 has now been available for more than two years,

> actively supporting the previous LTS version of Ubuntu now.
> 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Thomas Huth 
> ---

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: Deprecation/removal of nios2 target support

2024-04-19 Thread Dinh Nguyen




On 4/18/24 13:41, Arnd Bergmann wrote:

On Thu, Apr 18, 2024, at 17:44, Joseph Myers wrote:

On Wed, 17 Apr 2024, Sandra Loosemore wrote:


Therefore I'd like to mark Nios II as obsolete in GCC 14 now, and remove
support from all toolchain components after the release is made.  I'm not sure
there is an established process for obsoleting/removing support in other
components; besides binutils, GDB, and GLIBC, there's QEMU, newlib/libgloss,
and the Linux kernel.  But, we need to get the ball rolling somewhere.


CC:ing Arnd Bergmann regarding the obsolescence in the Linux kernel.


We have not yet marked nios2 as deprecated in the kernel, but that
is mostly because the implementation does not get in the way too
much and Dinh Nguyen is still around as a maintainer and merging
bugfixes.

Almost all nios2 kernel changes I see in the past decade have been
done blindly without testing on hardware, either for treewide
changes, or by code inspection. The only notable exceptions I could
find are from Andreas Oetken and Bernd Weiberg at Siemens and
from Marek Vasut (all added to Cc in case they have something to add).

We should probably remove nios2 from the kernel in the near future,
but even if we decide not to, I think deprecating it from gcc is the
right idea: If there are a few remaining users that still plan
to update their kernels, gcc-14 will still be able to build new
kernels for several years.



I'm planning to do this soon.

Dinh



  1   2   >