date:20231218

[PATCH v6 00/11] Support blob memory and venus on qemu

2023-12-18 Thread Huang Rui

Hi all,

Sorry to late for V6, I was occupied by other stuff last two months, and
right now resume the submission.

Antonio Caggiano made the venus with QEMU on KVM platform last
September[1]. This series are inherited from his original work to support
the features of context init, hostmem, resource uuid, and blob resources
for venus.
At March of this year, we sent out the V1 version[2] for the review. But
those series are included both xen and virtio gpu. Right now, we would like
to divide into two parts, one is to continue the Antonio's work to upstream
virtio-gpu support for blob memory and venus, and another is to upstream
xen specific patches. This series is focusing on virtio-gpu, so we are
marking as V4 version here to continue Antonio's patches[1]. And we will
send xen specific patches separately, because they are hypervisor specific.
Besides of QEMU, these supports also included virglrenderer[3][4] and
mesa[5][6] as well. Right now, virglrenderer and mesa parts are all
accepted by upstream. In this qemu version, we try to address the concerns
around not proper cleanup during blob resource unmap and unref. Appreciate
it if you have any commments.

[1] 
https://lore.kernel.org/qemu-devel/20220926142422.22325-1-antonio.caggi...@collabora.com/
[2] V1: 
https://lore.kernel.org/qemu-devel/20230312092244.451465-1-ray.hu...@amd.com
[3] https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1068
[4] https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1180
[5] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22108
[6] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23680

Please note the first 4 patches 1 -> 4 are inlcuded in these series because
the series depends on them and not because we want them to be reviewed
since they are already in the process of review through the "rutabaga_gfx +
gfxstream" series.
- 
https://lore.kernel.org/qemu-devel/20230829003629.410-1-gurchetansi...@chromium.org/

V4: 
https://lore.kernel.org/qemu-devel/20230831093252.2461282-1-ray.hu...@amd.com
V5: https://lore.kernel.org/qemu-devel/2023091530.24064-1-ray.hu...@amd.com

Changes from V5 to V6

- Move macros configurations under virgl.found() and rename
  HAVE_VIRGL_CONTEXT_CREATE_WITH_FLAGS.

- Handle the case while context_init is disabled.

- Enable context_init by default.

- Move virtio_gpu_virgl_resource_unmap() into
  virgl_cmd_resource_unmap_blob().

- Introduce new struct virgl_gpu_resource to store virgl specific members.

- Remove erro handling of g_new0, because glib will abort() on OOM.

- Set resource uuid as option.

- Implement optional subsection of vmstate_virtio_gpu_resource_uuid_state
  for virtio live migration.

- Use g_int_hash/g_int_equal instead of the default

- Add scanout_blob function for virtio-gpu-virgl

- Resolve the memory leak on virtio-gpu-virgl

- Remove the unstable API flags check because virglrenderer is already 1.0

- Squash the render server flag support into "Initialize Venus"

Changes from V4 (virtio gpu V4) to V5

- Inverted patch 5 and 6 because we should configure
  HAVE_VIRGL_CONTEXT_INIT firstly.

- Validate owner of memory region to avoid slowing down DMA.

- Use memory_region_init_ram_ptr() instead of
  memory_region_init_ram_device_ptr().

- Adjust sequence to allocate gpu resource before virglrender resource
  creation

- Add virtio migration handling for uuid.

- Send kernel patch to define VIRTIO_GPU_CAPSET_VENUS.
  https://lore.kernel.org/lkml/20230915105918.3763061-1-ray.hu...@amd.com/

- Add meson check to make sure unstable APIs defined from 0.9.0.

Changes from V1 to V2 (virtio gpu V4)

- Remove unused #include "hw/virtio/virtio-iommu.h"

- Add a local function, called virgl_resource_destroy(), that is used
  to release a vgpu resource on error paths and in resource_unref.

- Remove virtio_gpu_virgl_resource_unmap from
  virtio_gpu_cleanup_mapping(),
  since this function won't be called on blob resources and also because
  blob resources are unmapped via virgl_cmd_resource_unmap_blob().

- In virgl_cmd_resource_create_blob(), do proper cleanup in error paths
  and move QTAILQ_INSERT_HEAD(&g->reslist, res, next) after the resource
  has been fully initialized.

- Memory region has a different life-cycle from virtio gpu resources
  i.e. cannot be released synchronously along with the vgpu resource.
  So, here the field "region" was changed to a pointer and is allocated
  dynamically when the blob is mapped.
  Also, since the pointer can be used to indicate whether the blob
  is mapped, the explicite field "mapped" was removed.

- In virgl_cmd_resource_map_blob(), add check on the value of
  res->region, to prevent beeing called twice on the same resource.

- Add a patch to enable automatic deallocation of memory regions to resolve
  use-after-free memory corruption with a reference.

References

Demo with Venus:
- 
https://static.sched.com/hosted_files/xen2023/3f/xen_summit_2023_virtgpu_demo.mp4
QEMU repository:
- htt

[PATCH v6 11/11] virtio-gpu: make blob scanout use dmabuf fd

2023-12-18 Thread Huang Rui

From: Robert Beckett 

This relies on a virglrenderer change to include the dmabuf fd when
returning resource info.

Signed-off-by: Robert Beckett 
Signed-off-by: Huang Rui 
---

Changes in v6:
- Add scanout_blob function for virtio-gpu-virgl.
- Update for new virgl_gpu_resource.

 hw/display/virtio-gpu-virgl.c  | 104 +
 hw/display/virtio-gpu.c|   4 +-
 include/hw/virtio/virtio-gpu.h |   6 ++
 3 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index c523a6717a..c384225a98 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -18,6 +18,7 @@
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-gpu.h"
 #include "hw/virtio/virtio-gpu-bswap.h"
+#include "hw/virtio/virtio-gpu-pixman.h"
 
 #include "ui/egl-helpers.h"
 
@@ -726,6 +727,106 @@ static void virgl_cmd_resource_unmap_blob(VirtIOGPU *g,
 object_unparent(OBJECT(mr));
 }
 
+static void virgl_cmd_set_scanout_blob(VirtIOGPU *g,
+   struct virtio_gpu_ctrl_command *cmd)
+{
+struct virgl_gpu_resource *vres;
+struct virtio_gpu_framebuffer fb = { 0 };
+struct virtio_gpu_set_scanout_blob ss;
+struct virgl_renderer_resource_info info;
+uint64_t fbend;
+
+VIRTIO_GPU_FILL_CMD(ss);
+virtio_gpu_scanout_blob_bswap(&ss);
+trace_virtio_gpu_cmd_set_scanout_blob(ss.scanout_id, ss.resource_id,
+  ss.r.width, ss.r.height, ss.r.x,
+  ss.r.y);
+
+if (ss.scanout_id >= g->parent_obj.conf.max_outputs) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: illegal scanout id specified %d",
+  __func__, ss.scanout_id);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_SCANOUT_ID;
+return;
+}
+
+if (ss.resource_id == 0) {
+virtio_gpu_disable_scanout(g, ss.scanout_id);
+return;
+}
+
+if (ss.width < 16 ||
+ss.height < 16 ||
+ss.r.x + ss.r.width > ss.width ||
+ss.r.y + ss.r.height > ss.height) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: illegal scanout %d bounds for"
+  " resource %d, rect (%d,%d)+%d,%d, fb %d %d\n",
+  __func__, ss.scanout_id, ss.resource_id,
+  ss.r.x, ss.r.y, ss.r.width, ss.r.height,
+  ss.width, ss.height);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
+return;
+}
+
+if (!console_has_gl(g->parent_obj.scanout[ss.scanout_id].con)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: unable to scanout blot without 
GL!\n", __func__);
+return;
+}
+
+vres = virgl_gpu_find_resource(g, ss.resource_id);
+if (!vres) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: illegal resource specified %d\n",
+  __func__, ss.resource_id);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
+return;
+}
+if (virgl_renderer_resource_get_info(ss.resource_id, &info)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: illegal virgl resource specified %d\n",
+  __func__, ss.resource_id);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
+return;
+}
+if (!vres->res.dmabuf_fd && info.fd)
+vres->res.dmabuf_fd = info.fd;
+
+fb.format = virtio_gpu_get_pixman_format(ss.format);
+if (!fb.format) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: host couldn't handle guest format %d\n",
+  __func__, ss.format);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
+return;
+}
+
+fb.bytes_pp = DIV_ROUND_UP(PIXMAN_FORMAT_BPP(fb.format), 8);
+fb.width = ss.width;
+fb.height = ss.height;
+fb.stride = ss.strides[0];
+fb.offset = ss.offsets[0] + ss.r.x * fb.bytes_pp + ss.r.y * fb.stride;
+
+fbend = fb.offset;
+fbend += fb.stride * (ss.r.height - 1);
+fbend += fb.bytes_pp * ss.r.width;
+if (fbend > vres->res.blob_size) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: fb end out of range\n",
+  __func__);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
+return;
+}
+
+g->parent_obj.enable = 1;
+if (virtio_gpu_update_dmabuf(g, ss.scanout_id, &vres->res,
+ &fb, &ss.r)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: failed to update dmabuf\n", __func__);
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
+return;
+}
+virtio_gpu_update_scanout(g, ss.scanout_id, &vres->res, &ss.r);
+}
+
 #endif /* HAVE_VIRGL_RESOURCE_BLOB */
 
 void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
@@ -807,6 +908,9 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
 case VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB:
 virgl_cmd_resource_unmap_b

[PATCH v6 10/11] virtio-gpu: Initialize Venus

2023-12-18 Thread Huang Rui

From: Antonio Caggiano 

Request Venus when initializing VirGL.

Signed-off-by: Antonio Caggiano 
Signed-off-by: Huang Rui 
---

Changes in v6:
- Remove the unstable API flags check because virglrenderer is already 1.0.
- Squash the render server flag support into "Initialize Venus".

 hw/display/virtio-gpu-virgl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index f35a751824..c523a6717a 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -964,6 +964,10 @@ int virtio_gpu_virgl_init(VirtIOGPU *g)
 }
 #endif
 
+#ifdef VIRGL_RENDERER_VENUS
+flags |= VIRGL_RENDERER_VENUS | VIRGL_RENDERER_RENDER_SERVER;
+#endif
+
 ret = virgl_renderer_init(g, flags, &virtio_gpu_3d_cbs);
 if (ret != 0) {
 error_report("virgl could not be initialized: %d", ret);
-- 
2.25.1

[PATCH v6 08/11] virtio-gpu: Resource UUID

2023-12-18 Thread Huang Rui

From: Antonio Caggiano 

Enable resource UUID feature and implement command resource assign UUID.
This is done by introducing a hash table to map resource IDs to their
UUIDs.

Signed-off-by: Antonio Caggiano 
Signed-off-by: Huang Rui 
---

Changes in v6:
- Set resource uuid as option.
- Implement optional subsection of vmstate_virtio_gpu_resource_uuid_state
  or virtio live migration.
- Use g_int_hash/g_int_equal instead of the default.
- Move virtio_vgpu_simple_resource initialization in the earlier new patch
  "virtio-gpu: Introduce virgl_gpu_resource structure"

 hw/display/trace-events|   1 +
 hw/display/virtio-gpu-base.c   |   4 ++
 hw/display/virtio-gpu-virgl.c  |   3 +
 hw/display/virtio-gpu.c| 119 +
 include/hw/virtio/virtio-gpu.h |   7 ++
 5 files changed, 134 insertions(+)

diff --git a/hw/display/trace-events b/hw/display/trace-events
index 2336a0ca15..54d6894c59 100644
--- a/hw/display/trace-events
+++ b/hw/display/trace-events
@@ -41,6 +41,7 @@ virtio_gpu_cmd_res_create_blob(uint32_t res, uint64_t size) 
"res 0x%x, size %" P
 virtio_gpu_cmd_res_unref(uint32_t res) "res 0x%x"
 virtio_gpu_cmd_res_back_attach(uint32_t res) "res 0x%x"
 virtio_gpu_cmd_res_back_detach(uint32_t res) "res 0x%x"
+virtio_gpu_cmd_res_assign_uuid(uint32_t res) "res 0x%x"
 virtio_gpu_cmd_res_xfer_toh_2d(uint32_t res) "res 0x%x"
 virtio_gpu_cmd_res_xfer_toh_3d(uint32_t res) "res 0x%x"
 virtio_gpu_cmd_res_xfer_fromh_3d(uint32_t res) "res 0x%x"
diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
index 37af256219..6bcee3882f 100644
--- a/hw/display/virtio-gpu-base.c
+++ b/hw/display/virtio-gpu-base.c
@@ -236,6 +236,10 @@ virtio_gpu_base_get_features(VirtIODevice *vdev, uint64_t 
features,
 features |= (1 << VIRTIO_GPU_F_CONTEXT_INIT);
 }
 
+if (virtio_gpu_resource_uuid_enabled(g->conf)) {
+features |= (1 << VIRTIO_GPU_F_RESOURCE_UUID);
+}
+
 return features;
 }
 
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 5a3a292f79..be9da6e780 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -777,6 +777,9 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
 /* TODO add security */
 virgl_cmd_ctx_detach_resource(g, cmd);
 break;
+case VIRTIO_GPU_CMD_RESOURCE_ASSIGN_UUID:
+virtio_gpu_resource_assign_uuid(g, cmd);
+break;
 case VIRTIO_GPU_CMD_GET_CAPSET_INFO:
 virgl_cmd_get_capset_info(g, cmd);
 break;
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 8189c392dc..466debb256 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -958,6 +958,37 @@ virtio_gpu_resource_detach_backing(VirtIOGPU *g,
 virtio_gpu_cleanup_mapping(g, res);
 }
 
+void virtio_gpu_resource_assign_uuid(VirtIOGPU *g,
+ struct virtio_gpu_ctrl_command *cmd)
+{
+struct virtio_gpu_simple_resource *res;
+struct virtio_gpu_resource_assign_uuid assign;
+struct virtio_gpu_resp_resource_uuid resp;
+QemuUUID *uuid;
+
+VIRTIO_GPU_FILL_CMD(assign);
+virtio_gpu_bswap_32(&assign, sizeof(assign));
+trace_virtio_gpu_cmd_res_assign_uuid(assign.resource_id);
+
+res = virtio_gpu_find_check_resource(g, assign.resource_id, false, 
__func__, &cmd->error);
+if (!res) {
+return;
+}
+
+memset(&resp, 0, sizeof(resp));
+resp.hdr.type = VIRTIO_GPU_RESP_OK_RESOURCE_UUID;
+
+uuid = g_hash_table_lookup(g->resource_uuids, &assign.resource_id);
+if (!uuid) {
+uuid = g_new(QemuUUID, 1);
+qemu_uuid_generate(uuid);
+g_hash_table_insert(g->resource_uuids, &assign.resource_id, uuid);
+}
+
+memcpy(resp.uuid, uuid, sizeof(QemuUUID));
+virtio_gpu_ctrl_response(g, cmd, &resp.hdr, sizeof(resp));
+}
+
 void virtio_gpu_simple_process_cmd(VirtIOGPU *g,
struct virtio_gpu_ctrl_command *cmd)
 {
@@ -1006,6 +1037,9 @@ void virtio_gpu_simple_process_cmd(VirtIOGPU *g,
 case VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING:
 virtio_gpu_resource_detach_backing(g, cmd);
 break;
+case VIRTIO_GPU_CMD_RESOURCE_ASSIGN_UUID:
+virtio_gpu_resource_assign_uuid(g, cmd);
+break;
 default:
 cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
 break;
@@ -1400,6 +1434,57 @@ static int virtio_gpu_blob_load(QEMUFile *f, void 
*opaque, size_t size,
 return 0;
 }
 
+static int virtio_gpu_resource_uuid_save(QEMUFile *f, void *opaque, size_t 
size,
+ const VMStateField *field,
+ JSONWriter *vmdesc)
+{
+VirtIOGPU *g = opaque;
+struct virtio_gpu_simple_resource *res;
+QemuUUID *uuid;
+
+/* in 2d mode we should never find unprocessed commands here */
+assert(QTAILQ_EMPTY(&g->cmdq));
+
+QTAILQ_FOREACH(res, &g->reslist, next) {
+qemu_put_be32(f, res->resource_id);
+

[PATCH v6 06/11] softmmu/memory: enable automatic deallocation of memory regions

2023-12-18 Thread Huang Rui

From: Xenia Ragiadakou 

When the memory region has a different life-cycle from that of her parent,
could be automatically released, once has been unparent and once all of her
references have gone away, via the object's free callback.

However, currently, the address space subsystem keeps references to the
memory region without first incrementing its object's reference count.
As a result, the automatic deallocation of the object, not taking into
account those references, results in use-after-free memory corruption.

More specifically, reference to the memory region is kept in flatview
ranges. If the reference count of the memory region is not incremented,
flatview_destroy(), that is asynchronous, may be called after memory
region's destruction. If the reference count of the memory region is
incremented, memory region's destruction will take place after
flatview_destroy() has released its references.

This patch increases the reference count of an owned memory region object
on each memory_region_ref() and decreases it on each memory_region_unref().

Signed-off-by: Xenia Ragiadakou 
Signed-off-by: Huang Rui 
---

Changes in v6:
- remove in-code comment because it is confusing and explain the issue,
  that the patch attempts to fix, with more details in commit message

 system/memory.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/system/memory.c b/system/memory.c
index 304fa843ea..4d5e7e7a4c 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1824,6 +1824,7 @@ void memory_region_ref(MemoryRegion *mr)
  * we do not ref/unref them because it slows down DMA sensibly.
  */
 if (mr && mr->owner) {
+object_ref(OBJECT(mr));
 object_ref(mr->owner);
 }
 }
@@ -1832,6 +1833,7 @@ void memory_region_unref(MemoryRegion *mr)
 {
 if (mr && mr->owner) {
 object_unref(mr->owner);
+object_unref(OBJECT(mr));
 }
 }
 
-- 
2.25.1

[PATCH v6 07/11] virtio-gpu: Handle resource blob commands

2023-12-18 Thread Huang Rui

From: Antonio Caggiano 

Support BLOB resources creation, mapping and unmapping by calling the
new stable virglrenderer 0.10 interface. Only enabled when available and
via the blob config. E.g. -device virtio-vga-gl,blob=true

Signed-off-by: Antonio Caggiano 
Signed-off-by: Dmitry Osipenko 
Signed-off-by: Xenia Ragiadakou 
Signed-off-by: Huang Rui 
---

Changes in v6:
- Use new struct virgl_gpu_resource.
- Unmap, unref and destroy the resource only after the memory region
  has been completely removed.
- In unref check whether the resource is still mapped.
- In unmap_blob check whether the resource has been already unmapped.
- Fix coding style

 hw/display/virtio-gpu-virgl.c | 274 +-
 hw/display/virtio-gpu.c   |   4 +-
 meson.build   |   4 +
 3 files changed, 276 insertions(+), 6 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index faab374336..5a3a292f79 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -17,6 +17,7 @@
 #include "trace.h"
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-gpu.h"
+#include "hw/virtio/virtio-gpu-bswap.h"
 
 #include "ui/egl-helpers.h"
 
@@ -24,8 +25,62 @@
 
 struct virgl_gpu_resource {
 struct virtio_gpu_simple_resource res;
+uint32_t ref;
+VirtIOGPU *g;
+
+#ifdef HAVE_VIRGL_RESOURCE_BLOB
+/* only blob resource needs this region to be mapped as guest mmio */
+MemoryRegion *region;
+#endif
 };
 
+static void vres_get_ref(struct virgl_gpu_resource *vres)
+{
+uint32_t ref;
+
+ref = qatomic_fetch_inc(&vres->ref);
+g_assert(ref < INT_MAX);
+}
+
+static void virgl_resource_destroy(struct virgl_gpu_resource *vres)
+{
+struct virtio_gpu_simple_resource *res;
+VirtIOGPU *g;
+
+if (!vres) {
+return;
+}
+
+g = vres->g;
+res = &vres->res;
+QTAILQ_REMOVE(&g->reslist, res, next);
+virtio_gpu_cleanup_mapping(g, res);
+g_free(vres);
+}
+
+static void virgl_resource_unref(struct virgl_gpu_resource *vres)
+{
+struct virtio_gpu_simple_resource *res;
+
+if (!vres) {
+return;
+}
+
+res = &vres->res;
+virgl_renderer_resource_detach_iov(res->resource_id, NULL, NULL);
+virgl_renderer_resource_unref(res->resource_id);
+}
+
+static void vres_put_ref(struct virgl_gpu_resource *vres)
+{
+g_assert(vres->ref > 0);
+
+if (qatomic_fetch_dec(&vres->ref) == 1) {
+virgl_resource_unref(vres);
+virgl_resource_destroy(vres);
+}
+}
+
 static struct virgl_gpu_resource *
 virgl_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id)
 {
@@ -59,6 +114,8 @@ static void virgl_cmd_create_resource_2d(VirtIOGPU *g,
c2d.width, c2d.height);
 
 vres = g_new0(struct virgl_gpu_resource, 1);
+vres_get_ref(vres);
+vres->g = g;
 vres->res.width = c2d.width;
 vres->res.height = c2d.height;
 vres->res.format = c2d.format;
@@ -91,6 +148,8 @@ static void virgl_cmd_create_resource_3d(VirtIOGPU *g,
c3d.width, c3d.height, c3d.depth);
 
 vres = g_new0(struct virgl_gpu_resource, 1);
+vres_get_ref(vres);
+vres->g = g;
 vres->res.width = c3d.width;
 vres->res.height = c3d.height;
 vres->res.format = c3d.format;
@@ -126,12 +185,21 @@ static void virgl_cmd_resource_unref(VirtIOGPU *g,
 return;
 }
 
-virgl_renderer_resource_detach_iov(unref.resource_id, NULL, NULL);
-virgl_renderer_resource_unref(unref.resource_id);
+#ifdef HAVE_VIRGL_RESOURCE_BLOB
+if (vres->region) {
+VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
+MemoryRegion *mr = vres->region;
+
+warn_report("%s: blob resource %d not unmapped",
+__func__, unref.resource_id);
+vres->region = NULL;
+memory_region_set_enabled(mr, false);
+memory_region_del_subregion(&b->hostmem, mr);
+object_unparent(OBJECT(mr));
+}
+#endif /* HAVE_VIRGL_RESOURCE_BLOB */
 
-QTAILQ_REMOVE(&g->reslist, &vres->res, next);
-virtio_gpu_cleanup_mapping(g, &vres->res);
-g_free(vres);
+vres_put_ref(vres);
 }
 
 static void virgl_cmd_context_create(VirtIOGPU *g,
@@ -470,6 +538,191 @@ static void virgl_cmd_get_capset(VirtIOGPU *g,
 g_free(resp);
 }
 
+#ifdef HAVE_VIRGL_RESOURCE_BLOB
+
+static void virgl_resource_unmap(struct virgl_gpu_resource *vres)
+{
+if (!vres) {
+return;
+}
+
+virgl_renderer_resource_unmap(vres->res.resource_id);
+
+vres_put_ref(vres);
+}
+
+static void virgl_resource_blob_async_unmap(void *obj)
+{
+MemoryRegion *mr = MEMORY_REGION(obj);
+struct virgl_gpu_resource *vres = mr->opaque;
+
+virgl_resource_unmap(vres);
+
+g_free(obj);
+}
+
+static void virgl_cmd_resource_create_blob(VirtIOGPU *g,
+   struct virtio_gpu_ctrl_command *cmd)
+{
+struct virgl_gpu_resource *vres;
+struct virtio_gpu_resource_create_blob

[PATCH v6 09/11] virtio-gpu: Support Venus capset

2023-12-18 Thread Huang Rui

From: Antonio Caggiano 

Add support for the Venus capset, which enables Vulkan support through
the Venus Vulkan driver for virtio-gpu.

Signed-off-by: Antonio Caggiano 
Signed-off-by: Huang Rui 
---

No change in v6.

 hw/display/virtio-gpu-virgl.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index be9da6e780..f35a751824 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -506,6 +506,11 @@ static void virgl_cmd_get_capset_info(VirtIOGPU *g,
 virgl_renderer_get_cap_set(resp.capset_id,
&resp.capset_max_version,
&resp.capset_max_size);
+} else if (info.capset_index == 2) {
+resp.capset_id = VIRTIO_GPU_CAPSET_VENUS;
+virgl_renderer_get_cap_set(resp.capset_id,
+   &resp.capset_max_version,
+   &resp.capset_max_size);
 } else {
 resp.capset_max_version = 0;
 resp.capset_max_size = 0;
@@ -978,10 +983,18 @@ int virtio_gpu_virgl_init(VirtIOGPU *g)
 
 int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g)
 {
-uint32_t capset2_max_ver, capset2_max_size;
+uint32_t capset2_max_ver, capset2_max_size, num_capsets;
+num_capsets = 1;
+
 virgl_renderer_get_cap_set(VIRTIO_GPU_CAPSET_VIRGL2,
-  &capset2_max_ver,
-  &capset2_max_size);
+   &capset2_max_ver,
+   &capset2_max_size);
+num_capsets += capset2_max_ver ? 1 : 0;
+
+virgl_renderer_get_cap_set(VIRTIO_GPU_CAPSET_VENUS,
+   &capset2_max_ver,
+   &capset2_max_size);
+num_capsets += capset2_max_size ? 1 : 0;
 
-return capset2_max_ver ? 2 : 1;
+return num_capsets;
 }
-- 
2.25.1

[PATCH v6 04/11] virtio-gpu: Don't require udmabuf when blobs and virgl are enabled

2023-12-18 Thread Huang Rui

From: Dmitry Osipenko 

The udmabuf usage is mandatory when virgl is disabled and blobs feature
enabled in the Qemu machine configuration. If virgl and blobs are enabled,
then udmabuf requirement is optional. Since udmabuf isn't widely supported
by a popular Linux distros today, let's relax the udmabuf requirement for
blobs=on,virgl=on. Now, a full-featured virtio-gpu acceleration is
available to Qemu users without a need to have udmabuf available in the
system.

Reviewed-by: Antonio Caggiano 
Signed-off-by: Dmitry Osipenko 
Signed-off-by: Huang Rui 
---

No change in v6.

 hw/display/virtio-gpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 8b2f4c6be3..4c3ec9d0ea 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1443,6 +1443,7 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error 
**errp)
 
 if (virtio_gpu_blob_enabled(g->parent_obj.conf)) {
 if (!virtio_gpu_rutabaga_enabled(g->parent_obj.conf) &&
+!virtio_gpu_virgl_enabled(g->parent_obj.conf) &&
 !virtio_gpu_have_udmabuf()) {
 error_setg(errp, "need rutabaga or udmabuf for blob resources");
 return;
-- 
2.25.1

[PATCH v6 05/11] virtio-gpu: Introduce virgl_gpu_resource structure

2023-12-18 Thread Huang Rui

Introduce a new virgl_gpu_resource data structure and helper functions
for virgl. It's used to add new member which is specific for virgl in
following patches of blob memory support.

Signed-off-by: Huang Rui 
---

New patch:
- Introduce new struct virgl_gpu_resource to store virgl specific members.
- Move resource initialization from path "virtio-gpu: Resource UUID" here.
- Remove error handling of g_new0, because glib will abort() on OOM.
- Set iov and iov_cnt in struct virtio_gpu_simple_resource for all types
  of resources.

 hw/display/virtio-gpu-virgl.c | 84 ++-
 1 file changed, 64 insertions(+), 20 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 5bbc8071b2..faab374336 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -22,6 +22,23 @@
 
 #include 
 
+struct virgl_gpu_resource {
+struct virtio_gpu_simple_resource res;
+};
+
+static struct virgl_gpu_resource *
+virgl_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id)
+{
+struct virtio_gpu_simple_resource *res;
+
+res = virtio_gpu_find_resource(g, resource_id);
+if (!res) {
+return NULL;
+}
+
+return container_of(res, struct virgl_gpu_resource, res);
+}
+
 #if VIRGL_RENDERER_CALLBACKS_VERSION >= 4
 static void *
 virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
@@ -35,11 +52,19 @@ static void virgl_cmd_create_resource_2d(VirtIOGPU *g,
 {
 struct virtio_gpu_resource_create_2d c2d;
 struct virgl_renderer_resource_create_args args;
+struct virgl_gpu_resource *vres;
 
 VIRTIO_GPU_FILL_CMD(c2d);
 trace_virtio_gpu_cmd_res_create_2d(c2d.resource_id, c2d.format,
c2d.width, c2d.height);
 
+vres = g_new0(struct virgl_gpu_resource, 1);
+vres->res.width = c2d.width;
+vres->res.height = c2d.height;
+vres->res.format = c2d.format;
+vres->res.resource_id = c2d.resource_id;
+QTAILQ_INSERT_HEAD(&g->reslist, &vres->res, next);
+
 args.handle = c2d.resource_id;
 args.target = 2;
 args.format = c2d.format;
@@ -59,11 +84,19 @@ static void virgl_cmd_create_resource_3d(VirtIOGPU *g,
 {
 struct virtio_gpu_resource_create_3d c3d;
 struct virgl_renderer_resource_create_args args;
+struct virgl_gpu_resource *vres;
 
 VIRTIO_GPU_FILL_CMD(c3d);
 trace_virtio_gpu_cmd_res_create_3d(c3d.resource_id, c3d.format,
c3d.width, c3d.height, c3d.depth);
 
+vres = g_new0(struct virgl_gpu_resource, 1);
+vres->res.width = c3d.width;
+vres->res.height = c3d.height;
+vres->res.format = c3d.format;
+vres->res.resource_id = c3d.resource_id;
+QTAILQ_INSERT_HEAD(&g->reslist, &vres->res, next);
+
 args.handle = c3d.resource_id;
 args.target = c3d.target;
 args.format = c3d.format;
@@ -82,19 +115,23 @@ static void virgl_cmd_resource_unref(VirtIOGPU *g,
  struct virtio_gpu_ctrl_command *cmd)
 {
 struct virtio_gpu_resource_unref unref;
-struct iovec *res_iovs = NULL;
-int num_iovs = 0;
+struct virgl_gpu_resource *vres;
 
 VIRTIO_GPU_FILL_CMD(unref);
 trace_virtio_gpu_cmd_res_unref(unref.resource_id);
 
-virgl_renderer_resource_detach_iov(unref.resource_id,
-   &res_iovs,
-   &num_iovs);
-if (res_iovs != NULL && num_iovs != 0) {
-virtio_gpu_cleanup_mapping_iov(g, res_iovs, num_iovs);
+vres = virgl_gpu_find_resource(g, unref.resource_id);
+if (!vres) {
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
+return;
 }
+
+virgl_renderer_resource_detach_iov(unref.resource_id, NULL, NULL);
 virgl_renderer_resource_unref(unref.resource_id);
+
+QTAILQ_REMOVE(&g->reslist, &vres->res, next);
+virtio_gpu_cleanup_mapping(g, &vres->res);
+g_free(vres);
 }
 
 static void virgl_cmd_context_create(VirtIOGPU *g,
@@ -310,44 +347,51 @@ static void virgl_resource_attach_backing(VirtIOGPU *g,
   struct virtio_gpu_ctrl_command *cmd)
 {
 struct virtio_gpu_resource_attach_backing att_rb;
-struct iovec *res_iovs;
-uint32_t res_niov;
+struct virgl_gpu_resource *vres;
 int ret;
 
 VIRTIO_GPU_FILL_CMD(att_rb);
 trace_virtio_gpu_cmd_res_back_attach(att_rb.resource_id);
 
+vres = virgl_gpu_find_resource(g, att_rb.resource_id);
+if (!vres) {
+cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_RESOURCE_ID;
+return;
+}
+
 ret = virtio_gpu_create_mapping_iov(g, att_rb.nr_entries, sizeof(att_rb),
-cmd, NULL, &res_iovs, &res_niov);
+cmd, NULL, &vres->res.iov,
+&vres->res.iov_cnt);
 if (ret != 0) {
 cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
 return;
 }
 
 ret = virgl_renderer_resource

[PATCH v6 03/11] virtio-gpu: Support context init feature with virglrenderer

2023-12-18 Thread Huang Rui

Patch "virtio-gpu: CONTEXT_INIT feature" has added the context_init
feature flags.
We would like to enable the feature with virglrenderer, so add to create
virgl renderer context with flags using context_id when valid.

Originally-by: Antonio Caggiano 
Signed-off-by: Huang Rui 
---

Changes in v6:
- Handle the case while context_init is disabled.
- Enable context_init by default.

 hw/display/virtio-gpu-virgl.c | 13 +++--
 hw/display/virtio-gpu.c   |  4 
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 8bb7a2c21f..5bbc8071b2 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -106,8 +106,17 @@ static void virgl_cmd_context_create(VirtIOGPU *g,
 trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
 cc.debug_name);
 
-virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen,
-  cc.debug_name);
+#ifdef HAVE_VIRGL_CONTEXT_CREATE_WITH_FLAGS
+if (cc.context_init && 
virtio_gpu_context_init_enabled(g->parent_obj.conf)) {
+virgl_renderer_context_create_with_flags(cc.hdr.ctx_id,
+ cc.context_init,
+ cc.nlen,
+ cc.debug_name);
+return;
+}
+#endif
+
+virgl_renderer_context_create(cc.hdr.ctx_id, cc.nlen, cc.debug_name);
 }
 
 static void virgl_cmd_context_destroy(VirtIOGPU *g,
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index b016d3bac8..8b2f4c6be3 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1619,6 +1619,10 @@ static Property virtio_gpu_properties[] = {
 DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags,
 VIRTIO_GPU_FLAG_BLOB_ENABLED, false),
 DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0),
+#ifdef HAVE_VIRGL_CONTEXT_CREATE_WITH_FLAGS
+DEFINE_PROP_BIT("context_init", VirtIOGPU, parent_obj.conf.flags,
+VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED, true),
+#endif
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.25.1

[PATCH v6 02/11] virtio-gpu: Configure new feature flag context_create_with_flags for virglrenderer

2023-12-18 Thread Huang Rui

Configure a new feature flag (context_create_with_flags) for
virglrenderer.

Originally-by: Antonio Caggiano 
Signed-off-by: Huang Rui 
---

Changes in v6:
- Move macros configurations under virgl.found() and rename
  HAVE_VIRGL_CONTEXT_CREATE_WITH_FLAGS.

 meson.build | 4 
 1 file changed, 4 insertions(+)

diff --git a/meson.build b/meson.build
index ec01f8b138..ea52ef1b9c 100644
--- a/meson.build
+++ b/meson.build
@@ -1050,6 +1050,10 @@ if not get_option('virglrenderer').auto() or have_system 
or have_vhost_user_gpu
  cc.has_member('struct 
virgl_renderer_resource_info_ext', 'd3d_tex2d',
prefix: '#include ',
dependencies: virgl))
+config_host_data.set('HAVE_VIRGL_CONTEXT_CREATE_WITH_FLAGS',
+ 
cc.has_function('virgl_renderer_context_create_with_flags',
+ prefix: '#include ',
+ dependencies: virgl))
   endif
 endif
 rutabaga = not_found
-- 
2.25.1

[PATCH v6 01/11] linux-headers: Update to kernel headers to add venus capset

2023-12-18 Thread Huang Rui

Sync up kernel headers to update venus macro till they are merged into
mainline.

Signed-off-by: Huang Rui 
---

Changes in v6:
- Venus capset is applied in kernel, so update it in qemu for future use.

https://lore.kernel.org/lkml/b79dcf75-c9e8-490e-644f-3b97d95f7...@collabora.com/
https://cgit.freedesktop.org/drm-misc/commit/?id=216d86b9a430f3280e5b631c51e6fd1a7774cfa0

 include/standard-headers/linux/virtio_gpu.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/standard-headers/linux/virtio_gpu.h 
b/include/standard-headers/linux/virtio_gpu.h
index 2da48d3d4c..2db643ed8f 100644
--- a/include/standard-headers/linux/virtio_gpu.h
+++ b/include/standard-headers/linux/virtio_gpu.h
@@ -309,6 +309,8 @@ struct virtio_gpu_cmd_submit {
 
 #define VIRTIO_GPU_CAPSET_VIRGL 1
 #define VIRTIO_GPU_CAPSET_VIRGL2 2
+/* 3 is reserved for gfxstream */
+#define VIRTIO_GPU_CAPSET_VENUS 4
 
 /* VIRTIO_GPU_CMD_GET_CAPSET_INFO */
 struct virtio_gpu_get_capset_info {
-- 
2.25.1

Re: [PATCH v2 1/2] qdev: add IOThreadVirtQueueMappingList property type

2023-12-18 Thread Markus Armbruster

Stefan Hajnoczi  writes:

> On Mon, Dec 11, 2023 at 04:32:06PM +0100, Markus Armbruster wrote:
>> Kevin Wolf  writes:
>> 
>> > Am 18.09.2023 um 18:16 hat Stefan Hajnoczi geschrieben:
>> >> virtio-blk and virtio-scsi devices will need a way to specify the
>> >> mapping between IOThreads and virtqueues. At the moment all virtqueues
>> >> are assigned to a single IOThread or the main loop. This single thread
>> >> can be a CPU bottleneck, so it is necessary to allow finer-grained
>> >> assignment to spread the load.
>> >> 
>> >> Introduce DEFINE_PROP_IOTHREAD_VQ_MAPPING_LIST() so devices can take a
>> >> parameter that maps virtqueues to IOThreads. The command-line syntax for
>> >> this new property is as follows:
>> >> 
>> >>   --device 
>> >> '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]}'
>> >> 
>> >> IOThreads are specified by name and virtqueues are specified by 0-based
>> >> index.
>> >> 
>> >> It will be common to simply assign virtqueues round-robin across a set
>> >> of IOThreads. A convenient syntax that does not require specifying
>> >> individual virtqueue indices is available:
>> >> 
>> >>   --device 
>> >> '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]}'
>> >> 
>> >> Signed-off-by: Stefan Hajnoczi 
>> >
>> > When testing this, Qing Wang noticed that "info qtree" crashes. This is
>> > because the string output visitor doesn't support structs. I suppose
>> > IOThreadVirtQueueMapping is the first struct type that is used in a qdev
>> > property type.
>> >
>> > So we'll probably have to add some kind of struct support to the string
>> > output visitor before we can apply this. Even if it's as stupid as just
>> > printing "" without actually displaying
>> > the value.
>> 
>> The string visitors have been nothing but trouble.
>> 
>> For input, we can now use keyval_parse() and the QObject input visitor
>> instead.  Comes with restrictions, but I'd argue it's a more solid base
>> than the string input visitor.
>> 
>> Perhaps we can do something similar for output: create a suitable
>> formatter for use it with the QObject output visitor, replacing the
>> string output visitor.
>
> I sent an initial patch that just shows "" but would like to
> work on a proper solution with your input.
>
> From what I've seen StringOutputVisitor is used in several places in
> QEMU. "info qtree" calls it through object_property_print() to print
> individual qdev properties. I don't understand the requirements of the
> other callers, but object_property_print() wants to return a single
> string without newlines.

string_output_visitor_new():

* hmp_info_migrate(): format a list of integers, then print it like

monitor_printf(mon, "postcopy vcpu blocktime: %s\n", str);

  One element per vCPU; can produce a long line.

* netfilter_print_info(): format the property values of a
  NetFilterState, then print each like

monitor_printf(mon, ",%s=%s", prop->name, str);

* object_property_print(): format a property value of an object, return
  the string.

  Function is misnamed.  object_property_format() or
  object_property_to_string() would be better.

  Just one caller: qdev_print_props(), helper for hmp_info_qtree().
  Prints the string like

qdev_printf("%s = %s\n", props->name,
*value ? value : "");

  where qdev_printf() is a macro wrapping monitor_printf().

  This one passes human=true, unlike the others.  More on that below.

* hmp_info_memdev(): format a list of integers, then print it like

monitor_printf(mon, "  policy: %s\n",
   HostMemPolicy_str(m->value->policy));

  One element per "host node", whatever that may be; might produce a
  long line.

* Tests; not relevant here.

hmp_info_migrate() and hmp_info_memdev() use the visitor as a (somewhat
cumbersome) helper for printing uint32List and uint16List, respectively.
Could do without.

The other two display all properties in HMP.  Both kind of assume the
string visitor produces no newlines.  I think we could instead use the
QObject output visitor, then format the QObject in human-readable form.
Might be less efficient, because we create a temporary QObject.  Perhaps
factor out a single helper first.

string_input_visitor_new(), for good measure:

* hmp_migrate_set_parameter(): parse an uint8_t, uint32, size_t, bool,
  str, or QAPI enum from a string.

* object_property_parse(): parse a property value from a string, and
  assign it to the property.

  Calling this object_property_set_from_string() would be better.

  Callers:

  - object_apply_global_props(): applying compatibility properties
(defined in C) and defauls set with -global (given by user).

  - object_set_propv(): helper for convenience functions to set multiple
properties in C.

  - hmp_qom_set(): set the property value when not JSON (-j is off).

  - object_parse_property_opt(), for accelerator_set_property(), which
processes the arg

[PATCH for-9.0 v2 05/10] vfio/container: Introduce a VFIOIOMMU legacy QOM interface

2023-12-18 Thread Cédric Le Goater

Convert the legacy VFIOIOMMUOps struct to the new VFIOIOMMU QOM
interface. The set of of operations for this backend can be referenced
with a literal typename instead of a C struct. This will simplify
support of multiple backends.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 v2: - Removed class_size initialization
 - Removed NULL initialization of vioc
 
 include/hw/vfio/vfio-common.h |  1 -
 include/hw/vfio/vfio-container-base.h |  1 +
 hw/vfio/common.c  |  6 ++-
 hw/vfio/container.c   | 58 ++-
 4 files changed, 55 insertions(+), 11 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 
b8aa8a549532442a31c8e85ce385c992d84f6bd5..14c497b6b0a79466e8f567aceed384ec2c75ea90
 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -210,7 +210,6 @@ typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
 typedef QLIST_HEAD(VFIODeviceList, VFIODevice) VFIODeviceList;
 extern VFIOGroupList vfio_group_list;
 extern VFIODeviceList vfio_device_list;
-extern const VFIOIOMMUOps vfio_legacy_ops;
 extern const VFIOIOMMUOps vfio_iommufd_ops;
 extern const MemoryListener vfio_memory_listener;
 extern int vfio_kvm_device_fd;
diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 
d6147b4aeef26b6075c88579108e566720f58ebb..c60370fc5ebe65474816dbf2b065aa0912de1a3c
 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -94,6 +94,7 @@ void vfio_container_destroy(VFIOContainerBase *bcontainer);
 
 
 #define TYPE_VFIO_IOMMU "vfio-iommu"
+#define TYPE_VFIO_IOMMU_LEGACY TYPE_VFIO_IOMMU "-legacy"
 
 /*
  * VFIOContainerBase is not an abstract QOM object because it felt
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 
49dab41566f07ba7be1100fed1973e028d34467c..2329d0efc8c1d617f0bfee5283e82b295d2d477d
 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1503,13 +1503,17 @@ retry:
 int vfio_attach_device(char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp)
 {
-const VFIOIOMMUClass *ops = &vfio_legacy_ops;
+const VFIOIOMMUClass *ops =
+VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
 
 #ifdef CONFIG_IOMMUFD
 if (vbasedev->iommufd) {
 ops = &vfio_iommufd_ops;
 }
 #endif
+
+assert(ops);
+
 return ops->attach_device(name, vbasedev, as, errp);
 }
 
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 
f4a0434a5239bfb6a17b91c8879cb98e686afccc..220e838a917f9a135af1e040a450cb52064428cf
 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -369,10 +369,30 @@ static int vfio_get_iommu_type(VFIOContainer *container,
 return -EINVAL;
 }
 
+/*
+ * vfio_get_iommu_ops - get a VFIOIOMMUClass associated with a type
+ */
+static const VFIOIOMMUClass *vfio_get_iommu_class(int iommu_type, Error **errp)
+{
+ObjectClass *klass = NULL;
+
+switch (iommu_type) {
+case VFIO_TYPE1v2_IOMMU:
+case VFIO_TYPE1_IOMMU:
+klass = object_class_by_name(TYPE_VFIO_IOMMU_LEGACY);
+break;
+default:
+g_assert_not_reached();
+};
+
+return VFIO_IOMMU_CLASS(klass);
+}
+
 static int vfio_init_container(VFIOContainer *container, int group_fd,
VFIOAddressSpace *space, Error **errp)
 {
 int iommu_type, ret;
+const VFIOIOMMUClass *vioc;
 
 iommu_type = vfio_get_iommu_type(container, errp);
 if (iommu_type < 0) {
@@ -401,7 +421,14 @@ static int vfio_init_container(VFIOContainer *container, 
int group_fd,
 }
 
 container->iommu_type = iommu_type;
-vfio_container_init(&container->bcontainer, space, &vfio_legacy_ops);
+
+vioc = vfio_get_iommu_class(iommu_type, errp);
+if (!vioc) {
+error_setg(errp, "No available IOMMU models");
+return -EINVAL;
+}
+
+vfio_container_init(&container->bcontainer, space, vioc);
 return 0;
 }
 
@@ -1098,12 +1125,25 @@ out_single:
 return ret;
 }
 
-const VFIOIOMMUOps vfio_legacy_ops = {
-.dma_map = vfio_legacy_dma_map,
-.dma_unmap = vfio_legacy_dma_unmap,
-.attach_device = vfio_legacy_attach_device,
-.detach_device = vfio_legacy_detach_device,
-.set_dirty_page_tracking = vfio_legacy_set_dirty_page_tracking,
-.query_dirty_bitmap = vfio_legacy_query_dirty_bitmap,
-.pci_hot_reset = vfio_legacy_pci_hot_reset,
+static void vfio_iommu_legacy_class_init(ObjectClass *klass, void *data)
+{
+VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
+
+vioc->dma_map = vfio_legacy_dma_map;
+vioc->dma_unmap = vfio_legacy_dma_unmap;
+vioc->attach_device = vfio_legacy_attach_device;
+vioc->detach_device = vfio_legacy_detach_device;
+vioc->set_dirty_page_tracking = vfio_legacy_set_dirty_page_tracking;
+vioc->query_dirty_bitmap = vfio_legacy_query_dirty_bitmap;
+vioc->pci_hot_reset = vfio_legacy_pci_hot_re

[PATCH for-9.0 v2 00/10] vfio: Introduce a VFIOIOMMUClass

2023-12-18 Thread Cédric Le Goater

Hello,

The VFIO object hierarchy has some constraints because each VFIO type
has a dual nature: a VFIO nature for passthrough support and a bus
nature (PCI, AP, CCW, Platform) for its initial presentation. It
seemed the best approach made because multi-inheritance is not
feasible with QOM and both aspect of the VFIO object, passthrough and
bus, require state. A QOM interface in that case is not sufficient.

One aspect of passthrough is interaction with the IOMMU. IOMMUFD
support was recently added and for this purpose, we introduced an
IOMMU backend framework simply based on a VFIOIOMMUOps struct. We
didn't want to use QOM again because it would have exposed the various
lowlevel backend objects to the QEMU machine and human interface which
felt unnecessary at the time.

The changes of this series introduce a VFIO_IOMMU QOM interface and
its VFIOIOMMUClass to replace the current VFIOIOMMUOps. This provides
better code abstraction for the type1 and sPAPR IOMMU backends and
allows us to improve the vfio_connect_container() implementation.
Also, QOM interfaces are not exposed at the QEMU interface level. Most
important, we can now avoid compiling the sPAPR IOMMU support on
targets not needing it. This saves some text in QEMU.

Applies on vfio-next.

Thanks,

C.

Changes in v2:
 - Removed superfluous define and struct definitions
 - Improved comments and commit log
 - Removed NULL initialization of vioc
 - Removed class_size initialization

Cédric Le Goater (10):
  vfio/spapr: Extend VFIOIOMMUOps with a release handler
  vfio/container: Introduce vfio_legacy_setup() for further cleanups
  vfio/container: Initialize VFIOIOMMUOps under vfio_init_container()
  vfio/container: Introduce a VFIOIOMMU QOM interface
  vfio/container: Introduce a VFIOIOMMU legacy QOM interface
  vfio/container: Intoduce a new VFIOIOMMUClass::setup handler
  vfio/spapr: Introduce a sPAPR VFIOIOMMU QOM interface
  vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interface
  vfio/spapr: Only compile sPAPR IOMMU support when needed
  vfio/iommufd: Remove CONFIG_IOMMUFD usage

 include/hw/vfio/vfio-common.h |   2 -
 include/hw/vfio/vfio-container-base.h |  27 -
 hw/vfio/common.c  |  11 +-
 hw/vfio/container-base.c  |  12 ++-
 hw/vfio/container.c   | 146 --
 hw/vfio/iommufd.c |  35 --
 hw/vfio/pci.c |   2 +-
 hw/vfio/spapr.c   |  60 ++-
 hw/vfio/meson.build   |   2 +-
 9 files changed, 197 insertions(+), 100 deletions(-)

-- 
2.43.0

[PATCH for-9.0 v2 07/10] vfio/spapr: Introduce a sPAPR VFIOIOMMU QOM interface

2023-12-18 Thread Cédric Le Goater

Move vfio_spapr_container_setup() to a VFIOIOMMUClass::setup handler
and convert the sPAPR VFIOIOMMUOps struct to a QOM interface. The
sPAPR QOM interface inherits from the legacy QOM interface because
because both have the same basic needs. The sPAPR interface is then
extended with the handlers specific to the sPAPR IOMMU.

This allows reuse and provides better abstraction of the backends. It
will be useful to avoid compiling the sPAPR IOMMU backend on targets
not supporting it.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 v2: - Removed class_size initialization
 
 include/hw/vfio/vfio-container-base.h |  1 +
 hw/vfio/container.c   | 18 +
 hw/vfio/spapr.c   | 39 ---
 3 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 
ce8b1fba88c145135adc20e96591bafd6050d5f1..9e21d7811f3810ca2c63d9f28bdcc9aa6f75f9ad
 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -95,6 +95,7 @@ void vfio_container_destroy(VFIOContainerBase *bcontainer);
 
 #define TYPE_VFIO_IOMMU "vfio-iommu"
 #define TYPE_VFIO_IOMMU_LEGACY TYPE_VFIO_IOMMU "-legacy"
+#define TYPE_VFIO_IOMMU_SPAPR TYPE_VFIO_IOMMU "-spapr"
 
 /*
  * VFIOContainerBase is not an abstract QOM object because it felt
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 
c22bdd321677026e52c7cdffce853523ef679cd0..688cf23bab88f85246378bc5a7da3c51ea6b79d9
 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -381,6 +381,10 @@ static const VFIOIOMMUClass *vfio_get_iommu_class(int 
iommu_type, Error **errp)
 case VFIO_TYPE1_IOMMU:
 klass = object_class_by_name(TYPE_VFIO_IOMMU_LEGACY);
 break;
+case VFIO_SPAPR_TCE_v2_IOMMU:
+case VFIO_SPAPR_TCE_IOMMU:
+klass = object_class_by_name(TYPE_VFIO_IOMMU_SPAPR);
+break;
 default:
 g_assert_not_reached();
 };
@@ -623,19 +627,9 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 goto free_container_exit;
 }
 
-switch (container->iommu_type) {
-case VFIO_TYPE1v2_IOMMU:
-case VFIO_TYPE1_IOMMU:
-ret = vfio_legacy_setup(bcontainer, errp);
-break;
-case VFIO_SPAPR_TCE_v2_IOMMU:
-case VFIO_SPAPR_TCE_IOMMU:
-ret = vfio_spapr_container_init(container, errp);
-break;
-default:
-g_assert_not_reached();
-}
+assert(bcontainer->ops->setup);
 
+ret = bcontainer->ops->setup(bcontainer, errp);
 if (ret) {
 goto enable_discards_exit;
 }
diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
index 
44617dfc6b5f1a2a3a1c37436b76042aebda8b63..0d949bb728212534a7e2296e491aa8d95f45945d
 100644
--- a/hw/vfio/spapr.c
+++ b/hw/vfio/spapr.c
@@ -458,20 +458,11 @@ static void 
vfio_spapr_container_release(VFIOContainerBase *bcontainer)
 }
 }
 
-static VFIOIOMMUOps vfio_iommu_spapr_ops;
-
-static void setup_spapr_ops(VFIOContainerBase *bcontainer)
-{
-vfio_iommu_spapr_ops = *bcontainer->ops;
-vfio_iommu_spapr_ops.add_window = vfio_spapr_container_add_section_window;
-vfio_iommu_spapr_ops.del_window = vfio_spapr_container_del_section_window;
-vfio_iommu_spapr_ops.release = vfio_spapr_container_release;
-bcontainer->ops = &vfio_iommu_spapr_ops;
-}
-
-int vfio_spapr_container_init(VFIOContainer *container, Error **errp)
+static int vfio_spapr_container_setup(VFIOContainerBase *bcontainer,
+  Error **errp)
 {
-VFIOContainerBase *bcontainer = &container->bcontainer;
+VFIOContainer *container = container_of(bcontainer, VFIOContainer,
+bcontainer);
 VFIOSpaprContainer *scontainer = container_of(container, 
VFIOSpaprContainer,
   container);
 struct vfio_iommu_spapr_tce_info info;
@@ -536,8 +527,6 @@ int vfio_spapr_container_init(VFIOContainer *container, 
Error **errp)
   0x1000);
 }
 
-setup_spapr_ops(bcontainer);
-
 return 0;
 
 listener_unregister_exit:
@@ -546,3 +535,23 @@ listener_unregister_exit:
 }
 return ret;
 }
+
+static void vfio_iommu_spapr_class_init(ObjectClass *klass, void *data)
+{
+VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
+
+vioc->add_window = vfio_spapr_container_add_section_window;
+vioc->del_window = vfio_spapr_container_del_section_window;
+vioc->release = vfio_spapr_container_release;
+vioc->setup = vfio_spapr_container_setup;
+};
+
+static const TypeInfo types[] = {
+{
+.name = TYPE_VFIO_IOMMU_SPAPR,
+.parent = TYPE_VFIO_IOMMU_LEGACY,
+.class_init = vfio_iommu_spapr_class_init,
+},
+};
+
+DEFINE_TYPES(types)
-- 
2.43.0

[PATCH for-9.0 v2 01/10] vfio/spapr: Extend VFIOIOMMUOps with a release handler

2023-12-18 Thread Cédric Le Goater

This allows to abstract a bit more the sPAPR IOMMU support in the
legacy IOMMU backend.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 include/hw/vfio/vfio-container-base.h |  1 +
 hw/vfio/container.c   | 10 +++-
 hw/vfio/spapr.c   | 35 +++
 3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 
2ae297ccda93fd97986c852a8329b390fa1ab91f..5c9594b6c77681e5593236e711e7e391e5f2bdff
 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -117,5 +117,6 @@ struct VFIOIOMMUOps {
   Error **errp);
 void (*del_window)(VFIOContainerBase *bcontainer,
MemoryRegionSection *section);
+void (*release)(VFIOContainerBase *bcontainer);
 };
 #endif /* HW_VFIO_VFIO_CONTAINER_BASE_H */
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 
b22feb8ded0a0d9ed98d6e206b78c0c6e2554d5c..1e77a2929e90ed1d2ee84062549c477ae651c5a8
 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -632,9 +632,8 @@ listener_release_exit:
 QLIST_REMOVE(bcontainer, next);
 vfio_kvm_device_del_group(group);
 memory_listener_unregister(&bcontainer->listener);
-if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU ||
-container->iommu_type == VFIO_SPAPR_TCE_IOMMU) {
-vfio_spapr_container_deinit(container);
+if (bcontainer->ops->release) {
+bcontainer->ops->release(bcontainer);
 }
 
 enable_discards_exit:
@@ -667,9 +666,8 @@ static void vfio_disconnect_container(VFIOGroup *group)
  */
 if (QLIST_EMPTY(&container->group_list)) {
 memory_listener_unregister(&bcontainer->listener);
-if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU ||
-container->iommu_type == VFIO_SPAPR_TCE_IOMMU) {
-vfio_spapr_container_deinit(container);
+if (bcontainer->ops->release) {
+bcontainer->ops->release(bcontainer);
 }
 }
 
diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
index 
5c6426e6973bec606667ebcaca5b0585b184a214..44617dfc6b5f1a2a3a1c37436b76042aebda8b63
 100644
--- a/hw/vfio/spapr.c
+++ b/hw/vfio/spapr.c
@@ -440,6 +440,24 @@ vfio_spapr_container_del_section_window(VFIOContainerBase 
*bcontainer,
 }
 }
 
+static void vfio_spapr_container_release(VFIOContainerBase *bcontainer)
+{
+VFIOContainer *container = container_of(bcontainer, VFIOContainer,
+bcontainer);
+VFIOSpaprContainer *scontainer = container_of(container, 
VFIOSpaprContainer,
+  container);
+VFIOHostDMAWindow *hostwin, *next;
+
+if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) {
+memory_listener_unregister(&scontainer->prereg_listener);
+}
+QLIST_FOREACH_SAFE(hostwin, &scontainer->hostwin_list, hostwin_next,
+   next) {
+QLIST_REMOVE(hostwin, hostwin_next);
+g_free(hostwin);
+}
+}
+
 static VFIOIOMMUOps vfio_iommu_spapr_ops;
 
 static void setup_spapr_ops(VFIOContainerBase *bcontainer)
@@ -447,6 +465,7 @@ static void setup_spapr_ops(VFIOContainerBase *bcontainer)
 vfio_iommu_spapr_ops = *bcontainer->ops;
 vfio_iommu_spapr_ops.add_window = vfio_spapr_container_add_section_window;
 vfio_iommu_spapr_ops.del_window = vfio_spapr_container_del_section_window;
+vfio_iommu_spapr_ops.release = vfio_spapr_container_release;
 bcontainer->ops = &vfio_iommu_spapr_ops;
 }
 
@@ -527,19 +546,3 @@ listener_unregister_exit:
 }
 return ret;
 }
-
-void vfio_spapr_container_deinit(VFIOContainer *container)
-{
-VFIOSpaprContainer *scontainer = container_of(container, 
VFIOSpaprContainer,
-  container);
-VFIOHostDMAWindow *hostwin, *next;
-
-if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) {
-memory_listener_unregister(&scontainer->prereg_listener);
-}
-QLIST_FOREACH_SAFE(hostwin, &scontainer->hostwin_list, hostwin_next,
-   next) {
-QLIST_REMOVE(hostwin, hostwin_next);
-g_free(hostwin);
-}
-}
-- 
2.43.0

[PATCH for-9.0 v2 02/10] vfio/container: Introduce vfio_legacy_setup() for further cleanups

2023-12-18 Thread Cédric Le Goater

This will help subsequent patches to unify the initialization of type1
and sPAPR IOMMU backends.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 hw/vfio/container.c | 63 +
 1 file changed, 35 insertions(+), 28 deletions(-)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 
1e77a2929e90ed1d2ee84062549c477ae651c5a8..afcfe8048805c58291d1104ff0ef20bdc457f99c
 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -474,6 +474,35 @@ static void vfio_get_iommu_info_migration(VFIOContainer 
*container,
 }
 }
 
+static int vfio_legacy_setup(VFIOContainerBase *bcontainer, Error **errp)
+{
+VFIOContainer *container = container_of(bcontainer, VFIOContainer,
+bcontainer);
+g_autofree struct vfio_iommu_type1_info *info = NULL;
+int ret;
+
+ret = vfio_get_iommu_info(container, &info);
+if (ret) {
+error_setg_errno(errp, -ret, "Failed to get VFIO IOMMU info");
+return ret;
+}
+
+if (info->flags & VFIO_IOMMU_INFO_PGSIZES) {
+bcontainer->pgsizes = info->iova_pgsizes;
+} else {
+bcontainer->pgsizes = qemu_real_host_page_size();
+}
+
+if (!vfio_get_info_dma_avail(info, &bcontainer->dma_max_mappings)) {
+bcontainer->dma_max_mappings = 65535;
+}
+
+vfio_get_info_iova_range(info, bcontainer);
+
+vfio_get_iommu_info_migration(container, info);
+return 0;
+}
+
 static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
   Error **errp)
 {
@@ -570,40 +599,18 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 switch (container->iommu_type) {
 case VFIO_TYPE1v2_IOMMU:
 case VFIO_TYPE1_IOMMU:
-{
-struct vfio_iommu_type1_info *info;
-
-ret = vfio_get_iommu_info(container, &info);
-if (ret) {
-error_setg_errno(errp, -ret, "Failed to get VFIO IOMMU info");
-goto enable_discards_exit;
-}
-
-if (info->flags & VFIO_IOMMU_INFO_PGSIZES) {
-bcontainer->pgsizes = info->iova_pgsizes;
-} else {
-bcontainer->pgsizes = qemu_real_host_page_size();
-}
-
-if (!vfio_get_info_dma_avail(info, &bcontainer->dma_max_mappings)) {
-bcontainer->dma_max_mappings = 65535;
-}
-
-vfio_get_info_iova_range(info, bcontainer);
-
-vfio_get_iommu_info_migration(container, info);
-g_free(info);
+ret = vfio_legacy_setup(bcontainer, errp);
 break;
-}
 case VFIO_SPAPR_TCE_v2_IOMMU:
 case VFIO_SPAPR_TCE_IOMMU:
-{
 ret = vfio_spapr_container_init(container, errp);
-if (ret) {
-goto enable_discards_exit;
-}
 break;
+default:
+g_assert_not_reached();
 }
+
+if (ret) {
+goto enable_discards_exit;
 }
 
 vfio_kvm_device_add_group(group);
-- 
2.43.0

[PATCH for-9.0 v2 09/10] vfio/spapr: Only compile sPAPR IOMMU support when needed

2023-12-18 Thread Cédric Le Goater

sPAPR IOMMU support is only needed for pseries machines. Compile out
support when CONFIG_PSERIES is not set. This saves ~7K of text.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 hw/vfio/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build
index 
e5d98b6adc223061f6b0c3e1a7db3ba93d4eef16..bb98493b53e858c53181e224f9cb46892838a8be
 100644
--- a/hw/vfio/meson.build
+++ b/hw/vfio/meson.build
@@ -4,9 +4,9 @@ vfio_ss.add(files(
   'common.c',
   'container-base.c',
   'container.c',
-  'spapr.c',
   'migration.c',
 ))
+vfio_ss.add(when: 'CONFIG_PSERIES', if_true: files('spapr.c'))
 vfio_ss.add(when: 'CONFIG_IOMMUFD', if_true: files(
   'iommufd.c',
 ))
-- 
2.43.0

[PATCH for-9.0 v2 04/10] vfio/container: Introduce a VFIOIOMMU QOM interface

2023-12-18 Thread Cédric Le Goater

VFIOContainerBase was not introduced as an abstract QOM object because
it felt unnecessary to expose all the IOMMU backends to the QEMU
machine and human interface. However, we can still abstract the IOMMU
backend handlers using a QOM interface class. This provides more
flexibility when referencing the various implementations.

Simply transform the VFIOIOMMUOps struct in an InterfaceClass and do
some initial name replacements. Next changes will start converting
VFIOIOMMUOps.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 v2: - Removed superfluous define and struct definitions
 - Improved comments and commit log
 
 include/hw/vfio/vfio-container-base.h | 23 +++
 hw/vfio/common.c  |  2 +-
 hw/vfio/container-base.c  | 12 +++-
 hw/vfio/pci.c |  2 +-
 4 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 
5c9594b6c77681e5593236e711e7e391e5f2bdff..d6147b4aeef26b6075c88579108e566720f58ebb
 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -16,7 +16,8 @@
 #include "exec/memory.h"
 
 typedef struct VFIODevice VFIODevice;
-typedef struct VFIOIOMMUOps VFIOIOMMUOps;
+typedef struct VFIOIOMMUClass VFIOIOMMUClass;
+#define VFIOIOMMUOps VFIOIOMMUClass /* To remove */
 
 typedef struct {
 unsigned long *bitmap;
@@ -34,7 +35,7 @@ typedef struct VFIOAddressSpace {
  * This is the base object for vfio container backends
  */
 typedef struct VFIOContainerBase {
-const VFIOIOMMUOps *ops;
+const VFIOIOMMUClass *ops;
 VFIOAddressSpace *space;
 MemoryListener listener;
 Error *error;
@@ -88,10 +89,24 @@ int vfio_container_query_dirty_bitmap(const 
VFIOContainerBase *bcontainer,
 
 void vfio_container_init(VFIOContainerBase *bcontainer,
  VFIOAddressSpace *space,
- const VFIOIOMMUOps *ops);
+ const VFIOIOMMUClass *ops);
 void vfio_container_destroy(VFIOContainerBase *bcontainer);
 
-struct VFIOIOMMUOps {
+
+#define TYPE_VFIO_IOMMU "vfio-iommu"
+
+/*
+ * VFIOContainerBase is not an abstract QOM object because it felt
+ * unnecessary to expose all the IOMMU backends to the QEMU machine
+ * and human interface. However, we can still abstract the IOMMU
+ * backend handlers using a QOM interface class. This provides more
+ * flexibility when referencing the various implementations.
+ */
+DECLARE_CLASS_CHECKERS(VFIOIOMMUClass, VFIO_IOMMU, TYPE_VFIO_IOMMU)
+
+struct VFIOIOMMUClass {
+InterfaceClass parent_class;
+
 /* basic feature */
 int (*dma_map)(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size,
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 
08a3e576725b1fc9f2f7e425375df3b827c4fe56..49dab41566f07ba7be1100fed1973e028d34467c
 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1503,7 +1503,7 @@ retry:
 int vfio_attach_device(char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp)
 {
-const VFIOIOMMUOps *ops = &vfio_legacy_ops;
+const VFIOIOMMUClass *ops = &vfio_legacy_ops;
 
 #ifdef CONFIG_IOMMUFD
 if (vbasedev->iommufd) {
diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
index 
1ffd25bbfa8bd3d404e43b96357273b95f5a0031..913ae49077c4f09b7b27517c1231cfbe4befb7fb
 100644
--- a/hw/vfio/container-base.c
+++ b/hw/vfio/container-base.c
@@ -72,7 +72,7 @@ int vfio_container_query_dirty_bitmap(const VFIOContainerBase 
*bcontainer,
 }
 
 void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace 
*space,
- const VFIOIOMMUOps *ops)
+ const VFIOIOMMUClass *ops)
 {
 bcontainer->ops = ops;
 bcontainer->space = space;
@@ -99,3 +99,13 @@ void vfio_container_destroy(VFIOContainerBase *bcontainer)
 
 g_list_free_full(bcontainer->iova_ranges, g_free);
 }
+
+static const TypeInfo types[] = {
+{
+.name = TYPE_VFIO_IOMMU,
+.parent = TYPE_INTERFACE,
+.class_size = sizeof(VFIOIOMMUClass),
+},
+};
+
+DEFINE_TYPES(types)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 
1874ec1aba987cac6cb83f86650e7a5e1968c327..d84a9e73a65de4e4c1cdaf65619a700bd8d6b802
 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2488,7 +2488,7 @@ int vfio_pci_get_pci_hot_reset_info(VFIOPCIDevice *vdev,
 static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
 {
 VFIODevice *vbasedev = &vdev->vbasedev;
-const VFIOIOMMUOps *ops = vbasedev->bcontainer->ops;
+const VFIOIOMMUClass *ops = vbasedev->bcontainer->ops;
 
 return ops->pci_hot_reset(vbasedev, single);
 }
-- 
2.43.0

[PATCH for-9.0 v2 10/10] vfio/iommufd: Remove CONFIG_IOMMUFD usage

2023-12-18 Thread Cédric Le Goater

Availability of the IOMMUFD backend can now be fully determined at
runtime and the ifdef check was a build time protection (for PPC not
supporting it mostly).

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 hw/vfio/common.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 
89ff1c7aeda14d20b2e24f8bc251db0a71d4527c..0d4d8b8416c6a4770677e1ebe5e1fc7dbaaef004
 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -19,7 +19,6 @@
  */
 
 #include "qemu/osdep.h"
-#include CONFIG_DEVICES /* CONFIG_IOMMUFD */
 #include 
 #ifdef CONFIG_KVM
 #include 
@@ -1506,11 +1505,9 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
 const VFIOIOMMUClass *ops =
 VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
 
-#ifdef CONFIG_IOMMUFD
 if (vbasedev->iommufd) {
 ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
 }
-#endif
 
 assert(ops);
 
-- 
2.43.0

[PATCH for-9.0 v2 06/10] vfio/container: Intoduce a new VFIOIOMMUClass::setup handler

2023-12-18 Thread Cédric Le Goater

This will help in converting the sPAPR IOMMU backend to a QOM interface.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 include/hw/vfio/vfio-container-base.h | 1 +
 hw/vfio/container.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 
c60370fc5ebe65474816dbf2b065aa0912de1a3c..ce8b1fba88c145135adc20e96591bafd6050d5f1
 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -109,6 +109,7 @@ struct VFIOIOMMUClass {
 InterfaceClass parent_class;
 
 /* basic feature */
+int (*setup)(VFIOContainerBase *bcontainer, Error **errp);
 int (*dma_map)(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size,
void *vaddr, bool readonly);
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 
220e838a917f9a135af1e040a450cb52064428cf..c22bdd321677026e52c7cdffce853523ef679cd0
 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -1129,6 +1129,7 @@ static void vfio_iommu_legacy_class_init(ObjectClass 
*klass, void *data)
 {
 VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
 
+vioc->setup = vfio_legacy_setup;
 vioc->dma_map = vfio_legacy_dma_map;
 vioc->dma_unmap = vfio_legacy_dma_unmap;
 vioc->attach_device = vfio_legacy_attach_device;
-- 
2.43.0

[PATCH for-9.0 v2 08/10] vfio/iommufd: Introduce a VFIOIOMMU iommufd QOM interface

2023-12-18 Thread Cédric Le Goater

As previously done for the sPAPR and legacy IOMMU backends, convert
the VFIOIOMMUOps struct to a QOM interface. The set of of operations
for this backend can be referenced with a literal typename instead of
a C struct.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 v2: - Removed class_size initialization
 
 include/hw/vfio/vfio-common.h |  1 -
 include/hw/vfio/vfio-container-base.h |  2 +-
 hw/vfio/common.c  |  2 +-
 hw/vfio/iommufd.c | 35 ---
 4 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 
14c497b6b0a79466e8f567aceed384ec2c75ea90..9b7ef7d02b5a0ad5266bcc4d06cd6874178978e4
 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -210,7 +210,6 @@ typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
 typedef QLIST_HEAD(VFIODeviceList, VFIODevice) VFIODeviceList;
 extern VFIOGroupList vfio_group_list;
 extern VFIODeviceList vfio_device_list;
-extern const VFIOIOMMUOps vfio_iommufd_ops;
 extern const MemoryListener vfio_memory_listener;
 extern int vfio_kvm_device_fd;
 
diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 
9e21d7811f3810ca2c63d9f28bdcc9aa6f75f9ad..b2813b0c117985425c842d91f011bb895955d738
 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -17,7 +17,6 @@
 
 typedef struct VFIODevice VFIODevice;
 typedef struct VFIOIOMMUClass VFIOIOMMUClass;
-#define VFIOIOMMUOps VFIOIOMMUClass /* To remove */
 
 typedef struct {
 unsigned long *bitmap;
@@ -96,6 +95,7 @@ void vfio_container_destroy(VFIOContainerBase *bcontainer);
 #define TYPE_VFIO_IOMMU "vfio-iommu"
 #define TYPE_VFIO_IOMMU_LEGACY TYPE_VFIO_IOMMU "-legacy"
 #define TYPE_VFIO_IOMMU_SPAPR TYPE_VFIO_IOMMU "-spapr"
+#define TYPE_VFIO_IOMMU_IOMMUFD TYPE_VFIO_IOMMU "-iommufd"
 
 /*
  * VFIOContainerBase is not an abstract QOM object because it felt
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 
2329d0efc8c1d617f0bfee5283e82b295d2d477d..89ff1c7aeda14d20b2e24f8bc251db0a71d4527c
 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1508,7 +1508,7 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
 
 #ifdef CONFIG_IOMMUFD
 if (vbasedev->iommufd) {
-ops = &vfio_iommufd_ops;
+ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
 }
 #endif
 
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 
87a561c54580adc6d7b2711331a00940ff13bd43..d4c586e842def8f04d3a914843f5eece2c75ea30
 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -319,6 +319,8 @@ static int iommufd_cdev_attach(const char *name, VFIODevice 
*vbasedev,
 int ret, devfd;
 uint32_t ioas_id;
 Error *err = NULL;
+const VFIOIOMMUClass *iommufd_vioc =
+VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
 
 if (vbasedev->fd < 0) {
 devfd = iommufd_cdev_getfd(vbasedev->sysfsdev, errp);
@@ -340,7 +342,7 @@ static int iommufd_cdev_attach(const char *name, VFIODevice 
*vbasedev,
 /* try to attach to an existing container in this space */
 QLIST_FOREACH(bcontainer, &space->containers, next) {
 container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
-if (bcontainer->ops != &vfio_iommufd_ops ||
+if (bcontainer->ops != iommufd_vioc ||
 vbasedev->iommufd != container->be) {
 continue;
 }
@@ -374,7 +376,7 @@ static int iommufd_cdev_attach(const char *name, VFIODevice 
*vbasedev,
 container->ioas_id = ioas_id;
 
 bcontainer = &container->bcontainer;
-vfio_container_init(bcontainer, space, &vfio_iommufd_ops);
+vfio_container_init(bcontainer, space, iommufd_vioc);
 QLIST_INSERT_HEAD(&space->containers, bcontainer, next);
 
 ret = iommufd_cdev_attach_container(vbasedev, container, errp);
@@ -476,9 +478,11 @@ static void iommufd_cdev_detach(VFIODevice *vbasedev)
 static VFIODevice *iommufd_cdev_pci_find_by_devid(__u32 devid)
 {
 VFIODevice *vbasedev_iter;
+const VFIOIOMMUClass *iommufd_vioc =
+VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
 
 QLIST_FOREACH(vbasedev_iter, &vfio_device_list, global_next) {
-if (vbasedev_iter->bcontainer->ops != &vfio_iommufd_ops) {
+if (vbasedev_iter->bcontainer->ops != iommufd_vioc) {
 continue;
 }
 if (devid == vbasedev_iter->devid) {
@@ -621,10 +625,23 @@ out_single:
 return ret;
 }
 
-const VFIOIOMMUOps vfio_iommufd_ops = {
-.dma_map = iommufd_cdev_map,
-.dma_unmap = iommufd_cdev_unmap,
-.attach_device = iommufd_cdev_attach,
-.detach_device = iommufd_cdev_detach,
-.pci_hot_reset = iommufd_cdev_pci_hot_reset,
+static void vfio_iommu_iommufd_class_init(ObjectClass *klass, void *data)
+{
+VFIOIOMMUClass *vioc = VFIO_IOMMU_CLASS(klass);
+
+vioc->dma

[PATCH for-9.0 v2 03/10] vfio/container: Initialize VFIOIOMMUOps under vfio_init_container()

2023-12-18 Thread Cédric Le Goater

vfio_init_container() already defines the IOMMU type of the container.
Do the same for the VFIOIOMMUOps struct. This prepares ground for the
following patches that will deduce the associated VFIOIOMMUOps struct
from the IOMMU type.

Reviewed-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 hw/vfio/container.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 
afcfe8048805c58291d1104ff0ef20bdc457f99c..f4a0434a5239bfb6a17b91c8879cb98e686afccc
 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -370,7 +370,7 @@ static int vfio_get_iommu_type(VFIOContainer *container,
 }
 
 static int vfio_init_container(VFIOContainer *container, int group_fd,
-   Error **errp)
+   VFIOAddressSpace *space, Error **errp)
 {
 int iommu_type, ret;
 
@@ -401,6 +401,7 @@ static int vfio_init_container(VFIOContainer *container, 
int group_fd,
 }
 
 container->iommu_type = iommu_type;
+vfio_container_init(&container->bcontainer, space, &vfio_legacy_ops);
 return 0;
 }
 
@@ -583,9 +584,8 @@ static int vfio_connect_container(VFIOGroup *group, 
AddressSpace *as,
 container = g_malloc0(sizeof(*container));
 container->fd = fd;
 bcontainer = &container->bcontainer;
-vfio_container_init(bcontainer, space, &vfio_legacy_ops);
 
-ret = vfio_init_container(container, group->fd, errp);
+ret = vfio_init_container(container, group->fd, space, errp);
 if (ret) {
 goto free_container_exit;
 }
-- 
2.43.0

Re: [External] Re: [PATCH v2 07/20] util/dsa: Implement DSA device start and stop logic.

2023-12-18 Thread Hao Xiang

On Mon, Dec 11, 2023 at 1:28 PM Fabiano Rosas  wrote:
>
> Hao Xiang  writes:
>
> > * DSA device open and close.
> > * DSA group contains multiple DSA devices.
> > * DSA group configure/start/stop/clean.
> >
> > Signed-off-by: Hao Xiang 
> > Signed-off-by: Bryan Zhang 
> > ---
> >  include/qemu/dsa.h |  49 +++
> >  util/dsa.c | 338 +
> >  util/meson.build   |   1 +
> >  3 files changed, 388 insertions(+)
> >  create mode 100644 include/qemu/dsa.h
> >  create mode 100644 util/dsa.c
> >
> > diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h
> > new file mode 100644
> > index 00..30246b507e
> > --- /dev/null
> > +++ b/include/qemu/dsa.h
> > @@ -0,0 +1,49 @@
> > +#ifndef QEMU_DSA_H
> > +#define QEMU_DSA_H
> > +
> > +#include "qemu/thread.h"
> > +#include "qemu/queue.h"
> > +
> > +#ifdef CONFIG_DSA_OPT
> > +
> > +#pragma GCC push_options
> > +#pragma GCC target("enqcmd")
> > +
> > +#include 
> > +#include "x86intrin.h"
> > +
> > +#endif
> > +
> > +/**
> > + * @brief Initializes DSA devices.
> > + *
> > + * @param dsa_parameter A list of DSA device path from migration parameter.
>
> This code seems pretty generic, let's decouple this doc from migration.
>
> > + * @return int Zero if successful, otherwise non zero.
> > + */
> > +int dsa_init(const char *dsa_parameter);
> > +
> > +/**
> > + * @brief Start logic to enable using DSA.
> > + */
> > +void dsa_start(void);
> > +
> > +/**
> > + * @brief Stop logic to clean up DSA by halting the device group and 
> > cleaning up
> > + * the completion thread.
>
> "Stop the device group and the completion thread"
>
> The mention of "clean/cleaning up" makes this confusing because of
> dsa_cleanup() below.

Fixed.

>
> > + */
> > +void dsa_stop(void);
> > +
> > +/**
> > + * @brief Clean up system resources created for DSA offloading.
> > + *This function is called during QEMU process teardown.
>
> This is not called during QEMU process teardown. It's called at the end
> of migration AFAICS. Maybe just leave this sentence out.

Fixed.

>
> > + */
> > +void dsa_cleanup(void);
> > +
> > +/**
> > + * @brief Check if DSA is running.
> > + *
> > + * @return True if DSA is running, otherwise false.
> > + */
> > +bool dsa_is_running(void);
> > +
> > +#endif
> > \ No newline at end of file
> > diff --git a/util/dsa.c b/util/dsa.c
> > new file mode 100644
> > index 00..8edaa892ec
> > --- /dev/null
> > +++ b/util/dsa.c
> > @@ -0,0 +1,338 @@
> > +/*
> > + * Use Intel Data Streaming Accelerator to offload certain background
> > + * operations.
> > + *
> > + * Copyright (c) 2023 Hao Xiang 
> > + *Bryan Zhang 
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a 
> > copy
> > + * of this software and associated documentation files (the "Software"), 
> > to deal
> > + * in the Software without restriction, including without limitation the 
> > rights
> > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
> > sell
> > + * copies of the Software, and to permit persons to whom the Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included 
> > in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> > FROM,
> > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
> > IN
> > + * THE SOFTWARE.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/queue.h"
> > +#include "qemu/memalign.h"
> > +#include "qemu/lockable.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/dsa.h"
> > +#include "qemu/bswap.h"
> > +#include "qemu/error-report.h"
> > +#include "qemu/rcu.h"
> > +
> > +#ifdef CONFIG_DSA_OPT
> > +
> > +#pragma GCC push_options
> > +#pragma GCC target("enqcmd")
> > +
> > +#include 
> > +#include "x86intrin.h"
> > +
> > +#define DSA_WQ_SIZE 4096
> > +#define MAX_DSA_DEVICES 16
> > +
> > +typedef QSIMPLEQ_HEAD(dsa_task_queue, buffer_zero_batch_task) 
> > dsa_task_queue;
> > +
> > +struct dsa_device {
> > +void *work_queue;
> > +};
> > +
> > +struct dsa_device_group {
> > +struct dsa_device *dsa_devices;
> > +int num_dsa_devices;
> > +uint32_t index;
> > +bool running;
> > +QemuMutex task_queue_lock;
> > +QemuCond task_queue_cond;
> > +dsa_task_queue task_queue;
> > +};
> > +
> > +uint64_t max_retry_count;
> > +static struct dsa_device_group dsa_group;
> > +
> > +
> > +/**
> > + * @brief This function opens a DSA device's work queue and
> > + *

Re: [External] Re: [PATCH v2 09/20] util/dsa: Implement DSA task asynchronous completion thread model.

2023-12-18 Thread Hao Xiang

On Mon, Dec 18, 2023 at 5:34 PM Wang, Lei  wrote:
>
> On 12/19/2023 2:57, Hao Xiang wrote:> On Sun, Dec 17, 2023 at 7:11 PM Wang, 
> Lei
>  wrote:
> >>
> >> On 11/14/2023 13:40, Hao Xiang wrote:> * Create a dedicated thread for DSA 
> >> task
> >> completion.
> >>> * DSA completion thread runs a loop and poll for completed tasks.
> >>> * Start and stop DSA completion thread during DSA device start stop.
> >>>
> >>> User space application can directly submit task to Intel DSA
> >>> accelerator by writing to DSA's device memory (mapped in user space).
> >>
> >>> +}
> >>> +return;
> >>> +}
> >>> +} else {
> >>> +assert(batch_status == DSA_COMP_BATCH_FAIL ||
> >>> +batch_status == DSA_COMP_BATCH_PAGE_FAULT);
> >>
> >> Nit: indentation is broken here.
> >>
> >>> +}
> >>> +
> >>> +for (int i = 0; i < count; i++) {
> >>> +
> >>> +completion = &batch_task->completions[i];
> >>> +status = completion->status;
> >>> +
> >>> +if (status == DSA_COMP_SUCCESS) {
> >>> +results[i] = (completion->result == 0);
> >>> +continue;
> >>> +}
> >>> +
> >>> +if (status != DSA_COMP_PAGE_FAULT_NOBOF) {
> >>> +fprintf(stderr,
> >>> +"Unexpected completion status = %u.\n", status);
> >>> +assert(false);
> >>> +}
> >>> +}
> >>> +}
> >>> +
> >>> +/**
> >>> + * @brief Handles an asynchronous DSA batch task completion.
> >>> + *
> >>> + * @param task A pointer to the batch buffer zero task structure.
> >>> + */
> >>> +static void
> >>> +dsa_batch_task_complete(struct buffer_zero_batch_task *batch_task)
> >>> +{
> >>> +batch_task->status = DSA_TASK_COMPLETION;
> >>> +batch_task->completion_callback(batch_task);
> >>> +}
> >>> +
> >>> +/**
> >>> + * @brief The function entry point called by a dedicated DSA
> >>> + *work item completion thread.
> >>> + *
> >>> + * @param opaque A pointer to the thread context.
> >>> + *
> >>> + * @return void* Not used.
> >>> + */
> >>> +static void *
> >>> +dsa_completion_loop(void *opaque)
> >>
> >> Per my understanding, if a multifd sending thread corresponds to a DSA 
> >> device,
> >> then the batch tasks are executed in parallel which means a task may be
> >> completed slower than another even if this task is enqueued earlier than 
> >> it. If
> >> we poll on the slower task first it will block the handling of the faster 
> >> one,
> >> even if the zero checking task for that thread is finished and it can go 
> >> ahead
> >> and send the data to the wire, this may lower the network resource 
> >> utilization.
> >>
> >
> > Hi Lei, thanks for reviewing. You are correct that we can keep pulling
> > a task enqueued first while others in the queue have already been
> > completed. In fact, only one DSA completion thread (pulling thread) is
> > used here even when multiple DSA devices are used. The pulling loop is
> > the most CPU intensive activity in the DSA workflow and that acts
> > directly against the goal of saving CPU usage. The trade-off I want to
> > take here is a slightly higher latency on DSA task completion but more
> > CPU savings. A single DSA engine can reach 30 GB/s throughput on
> > memory comparison operation. We use kernel tcp stack for network
> > transfer. The best I see is around 10GB/s throughput.  RDMA can
> > potentially go higher but I am not sure if it can go higher than 30
> > GB/s throughput anytime soon.
>
> Hi Hao, that makes sense, if the DSA is faster than the network, then a little
> bit of latency in DSA checking is tolerable. In the long term, I think the 
> best
> form of the DSA task checking thread is to use an fd or such sort of thing 
> that
> can multiplex the checking of different DSA devices, then we can serve the DSA
> task in the order they complete rather than FCFS.
>
I have experimented using N completion threads and each thread pulls
tasks submitted to a particular DSA device. That approach uses too
many CPU cycles. If Intel can come up with a better workflow for DSA
completion, there is definitely space for improvement here.
> >
> >>> +{
> >>> +struct dsa_completion_thread *thread_context =
> >>> +(struct dsa_completion_thread *)opaque;
> >>> +struct buffer_zero_batch_task *batch_task;
> >>> +struct dsa_device_group *group = thread_context->group;
> >>> +
> >>> +rcu_register_thread();
> >>> +
> >>> +thread_context->thread_id = qemu_get_thread_id();
> >>> +qemu_sem_post(&thread_context->sem_init_done);
> >>> +
> >>> +while (thread_context->running) {
> >>> +batch_task = dsa_task_dequeue(group);
> >>> +assert(batch_task != NULL || !group->running);
> >>> +if (!group->running) {
> >>> +assert(!thread_context->running);
> >>> +break;
> >>> +}
> >>> +if (batch_task->task_type == DSA_TASK) {
> >>> +poll_task_completion(batch_task);
> >>> +} else {
>

Re: [PATCH 0/2] support unaligned access for some xHCI registers

2023-12-18 Thread Tomoyuki Hirose

I would be grateful if you would any comments on my patch.

ping,

Tomoyuki HIROSE

On Mon, Dec 11, 2023 at 4:12 PM Tomoyuki HIROSE
 wrote:
>
> According to xHCI spec rev 1.2, unaligned access to xHCI Host
> Controller Capability Registers are not prohibited. But current
> implementation does not support unaligned access to 'MemoryRegion'.
> These patches contain 2 changes:
> 1. support unaligned access to 'MemoryRegion'
> 2. allow unaligned access to Host Controller Capability Registers.
>
> Tomoyuki HIROSE (2):
>   system/memory.c: support unaligned access
>   hw/usb/hcd-xhci.c: allow unaligned access to Capability Registers
>
>  hw/usb/hcd-xhci.c |  4 +++-
>  system/memory.c   | 22 --
>  2 files changed, 19 insertions(+), 7 deletions(-)
>
> --
> 2.39.2
>

Re: [External] Re: [PATCH v2 09/20] util/dsa: Implement DSA task asynchronous completion thread model.

2023-12-18 Thread Wang, Lei

On 12/19/2023 2:57, Hao Xiang wrote:> On Sun, Dec 17, 2023 at 7:11 PM Wang, Lei
 wrote:
>>
>> On 11/14/2023 13:40, Hao Xiang wrote:> * Create a dedicated thread for DSA 
>> task
>> completion.
>>> * DSA completion thread runs a loop and poll for completed tasks.
>>> * Start and stop DSA completion thread during DSA device start stop.
>>>
>>> User space application can directly submit task to Intel DSA
>>> accelerator by writing to DSA's device memory (mapped in user space).
>>
>>> +}
>>> +return;
>>> +}
>>> +} else {
>>> +assert(batch_status == DSA_COMP_BATCH_FAIL ||
>>> +batch_status == DSA_COMP_BATCH_PAGE_FAULT);
>>
>> Nit: indentation is broken here.
>>
>>> +}
>>> +
>>> +for (int i = 0; i < count; i++) {
>>> +
>>> +completion = &batch_task->completions[i];
>>> +status = completion->status;
>>> +
>>> +if (status == DSA_COMP_SUCCESS) {
>>> +results[i] = (completion->result == 0);
>>> +continue;
>>> +}
>>> +
>>> +if (status != DSA_COMP_PAGE_FAULT_NOBOF) {
>>> +fprintf(stderr,
>>> +"Unexpected completion status = %u.\n", status);
>>> +assert(false);
>>> +}
>>> +}
>>> +}
>>> +
>>> +/**
>>> + * @brief Handles an asynchronous DSA batch task completion.
>>> + *
>>> + * @param task A pointer to the batch buffer zero task structure.
>>> + */
>>> +static void
>>> +dsa_batch_task_complete(struct buffer_zero_batch_task *batch_task)
>>> +{
>>> +batch_task->status = DSA_TASK_COMPLETION;
>>> +batch_task->completion_callback(batch_task);
>>> +}
>>> +
>>> +/**
>>> + * @brief The function entry point called by a dedicated DSA
>>> + *work item completion thread.
>>> + *
>>> + * @param opaque A pointer to the thread context.
>>> + *
>>> + * @return void* Not used.
>>> + */
>>> +static void *
>>> +dsa_completion_loop(void *opaque)
>>
>> Per my understanding, if a multifd sending thread corresponds to a DSA 
>> device,
>> then the batch tasks are executed in parallel which means a task may be
>> completed slower than another even if this task is enqueued earlier than it. 
>> If
>> we poll on the slower task first it will block the handling of the faster 
>> one,
>> even if the zero checking task for that thread is finished and it can go 
>> ahead
>> and send the data to the wire, this may lower the network resource 
>> utilization.
>>
> 
> Hi Lei, thanks for reviewing. You are correct that we can keep pulling
> a task enqueued first while others in the queue have already been
> completed. In fact, only one DSA completion thread (pulling thread) is
> used here even when multiple DSA devices are used. The pulling loop is
> the most CPU intensive activity in the DSA workflow and that acts
> directly against the goal of saving CPU usage. The trade-off I want to
> take here is a slightly higher latency on DSA task completion but more
> CPU savings. A single DSA engine can reach 30 GB/s throughput on
> memory comparison operation. We use kernel tcp stack for network
> transfer. The best I see is around 10GB/s throughput.  RDMA can
> potentially go higher but I am not sure if it can go higher than 30
> GB/s throughput anytime soon.

Hi Hao, that makes sense, if the DSA is faster than the network, then a little
bit of latency in DSA checking is tolerable. In the long term, I think the best
form of the DSA task checking thread is to use an fd or such sort of thing that
can multiplex the checking of different DSA devices, then we can serve the DSA
task in the order they complete rather than FCFS.

> 
>>> +{
>>> +struct dsa_completion_thread *thread_context =
>>> +(struct dsa_completion_thread *)opaque;
>>> +struct buffer_zero_batch_task *batch_task;
>>> +struct dsa_device_group *group = thread_context->group;
>>> +
>>> +rcu_register_thread();
>>> +
>>> +thread_context->thread_id = qemu_get_thread_id();
>>> +qemu_sem_post(&thread_context->sem_init_done);
>>> +
>>> +while (thread_context->running) {
>>> +batch_task = dsa_task_dequeue(group);
>>> +assert(batch_task != NULL || !group->running);
>>> +if (!group->running) {
>>> +assert(!thread_context->running);
>>> +break;
>>> +}
>>> +if (batch_task->task_type == DSA_TASK) {
>>> +poll_task_completion(batch_task);
>>> +} else {
>>> +assert(batch_task->task_type == DSA_BATCH_TASK);
>>> +poll_batch_task_completion(batch_task);
>>> +}
>>> +
>>> +dsa_batch_task_complete(batch_task);
>>> +}
>>> +
>>> +rcu_unregister_thread();
>>> +return NULL;
>>> +}
>>> +
>>> +/**
>>> + * @brief Initializes a DSA completion thread.
>>> + *
>>> + * @param completion_thread A pointer to the completion thread context.
>>> + * @param group A pointer to the DSA device group.
>>> + */
>>> +static void
>>> +dsa_completion_thread_init(
>>> +struct dsa

Re: [PATCH v2 12/12] hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

The VIA south bridges are able to relocate and toggle (enable or disable) their
SuperI/O functions. So far this is hardcoded such that all functions are always
enabled and are located at fixed addresses.

Some PC BIOSes seem to probe for I/O occupancy before activating such a function
and issue an error in case of a conflict. Since the functions are enabled on
reset, conflicts are always detected. Prevent that by implementing relocation
and toggling of the SuperI/O functions.

Note that all SuperI/O functions are now deactivated upon reset (except for
VT82C686B's serial ports where Fuloong 2e's rescue-yl seems to expect them to be
enabled by default). Rely on firmware -- or in case of pegasos2 on board code if
no -bios is given -- to configure the functions accordingly.


Pegasos2 emulates firmware when no -bios is given, this was explained in 
previos commit so maybe not needed to be explained it here again so you 
could drop the comment between -- -- but I don't mind.



Signed-off-by: Bernhard Beschow 
---
hw/isa/vt82c686.c | 121 ++
1 file changed, 90 insertions(+), 31 deletions(-)

diff --git a/hw/isa/vt82c686.c b/hw/isa/vt82c686.c
index 9c2333a277..be202d23cf 100644
--- a/hw/isa/vt82c686.c
+++ b/hw/isa/vt82c686.c
@@ -15,6 +15,9 @@

#include "qemu/osdep.h"
#include "hw/isa/vt82c686.h"
+#include "hw/block/fdc.h"
+#include "hw/char/parallel-isa.h"
+#include "hw/char/serial.h"
#include "hw/pci/pci.h"
#include "hw/qdev-properties.h"
#include "hw/ide/pci.h"
@@ -343,6 +346,35 @@ static const TypeInfo via_superio_info = {

#define TYPE_VT82C686B_SUPERIO "vt82c686b-superio"

+static void vt82c686b_superio_update(ViaSuperIOState *s)
+{
+isa_parallel_set_enabled(s->superio.parallel[0],
+ (s->regs[0xe2] & 0x3) != 3);
+isa_serial_set_enabled(s->superio.serial[0], s->regs[0xe2] & BIT(2));
+isa_serial_set_enabled(s->superio.serial[1], s->regs[0xe2] & BIT(3));
+isa_fdc_set_enabled(s->superio.floppy, s->regs[0xe2] & BIT(4));
+
+isa_fdc_set_iobase(s->superio.floppy, (s->regs[0xe3] & 0xfc) << 2);
+isa_parallel_set_iobase(s->superio.parallel[0], s->regs[0xe6] << 2);
+isa_serial_set_iobase(s->superio.serial[0], (s->regs[0xe7] & 0xfe) << 2);
+isa_serial_set_iobase(s->superio.serial[1], (s->regs[0xe8] & 0xfe) << 2);
+}


I wonder if some code duplication could be saved by adding a shared 
via_superio_update() for this further up in the abstract via-superio class 
instead of this method and vt8231_superio_update() below. This common 
method in abstract class would need to handle the differences which seem 
to be reg addresses offset by 0x10 and VT8231 having only 1 serial port. 
These could either be handled by adding function parameters or fields to 
ViaSuperIOState for this that the subclasses can set and the method check. 
(Such as reg base=0xe2 for vt82c686 and 0xf2 for vt8231 and num_serial or 
similar for how many ports are there then can have a for loop for those 
that would only run once for vt8231).



+static int vmstate_vt82c686b_superio_post_load(void *opaque, int version_id)
+{
+ViaSuperIOState *s = opaque;
+
+vt82c686b_superio_update(s);
+
+return 0;


You could lose some blank lines here. You seem to love them, half of your 
cover letter is just blank lines :-) but I'm the opposite and like more 
code to fit in one screen even on todays displays that are wider than 
tall. So this funciton would take less space without blank lines. (Even 
the local variable may not be necessary as you don't access any fields 
within and void * should just cast without a warning but for spelling out 
the desired type as a reminder I'm ok with leaving that but no excessive 
blank lines please if you don't mind that much.)


Regards,
BALATON Zoltan


+}
+
+static const VMStateDescription vmstate_vt82c686b_superio = {
+.name = "vt82c686b_superio",
+.version_id = 1,
+.post_load = vmstate_vt82c686b_superio_post_load,
+};
+
static void vt82c686b_superio_cfg_write(void *opaque, hwaddr addr,
uint64_t data, unsigned size)
{
@@ -368,7 +400,11 @@ static void vt82c686b_superio_cfg_write(void *opaque, 
hwaddr addr,
case 0xfd ... 0xff:
/* ignore write to read only registers */
return;
-/* case 0xe6 ... 0xe8: Should set base port of parallel and serial */
+case 0xe2 ... 0xe3:
+case 0xe6 ... 0xe8:
+sc->regs[idx] = data;
+vt82c686b_superio_update(sc);
+return;
default:
qemu_log_mask(LOG_UNIMP,
  "via_superio_cfg: unimplemented register 0x%x\n", idx);
@@ -393,25 +429,24 @@ static void vt82c686b_superio_reset(DeviceState *dev)

memset(s->regs, 0, sizeof(s->regs));
/* Device ID */
-vt82c686b_superio_cfg_write(s, 0, 0xe0, 1);
-vt82c686b_superio_cfg_write(s, 1, 0x3c, 1);
-/* Function select - all disabled */
-vt82c686b_superio_cf

Re: [PATCH v2 11/12] hw/ppc/pegasos2: Let pegasos2 machine configure SuperI/O functions

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

This is a preparation for implementing relocation and toggling of SuperI/O
functions in the VT8231 device model. Upon reset, all SuperI/O functions will be
deactivated, so in case if no -bios is given, let the machine configure those
functions the same way pegasos2.rom would do. For now the meantime this will be


"same way pegasos2 firmware would do". You can drop the last sentence 
about no-op as it does not make much sense as it is or reword it if you 
want to keep it.


Regards,
BALATON Zoltan


a no-op.

Signed-off-by: Bernhard Beschow 
---
hw/ppc/pegasos2.c | 15 +++
1 file changed, 15 insertions(+)

diff --git a/hw/ppc/pegasos2.c b/hw/ppc/pegasos2.c
index 3203a4a728..0a40ebd542 100644
--- a/hw/ppc/pegasos2.c
+++ b/hw/ppc/pegasos2.c
@@ -285,6 +285,15 @@ static void pegasos2_pci_config_write(Pegasos2MachineState 
*pm, int bus,
pegasos2_mv_reg_write(pm, pcicfg + 4, len, val);
}

+static void pegasos2_superio_write(Pegasos2MachineState *pm, uint32_t addr,
+   uint32_t val)
+{
+AddressSpace *as = CPU(pm->cpu)->as;
+
+stb_phys(as, PCI1_IO_BASE + 0x3f0, addr);
+stb_phys(as, PCI1_IO_BASE + 0x3f1, val);
+}
+
static void pegasos2_machine_reset(MachineState *machine, ShutdownCause reason)
{
Pegasos2MachineState *pm = PEGASOS2_MACHINE(machine);
@@ -310,6 +319,12 @@ static void pegasos2_machine_reset(MachineState *machine, 
ShutdownCause reason)

pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |
  PCI_INTERRUPT_LINE, 2, 0x9);
+pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |
+  0x50, 1, 0x6);
+pegasos2_superio_write(pm, 0xf4, 0xbe);
+pegasos2_superio_write(pm, 0xf6, 0xef);
+pegasos2_superio_write(pm, 0xf7, 0xfc);
+pegasos2_superio_write(pm, 0xf2, 0x14);
pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |
  0x50, 1, 0x2);
pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |

Re: [PATCH v2 08/12] hw/block/fdc-isa: Implement relocation and toggling for TYPE_ISA_FDC

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

Implement isa_fdc_set_{enabled,iobase} in order to implement relocation and
toggling of SuperI/O functions in the VIA south bridges without breaking
encapsulation.


You may want to revise these commit messages. What toggling means is only 
defined in the last patch but I can't think of a better name for it other 
than spelling out enable/disable. It's probably also not relevant in this 
commit message to mention VIA south bridges as this is a generic function 
not specific to that usage only.


Regards,
BALATON Zoltan


Signed-off-by: Bernhard Beschow 
---
include/hw/block/fdc.h |  3 +++
hw/block/fdc-isa.c | 14 ++
2 files changed, 17 insertions(+)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index 35248c0837..c367c5efea 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -14,6 +14,9 @@ void fdctrl_init_sysbus(qemu_irq irq, hwaddr mmio_base, 
DriveInfo **fds);
void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
   DriveInfo **fds, qemu_irq *fdc_tc);

+void isa_fdc_set_iobase(ISADevice *fdc, hwaddr iobase);
+void isa_fdc_set_enabled(ISADevice *fdc, bool enabled);
+
FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
int cmos_get_fd_drive_type(FloppyDriveType fd0);

diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index b4c92b40b3..c989325de3 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -192,6 +192,20 @@ static Aml *build_fdinfo_aml(int idx, FloppyDriveType type)
return dev;
}

+void isa_fdc_set_iobase(ISADevice *fdc, hwaddr iobase)
+{
+FDCtrlISABus *isa = ISA_FDC(fdc);
+
+fdc->ioport_id = iobase;
+isa->iobase = iobase;
+portio_list_set_address(&isa->portio_list, isa->iobase);
+}
+
+void isa_fdc_set_enabled(ISADevice *fdc, bool enabled)
+{
+portio_list_set_enabled(&ISA_FDC(fdc)->portio_list, enabled);
+}
+
int cmos_get_fd_drive_type(FloppyDriveType fd0)
{
int val;

Re: [PATCH v2 06/12] exec/ioport: Add portio_list_set_address()

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

Some SuperI/O devices such as the VIA south bridges or the PC87312 controller
are able to relocate their SuperI/O functions. Add a convenience function for
implementing this in the VIA south bridges.

This convenience function relies on previous simplifications in exec/ioport
which avoids some duplicate synchronization of I/O port base addresses. The
naming of the function is inspired by its memory_region_set_address() pendant.

Signed-off-by: Bernhard Beschow 
---
docs/devel/migration.rst |  1 +
include/exec/ioport.h|  2 ++
system/ioport.c  | 19 +++
3 files changed, 22 insertions(+)

diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index ec55089b25..389fa24bde 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -464,6 +464,7 @@ Examples of such memory API functions are:
  - memory_region_set_enabled()
  - memory_region_set_address()
  - memory_region_set_alias_offset()


These added here aren't memory API functions so maybe make them a separate 
list with some rewording so that this is not specific to memory API but 
whatever changes memory regions such as memory API or these portio_list 
functions.

Re: [PATCH v2 04/12] hw/char/parallel: Free struct ParallelState from PortioList

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

ParallelState::portio_list isn't used inside ParallelState context but only
inside ISAParallelState context, so more it there.


Same comments as for patch 1 otherwise

Reviewed-by: BALATON Zoltan 


Signed-off-by: Bernhard Beschow 
---
include/hw/char/parallel-isa.h | 2 ++
include/hw/char/parallel.h | 2 --
hw/char/parallel.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/hw/char/parallel-isa.h b/include/hw/char/parallel-isa.h
index d24ccecf05..3b783bd08d 100644
--- a/include/hw/char/parallel-isa.h
+++ b/include/hw/char/parallel-isa.h
@@ -12,6 +12,7 @@

#include "parallel.h"

+#include "exec/ioport.h"
#include "hw/isa/isa.h"
#include "qom/object.h"

@@ -25,6 +26,7 @@ struct ISAParallelState {
uint32_t iobase;
uint32_t isairq;
ParallelState state;
+PortioList portio_list;
};

#endif /* HW_PARALLEL_ISA_H */
diff --git a/include/hw/char/parallel.h b/include/hw/char/parallel.h
index 7b5a309a03..cfb97cc7cc 100644
--- a/include/hw/char/parallel.h
+++ b/include/hw/char/parallel.h
@@ -1,7 +1,6 @@
#ifndef HW_PARALLEL_H
#define HW_PARALLEL_H

-#include "exec/ioport.h"
#include "exec/memory.h"
#include "hw/isa/isa.h"
#include "hw/irq.h"
@@ -22,7 +21,6 @@ typedef struct ParallelState {
uint32_t last_read_offset; /* For debugging */
/* Memory-mapped interface */
int it_shift;
-PortioList portio_list;
} ParallelState;

void parallel_hds_isa_init(ISABus *bus, int n);
diff --git a/hw/char/parallel.c b/hw/char/parallel.c
index 147c900f0d..c1747cbb75 100644
--- a/hw/char/parallel.c
+++ b/hw/char/parallel.c
@@ -532,7 +532,7 @@ static void parallel_isa_realizefn(DeviceState *dev, Error 
**errp)
s->status = dummy;
}

-isa_register_portio_list(isadev, &s->portio_list, base,
+isa_register_portio_list(isadev, &isa->portio_list, base,
 (s->hw_driver
  ? &isa_parallel_portio_hw_list[0]
  : &isa_parallel_portio_sw_list[0]),

Re: [PATCH v2 03/12] hw/char/serial: Free struct SerialState from MemoryRegion

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

SerialState::io isn't used within TYPE_SERIAL directly. Push it to its users to
make them the owner of the MemoryRegion.


I'm not sure this patch is needed. The users already own the SerialState 
so can use its memory region so they don't need their own. Since all of 
these need this io region putting it in SerialState saves some 
duplication. Unless I've missed some reason this might be needed.


Regards,
BALATON Zoltan


Signed-off-by: Bernhard Beschow 
---
include/hw/char/serial.h   | 2 +-
hw/char/serial-isa.c   | 7 +--
hw/char/serial-pci-multi.c | 7 ---
hw/char/serial-pci.c   | 7 +--
hw/char/serial.c   | 4 ++--
5 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/include/hw/char/serial.h b/include/hw/char/serial.h
index 8ba7eca3d6..eb4254edde 100644
--- a/include/hw/char/serial.h
+++ b/include/hw/char/serial.h
@@ -77,7 +77,6 @@ struct SerialState {
int poll_msl;

QEMUTimer *modem_status_poll;
-MemoryRegion io;
};
typedef struct SerialState SerialState;

@@ -85,6 +84,7 @@ struct SerialMM {
SysBusDevice parent;

SerialState serial;
+MemoryRegion io;

uint8_t regshift;
uint8_t endianness;
diff --git a/hw/char/serial-isa.c b/hw/char/serial-isa.c
index 141a6cb168..2be8be980b 100644
--- a/hw/char/serial-isa.c
+++ b/hw/char/serial-isa.c
@@ -26,6 +26,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/module.h"
+#include "exec/memory.h"
#include "sysemu/sysemu.h"
#include "hw/acpi/acpi_aml_interface.h"
#include "hw/char/serial.h"
@@ -43,6 +44,7 @@ struct ISASerialState {
uint32_t iobase;
uint32_t isairq;
SerialState state;
+MemoryRegion io;
};

static const int isa_serial_io[MAX_ISA_SERIAL_PORTS] = {
@@ -79,8 +81,9 @@ static void serial_isa_realizefn(DeviceState *dev, Error 
**errp)
qdev_realize(DEVICE(s), NULL, errp);
qdev_set_legacy_instance_id(dev, isa->iobase, 3);

-memory_region_init_io(&s->io, OBJECT(isa), &serial_io_ops, s, "serial", 8);
-isa_register_ioport(isadev, &s->io, isa->iobase);
+memory_region_init_io(&isa->io, OBJECT(isa), &serial_io_ops, s, "serial",
+  8);
+isa_register_ioport(isadev, &isa->io, isa->iobase);
}

static void serial_isa_build_aml(AcpiDevAmlIf *adev, Aml *scope)
diff --git a/hw/char/serial-pci-multi.c b/hw/char/serial-pci-multi.c
index 5d65c534cb..16cb2faad7 100644
--- a/hw/char/serial-pci-multi.c
+++ b/hw/char/serial-pci-multi.c
@@ -44,6 +44,7 @@ typedef struct PCIMultiSerialState {
uint32_t ports;
char *name[PCI_SERIAL_MAX_PORTS];
SerialState  state[PCI_SERIAL_MAX_PORTS];
+MemoryRegion io[PCI_SERIAL_MAX_PORTS];
uint32_t level[PCI_SERIAL_MAX_PORTS];
qemu_irq *irqs;
uint8_t  prog_if;
@@ -58,7 +59,7 @@ static void multi_serial_pci_exit(PCIDevice *dev)
for (i = 0; i < pci->ports; i++) {
s = pci->state + i;
qdev_unrealize(DEVICE(s));
-memory_region_del_subregion(&pci->iobar, &s->io);
+memory_region_del_subregion(&pci->iobar, &pci->io[i]);
g_free(pci->name[i]);
}
qemu_free_irqs(pci->irqs, pci->ports);
@@ -112,9 +113,9 @@ static void multi_serial_pci_realize(PCIDevice *dev, Error 
**errp)
}
s->irq = pci->irqs[i];
pci->name[i] = g_strdup_printf("uart #%zu", i + 1);
-memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s,
+memory_region_init_io(&pci->io[i], OBJECT(pci), &serial_io_ops, s,
  pci->name[i], 8);
-memory_region_add_subregion(&pci->iobar, 8 * i, &s->io);
+memory_region_add_subregion(&pci->iobar, 8 * i, &pci->io[i]);
pci->ports++;
}
}
diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
index 087da3059a..ab3d0e56b5 100644
--- a/hw/char/serial-pci.c
+++ b/hw/char/serial-pci.c
@@ -28,6 +28,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/module.h"
+#include "exec/memory.h"
#include "hw/char/serial.h"
#include "hw/irq.h"
#include "hw/pci/pci_device.h"
@@ -38,6 +39,7 @@
struct PCISerialState {
PCIDevice dev;
SerialState state;
+MemoryRegion io;
uint8_t prog_if;
};

@@ -57,8 +59,9 @@ static void serial_pci_realize(PCIDevice *dev, Error **errp)
pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
s->irq = pci_allocate_irq(&pci->dev);

-memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s, "serial", 8);
-pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &s->io);
+memory_region_init_io(&pci->io, OBJECT(pci), &serial_io_ops, s, "serial",
+  8);
+pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &pci->io);
}

static void serial_pci_exit(PCIDevice *dev)
diff --git a/hw/char/serial.c b/hw/char/serial.c
index a32eb25f58..83b642aec3 100644
--- a/hw/char/serial.c
+++ b/hw/char/serial.c
@@ -1045,10 +1045,10 @@ static void serial_mm_realize(DeviceState *dev, Error 
**errp)
return;
}

Re: [PATCH v2 02/12] hw/block/fdc-sysbus: Free struct FDCtrl from MemoryRegion

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

FDCtrl::iomem isn't used inside FDCtrl context but only inside FDCtrlSysBus
context, so more it there.


Same comments as for patch 1 otherwise

Reviewed-by: BALATON Zoltan 


Signed-off-by: Bernhard Beschow 
---
hw/block/fdc-internal.h | 2 --
hw/block/fdc-sysbus.c   | 6 --
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
index fef2bfbbf5..e219623dc7 100644
--- a/hw/block/fdc-internal.h
+++ b/hw/block/fdc-internal.h
@@ -25,7 +25,6 @@
#ifndef HW_BLOCK_FDC_INTERNAL_H
#define HW_BLOCK_FDC_INTERNAL_H

-#include "exec/memory.h"
#include "hw/block/block.h"
#include "hw/block/fdc.h"
#include "qapi/qapi-types-block.h"
@@ -91,7 +90,6 @@ typedef struct FDrive {
} FDrive;

struct FDCtrl {
-MemoryRegion iomem;
qemu_irq irq;
/* Controller state */
QEMUTimer *result_timer;
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 86ea51d003..e197b97262 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -26,6 +26,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qom/object.h"
+#include "exec/memory.h"
#include "hw/sysbus.h"
#include "hw/block/fdc.h"
#include "migration/vmstate.h"
@@ -52,6 +53,7 @@ struct FDCtrlSysBus {
/*< public >*/

struct FDCtrl state;
+MemoryRegion iomem;
};

static uint64_t fdctrl_read_mem(void *opaque, hwaddr reg, unsigned ize)
@@ -146,11 +148,11 @@ static void sysbus_fdc_common_instance_init(Object *obj)

qdev_set_legacy_instance_id(dev, 0 /* io */, 2); /* FIXME */

-memory_region_init_io(&fdctrl->iomem, obj,
+memory_region_init_io(&sys->iomem, obj,
  sbdc->use_strict_io ? &fdctrl_mem_strict_ops
  : &fdctrl_mem_ops,
  fdctrl, "fdc", 0x08);
-sysbus_init_mmio(sbd, &fdctrl->iomem);
+sysbus_init_mmio(sbd, &sys->iomem);

sysbus_init_irq(sbd, &fdctrl->irq);
qdev_init_gpio_in(dev, fdctrl_handle_tc, 1);

Re: [PATCH v2 01/12] hw/block/fdc-isa: Free struct FDCtrl from PortioList

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

FDCtrl::portio_list isn't used inside FDCtrl context but only inside
FDCtrlISABus context, so more it there.


"more" -> "move", you have the same typo in several other commit messages. 
Not sure I like the C++ism FDCtrl::portio_list and would write out "The 
portio_list field of FDCtrl" instead but not a big deal. Also the subject 
could say "Move portio_list from FDCtrl to FDCtrlISABus" which is less 
ambiguous than using free that's ususally associated with freeing memory.

Otherwise

Reviewed-by: BALATON Zoltan 


Signed-off-by: Bernhard Beschow 
---
hw/block/fdc-internal.h | 2 --
hw/block/fdc-isa.c  | 4 +++-
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
index 036392e9fc..fef2bfbbf5 100644
--- a/hw/block/fdc-internal.h
+++ b/hw/block/fdc-internal.h
@@ -26,7 +26,6 @@
#define HW_BLOCK_FDC_INTERNAL_H

#include "exec/memory.h"
-#include "exec/ioport.h"
#include "hw/block/block.h"
#include "hw/block/fdc.h"
#include "qapi/qapi-types-block.h"
@@ -140,7 +139,6 @@ struct FDCtrl {
/* Timers state */
uint8_t timer0;
uint8_t timer1;
-PortioList portio_list;
};

extern const FDFormat fd_formats[];
diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index 7ec075e470..b4c92b40b3 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -42,6 +42,7 @@
#include "sysemu/block-backend.h"
#include "sysemu/blockdev.h"
#include "sysemu/sysemu.h"
+#include "exec/ioport.h"
#include "qemu/log.h"
#include "qemu/main-loop.h"
#include "qemu/module.h"
@@ -60,6 +61,7 @@ struct FDCtrlISABus {
uint32_t irq;
uint32_t dma;
struct FDCtrl state;
+PortioList portio_list;
int32_t bootindexA;
int32_t bootindexB;
};
@@ -91,7 +93,7 @@ static void isabus_fdc_realize(DeviceState *dev, Error **errp)
FDCtrl *fdctrl = &isa->state;
Error *err = NULL;

-isa_register_portio_list(isadev, &fdctrl->portio_list,
+isa_register_portio_list(isadev, &isa->portio_list,
 isa->iobase, fdc_portio_list, fdctrl,
 "fdc");

Re: [PATCH 04/12] hw/block/fdc: Expose internal header

2023-12-18 Thread BALATON Zoltan


On Mon, 18 Dec 2023, Bernhard Beschow wrote:

Am 18. Dezember 2023 10:54:56 UTC schrieb BALATON Zoltan :

On Sun, 17 Dec 2023, Bernhard Beschow wrote:

Am 17. Dezember 2023 15:47:33 UTC schrieb BALATON Zoltan :

On Sun, 17 Dec 2023, Bernhard Beschow wrote:

Exposing the internal header allows for exposing struct FDCtrlISABus which is
encuraged by qdev guidelines.


Hopefully the guidelines don't encourage this as object orientation indeed 
encourages object encapsulation so only the object itseld should poke its 
internals and other objects should use methods the change object state. In QOM 
some object states were exposed in public headers to allow embedding those 
objects in other objects becuase C needs the struct size to allow that. This 
was to simplify memory management so the embedded objects don't need to be 
tracked and freed but would be created and freed with the other object 
embedding it but this does not mean the other object should poke into these 
object or that this is a general guideline to expose internal object state. I'd 
say the exposed objects are an exception instead of recommended guideline and 
only allowed for objects that need to be embeded in others but generally object 
encapsulation would be better to preserve where possible. This patch exposes 
objects so others can poke into them which would make those other objects de

pe

ndent on the implementation of these objects making these harder to chnage in 
the future so a better way may be to add methods to fdc and serial to allow 
changing their base address and map/unmap their ports and keep their internals 
unexposed.


Each ISADevice sub class would need concenience methods as well as each state 
class. This series touches three of each: fdc, parallel, serial. And each of 
those need two convenience methods: set_enabled() and set_address(). This would 
add another 12 functions on top of the current ones.


If all ISA devices need this then these should really be methods of ISADevice 
but since that's just an empty wrapper over devices each of which handles its 
own ports, the ISADevice does not know about those and since each device may 
have different ports and not all of them uses portio lists for this, moving 
port handling to ISADevice might be too big refactoring to do for this. Keeping 
these functions with the superio component devices so their implementation is 
kept private still worth it in my opinion so even if that adds 2 functions to 
superio component devices (which is not all ISA devices just a limited set) 
seems to be a better approach to me than breaking encapsulation of objects. 
These are simple access methods for internal object state which are common in 
object otiented programming.


Then ISASuperIODevice would require at least 6 more such methods (not counting 
the unneeded ones for IDE which might be desirable for consistency). So in the 
end we'd have at least 18 more methods. Is this really worth it?


We may do without these if we say superio is just a container of components so 
don't add forwarding methods but we can call the accessor methods of component 
objects from vt82c686.c. That's still better than reaching into object 
internals from foreign objects.


Version 2 is out which should address all of your comments.


I think this version looks better. I only have time for somw preliminary 
comments after a quick look now, I'll plan to give it more testing during 
Xmas holiday.


Regards,
BALATON Zoltan

Re: [PATCH] target/riscv/kvm: do not use non-portable strerrorname_np()

2023-12-18 Thread Daniel Henrique Barboza





On 12/18/23 18:35, Michael Tokarev wrote:

18.12.2023 20:20, Daniel Henrique Barboza wrote:



On 12/18/23 13:22, Natanael Copa wrote:

strerrorname_np is non-portable and breaks building with musl libc.

Use strerror(errno) instead, like we do other places.

Cc: qemu-sta...@nongnu.org
Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' error 
msg)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
Signed-off-by: Natanael Copa 
---
  target/riscv/kvm/kvm-cpu.c | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 45b6cf1cfa..117e33cf90 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -832,9 +832,8 @@ static void kvm_riscv_read_multiext_legacy(RISCVCPU *cpu,
  multi_ext_cfg->supported = false;
  val = false;
  } else {
-    error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+    error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));



The reason I did this change, as described in 082e9e4a58ba mentioned in the 
commit
message, was precisely to avoid things like this:

qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error: no such 
file or directory


If KVM context puts its own unique meaning for ENOENT, maybe something like

  "unable to read KVM register: %s\n", errno == ENOENT ? "no such register" : 
strerror(errno)

would do it better?



A solution like this is something I can go after if I'm bothered enough with 
how strerror()
is working in the RISC-V KVM driver.

For now I think we can live with this fix as is since fixing the build is more 
important
that aesthetics.


Thanks,

Daniel



To me, "No such file or directory" already tells everything and does not look
weird, but that's because I've seen this error message for all sorts of contexts
and got used to this. It is definitely understandable.

/mjt

Re: [PATCH 1/1] target/riscv/kvm.c: remove group setting of KVM AIA if the machine only has 1 socket

2023-12-18 Thread Daniel Henrique Barboza





On 12/18/23 06:05, Yong-Xuan Wang wrote:

The emulated AIA within the Linux kernel restores the HART index
of the IMSICs according to the configured AIA settings. During
this process, the group setting is used only when the machine
partitions harts into groups. It's unnecessary to set the group
configuration if the machine has only one socket, as its address
space might not contain the group shift.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
---


Reviewed-by: Daniel Henrique Barboza 


  target/riscv/kvm/kvm-cpu.c | 31 +--
  1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 62a1e51f0a2e..6494597157b8 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1387,21 +1387,24 @@ void kvm_riscv_aia_create(MachineState *machine, 
uint64_t group_shift,
  exit(1);
  }
  
-socket_bits = find_last_bit(&socket_count, BITS_PER_LONG) + 1;

-ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
-KVM_DEV_RISCV_AIA_CONFIG_GROUP_BITS,
-&socket_bits, true, NULL);
-if (ret < 0) {
-error_report("KVM AIA: failed to set group_bits");
-exit(1);
-}
  
-ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,

-KVM_DEV_RISCV_AIA_CONFIG_GROUP_SHIFT,
-&group_shift, true, NULL);
-if (ret < 0) {
-error_report("KVM AIA: failed to set group_shift");
-exit(1);
+if (socket_count > 1) {
+socket_bits = find_last_bit(&socket_count, BITS_PER_LONG) + 1;
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GROUP_BITS,
+&socket_bits, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set group_bits");
+exit(1);
+}
+
+ret = kvm_device_access(aia_fd, KVM_DEV_RISCV_AIA_GRP_CONFIG,
+KVM_DEV_RISCV_AIA_CONFIG_GROUP_SHIFT,
+&group_shift, true, NULL);
+if (ret < 0) {
+error_report("KVM AIA: failed to set group_shift");
+exit(1);
+}
  }
  
  guest_bits = guest_num == 0 ? 0 :

Re: [PATCH] target/riscv/kvm: do not use non-portable strerrorname_np()

2023-12-18 Thread Michael Tokarev


18.12.2023 20:20, Daniel Henrique Barboza wrote:



On 12/18/23 13:22, Natanael Copa wrote:

strerrorname_np is non-portable and breaks building with musl libc.

Use strerror(errno) instead, like we do other places.

Cc: qemu-sta...@nongnu.org
Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' error 
msg)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
Signed-off-by: Natanael Copa 
---
  target/riscv/kvm/kvm-cpu.c | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 45b6cf1cfa..117e33cf90 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -832,9 +832,8 @@ static void kvm_riscv_read_multiext_legacy(RISCVCPU *cpu,
  multi_ext_cfg->supported = false;
  val = false;
  } else {
-    error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+    error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));



The reason I did this change, as described in 082e9e4a58ba mentioned in the 
commit
message, was precisely to avoid things like this:

qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error: no such 
file or directory


If KVM context puts its own unique meaning for ENOENT, maybe something like

 "unable to read KVM register: %s\n", errno == ENOENT ? "no such register" : 
strerror(errno)

would do it better?

To me, "No such file or directory" already tells everything and does not look
weird, but that's because I've seen this error message for all sorts of contexts
and got used to this. It is definitely understandable.

/mjt

Re: [PATCH 1/1] hw/riscv/virt.c: fix the interrupts-extended property format of PLIC

2023-12-18 Thread Daniel Henrique Barboza





On 12/18/23 06:05, Yong-Xuan Wang wrote:

The interrupts-extended property of PLIC only has 2 * hart number
fields when KVM enabled, copy 4 * hart number fields to fdt will
expose some uninitialized value.

In this patch, I also refactor the code about the setting of
interrupts-extended property of PLIC for improved readability.

Signed-off-by: Yong-Xuan Wang 
Reviewed-by: Jim Shu 
---


Reviewed-by: Daniel Henrique Barboza 


  hw/riscv/virt.c | 47 +++
  1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index d2eac2415619..e42baf82cab6 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -460,24 +460,6 @@ static void create_fdt_socket_plic(RISCVVirtState *s,
  "sifive,plic-1.0.0", "riscv,plic0"
  };
  
-if (kvm_enabled()) {

-plic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
-} else {
-plic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 4);
-}
-
-for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
-if (kvm_enabled()) {
-plic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
-plic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
-} else {
-plic_cells[cpu * 4 + 0] = cpu_to_be32(intc_phandles[cpu]);
-plic_cells[cpu * 4 + 1] = cpu_to_be32(IRQ_M_EXT);
-plic_cells[cpu * 4 + 2] = cpu_to_be32(intc_phandles[cpu]);
-plic_cells[cpu * 4 + 3] = cpu_to_be32(IRQ_S_EXT);
-}
-}
-
  plic_phandles[socket] = (*phandle)++;
  plic_addr = memmap[VIRT_PLIC].base + (memmap[VIRT_PLIC].size * socket);
  plic_name = g_strdup_printf("/soc/plic@%lx", plic_addr);
@@ -490,8 +472,33 @@ static void create_fdt_socket_plic(RISCVVirtState *s,
(char **)&plic_compat,
ARRAY_SIZE(plic_compat));
  qemu_fdt_setprop(ms->fdt, plic_name, "interrupt-controller", NULL, 0);
-qemu_fdt_setprop(ms->fdt, plic_name, "interrupts-extended",
-plic_cells, s->soc[socket].num_harts * sizeof(uint32_t) * 4);
+
+if (kvm_enabled()) {
+plic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
+
+for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
+plic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
+plic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_EXT);
+}
+
+qemu_fdt_setprop(ms->fdt, plic_name, "interrupts-extended",
+ plic_cells,
+ s->soc[socket].num_harts * sizeof(uint32_t) * 2);
+   } else {
+plic_cells = g_new0(uint32_t, s->soc[socket].num_harts * 4);
+
+for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
+plic_cells[cpu * 4 + 0] = cpu_to_be32(intc_phandles[cpu]);
+plic_cells[cpu * 4 + 1] = cpu_to_be32(IRQ_M_EXT);
+plic_cells[cpu * 4 + 2] = cpu_to_be32(intc_phandles[cpu]);
+plic_cells[cpu * 4 + 3] = cpu_to_be32(IRQ_S_EXT);
+}
+
+qemu_fdt_setprop(ms->fdt, plic_name, "interrupts-extended",
+ plic_cells,
+ s->soc[socket].num_harts * sizeof(uint32_t) * 4);
+}
+
  qemu_fdt_setprop_cells(ms->fdt, plic_name, "reg",
  0x0, plic_addr, 0x0, memmap[VIRT_PLIC].size);
  qemu_fdt_setprop_cell(ms->fdt, plic_name, "riscv,ndev",

Re: [PATCH] target/riscv/kvm: do not use non-portable strerrorname_np()

2023-12-18 Thread Daniel Henrique Barboza





On 12/18/23 13:22, Natanael Copa wrote:

strerrorname_np is non-portable and breaks building with musl libc.

Use strerror(errno) instead, like we do other places.

Cc: qemu-sta...@nongnu.org
Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' error 
msg)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
Signed-off-by: Natanael Copa 
---


Apart from my 'aesthetic preference' of using "error code %d" instead of
strerror(errno), which I stand by, this patch is fixing a build break
and it's an improvement from what we have now. Aesthetics can be dealt
with later.


Reviewed-by: Daniel Henrique Barboza 





  target/riscv/kvm/kvm-cpu.c | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 45b6cf1cfa..117e33cf90 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -832,9 +832,8 @@ static void kvm_riscv_read_multiext_legacy(RISCVCPU *cpu,
  multi_ext_cfg->supported = false;
  val = false;
  } else {
-error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));
  exit(EXIT_FAILURE);
  }
  } else {
@@ -895,8 +894,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, 
KVMScratchCPU *kvmcpu)
   *
   * Error out if we get any other errno.
   */
-error_report("Error when accessing get-reg-list, code: %s",
- strerrorname_np(errno));
+error_report("Error when accessing get-reg-list: %s",
+ strerror(errno));
  exit(EXIT_FAILURE);
  }
  
@@ -905,8 +904,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, KVMScratchCPU *kvmcpu)

  reglist->n = rl_struct.n;
  ret = ioctl(kvmcpu->cpufd, KVM_GET_REG_LIST, reglist);
  if (ret) {
-error_report("Error when reading KVM_GET_REG_LIST, code %s ",
- strerrorname_np(errno));
+error_report("Error when reading KVM_GET_REG_LIST: %s",
+ strerror(errno));
  exit(EXIT_FAILURE);
  }
  
@@ -927,9 +926,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, KVMScratchCPU *kvmcpu)

  reg.addr = (uint64_t)&val;
  ret = ioctl(kvmcpu->cpufd, KVM_GET_ONE_REG, ®);
  if (ret != 0) {
-error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));
  exit(EXIT_FAILURE);
  }

[PATCH v2 1/4] linux-headers: Update to Linux v6.7-rc5

2023-12-18 Thread Daniel Henrique Barboza

We'll add a new RISC-V linux-header file, but first let's update all
headers.

Headers for 'asm-loongarch' were added in this update.

Signed-off-by: Daniel Henrique Barboza 
Acked-by: Alistair Francis 
---
 include/standard-headers/drm/drm_fourcc.h |   2 +
 include/standard-headers/linux/pci_regs.h |  24 ++-
 include/standard-headers/linux/vhost_types.h  |   7 +
 .../standard-headers/linux/virtio_config.h|   5 +
 include/standard-headers/linux/virtio_pci.h   |  11 ++
 linux-headers/asm-arm64/kvm.h |  32 
 linux-headers/asm-generic/unistd.h|  14 +-
 linux-headers/asm-loongarch/bitsperlong.h |   1 +
 linux-headers/asm-loongarch/kvm.h | 108 +++
 linux-headers/asm-loongarch/mman.h|   1 +
 linux-headers/asm-loongarch/unistd.h  |   5 +
 linux-headers/asm-mips/unistd_n32.h   |   4 +
 linux-headers/asm-mips/unistd_n64.h   |   4 +
 linux-headers/asm-mips/unistd_o32.h   |   4 +
 linux-headers/asm-powerpc/unistd_32.h |   4 +
 linux-headers/asm-powerpc/unistd_64.h |   4 +
 linux-headers/asm-riscv/kvm.h |  12 ++
 linux-headers/asm-s390/unistd_32.h|   4 +
 linux-headers/asm-s390/unistd_64.h|   4 +
 linux-headers/asm-x86/unistd_32.h |   4 +
 linux-headers/asm-x86/unistd_64.h |   3 +
 linux-headers/asm-x86/unistd_x32.h|   3 +
 linux-headers/linux/iommufd.h | 180 +-
 linux-headers/linux/kvm.h |  11 ++
 linux-headers/linux/psp-sev.h |   1 +
 linux-headers/linux/stddef.h  |   9 +-
 linux-headers/linux/userfaultfd.h |   9 +-
 linux-headers/linux/vfio.h|  47 +++--
 linux-headers/linux/vhost.h   |   8 +
 29 files changed, 498 insertions(+), 27 deletions(-)
 create mode 100644 linux-headers/asm-loongarch/bitsperlong.h
 create mode 100644 linux-headers/asm-loongarch/kvm.h
 create mode 100644 linux-headers/asm-loongarch/mman.h
 create mode 100644 linux-headers/asm-loongarch/unistd.h

diff --git a/include/standard-headers/drm/drm_fourcc.h 
b/include/standard-headers/drm/drm_fourcc.h
index 72279f4d25..3afb70160f 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -322,6 +322,8 @@ extern "C" {
  * index 1 = Cr:Cb plane, [39:0] Cr1:Cb1:Cr0:Cb0 little endian
  */
 #define DRM_FORMAT_NV15fourcc_code('N', 'V', '1', '5') /* 2x2 
subsampled Cr:Cb plane */
+#define DRM_FORMAT_NV20fourcc_code('N', 'V', '2', '0') /* 2x1 
subsampled Cr:Cb plane */
+#define DRM_FORMAT_NV30fourcc_code('N', 'V', '3', '0') /* 
non-subsampled Cr:Cb plane */
 
 /*
  * 2 plane YCbCr MSB aligned
diff --git a/include/standard-headers/linux/pci_regs.h 
b/include/standard-headers/linux/pci_regs.h
index e5f558d964..a39193213f 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -80,6 +80,7 @@
 #define  PCI_HEADER_TYPE_NORMAL0
 #define  PCI_HEADER_TYPE_BRIDGE1
 #define  PCI_HEADER_TYPE_CARDBUS   2
+#define  PCI_HEADER_TYPE_MFD   0x80/* Multi-Function Device 
(possible) */
 
 #define PCI_BIST   0x0f/* 8 bits */
 #define  PCI_BIST_CODE_MASK0x0f/* Return result */
@@ -637,6 +638,7 @@
 #define PCI_EXP_RTCAP  0x1e/* Root Capabilities */
 #define  PCI_EXP_RTCAP_CRSVIS  0x0001  /* CRS Software Visibility capability */
 #define PCI_EXP_RTSTA  0x20/* Root Status */
+#define  PCI_EXP_RTSTA_PME_RQ_ID 0x /* PME Requester ID */
 #define  PCI_EXP_RTSTA_PME 0x0001 /* PME status */
 #define  PCI_EXP_RTSTA_PENDING 0x0002 /* PME pending */
 /*
@@ -930,12 +932,13 @@
 
 /* Process Address Space ID */
 #define PCI_PASID_CAP  0x04/* PASID feature register */
-#define  PCI_PASID_CAP_EXEC0x02/* Exec permissions Supported */
-#define  PCI_PASID_CAP_PRIV0x04/* Privilege Mode Supported */
+#define  PCI_PASID_CAP_EXEC0x0002  /* Exec permissions Supported */
+#define  PCI_PASID_CAP_PRIV0x0004  /* Privilege Mode Supported */
+#define  PCI_PASID_CAP_WIDTH   0x1f00
 #define PCI_PASID_CTRL 0x06/* PASID control register */
-#define  PCI_PASID_CTRL_ENABLE 0x01/* Enable bit */
-#define  PCI_PASID_CTRL_EXEC   0x02/* Exec permissions Enable */
-#define  PCI_PASID_CTRL_PRIV   0x04/* Privilege Mode Enable */
+#define  PCI_PASID_CTRL_ENABLE 0x0001  /* Enable bit */
+#define  PCI_PASID_CTRL_EXEC   0x0002  /* Exec permissions Enable */
+#define  PCI_PASID_CTRL_PRIV   0x0004  /* Privilege Mode Enable */
 #define PCI_EXT_CAP_PASID_SIZEOF   8
 
 /* Single Root I/O Virtualization */
@@ -975,6 +978,8 @@
 #define  PCI_LTR_VALUE_MASK0x03ff
 #define  PCI_LTR_SCALE_MASK0x1c00
 #define  PCI_LTR_SCALE_SHIFT   10
+#define  PCI_LTR_NOSNOOP_VALUE 0x03ff0

[PATCH v2 0/4] target/riscv: add RVV CSRs

2023-12-18 Thread Daniel Henrique Barboza

Hi,

This version was rebased on top of Alistair's riscv-to-apply.next. A
small tweak was needed in patch 4 due to changes in the branch.

I took the chance to update linux-headers to 6.7-rc5, although the
differences from the rc3 headers from v1 were minimal.

All patches acked.

Changes from v1:
- rebased to Alistair's riscv-to-apply.next
- patch 1:
  - updated headers to v6.7-rc5 
- patch 4:
  - use kvm_riscv_reg_id_ulong() instead of kvm_riscv_reg_id()
- v1 link: 
https://lore.kernel.org/qemu-riscv/20231130182748.1894790-1-dbarb...@ventanamicro.com/

Daniel Henrique Barboza (4):
  linux-headers: Update to Linux v6.7-rc5
  linux-headers: riscv: add ptrace.h
  target/riscv/kvm: do PR_RISCV_V_SET_CONTROL during realize()
  target/riscv/kvm: add RVV and Vector CSR regs

 include/standard-headers/drm/drm_fourcc.h |   2 +
 include/standard-headers/linux/pci_regs.h |  24 ++-
 include/standard-headers/linux/vhost_types.h  |   7 +
 .../standard-headers/linux/virtio_config.h|   5 +
 include/standard-headers/linux/virtio_pci.h   |  11 ++
 linux-headers/asm-arm64/kvm.h |  32 
 linux-headers/asm-generic/unistd.h|  14 +-
 linux-headers/asm-loongarch/bitsperlong.h |   1 +
 linux-headers/asm-loongarch/kvm.h | 108 +++
 linux-headers/asm-loongarch/mman.h|   1 +
 linux-headers/asm-loongarch/unistd.h  |   5 +
 linux-headers/asm-mips/unistd_n32.h   |   4 +
 linux-headers/asm-mips/unistd_n64.h   |   4 +
 linux-headers/asm-mips/unistd_o32.h   |   4 +
 linux-headers/asm-powerpc/unistd_32.h |   4 +
 linux-headers/asm-powerpc/unistd_64.h |   4 +
 linux-headers/asm-riscv/kvm.h |  12 ++
 linux-headers/asm-riscv/ptrace.h  | 132 +
 linux-headers/asm-s390/unistd_32.h|   4 +
 linux-headers/asm-s390/unistd_64.h|   4 +
 linux-headers/asm-x86/unistd_32.h |   4 +
 linux-headers/asm-x86/unistd_64.h |   3 +
 linux-headers/asm-x86/unistd_x32.h|   3 +
 linux-headers/linux/iommufd.h | 180 +-
 linux-headers/linux/kvm.h |  11 ++
 linux-headers/linux/psp-sev.h |   1 +
 linux-headers/linux/stddef.h  |   9 +-
 linux-headers/linux/userfaultfd.h |   9 +-
 linux-headers/linux/vfio.h|  47 +++--
 linux-headers/linux/vhost.h   |   8 +
 scripts/update-linux-headers.sh   |   3 +
 target/riscv/kvm/kvm-cpu.c| 103 ++
 32 files changed, 736 insertions(+), 27 deletions(-)
 create mode 100644 linux-headers/asm-loongarch/bitsperlong.h
 create mode 100644 linux-headers/asm-loongarch/kvm.h
 create mode 100644 linux-headers/asm-loongarch/mman.h
 create mode 100644 linux-headers/asm-loongarch/unistd.h
 create mode 100644 linux-headers/asm-riscv/ptrace.h

-- 
2.43.0

[PATCH v2 2/4] linux-headers: riscv: add ptrace.h

2023-12-18 Thread Daniel Henrique Barboza

KVM vector support for RISC-V requires the linux-header ptrace.h.

Signed-off-by: Daniel Henrique Barboza 
Acked-by: Alistair Francis 
---
 linux-headers/asm-riscv/ptrace.h | 132 +++
 scripts/update-linux-headers.sh  |   3 +
 2 files changed, 135 insertions(+)
 create mode 100644 linux-headers/asm-riscv/ptrace.h

diff --git a/linux-headers/asm-riscv/ptrace.h b/linux-headers/asm-riscv/ptrace.h
new file mode 100644
index 00..1e3166caca
--- /dev/null
+++ b/linux-headers/asm-riscv/ptrace.h
@@ -0,0 +1,132 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ */
+
+#ifndef _ASM_RISCV_PTRACE_H
+#define _ASM_RISCV_PTRACE_H
+
+#ifndef __ASSEMBLY__
+
+#include 
+
+#define PTRACE_GETFDPIC33
+
+#define PTRACE_GETFDPIC_EXEC   0
+#define PTRACE_GETFDPIC_INTERP 1
+
+/*
+ * User-mode register state for core dumps, ptrace, sigcontext
+ *
+ * This decouples struct pt_regs from the userspace ABI.
+ * struct user_regs_struct must form a prefix of struct pt_regs.
+ */
+struct user_regs_struct {
+   unsigned long pc;
+   unsigned long ra;
+   unsigned long sp;
+   unsigned long gp;
+   unsigned long tp;
+   unsigned long t0;
+   unsigned long t1;
+   unsigned long t2;
+   unsigned long s0;
+   unsigned long s1;
+   unsigned long a0;
+   unsigned long a1;
+   unsigned long a2;
+   unsigned long a3;
+   unsigned long a4;
+   unsigned long a5;
+   unsigned long a6;
+   unsigned long a7;
+   unsigned long s2;
+   unsigned long s3;
+   unsigned long s4;
+   unsigned long s5;
+   unsigned long s6;
+   unsigned long s7;
+   unsigned long s8;
+   unsigned long s9;
+   unsigned long s10;
+   unsigned long s11;
+   unsigned long t3;
+   unsigned long t4;
+   unsigned long t5;
+   unsigned long t6;
+};
+
+struct __riscv_f_ext_state {
+   __u32 f[32];
+   __u32 fcsr;
+};
+
+struct __riscv_d_ext_state {
+   __u64 f[32];
+   __u32 fcsr;
+};
+
+struct __riscv_q_ext_state {
+   __u64 f[64] __attribute__((aligned(16)));
+   __u32 fcsr;
+   /*
+* Reserved for expansion of sigcontext structure.  Currently zeroed
+* upon signal, and must be zero upon sigreturn.
+*/
+   __u32 reserved[3];
+};
+
+struct __riscv_ctx_hdr {
+   __u32 magic;
+   __u32 size;
+};
+
+struct __riscv_extra_ext_header {
+   __u32 __padding[129] __attribute__((aligned(16)));
+   /*
+* Reserved for expansion of sigcontext structure.  Currently zeroed
+* upon signal, and must be zero upon sigreturn.
+*/
+   __u32 reserved;
+   struct __riscv_ctx_hdr hdr;
+};
+
+union __riscv_fp_state {
+   struct __riscv_f_ext_state f;
+   struct __riscv_d_ext_state d;
+   struct __riscv_q_ext_state q;
+};
+
+struct __riscv_v_ext_state {
+   unsigned long vstart;
+   unsigned long vl;
+   unsigned long vtype;
+   unsigned long vcsr;
+   unsigned long vlenb;
+   void *datap;
+   /*
+* In signal handler, datap will be set a correct user stack offset
+* and vector registers will be copied to the address of datap
+* pointer.
+*/
+};
+
+struct __riscv_v_regset_state {
+   unsigned long vstart;
+   unsigned long vl;
+   unsigned long vtype;
+   unsigned long vcsr;
+   unsigned long vlenb;
+   char vreg[];
+};
+
+/*
+ * According to spec: The number of bits in a single vector register,
+ * VLEN >= ELEN, which must be a power of 2, and must be no greater than
+ * 2^16 = 65536bits = 8192bytes
+ */
+#define RISCV_MAX_VLENB (8192)
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_PTRACE_H */
diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 34295c0fe5..a0006eec6f 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -156,6 +156,9 @@ for arch in $ARCHLIST; do
 cp_portable "$tmpdir/bootparam.h" \
 "$output/include/standard-headers/asm-$arch"
 fi
+if [ $arch = riscv ]; then
+cp "$tmpdir/include/asm/ptrace.h" "$output/linux-headers/asm-riscv/"
+fi
 done
 
 rm -rf "$output/linux-headers/linux"
-- 
2.43.0

[PATCH v2 3/4] target/riscv/kvm: do PR_RISCV_V_SET_CONTROL during realize()

2023-12-18 Thread Daniel Henrique Barboza

Linux RISC-V vector documentation (Document/arch/riscv/vector.rst)
mandates a prctl() in order to allow an userspace thread to use the
Vector extension from the host.

This is something to be done in realize() time, after init(), when we
already decided whether we're using RVV or not. We don't have a
realize() callback for KVM yet, so add kvm_cpu_realize() and enable RVV
for the thread via PR_RISCV_V_SET_CONTROL.

Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Alistair Francis 
---
 target/riscv/kvm/kvm-cpu.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 62a1e51f0a..0298c5dd69 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -18,6 +18,7 @@
 
 #include "qemu/osdep.h"
 #include 
+#include 
 
 #include 
 
@@ -47,6 +48,9 @@
 #include "sysemu/runstate.h"
 #include "hw/riscv/numa.h"
 
+#define PR_RISCV_V_SET_CONTROL69
+#define PR_RISCV_V_VSTATE_CTRL_ON  2
+
 void riscv_kvm_aplic_request(void *opaque, int irq, int level)
 {
 kvm_set_irq(kvm_state, irq, !!level);
@@ -1490,11 +1494,36 @@ static void kvm_cpu_instance_init(CPUState *cs)
 }
 }
 
+/*
+ * We'll get here via the following path:
+ *
+ * riscv_cpu_realize()
+ *   -> cpu_exec_realizefn()
+ *  -> kvm_cpu_realize() (via accel_cpu_common_realize())
+ */
+static bool kvm_cpu_realize(CPUState *cs, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(cs);
+int ret;
+
+if (riscv_has_ext(&cpu->env, RVV)) {
+ret = prctl(PR_RISCV_V_SET_CONTROL, PR_RISCV_V_VSTATE_CTRL_ON);
+if (ret) {
+error_setg(errp, "Error in prctl PR_RISCV_V_SET_CONTROL, code: %s",
+   strerrorname_np(errno));
+return false;
+}
+}
+
+   return true;
+}
+
 static void kvm_cpu_accel_class_init(ObjectClass *oc, void *data)
 {
 AccelCPUClass *acc = ACCEL_CPU_CLASS(oc);
 
 acc->cpu_instance_init = kvm_cpu_instance_init;
+acc->cpu_target_realize = kvm_cpu_realize;
 }
 
 static const TypeInfo kvm_cpu_accel_type_info = {
-- 
2.43.0

[PATCH v2 4/4] target/riscv/kvm: add RVV and Vector CSR regs

2023-12-18 Thread Daniel Henrique Barboza

Add support for RVV and Vector CSR KVM regs vstart, vl and vtype.

Support for vregs[] requires KVM side changes and an extra reg (vlenb)
and will be added later.

Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Alistair Francis 
---
 target/riscv/kvm/kvm-cpu.c | 74 ++
 1 file changed, 74 insertions(+)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 0298c5dd69..dfebcc1692 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -105,6 +105,10 @@ static uint64_t kvm_riscv_reg_id_u64(uint64_t type, 
uint64_t idx)
 
 #define RISCV_FP_D_REG(idx)  kvm_riscv_reg_id_u64(KVM_REG_RISCV_FP_D, idx)
 
+#define RISCV_VECTOR_CSR_REG(env, name) \
+kvm_riscv_reg_id_ulong(env, KVM_REG_RISCV_VECTOR, \
+   KVM_REG_RISCV_VECTOR_CSR_REG(name))
+
 #define KVM_RISCV_GET_CSR(cs, env, csr, reg) \
 do { \
 int _ret = kvm_get_one_reg(cs, RISCV_CSR_REG(env, csr), ®); \
@@ -158,6 +162,7 @@ static KVMCPUConfig kvm_misa_ext_cfgs[] = {
 KVM_MISA_CFG(RVH, KVM_RISCV_ISA_EXT_H),
 KVM_MISA_CFG(RVI, KVM_RISCV_ISA_EXT_I),
 KVM_MISA_CFG(RVM, KVM_RISCV_ISA_EXT_M),
+KVM_MISA_CFG(RVV, KVM_RISCV_ISA_EXT_V),
 };
 
 static void kvm_cpu_get_misa_ext_cfg(Object *obj, Visitor *v,
@@ -704,6 +709,65 @@ static void kvm_riscv_put_regs_timer(CPUState *cs)
 env->kvm_timer_dirty = false;
 }
 
+static int kvm_riscv_get_regs_vector(CPUState *cs)
+{
+CPURISCVState *env = &RISCV_CPU(cs)->env;
+target_ulong reg;
+int ret = 0;
+
+if (!riscv_has_ext(env, RVV)) {
+return 0;
+}
+
+ret = kvm_get_one_reg(cs, RISCV_VECTOR_CSR_REG(env, vstart), ®);
+if (ret) {
+return ret;
+}
+env->vstart = reg;
+
+ret = kvm_get_one_reg(cs, RISCV_VECTOR_CSR_REG(env, vl), ®);
+if (ret) {
+return ret;
+}
+env->vl = reg;
+
+ret = kvm_get_one_reg(cs, RISCV_VECTOR_CSR_REG(env, vtype), ®);
+if (ret) {
+return ret;
+}
+env->vtype = reg;
+
+return 0;
+}
+
+static int kvm_riscv_put_regs_vector(CPUState *cs)
+{
+CPURISCVState *env = &RISCV_CPU(cs)->env;
+target_ulong reg;
+int ret = 0;
+
+if (!riscv_has_ext(env, RVV)) {
+return 0;
+}
+
+reg = env->vstart;
+ret = kvm_set_one_reg(cs, RISCV_VECTOR_CSR_REG(env, vstart), ®);
+if (ret) {
+return ret;
+}
+
+reg = env->vl;
+ret = kvm_set_one_reg(cs, RISCV_VECTOR_CSR_REG(env, vl), ®);
+if (ret) {
+return ret;
+}
+
+reg = env->vtype;
+ret = kvm_set_one_reg(cs, RISCV_VECTOR_CSR_REG(env, vtype), ®);
+
+return ret;
+}
+
 typedef struct KVMScratchCPU {
 int kvmfd;
 int vmfd;
@@ -1001,6 +1065,11 @@ int kvm_arch_get_registers(CPUState *cs)
 return ret;
 }
 
+ret = kvm_riscv_get_regs_vector(cs);
+if (ret) {
+return ret;
+}
+
 return ret;
 }
 
@@ -1041,6 +1110,11 @@ int kvm_arch_put_registers(CPUState *cs, int level)
 return ret;
 }
 
+ret = kvm_riscv_put_regs_vector(cs);
+if (ret) {
+return ret;
+}
+
 if (KVM_PUT_RESET_STATE == level) {
 RISCVCPU *cpu = RISCV_CPU(cs);
 if (cs->cpu_index == 0) {
-- 
2.43.0

Re: [PATCH] hw/acpi: propagate vcpu hotplug after switch to modern interface

2023-12-18 Thread Aaron Young

Hi.  I wanted to follow up with information to test/reproduce this BUG/issue.

 Steps to reproduce:

 1. Use the following options with QEMU (configured with OVMF):
-S -smp 2,maxcpus=260,sockets=2,cores=65,threads=2

 2. Start QEMU and when QEMU reaches the paused state (due to -S),
issue the following command from the QMP Shell:
device_add driver=qemu64-x86_64-cpu socket-id=1 core-id=64 thread-id=0 

 3. Continue booting the VM and OVMF will report the following error
to the OVMF debug log indicating the BUG/error condition:
QEMU v2.7 reset bug: BootCpuCount=3 Present=2

 BTW: This BUG often results in intermittent OVMF Exceptions/ASSERTs as well.

 Thanks,

 -Aaron



From: qemu-devel-bounces+aaron.young=oracle@nongnu.org 
 on behalf of Aaron Young 

Sent: Tuesday, December 12, 2023 8:51 AM
To: qemu-devel@nongnu.org
Cc: m...@redhat.com; imamm...@redhat.com
Subject: [PATCH] hw/acpi: propagate vcpu hotplug after switch to modern 
interface

If a vcpu with an apic-id that is not supported by the legacy
interface (>255) is hot-plugged, the legacy code will dynamically switch
to the modern interface. However, the hotplug event is not forwarded to
the new interface resulting in the vcpu not being fully/properly added
to the machine config. This BUG is evidenced by OVMF when it
it attempts to count the vcpus and reports an inconsistent vcpu count
reported by the fw_cfg interface and the modern hotpug interface.

Fix is to propagate the hotplug event after making the switch from
the legacy interface to the modern interface.

Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Signed-off-by: Aaron Young 
---
 hw/acpi/cpu_hotplug.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index 634bbec..6f78db0 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -59,7 +59,8 @@ static const MemoryRegionOps AcpiCpuHotplug_ops = {
 },
 };

-static void acpi_set_cpu_present_bit(AcpiCpuHotplug *g, CPUState *cpu)
+static void acpi_set_cpu_present_bit(AcpiCpuHotplug *g, CPUState *cpu,
+ bool *swtchd_to_modern)
 {
 CPUClass *k = CPU_GET_CLASS(cpu);
 int64_t cpu_id;
@@ -68,23 +69,34 @@ static void acpi_set_cpu_present_bit(AcpiCpuHotplug *g, 
CPUState *cpu)
 if ((cpu_id / 8) >= ACPI_GPE_PROC_LEN) {
 object_property_set_bool(g->device, "cpu-hotplug-legacy", false,
  &error_abort);
+*swtchd_to_modern = true;
 return;
 }

+*swtchd_to_modern = false;
 g->sts[cpu_id / 8] |= (1 << (cpu_id % 8));
 }

 void legacy_acpi_cpu_plug_cb(HotplugHandler *hotplug_dev,
  AcpiCpuHotplug *g, DeviceState *dev, Error **errp)
 {
-acpi_set_cpu_present_bit(g, CPU(dev));
-acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
+bool swtchd_to_modern;
+Error *local_err = NULL;
+
+acpi_set_cpu_present_bit(g, CPU(dev), &swtchd_to_modern);
+if (swtchd_to_modern) {
+/* propagate the hotplug to the modern interface */
+hotplug_handler_plug(hotplug_dev, dev, &local_err);
+} else {
+acpi_send_event(DEVICE(hotplug_dev), ACPI_CPU_HOTPLUG_STATUS);
+}
 }

 void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, Object *owner,
   AcpiCpuHotplug *gpe_cpu, uint16_t base)
 {
 CPUState *cpu;
+bool swtchd_to_modern;

 memory_region_init_io(&gpe_cpu->io, owner, &AcpiCpuHotplug_ops,
   gpe_cpu, "acpi-cpu-hotplug", ACPI_GPE_PROC_LEN);
@@ -92,7 +104,7 @@ void legacy_acpi_cpu_hotplug_init(MemoryRegion *parent, 
Object *owner,
 gpe_cpu->device = owner;

 CPU_FOREACH(cpu) {
-acpi_set_cpu_present_bit(gpe_cpu, cpu);
+acpi_set_cpu_present_bit(gpe_cpu, cpu, &swtchd_to_modern);
 }
 }

--
1.8.3.1

Re: [External] Re: [PATCH v2 09/20] util/dsa: Implement DSA task asynchronous completion thread model.

2023-12-18 Thread Hao Xiang

On Sun, Dec 17, 2023 at 7:11 PM Wang, Lei  wrote:
>
> On 11/14/2023 13:40, Hao Xiang wrote:> * Create a dedicated thread for DSA 
> task
> completion.
> > * DSA completion thread runs a loop and poll for completed tasks.
> > * Start and stop DSA completion thread during DSA device start stop.
> >
> > User space application can directly submit task to Intel DSA
> > accelerator by writing to DSA's device memory (mapped in user space).
>
> > +}
> > +return;
> > +}
> > +} else {
> > +assert(batch_status == DSA_COMP_BATCH_FAIL ||
> > +batch_status == DSA_COMP_BATCH_PAGE_FAULT);
>
> Nit: indentation is broken here.
>
> > +}
> > +
> > +for (int i = 0; i < count; i++) {
> > +
> > +completion = &batch_task->completions[i];
> > +status = completion->status;
> > +
> > +if (status == DSA_COMP_SUCCESS) {
> > +results[i] = (completion->result == 0);
> > +continue;
> > +}
> > +
> > +if (status != DSA_COMP_PAGE_FAULT_NOBOF) {
> > +fprintf(stderr,
> > +"Unexpected completion status = %u.\n", status);
> > +assert(false);
> > +}
> > +}
> > +}
> > +
> > +/**
> > + * @brief Handles an asynchronous DSA batch task completion.
> > + *
> > + * @param task A pointer to the batch buffer zero task structure.
> > + */
> > +static void
> > +dsa_batch_task_complete(struct buffer_zero_batch_task *batch_task)
> > +{
> > +batch_task->status = DSA_TASK_COMPLETION;
> > +batch_task->completion_callback(batch_task);
> > +}
> > +
> > +/**
> > + * @brief The function entry point called by a dedicated DSA
> > + *work item completion thread.
> > + *
> > + * @param opaque A pointer to the thread context.
> > + *
> > + * @return void* Not used.
> > + */
> > +static void *
> > +dsa_completion_loop(void *opaque)
>
> Per my understanding, if a multifd sending thread corresponds to a DSA device,
> then the batch tasks are executed in parallel which means a task may be
> completed slower than another even if this task is enqueued earlier than it. 
> If
> we poll on the slower task first it will block the handling of the faster one,
> even if the zero checking task for that thread is finished and it can go ahead
> and send the data to the wire, this may lower the network resource 
> utilization.
>

Hi Lei, thanks for reviewing. You are correct that we can keep pulling
a task enqueued first while others in the queue have already been
completed. In fact, only one DSA completion thread (pulling thread) is
used here even when multiple DSA devices are used. The pulling loop is
the most CPU intensive activity in the DSA workflow and that acts
directly against the goal of saving CPU usage. The trade-off I want to
take here is a slightly higher latency on DSA task completion but more
CPU savings. A single DSA engine can reach 30 GB/s throughput on
memory comparison operation. We use kernel tcp stack for network
transfer. The best I see is around 10GB/s throughput.  RDMA can
potentially go higher but I am not sure if it can go higher than 30
GB/s throughput anytime soon.

> > +{
> > +struct dsa_completion_thread *thread_context =
> > +(struct dsa_completion_thread *)opaque;
> > +struct buffer_zero_batch_task *batch_task;
> > +struct dsa_device_group *group = thread_context->group;
> > +
> > +rcu_register_thread();
> > +
> > +thread_context->thread_id = qemu_get_thread_id();
> > +qemu_sem_post(&thread_context->sem_init_done);
> > +
> > +while (thread_context->running) {
> > +batch_task = dsa_task_dequeue(group);
> > +assert(batch_task != NULL || !group->running);
> > +if (!group->running) {
> > +assert(!thread_context->running);
> > +break;
> > +}
> > +if (batch_task->task_type == DSA_TASK) {
> > +poll_task_completion(batch_task);
> > +} else {
> > +assert(batch_task->task_type == DSA_BATCH_TASK);
> > +poll_batch_task_completion(batch_task);
> > +}
> > +
> > +dsa_batch_task_complete(batch_task);
> > +}
> > +
> > +rcu_unregister_thread();
> > +return NULL;
> > +}
> > +
> > +/**
> > + * @brief Initializes a DSA completion thread.
> > + *
> > + * @param completion_thread A pointer to the completion thread context.
> > + * @param group A pointer to the DSA device group.
> > + */
> > +static void
> > +dsa_completion_thread_init(
> > +struct dsa_completion_thread *completion_thread,
> > +struct dsa_device_group *group)
> > +{
> > +completion_thread->stopping = false;
> > +completion_thread->running = true;
> > +completion_thread->thread_id = -1;
> > +qemu_sem_init(&completion_thread->sem_init_done, 0);
> > +completion_thread->group = group;
> > +
> > +qemu_thread_create(&completion_thread->thread,
> > +   DSA_COMPLETION_THREAD,
> > +

Re: [PATCH 04/12] hw/block/fdc: Expose internal header

2023-12-18 Thread Bernhard Beschow




Am 18. Dezember 2023 10:54:56 UTC schrieb BALATON Zoltan :
>On Sun, 17 Dec 2023, Bernhard Beschow wrote:
>> Am 17. Dezember 2023 15:47:33 UTC schrieb BALATON Zoltan 
>> :
>>> On Sun, 17 Dec 2023, Bernhard Beschow wrote:
 Exposing the internal header allows for exposing struct FDCtrlISABus which 
 is
 encuraged by qdev guidelines.
>>> 
>>> Hopefully the guidelines don't encourage this as object orientation indeed 
>>> encourages object encapsulation so only the object itseld should poke its 
>>> internals and other objects should use methods the change object state. In 
>>> QOM some object states were exposed in public headers to allow embedding 
>>> those objects in other objects becuase C needs the struct size to allow 
>>> that. This was to simplify memory management so the embedded objects don't 
>>> need to be tracked and freed but would be created and freed with the other 
>>> object embedding it but this does not mean the other object should poke 
>>> into these object or that this is a general guideline to expose internal 
>>> object state. I'd say the exposed objects are an exception instead of 
>>> recommended guideline and only allowed for objects that need to be embeded 
>>> in others but generally object encapsulation would be better to preserve 
>>> where possible. This patch exposes objects so others can poke into them 
>>> which would make those other objects depe
>ndent on the implementation of these objects making these harder to chnage in 
>the future so a better way may be to add methods to fdc and serial to allow 
>changing their base address and map/unmap their ports and keep their internals 
>unexposed.
>> 
>> Each ISADevice sub class would need concenience methods as well as each 
>> state class. This series touches three of each: fdc, parallel, serial. And 
>> each of those need two convenience methods: set_enabled() and set_address(). 
>> This would add another 12 functions on top of the current ones.
>
>If all ISA devices need this then these should really be methods of ISADevice 
>but since that's just an empty wrapper over devices each of which handles its 
>own ports, the ISADevice does not know about those and since each device may 
>have different ports and not all of them uses portio lists for this, moving 
>port handling to ISADevice might be too big refactoring to do for this. 
>Keeping these functions with the superio component devices so their 
>implementation is kept private still worth it in my opinion so even if that 
>adds 2 functions to superio component devices (which is not all ISA devices 
>just a limited set) seems to be a better approach to me than breaking 
>encapsulation of objects. These are simple access methods for internal object 
>state which are common in object otiented programming.
>
>> Then ISASuperIODevice would require at least 6 more such methods (not 
>> counting the unneeded ones for IDE which might be desirable for 
>> consistency). So in the end we'd have at least 18 more methods. Is this 
>> really worth it?
>
>We may do without these if we say superio is just a container of components so 
>don't add forwarding methods but we can call the accessor methods of component 
>objects from vt82c686.c. That's still better than reaching into object 
>internals from foreign objects.

Version 2 is out which should address all of your comments.

Best regards,
Bernhard

>
>Regards,
>BALATON Zoltan
>
>> I didn't feel very comfortable going this route, so ended up with the 
>> current solution poking the states directly. I'm open to different 
>> approaches including the one above but I'd really like to know the opinion 
>> of the maintainers, too.
>> 
>> Best regards,
>> Bernhard
>> 
>>> 
>>> Regards,
>>> BALATON Zoltan
>>> 
 Signed-off-by: Bernhard Beschow 
 ---
 MAINTAINERS   | 2 +-
 hw/block/fdc-internal.h => include/hw/block/fdc.h | 4 ++--
 hw/block/fdc-isa.c| 2 +-
 hw/block/fdc-sysbus.c | 2 +-
 hw/block/fdc.c| 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)
 rename hw/block/fdc-internal.h => include/hw/block/fdc.h (98%)
 
 diff --git a/MAINTAINERS b/MAINTAINERS
 index b4718fcf59..939f518701 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -1945,9 +1945,9 @@ M: John Snow 
 L: qemu-bl...@nongnu.org
 S: Odd Fixes
 F: hw/block/fdc.c
 -F: hw/block/fdc-internal.h
 F: hw/block/fdc-isa.c
 F: hw/block/fdc-sysbus.c
 +F: include/hw/block/fdc.h
 F: include/hw/block/fdc-isa.h
 F: tests/qtest/fdc-test.c
 T: git https://gitlab.com/jsnow/qemu.git ide
 diff --git a/hw/block/fdc-internal.h b/include/hw/block/fdc.h
 similarity index 98%
 rename from hw/block/fdc-internal.h
 rename to include/hw/block/fdc.h
 index 1728231a26..acca7e0d0e 100644
 --- a/hw/block/fdc-internal.h
 +++ b/include/hw/bl

[PATCH v2 10/12] hw/char/parallel-isa: Implement relocation and toggling for TYPE_ISA_PARALLEL

2023-12-18 Thread Bernhard Beschow

Implement isa_parallel_set_{enabled,iobase} in order to implement relocation and
toggling of SuperI/O functions in the VIA south bridges without breaking
encapsulation.

Signed-off-by: Bernhard Beschow 
---
 include/hw/char/parallel-isa.h |  3 +++
 hw/char/parallel-isa.c | 14 ++
 2 files changed, 17 insertions(+)

diff --git a/include/hw/char/parallel-isa.h b/include/hw/char/parallel-isa.h
index 3b783bd08d..5284b2ffec 100644
--- a/include/hw/char/parallel-isa.h
+++ b/include/hw/char/parallel-isa.h
@@ -29,4 +29,7 @@ struct ISAParallelState {
 PortioList portio_list;
 };
 
+void isa_parallel_set_iobase(ISADevice *parallel, hwaddr iobase);
+void isa_parallel_set_enabled(ISADevice *parallel, bool enabled);
+
 #endif /* HW_PARALLEL_ISA_H */
diff --git a/hw/char/parallel-isa.c b/hw/char/parallel-isa.c
index ab0f879998..a5ce6ee13a 100644
--- a/hw/char/parallel-isa.c
+++ b/hw/char/parallel-isa.c
@@ -41,3 +41,17 @@ void parallel_hds_isa_init(ISABus *bus, int n)
 }
 }
 }
+
+void isa_parallel_set_iobase(ISADevice *parallel, hwaddr iobase)
+{
+ISAParallelState *s = ISA_PARALLEL(parallel);
+
+parallel->ioport_id = iobase;
+s->iobase = iobase;
+portio_list_set_address(&s->portio_list, s->iobase);
+}
+
+void isa_parallel_set_enabled(ISADevice *parallel, bool enabled)
+{
+portio_list_set_enabled(&ISA_PARALLEL(parallel)->portio_list, enabled);
+}
-- 
2.43.0

[PATCH v2 09/12] hw/char/serial-isa: Implement relocation and toggling for TYPE_ISA_SERIAL

2023-12-18 Thread Bernhard Beschow

Implement isa_serial_set_{enabled,iobase} in order to implement relocation and
toggling of SuperI/O functions in the VIA south bridges without breaking
encapsulation.

Signed-off-by: Bernhard Beschow 
---
 include/hw/char/serial.h |  2 ++
 hw/char/serial-isa.c | 14 ++
 2 files changed, 16 insertions(+)

diff --git a/include/hw/char/serial.h b/include/hw/char/serial.h
index eb4254edde..ba9f8f21d7 100644
--- a/include/hw/char/serial.h
+++ b/include/hw/char/serial.h
@@ -112,5 +112,7 @@ SerialMM *serial_mm_init(MemoryRegion *address_space,
 
 #define TYPE_ISA_SERIAL "isa-serial"
 void serial_hds_isa_init(ISABus *bus, int from, int to);
+void isa_serial_set_iobase(ISADevice *serial, hwaddr iobase);
+void isa_serial_set_enabled(ISADevice *serial, bool enabled);
 
 #endif
diff --git a/hw/char/serial-isa.c b/hw/char/serial-isa.c
index 2be8be980b..d51c9ec87c 100644
--- a/hw/char/serial-isa.c
+++ b/hw/char/serial-isa.c
@@ -187,3 +187,17 @@ void serial_hds_isa_init(ISABus *bus, int from, int to)
 }
 }
 }
+
+void isa_serial_set_iobase(ISADevice *serial, hwaddr iobase)
+{
+ISASerialState *s = ISA_SERIAL(serial);
+
+serial->ioport_id = iobase;
+s->iobase = iobase;
+memory_region_set_address(&s->io, s->iobase);
+}
+
+void isa_serial_set_enabled(ISADevice *serial, bool enabled)
+{
+memory_region_set_enabled(&ISA_SERIAL(serial)->io, enabled);
+}
-- 
2.43.0

[PATCH v2 03/12] hw/char/serial: Free struct SerialState from MemoryRegion

2023-12-18 Thread Bernhard Beschow

SerialState::io isn't used within TYPE_SERIAL directly. Push it to its users to
make them the owner of the MemoryRegion.

Signed-off-by: Bernhard Beschow 
---
 include/hw/char/serial.h   | 2 +-
 hw/char/serial-isa.c   | 7 +--
 hw/char/serial-pci-multi.c | 7 ---
 hw/char/serial-pci.c   | 7 +--
 hw/char/serial.c   | 4 ++--
 5 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/include/hw/char/serial.h b/include/hw/char/serial.h
index 8ba7eca3d6..eb4254edde 100644
--- a/include/hw/char/serial.h
+++ b/include/hw/char/serial.h
@@ -77,7 +77,6 @@ struct SerialState {
 int poll_msl;
 
 QEMUTimer *modem_status_poll;
-MemoryRegion io;
 };
 typedef struct SerialState SerialState;
 
@@ -85,6 +84,7 @@ struct SerialMM {
 SysBusDevice parent;
 
 SerialState serial;
+MemoryRegion io;
 
 uint8_t regshift;
 uint8_t endianness;
diff --git a/hw/char/serial-isa.c b/hw/char/serial-isa.c
index 141a6cb168..2be8be980b 100644
--- a/hw/char/serial-isa.c
+++ b/hw/char/serial-isa.c
@@ -26,6 +26,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/module.h"
+#include "exec/memory.h"
 #include "sysemu/sysemu.h"
 #include "hw/acpi/acpi_aml_interface.h"
 #include "hw/char/serial.h"
@@ -43,6 +44,7 @@ struct ISASerialState {
 uint32_t iobase;
 uint32_t isairq;
 SerialState state;
+MemoryRegion io;
 };
 
 static const int isa_serial_io[MAX_ISA_SERIAL_PORTS] = {
@@ -79,8 +81,9 @@ static void serial_isa_realizefn(DeviceState *dev, Error 
**errp)
 qdev_realize(DEVICE(s), NULL, errp);
 qdev_set_legacy_instance_id(dev, isa->iobase, 3);
 
-memory_region_init_io(&s->io, OBJECT(isa), &serial_io_ops, s, "serial", 8);
-isa_register_ioport(isadev, &s->io, isa->iobase);
+memory_region_init_io(&isa->io, OBJECT(isa), &serial_io_ops, s, "serial",
+  8);
+isa_register_ioport(isadev, &isa->io, isa->iobase);
 }
 
 static void serial_isa_build_aml(AcpiDevAmlIf *adev, Aml *scope)
diff --git a/hw/char/serial-pci-multi.c b/hw/char/serial-pci-multi.c
index 5d65c534cb..16cb2faad7 100644
--- a/hw/char/serial-pci-multi.c
+++ b/hw/char/serial-pci-multi.c
@@ -44,6 +44,7 @@ typedef struct PCIMultiSerialState {
 uint32_t ports;
 char *name[PCI_SERIAL_MAX_PORTS];
 SerialState  state[PCI_SERIAL_MAX_PORTS];
+MemoryRegion io[PCI_SERIAL_MAX_PORTS];
 uint32_t level[PCI_SERIAL_MAX_PORTS];
 qemu_irq *irqs;
 uint8_t  prog_if;
@@ -58,7 +59,7 @@ static void multi_serial_pci_exit(PCIDevice *dev)
 for (i = 0; i < pci->ports; i++) {
 s = pci->state + i;
 qdev_unrealize(DEVICE(s));
-memory_region_del_subregion(&pci->iobar, &s->io);
+memory_region_del_subregion(&pci->iobar, &pci->io[i]);
 g_free(pci->name[i]);
 }
 qemu_free_irqs(pci->irqs, pci->ports);
@@ -112,9 +113,9 @@ static void multi_serial_pci_realize(PCIDevice *dev, Error 
**errp)
 }
 s->irq = pci->irqs[i];
 pci->name[i] = g_strdup_printf("uart #%zu", i + 1);
-memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s,
+memory_region_init_io(&pci->io[i], OBJECT(pci), &serial_io_ops, s,
   pci->name[i], 8);
-memory_region_add_subregion(&pci->iobar, 8 * i, &s->io);
+memory_region_add_subregion(&pci->iobar, 8 * i, &pci->io[i]);
 pci->ports++;
 }
 }
diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
index 087da3059a..ab3d0e56b5 100644
--- a/hw/char/serial-pci.c
+++ b/hw/char/serial-pci.c
@@ -28,6 +28,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/module.h"
+#include "exec/memory.h"
 #include "hw/char/serial.h"
 #include "hw/irq.h"
 #include "hw/pci/pci_device.h"
@@ -38,6 +39,7 @@
 struct PCISerialState {
 PCIDevice dev;
 SerialState state;
+MemoryRegion io;
 uint8_t prog_if;
 };
 
@@ -57,8 +59,9 @@ static void serial_pci_realize(PCIDevice *dev, Error **errp)
 pci->dev.config[PCI_INTERRUPT_PIN] = 0x01;
 s->irq = pci_allocate_irq(&pci->dev);
 
-memory_region_init_io(&s->io, OBJECT(pci), &serial_io_ops, s, "serial", 8);
-pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &s->io);
+memory_region_init_io(&pci->io, OBJECT(pci), &serial_io_ops, s, "serial",
+  8);
+pci_register_bar(&pci->dev, 0, PCI_BASE_ADDRESS_SPACE_IO, &pci->io);
 }
 
 static void serial_pci_exit(PCIDevice *dev)
diff --git a/hw/char/serial.c b/hw/char/serial.c
index a32eb25f58..83b642aec3 100644
--- a/hw/char/serial.c
+++ b/hw/char/serial.c
@@ -1045,10 +1045,10 @@ static void serial_mm_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
-memory_region_init_io(&s->io, OBJECT(dev),
+memory_region_init_io(&smm->io, OBJECT(dev),
   &serial_mm_ops[smm->endianness], smm, "serial",
   8 << smm->regshift);
-sysbus_init_mmio(SYS_BUS_DEVICE(smm), &s->io);
+

[PATCH v2 07/12] exec/ioport: Add portio_list_set_enabled()

2023-12-18 Thread Bernhard Beschow

Some SuperI/O devices such as the VIA south bridges or the PC87312 controller
allow to enable or disable their SuperI/O functions. Add a convenience function
for implementing this in the VIA south bridges.

The naming of the functions is inspired by its memory_region_set_enabled()
pendant.

Signed-off-by: Bernhard Beschow 
---
 docs/devel/migration.rst | 1 +
 include/exec/ioport.h| 1 +
 system/ioport.c  | 9 +
 3 files changed, 11 insertions(+)

diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index 389fa24bde..466be609a2 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -465,6 +465,7 @@ Examples of such memory API functions are:
   - memory_region_set_address()
   - memory_region_set_alias_offset()
   - portio_list_set_address()
+  - portio_list_set_enabled()
 
 Iterative device migration
 --
diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index 96858e5ac3..4397f12f93 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -71,6 +71,7 @@ void portio_list_add(PortioList *piolist,
  struct MemoryRegion *address_space,
  uint32_t addr);
 void portio_list_del(PortioList *piolist);
+void portio_list_set_enabled(PortioList *piolist, bool enabled);
 void portio_list_set_address(PortioList *piolist, uint32_t addr);
 
 #endif /* IOPORT_H */
diff --git a/system/ioport.c b/system/ioport.c
index 000e0ee1af..fd551d0375 100644
--- a/system/ioport.c
+++ b/system/ioport.c
@@ -324,6 +324,15 @@ void portio_list_del(PortioList *piolist)
 }
 }
 
+void portio_list_set_enabled(PortioList *piolist, bool enabled)
+{
+unsigned i;
+
+for (i = 0; i < piolist->nr; ++i) {
+memory_region_set_enabled(piolist->regions[i], enabled);
+}
+}
+
 void portio_list_set_address(PortioList *piolist, uint32_t addr)
 {
 MemoryRegionPortioList *mrpio;
-- 
2.43.0

[PATCH v2 11/12] hw/ppc/pegasos2: Let pegasos2 machine configure SuperI/O functions

2023-12-18 Thread Bernhard Beschow

This is a preparation for implementing relocation and toggling of SuperI/O
functions in the VT8231 device model. Upon reset, all SuperI/O functions will be
deactivated, so in case if no -bios is given, let the machine configure those
functions the same way pegasos2.rom would do. For now the meantime this will be
a no-op.

Signed-off-by: Bernhard Beschow 
---
 hw/ppc/pegasos2.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/hw/ppc/pegasos2.c b/hw/ppc/pegasos2.c
index 3203a4a728..0a40ebd542 100644
--- a/hw/ppc/pegasos2.c
+++ b/hw/ppc/pegasos2.c
@@ -285,6 +285,15 @@ static void pegasos2_pci_config_write(Pegasos2MachineState 
*pm, int bus,
 pegasos2_mv_reg_write(pm, pcicfg + 4, len, val);
 }
 
+static void pegasos2_superio_write(Pegasos2MachineState *pm, uint32_t addr,
+   uint32_t val)
+{
+AddressSpace *as = CPU(pm->cpu)->as;
+
+stb_phys(as, PCI1_IO_BASE + 0x3f0, addr);
+stb_phys(as, PCI1_IO_BASE + 0x3f1, val);
+}
+
 static void pegasos2_machine_reset(MachineState *machine, ShutdownCause reason)
 {
 Pegasos2MachineState *pm = PEGASOS2_MACHINE(machine);
@@ -310,6 +319,12 @@ static void pegasos2_machine_reset(MachineState *machine, 
ShutdownCause reason)
 
 pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |
   PCI_INTERRUPT_LINE, 2, 0x9);
+pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |
+  0x50, 1, 0x6);
+pegasos2_superio_write(pm, 0xf4, 0xbe);
+pegasos2_superio_write(pm, 0xf6, 0xef);
+pegasos2_superio_write(pm, 0xf7, 0xfc);
+pegasos2_superio_write(pm, 0xf2, 0x14);
 pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |
   0x50, 1, 0x2);
 pegasos2_pci_config_write(pm, 1, (PCI_DEVFN(12, 0) << 8) |
-- 
2.43.0

[PATCH v2 12/12] hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions

2023-12-18 Thread Bernhard Beschow

The VIA south bridges are able to relocate and toggle (enable or disable) their
SuperI/O functions. So far this is hardcoded such that all functions are always
enabled and are located at fixed addresses.

Some PC BIOSes seem to probe for I/O occupancy before activating such a function
and issue an error in case of a conflict. Since the functions are enabled on
reset, conflicts are always detected. Prevent that by implementing relocation
and toggling of the SuperI/O functions.

Note that all SuperI/O functions are now deactivated upon reset (except for
VT82C686B's serial ports where Fuloong 2e's rescue-yl seems to expect them to be
enabled by default). Rely on firmware -- or in case of pegasos2 on board code if
no -bios is given -- to configure the functions accordingly.

Signed-off-by: Bernhard Beschow 
---
 hw/isa/vt82c686.c | 121 ++
 1 file changed, 90 insertions(+), 31 deletions(-)

diff --git a/hw/isa/vt82c686.c b/hw/isa/vt82c686.c
index 9c2333a277..be202d23cf 100644
--- a/hw/isa/vt82c686.c
+++ b/hw/isa/vt82c686.c
@@ -15,6 +15,9 @@
 
 #include "qemu/osdep.h"
 #include "hw/isa/vt82c686.h"
+#include "hw/block/fdc.h"
+#include "hw/char/parallel-isa.h"
+#include "hw/char/serial.h"
 #include "hw/pci/pci.h"
 #include "hw/qdev-properties.h"
 #include "hw/ide/pci.h"
@@ -343,6 +346,35 @@ static const TypeInfo via_superio_info = {
 
 #define TYPE_VT82C686B_SUPERIO "vt82c686b-superio"
 
+static void vt82c686b_superio_update(ViaSuperIOState *s)
+{
+isa_parallel_set_enabled(s->superio.parallel[0],
+ (s->regs[0xe2] & 0x3) != 3);
+isa_serial_set_enabled(s->superio.serial[0], s->regs[0xe2] & BIT(2));
+isa_serial_set_enabled(s->superio.serial[1], s->regs[0xe2] & BIT(3));
+isa_fdc_set_enabled(s->superio.floppy, s->regs[0xe2] & BIT(4));
+
+isa_fdc_set_iobase(s->superio.floppy, (s->regs[0xe3] & 0xfc) << 2);
+isa_parallel_set_iobase(s->superio.parallel[0], s->regs[0xe6] << 2);
+isa_serial_set_iobase(s->superio.serial[0], (s->regs[0xe7] & 0xfe) << 2);
+isa_serial_set_iobase(s->superio.serial[1], (s->regs[0xe8] & 0xfe) << 2);
+}
+
+static int vmstate_vt82c686b_superio_post_load(void *opaque, int version_id)
+{
+ViaSuperIOState *s = opaque;
+
+vt82c686b_superio_update(s);
+
+return 0;
+}
+
+static const VMStateDescription vmstate_vt82c686b_superio = {
+.name = "vt82c686b_superio",
+.version_id = 1,
+.post_load = vmstate_vt82c686b_superio_post_load,
+};
+
 static void vt82c686b_superio_cfg_write(void *opaque, hwaddr addr,
 uint64_t data, unsigned size)
 {
@@ -368,7 +400,11 @@ static void vt82c686b_superio_cfg_write(void *opaque, 
hwaddr addr,
 case 0xfd ... 0xff:
 /* ignore write to read only registers */
 return;
-/* case 0xe6 ... 0xe8: Should set base port of parallel and serial */
+case 0xe2 ... 0xe3:
+case 0xe6 ... 0xe8:
+sc->regs[idx] = data;
+vt82c686b_superio_update(sc);
+return;
 default:
 qemu_log_mask(LOG_UNIMP,
   "via_superio_cfg: unimplemented register 0x%x\n", idx);
@@ -393,25 +429,24 @@ static void vt82c686b_superio_reset(DeviceState *dev)
 
 memset(s->regs, 0, sizeof(s->regs));
 /* Device ID */
-vt82c686b_superio_cfg_write(s, 0, 0xe0, 1);
-vt82c686b_superio_cfg_write(s, 1, 0x3c, 1);
-/* Function select - all disabled */
-vt82c686b_superio_cfg_write(s, 0, 0xe2, 1);
-vt82c686b_superio_cfg_write(s, 1, 0x03, 1);
+s->regs[0xe0] = 0x3c;
+/*
+ * Function select - only serial enabled
+ * Fuloong 2e's rescue-yl prints to the serial console w/o enabling it. 
This
+ * suggests that the serial ports are enabled by default, so override the
+ * datasheet.
+ */
+s->regs[0xe2] = 0x0f;
 /* Floppy ctrl base addr 0x3f0-7 */
-vt82c686b_superio_cfg_write(s, 0, 0xe3, 1);
-vt82c686b_superio_cfg_write(s, 1, 0xfc, 1);
+s->regs[0xe3] = 0xfc;
 /* Parallel port base addr 0x378-f */
-vt82c686b_superio_cfg_write(s, 0, 0xe6, 1);
-vt82c686b_superio_cfg_write(s, 1, 0xde, 1);
+s->regs[0xe6] = 0xde;
 /* Serial port 1 base addr 0x3f8-f */
-vt82c686b_superio_cfg_write(s, 0, 0xe7, 1);
-vt82c686b_superio_cfg_write(s, 1, 0xfe, 1);
+s->regs[0xe7] = 0xfe;
 /* Serial port 2 base addr 0x2f8-f */
-vt82c686b_superio_cfg_write(s, 0, 0xe8, 1);
-vt82c686b_superio_cfg_write(s, 1, 0xbe, 1);
+s->regs[0xe8] = 0xbe;
 
-vt82c686b_superio_cfg_write(s, 0, 0, 1);
+vt82c686b_superio_update(s);
 }
 
 static void vt82c686b_superio_init(Object *obj)
@@ -429,6 +464,7 @@ static void vt82c686b_superio_class_init(ObjectClass 
*klass, void *data)
 sc->parallel.count = 1;
 sc->ide.count = 0; /* emulated by via-ide */
 sc->floppy.count = 1;
+dc->vmsd = &vmstate_vt82c686b_superio;
 }
 
 static const TypeInfo vt82c686b_superio_info = {
@@ -443,6 +479,33 @@ static const TypeInfo vt82c6

[PATCH v2 01/12] hw/block/fdc-isa: Free struct FDCtrl from PortioList

2023-12-18 Thread Bernhard Beschow

FDCtrl::portio_list isn't used inside FDCtrl context but only inside
FDCtrlISABus context, so more it there.

Signed-off-by: Bernhard Beschow 
---
 hw/block/fdc-internal.h | 2 --
 hw/block/fdc-isa.c  | 4 +++-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
index 036392e9fc..fef2bfbbf5 100644
--- a/hw/block/fdc-internal.h
+++ b/hw/block/fdc-internal.h
@@ -26,7 +26,6 @@
 #define HW_BLOCK_FDC_INTERNAL_H
 
 #include "exec/memory.h"
-#include "exec/ioport.h"
 #include "hw/block/block.h"
 #include "hw/block/fdc.h"
 #include "qapi/qapi-types-block.h"
@@ -140,7 +139,6 @@ struct FDCtrl {
 /* Timers state */
 uint8_t timer0;
 uint8_t timer1;
-PortioList portio_list;
 };
 
 extern const FDFormat fd_formats[];
diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index 7ec075e470..b4c92b40b3 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -42,6 +42,7 @@
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "sysemu/sysemu.h"
+#include "exec/ioport.h"
 #include "qemu/log.h"
 #include "qemu/main-loop.h"
 #include "qemu/module.h"
@@ -60,6 +61,7 @@ struct FDCtrlISABus {
 uint32_t irq;
 uint32_t dma;
 struct FDCtrl state;
+PortioList portio_list;
 int32_t bootindexA;
 int32_t bootindexB;
 };
@@ -91,7 +93,7 @@ static void isabus_fdc_realize(DeviceState *dev, Error **errp)
 FDCtrl *fdctrl = &isa->state;
 Error *err = NULL;
 
-isa_register_portio_list(isadev, &fdctrl->portio_list,
+isa_register_portio_list(isadev, &isa->portio_list,
  isa->iobase, fdc_portio_list, fdctrl,
  "fdc");
 
-- 
2.43.0

[PATCH v2 08/12] hw/block/fdc-isa: Implement relocation and toggling for TYPE_ISA_FDC

2023-12-18 Thread Bernhard Beschow

Implement isa_fdc_set_{enabled,iobase} in order to implement relocation and
toggling of SuperI/O functions in the VIA south bridges without breaking
encapsulation.

Signed-off-by: Bernhard Beschow 
---
 include/hw/block/fdc.h |  3 +++
 hw/block/fdc-isa.c | 14 ++
 2 files changed, 17 insertions(+)

diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index 35248c0837..c367c5efea 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -14,6 +14,9 @@ void fdctrl_init_sysbus(qemu_irq irq, hwaddr mmio_base, 
DriveInfo **fds);
 void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc);
 
+void isa_fdc_set_iobase(ISADevice *fdc, hwaddr iobase);
+void isa_fdc_set_enabled(ISADevice *fdc, bool enabled);
+
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i);
 int cmos_get_fd_drive_type(FloppyDriveType fd0);
 
diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index b4c92b40b3..c989325de3 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -192,6 +192,20 @@ static Aml *build_fdinfo_aml(int idx, FloppyDriveType type)
 return dev;
 }
 
+void isa_fdc_set_iobase(ISADevice *fdc, hwaddr iobase)
+{
+FDCtrlISABus *isa = ISA_FDC(fdc);
+
+fdc->ioport_id = iobase;
+isa->iobase = iobase;
+portio_list_set_address(&isa->portio_list, isa->iobase);
+}
+
+void isa_fdc_set_enabled(ISADevice *fdc, bool enabled)
+{
+portio_list_set_enabled(&ISA_FDC(fdc)->portio_list, enabled);
+}
+
 int cmos_get_fd_drive_type(FloppyDriveType fd0)
 {
 int val;
-- 
2.43.0

[PATCH v2 02/12] hw/block/fdc-sysbus: Free struct FDCtrl from MemoryRegion

2023-12-18 Thread Bernhard Beschow

FDCtrl::iomem isn't used inside FDCtrl context but only inside FDCtrlSysBus
context, so more it there.

Signed-off-by: Bernhard Beschow 
---
 hw/block/fdc-internal.h | 2 --
 hw/block/fdc-sysbus.c   | 6 --
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
index fef2bfbbf5..e219623dc7 100644
--- a/hw/block/fdc-internal.h
+++ b/hw/block/fdc-internal.h
@@ -25,7 +25,6 @@
 #ifndef HW_BLOCK_FDC_INTERNAL_H
 #define HW_BLOCK_FDC_INTERNAL_H
 
-#include "exec/memory.h"
 #include "hw/block/block.h"
 #include "hw/block/fdc.h"
 #include "qapi/qapi-types-block.h"
@@ -91,7 +90,6 @@ typedef struct FDrive {
 } FDrive;
 
 struct FDCtrl {
-MemoryRegion iomem;
 qemu_irq irq;
 /* Controller state */
 QEMUTimer *result_timer;
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 86ea51d003..e197b97262 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -26,6 +26,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qom/object.h"
+#include "exec/memory.h"
 #include "hw/sysbus.h"
 #include "hw/block/fdc.h"
 #include "migration/vmstate.h"
@@ -52,6 +53,7 @@ struct FDCtrlSysBus {
 /*< public >*/
 
 struct FDCtrl state;
+MemoryRegion iomem;
 };
 
 static uint64_t fdctrl_read_mem(void *opaque, hwaddr reg, unsigned ize)
@@ -146,11 +148,11 @@ static void sysbus_fdc_common_instance_init(Object *obj)
 
 qdev_set_legacy_instance_id(dev, 0 /* io */, 2); /* FIXME */
 
-memory_region_init_io(&fdctrl->iomem, obj,
+memory_region_init_io(&sys->iomem, obj,
   sbdc->use_strict_io ? &fdctrl_mem_strict_ops
   : &fdctrl_mem_ops,
   fdctrl, "fdc", 0x08);
-sysbus_init_mmio(sbd, &fdctrl->iomem);
+sysbus_init_mmio(sbd, &sys->iomem);
 
 sysbus_init_irq(sbd, &fdctrl->irq);
 qdev_init_gpio_in(dev, fdctrl_handle_tc, 1);
-- 
2.43.0

[PATCH v2 04/12] hw/char/parallel: Free struct ParallelState from PortioList

2023-12-18 Thread Bernhard Beschow

ParallelState::portio_list isn't used inside ParallelState context but only
inside ISAParallelState context, so more it there.

Signed-off-by: Bernhard Beschow 
---
 include/hw/char/parallel-isa.h | 2 ++
 include/hw/char/parallel.h | 2 --
 hw/char/parallel.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/hw/char/parallel-isa.h b/include/hw/char/parallel-isa.h
index d24ccecf05..3b783bd08d 100644
--- a/include/hw/char/parallel-isa.h
+++ b/include/hw/char/parallel-isa.h
@@ -12,6 +12,7 @@
 
 #include "parallel.h"
 
+#include "exec/ioport.h"
 #include "hw/isa/isa.h"
 #include "qom/object.h"
 
@@ -25,6 +26,7 @@ struct ISAParallelState {
 uint32_t iobase;
 uint32_t isairq;
 ParallelState state;
+PortioList portio_list;
 };
 
 #endif /* HW_PARALLEL_ISA_H */
diff --git a/include/hw/char/parallel.h b/include/hw/char/parallel.h
index 7b5a309a03..cfb97cc7cc 100644
--- a/include/hw/char/parallel.h
+++ b/include/hw/char/parallel.h
@@ -1,7 +1,6 @@
 #ifndef HW_PARALLEL_H
 #define HW_PARALLEL_H
 
-#include "exec/ioport.h"
 #include "exec/memory.h"
 #include "hw/isa/isa.h"
 #include "hw/irq.h"
@@ -22,7 +21,6 @@ typedef struct ParallelState {
 uint32_t last_read_offset; /* For debugging */
 /* Memory-mapped interface */
 int it_shift;
-PortioList portio_list;
 } ParallelState;
 
 void parallel_hds_isa_init(ISABus *bus, int n);
diff --git a/hw/char/parallel.c b/hw/char/parallel.c
index 147c900f0d..c1747cbb75 100644
--- a/hw/char/parallel.c
+++ b/hw/char/parallel.c
@@ -532,7 +532,7 @@ static void parallel_isa_realizefn(DeviceState *dev, Error 
**errp)
 s->status = dummy;
 }
 
-isa_register_portio_list(isadev, &s->portio_list, base,
+isa_register_portio_list(isadev, &isa->portio_list, base,
  (s->hw_driver
   ? &isa_parallel_portio_hw_list[0]
   : &isa_parallel_portio_sw_list[0]),
-- 
2.43.0

[PATCH v2 05/12] exec/ioport: Resolve redundant .base attribute in struct MemoryRegionPortio

2023-12-18 Thread Bernhard Beschow

portio_list_add_1() creates a MemoryRegionPortioList instance which holds a
MemoryRegion `mr` and an array of MemoryRegionPortio elements named `ports`.
Each element in the array gets assigned the same value for its .base attribute.
The same value also ends up as the .addr attribute of `mr` due to the
memory_region_add_subregion() call. This means that all .base attributes are
the same as `mr.addr`.

The only usages of MemoryRegionPortio::base were in portio_read() and
portio_write(). Both functions get above MemoryRegionPortioList as their
opaque parameter. In both cases find_portio() can only return one of the
MemoryRegionPortio elements of the `ports` array. Due to above observation any
element will have the same .base value equal to `mr.addr` which is also
accessible.

Hence, `mrpio->mr.addr` is equivalent to `mrp->base` and
MemoryRegionPortio::base is redundant and can be removed.

Signed-off-by: Bernhard Beschow 
---
 include/exec/ioport.h |  1 -
 system/ioport.c   | 13 ++---
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index e34f668998..95f1dc30d0 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -35,7 +35,6 @@ typedef struct MemoryRegionPortio {
 unsigned size;
 uint32_t (*read)(void *opaque, uint32_t address);
 void (*write)(void *opaque, uint32_t address, uint32_t data);
-uint32_t base; /* private field */
 } MemoryRegionPortio;
 
 #define PORTIO_END_OF_LIST() { }
diff --git a/system/ioport.c b/system/ioport.c
index 1824aa808c..a59e58b716 100644
--- a/system/ioport.c
+++ b/system/ioport.c
@@ -181,13 +181,13 @@ static uint64_t portio_read(void *opaque, hwaddr addr, 
unsigned size)
 
 data = ((uint64_t)1 << (size * 8)) - 1;
 if (mrp) {
-data = mrp->read(mrpio->portio_opaque, mrp->base + addr);
+data = mrp->read(mrpio->portio_opaque, mrpio->mr.addr + addr);
 } else if (size == 2) {
 mrp = find_portio(mrpio, addr, 1, false);
 if (mrp) {
-data = mrp->read(mrpio->portio_opaque, mrp->base + addr);
+data = mrp->read(mrpio->portio_opaque, mrpio->mr.addr + addr);
 if (addr + 1 < mrp->offset + mrp->len) {
-data |= mrp->read(mrpio->portio_opaque, mrp->base + addr + 1) 
<< 8;
+data |= mrp->read(mrpio->portio_opaque, mrpio->mr.addr + addr 
+ 1) << 8;
 } else {
 data |= 0xff00;
 }
@@ -203,13 +203,13 @@ static void portio_write(void *opaque, hwaddr addr, 
uint64_t data,
 const MemoryRegionPortio *mrp = find_portio(mrpio, addr, size, true);
 
 if (mrp) {
-mrp->write(mrpio->portio_opaque, mrp->base + addr, data);
+mrp->write(mrpio->portio_opaque, mrpio->mr.addr + addr, data);
 } else if (size == 2) {
 mrp = find_portio(mrpio, addr, 1, true);
 if (mrp) {
-mrp->write(mrpio->portio_opaque, mrp->base + addr, data & 0xff);
+mrp->write(mrpio->portio_opaque, mrpio->mr.addr + addr, data & 
0xff);
 if (addr + 1 < mrp->offset + mrp->len) {
-mrp->write(mrpio->portio_opaque, mrp->base + addr + 1, data >> 
8);
+mrp->write(mrpio->portio_opaque, mrpio->mr.addr + addr + 1, 
data >> 8);
 }
 }
 }
@@ -244,7 +244,6 @@ static void portio_list_add_1(PortioList *piolist,
 /* Adjust the offsets to all be zero-based for the region.  */
 for (i = 0; i < count; ++i) {
 mrpio->ports[i].offset -= off_low;
-mrpio->ports[i].base = start + off_low;
 }
 
 /*
-- 
2.43.0

[PATCH v2 06/12] exec/ioport: Add portio_list_set_address()

2023-12-18 Thread Bernhard Beschow

Some SuperI/O devices such as the VIA south bridges or the PC87312 controller
are able to relocate their SuperI/O functions. Add a convenience function for
implementing this in the VIA south bridges.

This convenience function relies on previous simplifications in exec/ioport
which avoids some duplicate synchronization of I/O port base addresses. The
naming of the function is inspired by its memory_region_set_address() pendant.

Signed-off-by: Bernhard Beschow 
---
 docs/devel/migration.rst |  1 +
 include/exec/ioport.h|  2 ++
 system/ioport.c  | 19 +++
 3 files changed, 22 insertions(+)

diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index ec55089b25..389fa24bde 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -464,6 +464,7 @@ Examples of such memory API functions are:
   - memory_region_set_enabled()
   - memory_region_set_address()
   - memory_region_set_alias_offset()
+  - portio_list_set_address()
 
 Iterative device migration
 --
diff --git a/include/exec/ioport.h b/include/exec/ioport.h
index 95f1dc30d0..96858e5ac3 100644
--- a/include/exec/ioport.h
+++ b/include/exec/ioport.h
@@ -54,6 +54,7 @@ typedef struct PortioList {
 const struct MemoryRegionPortio *ports;
 Object *owner;
 struct MemoryRegion *address_space;
+uint32_t addr;
 unsigned nr;
 struct MemoryRegion **regions;
 void *opaque;
@@ -70,5 +71,6 @@ void portio_list_add(PortioList *piolist,
  struct MemoryRegion *address_space,
  uint32_t addr);
 void portio_list_del(PortioList *piolist);
+void portio_list_set_address(PortioList *piolist, uint32_t addr);
 
 #endif /* IOPORT_H */
diff --git a/system/ioport.c b/system/ioport.c
index a59e58b716..000e0ee1af 100644
--- a/system/ioport.c
+++ b/system/ioport.c
@@ -133,6 +133,7 @@ void portio_list_init(PortioList *piolist,
 piolist->nr = 0;
 piolist->regions = g_new0(MemoryRegion *, n);
 piolist->address_space = NULL;
+piolist->addr = 0;
 piolist->opaque = opaque;
 piolist->owner = owner;
 piolist->name = name;
@@ -282,6 +283,7 @@ void portio_list_add(PortioList *piolist,
 unsigned int off_low, off_high, off_last, count;
 
 piolist->address_space = address_space;
+piolist->addr = start;
 
 /* Handle the first entry specially.  */
 off_last = off_low = pio_start->offset;
@@ -322,6 +324,23 @@ void portio_list_del(PortioList *piolist)
 }
 }
 
+void portio_list_set_address(PortioList *piolist, uint32_t addr)
+{
+MemoryRegionPortioList *mrpio;
+unsigned i, j;
+
+for (i = 0; i < piolist->nr; ++i) {
+mrpio = container_of(piolist->regions[i], MemoryRegionPortioList, mr);
+memory_region_set_address(&mrpio->mr,
+  mrpio->mr.addr - piolist->addr + addr);
+for (j = 0; mrpio->ports[j].size; ++j) {
+mrpio->ports[j].offset += addr - piolist->addr;
+}
+}
+
+piolist->addr = addr;
+}
+
 static void memory_region_portio_list_finalize(Object *obj)
 {
 MemoryRegionPortioList *mrpio = MEMORY_REGION_PORTIO_LIST(obj);
-- 
2.43.0

[PATCH v2 00/12] hw/isa/vt82c686: Implement relocation and toggling of SuperI/O functions

2023-12-18 Thread Bernhard Beschow

This series implements relocation of the SuperI/O functions of the VIA south
bridges which resolves some FIXME's. It is part of my via-apollo-pro-133t
branch [1] which is an extension of bringing the VIA south bridges to the PC
machine [2]. This branch is able to run some real-world X86 BIOSes in the hope
that it allows us to form a better understanding of the real vt82c686b devices.
Implementing relocation and toggling of the SuperI/O functions is one step to
make these BIOSes run without error messages, so here we go.

The series is structured as follows: Patches 1-4 prepare the TYPE_ISA_FDC,
TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL to relocate and toggle (enable/disable)
themselves without breaking encapsulation of their respective device states.
This is achieved by moving the MemoryRegions and PortioLists from the device
states into the encapsulating ISA devices since they will be relocated and
toggled.

Inspired by the memory API patches 5-7 add two convenience functions to the
portio_list API to toggle and relocate portio lists. Patch 5 is a preparation
for that which removes some redundancies which otherwise had to be dealt with
during relocation.

Patches 8-10 implement toggling and relocation for types TYPE_ISA_FDC,
TYPE_ISA_PARALLEL and TYPE_ISA_SERIAL. Patch 11 prepares the pegasos2 machine
which would end up with all SuperI/O functions disabled if no -bios argument is
given. Patch 12 finally implements the main feature which now relies on
firmware to configure the SuperI/O functions accordingly (except for pegasos2).

v2:
* Improve commit message (Zoltan)
* Split pegasos2 from vt82c686 patch (Zoltan)
* Avoid poking into device internals (Zoltan)

Testing done:
* make check
* make check-avocado
* Run MorphOS on pegasos2 with and without pegasos2.rom
* Run Linux on amigaone
* Run real-world BIOSes on via-apollo-pro-133t branch
* Start rescue-yl on fuloong2e

[1] https://github.com/shentok/qemu/tree/via-apollo-pro-133t
[2] https://github.com/shentok/qemu/tree/pc-via

Bernhard Beschow (12):
  hw/block/fdc-isa: Free struct FDCtrl from PortioList
  hw/block/fdc-sysbus: Free struct FDCtrl from MemoryRegion
  hw/char/serial: Free struct SerialState from MemoryRegion
  hw/char/parallel: Free struct ParallelState from PortioList
  exec/ioport: Resolve redundant .base attribute in struct
MemoryRegionPortio
  exec/ioport: Add portio_list_set_address()
  exec/ioport: Add portio_list_set_enabled()
  hw/block/fdc-isa: Implement relocation and toggling for TYPE_ISA_FDC
  hw/char/serial-isa: Implement relocation and toggling for
TYPE_ISA_SERIAL
  hw/char/parallel-isa: Implement relocation and toggling for
TYPE_ISA_PARALLEL
  hw/ppc/pegasos2: Let pegasos2 machine configure SuperI/O functions
  hw/isa/vt82c686: Implement relocation and toggling of SuperI/O
functions

 docs/devel/migration.rst   |   2 +
 hw/block/fdc-internal.h|   4 --
 include/exec/ioport.h  |   4 +-
 include/hw/block/fdc.h |   3 +
 include/hw/char/parallel-isa.h |   5 ++
 include/hw/char/parallel.h |   2 -
 include/hw/char/serial.h   |   4 +-
 hw/block/fdc-isa.c |  18 -
 hw/block/fdc-sysbus.c  |   6 +-
 hw/char/parallel-isa.c |  14 
 hw/char/parallel.c |   2 +-
 hw/char/serial-isa.c   |  21 +-
 hw/char/serial-pci-multi.c |   7 +-
 hw/char/serial-pci.c   |   7 +-
 hw/char/serial.c   |   4 +-
 hw/isa/vt82c686.c  | 121 -
 hw/ppc/pegasos2.c  |  15 
 system/ioport.c|  41 +--
 18 files changed, 221 insertions(+), 59 deletions(-)

-- 
2.43.0

Re: [External] Re: [PATCH v2 12/20] migration/multifd: Add new migration option for multifd DSA offloading.

2023-12-18 Thread Hao Xiang

On Mon, Dec 11, 2023 at 11:44 AM Fabiano Rosas  wrote:
>
> Hao Xiang  writes:
>
> > Intel DSA offloading is an optional feature that turns on if
> > proper hardware and software stack is available. To turn on
> > DSA offloading in multifd live migration:
> >
> > multifd-dsa-accel="[dsa_dev_path1] ] [dsa_dev_path2] ... [dsa_dev_pathX]"
> >
> > This feature is turned off by default.
>
> This patch breaks make check:
>
>  43/357 qemu:qtest+qtest-x86_64 / qtest-x86_64/test-hmp   ERROR   
> 0.52s
>  79/357 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test ERROR   
> 3.59s
> 167/357 qemu:qtest+qtest-x86_64 / qtest-x86_64/qmp-cmd-test ERROR   
> 3.68s
>
> Make sure you run make check before posting. Ideally also run the series
> through the Gitlab CI on your personal fork.

* I think I accidentally deleted some code in meson-buildoptions.sh.
Reverted those now.
* I also found a bug in how I handle the string in migration options. Fixed now.
* make check is passing now. Fix will be in the next patchset.

69/818 qemu:qtest+qtest-x86_64 / qtest-x86_64/test-hmp
   OK  4.22s   9 subtests passed
37/818 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test
   OK 60.16s   16 subtests passed
607/818 qemu:qtest+qtest-x86_64 / qtest-x86_64/qmp-cmd-test
OK  8.23s   65 subtests passed

Ok: 747
Expected Fail:  0
Fail:   0
Unexpected Pass:0
Skipped:71
Timeout:0

>
> > Signed-off-by: Hao Xiang 
> > ---
> >  migration/migration-hmp-cmds.c |  8 
> >  migration/options.c| 28 
> >  migration/options.h|  1 +
> >  qapi/migration.json| 17 ++---
> >  scripts/meson-buildoptions.sh  |  6 +++---
> >  5 files changed, 54 insertions(+), 6 deletions(-)
> >
> > diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> > index 86ae832176..d9451744dd 100644
> > --- a/migration/migration-hmp-cmds.c
> > +++ b/migration/migration-hmp-cmds.c
> > @@ -353,6 +353,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const 
> > QDict *qdict)
> >  monitor_printf(mon, "%s: '%s'\n",
> >  MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
> >  params->tls_authz);
> > +monitor_printf(mon, "%s: %s\n",
>
> Use '%s' here.

Fixed. Will be in the next version.

>
> > +MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL),
> > +params->multifd_dsa_accel);
> >
> >  if (params->has_block_bitmap_mapping) {
> >  const BitmapMigrationNodeAliasList *bmnal;
> > @@ -615,6 +618,11 @@ void hmp_migrate_set_parameter(Monitor *mon, const 
> > QDict *qdict)
> >  p->has_block_incremental = true;
> >  visit_type_bool(v, param, &p->block_incremental, &err);
> >  break;
> > +case MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL:
> > +p->multifd_dsa_accel = g_new0(StrOrNull, 1);
> > +p->multifd_dsa_accel->type = QTYPE_QSTRING;
> > +visit_type_str(v, param, &p->multifd_dsa_accel->u.s, &err);
> > +break;
> >  case MIGRATION_PARAMETER_MULTIFD_CHANNELS:
> >  p->has_multifd_channels = true;
> >  visit_type_uint8(v, param, &p->multifd_channels, &err);
> > diff --git a/migration/options.c b/migration/options.c
> > index 97d121d4d7..6e424b5d63 100644
> > --- a/migration/options.c
> > +++ b/migration/options.c
> > @@ -179,6 +179,8 @@ Property migration_properties[] = {
> >  DEFINE_PROP_MIG_MODE("mode", MigrationState,
> >parameters.mode,
> >MIG_MODE_NORMAL),
> > +DEFINE_PROP_STRING("multifd-dsa-accel", MigrationState,
> > +   parameters.multifd_dsa_accel),
> >
> >  /* Migration capabilities */
> >  DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
> > @@ -901,6 +903,13 @@ const char *migrate_tls_creds(void)
> >  return s->parameters.tls_creds;
> >  }
> >
> > +const char *migrate_multifd_dsa_accel(void)
> > +{
> > +MigrationState *s = migrate_get_current();
> > +
> > +return s->parameters.multifd_dsa_accel;
> > +}
> > +
> >  const char *migrate_tls_hostname(void)
> >  {
> >  MigrationState *s = migrate_get_current();
> > @@ -1025,6 +1034,7 @@ MigrationParameters 
> > *qmp_query_migrate_parameters(Error **errp)
> >  params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit;
> >  params->has_mode = true;
> >  params->mode = s->parameters.mode;
> > +params->multifd_dsa_accel = s->parameters.multifd_dsa_accel;
> >
> >  return params;
> >  }
> > @@ -1033,6 +1043,7 @@ void migrate_params_init(MigrationParameters *params)
> >  {
> >  params->tls_hostname = g_strdup("");
> >  params->tls_creds = g_strdup("");
> > +params->multifd_dsa_accel = g_strdup("");
> >
> >  /* Set has_* up only for parameter c

Re: [PATCH] target/riscv/kvm: do not use non-portable strerrorname_np()

2023-12-18 Thread Daniel Henrique Barboza





On 12/18/23 14:53, Peter Maydell wrote:

On Mon, 18 Dec 2023 at 17:22, Daniel Henrique Barboza
 wrote:




On 12/18/23 13:22, Natanael Copa wrote:

strerrorname_np is non-portable and breaks building with musl libc.

Use strerror(errno) instead, like we do other places.

Cc: qemu-sta...@nongnu.org
Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' error 
msg)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
Signed-off-by: Natanael Copa 
---
   target/riscv/kvm/kvm-cpu.c | 18 --
   1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 45b6cf1cfa..117e33cf90 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -832,9 +832,8 @@ static void kvm_riscv_read_multiext_legacy(RISCVCPU *cpu,
   multi_ext_cfg->supported = false;
   val = false;
   } else {
-error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));



The reason I did this change, as described in 082e9e4a58ba mentioned in the 
commit
message, was precisely to avoid things like this:

qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error: no such 
file or directory

The generic description of the error works well with file descriptors and so on 
but it's
weird in the KVM context. This patch is re-introducing it.


We don't seem to worry about that in any of the other
KVM code -- accel/kvm/ has lots of places that
use strerror() or error_setg_errno().


I don't know how this is being used in other parts of accel/kvm, but in this 
particular
instance we're handling the errors from get_one_reg. The kernel docs describes 
the errors
the API may return as:


Errors include:

ENOENT - no such register
EINVAL - invalid register ID, or no such register (...)
EPERM - (arm64) register access not allowed before vcpu finalization
-


The API interprets ENOENT as "no such register", but strerror(errno) in this 
case will output
"no such file or directory". The generic description is forcing me to think 
"this error
makes no sense ... oh, this might be the description of ENOENT". At this point 
having an
"error code 2" instead is clearer to me.


Thanks,

Daniel



thanks
-- PMM

Re: [PATCH] target/riscv/kvm: do not use non-portable strerrorname_np()

2023-12-18 Thread Peter Maydell

On Mon, 18 Dec 2023 at 17:22, Daniel Henrique Barboza
 wrote:
>
>
>
> On 12/18/23 13:22, Natanael Copa wrote:
> > strerrorname_np is non-portable and breaks building with musl libc.
> >
> > Use strerror(errno) instead, like we do other places.
> >
> > Cc: qemu-sta...@nongnu.org
> > Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' 
> > error msg)
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
> > Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
> > Signed-off-by: Natanael Copa 
> > ---
> >   target/riscv/kvm/kvm-cpu.c | 18 --
> >   1 file changed, 8 insertions(+), 10 deletions(-)
> >
> > diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
> > index 45b6cf1cfa..117e33cf90 100644
> > --- a/target/riscv/kvm/kvm-cpu.c
> > +++ b/target/riscv/kvm/kvm-cpu.c
> > @@ -832,9 +832,8 @@ static void kvm_riscv_read_multiext_legacy(RISCVCPU 
> > *cpu,
> >   multi_ext_cfg->supported = false;
> >   val = false;
> >   } else {
> > -error_report("Unable to read ISA_EXT KVM register %s, "
> > - "error code: %s", multi_ext_cfg->name,
> > - strerrorname_np(errno));
> > +error_report("Unable to read ISA_EXT KVM register %s: %s",
> > + multi_ext_cfg->name, strerror(errno));
>
>
> The reason I did this change, as described in 082e9e4a58ba mentioned in the 
> commit
> message, was precisely to avoid things like this:
>
> qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error: no 
> such file or directory
>
> The generic description of the error works well with file descriptors and so 
> on but it's
> weird in the KVM context. This patch is re-introducing it.

We don't seem to worry about that in any of the other
KVM code -- accel/kvm/ has lots of places that
use strerror() or error_setg_errno().

thanks
-- PMM

Re: [PATCH] target/riscv/kvm: do not use non-portable strerrorname_np()

2023-12-18 Thread Daniel Henrique Barboza





On 12/18/23 13:22, Natanael Copa wrote:

strerrorname_np is non-portable and breaks building with musl libc.

Use strerror(errno) instead, like we do other places.

Cc: qemu-sta...@nongnu.org
Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' error 
msg)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
Signed-off-by: Natanael Copa 
---
  target/riscv/kvm/kvm-cpu.c | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 45b6cf1cfa..117e33cf90 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -832,9 +832,8 @@ static void kvm_riscv_read_multiext_legacy(RISCVCPU *cpu,
  multi_ext_cfg->supported = false;
  val = false;
  } else {
-error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));



The reason I did this change, as described in 082e9e4a58ba mentioned in the 
commit
message, was precisely to avoid things like this:

qemu-system-riscv64: Unable to read ISA_EXT KVM register ssaia, error: no such 
file or directory
 
The generic description of the error works well with file descriptors and so on but it's

weird in the KVM context. This patch is re-introducing it.

If strerrorname_np() is non-portable I believe we're better off dealing with 
the numeric
errno than with its generic description. I.e:



+error_report("Unable to read ISA_EXT KVM register %s, error 
%d",
+ multi_ext_cfg->name, errno);



Same with the other 3 instances you changed in the patch. Thanks,


Daniel




  exit(EXIT_FAILURE);
  }
  } else {
@@ -895,8 +894,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, 
KVMScratchCPU *kvmcpu)
   *
   * Error out if we get any other errno.
   */
-error_report("Error when accessing get-reg-list, code: %s",
- strerrorname_np(errno));
+error_report("Error when accessing get-reg-list: %s",
+ strerror(errno));
  exit(EXIT_FAILURE);
  }
  
@@ -905,8 +904,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, KVMScratchCPU *kvmcpu)

  reglist->n = rl_struct.n;
  ret = ioctl(kvmcpu->cpufd, KVM_GET_REG_LIST, reglist);
  if (ret) {
-error_report("Error when reading KVM_GET_REG_LIST, code %s ",
- strerrorname_np(errno));
+error_report("Error when reading KVM_GET_REG_LIST: %s",
+ strerror(errno));
  exit(EXIT_FAILURE);
  }
  
@@ -927,9 +926,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, KVMScratchCPU *kvmcpu)

  reg.addr = (uint64_t)&val;
  ret = ioctl(kvmcpu->cpufd, KVM_GET_ONE_REG, ®);
  if (ret != 0) {
-error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));
  exit(EXIT_FAILURE);
  }

Re: [RFC PATCH v2 1/6] target/riscv: Remove obsolete pointer masking extension code.

2023-12-18 Thread Alexey Baturo

Hi Alistair,

Thanks for the lightning fast reply!
Could you please tell who should bump those numbers and to what values?
Do you think I could submit this patch series for the review?

Thanks

пн, 18 дек. 2023 г. в 06:11, Alistair Francis :

> On Sat, Dec 16, 2023 at 11:52 PM Alexey Baturo 
> wrote:
> >
> > From: Alexey Baturo 
> >
> > Zjpm v0.8 is almost frozen and it's much simplier compared to the
> existing one:
> > The newer version doesn't allow to specify custom mask or base for
> masking.
> > Instead it allows only certain options for masking top bits.
> >
> > Signed-off-by: Alexey Baturo 
> > ---
> >  target/riscv/cpu.c   |  10 --
> >  target/riscv/cpu.h   |  32 +---
> >  target/riscv/cpu_bits.h  |  82 -
> >  target/riscv/cpu_helper.c|  52 --
> >  target/riscv/csr.c   | 326 ---
> >  target/riscv/machine.c   |   9 -
> >  target/riscv/translate.c |  27 +--
> >  target/riscv/vector_helper.c |   2 +-
> >  8 files changed, 10 insertions(+), 530 deletions(-)
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > index 83c7c0cf07..1e6571ce99 100644
> > --- a/target/riscv/cpu.c
> > +++ b/target/riscv/cpu.c
> > @@ -710,13 +710,6 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE
> *f, int flags)
> >  CSR_MSCRATCH,
> >  CSR_SSCRATCH,
> >  CSR_SATP,
> > -CSR_MMTE,
> > -CSR_UPMBASE,
> > -CSR_UPMMASK,
> > -CSR_SPMBASE,
> > -CSR_SPMMASK,
> > -CSR_MPMBASE,
> > -CSR_MPMMASK,
> >  };
> >
> >  for (i = 0; i < ARRAY_SIZE(dump_csrs); ++i) {
> > @@ -891,8 +884,6 @@ static void riscv_cpu_reset_hold(Object *obj)
> >  }
> >  i++;
> >  }
> > -/* mmte is supposed to have pm.current hardwired to 1 */
> > -env->mmte |= (EXT_STATUS_INITIAL | MMTE_M_PM_CURRENT);
> >
> >  /*
> >   * Clear mseccfg and unlock all the PMP entries upon reset.
> > @@ -906,7 +897,6 @@ static void riscv_cpu_reset_hold(Object *obj)
> >  pmp_unlock_entries(env);
> >  #endif
> >  env->xl = riscv_cpu_mxl(env);
> > -riscv_cpu_update_mask(env);
> >  cs->exception_index = RISCV_EXCP_NONE;
> >  env->load_res = -1;
> >  set_default_nan_mode(1, &env->fp_status);
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index d74b361be6..73f7004936 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -374,18 +374,7 @@ struct CPUArchState {
> >  /* True if in debugger mode.  */
> >  bool debugger;
> >
> > -/*
> > - * CSRs for PointerMasking extension
> > - */
> > -target_ulong mmte;
> > -target_ulong mpmmask;
> > -target_ulong mpmbase;
> > -target_ulong spmmask;
> > -target_ulong spmbase;
> > -target_ulong upmmask;
> > -target_ulong upmbase;
> > -
> > -/* CSRs for execution environment configuration */
> > +/* CSRs for execution enviornment configuration */
> >  uint64_t menvcfg;
> >  uint64_t mstateen[SMSTATEEN_MAX_COUNT];
> >  uint64_t hstateen[SMSTATEEN_MAX_COUNT];
> > @@ -393,8 +382,6 @@ struct CPUArchState {
> >  target_ulong senvcfg;
> >  uint64_t henvcfg;
> >  #endif
> > -target_ulong cur_pmmask;
> > -target_ulong cur_pmbase;
> >
> >  /* Fields from here on are preserved across CPU reset. */
> >  QEMUTimer *stimer; /* Internal timer for S-mode interrupt */
> > @@ -543,17 +530,14 @@ FIELD(TB_FLAGS, VILL, 14, 1)
> >  FIELD(TB_FLAGS, VSTART_EQ_ZERO, 15, 1)
> >  /* The combination of MXL/SXL/UXL that applies to the current cpu mode.
> */
> >  FIELD(TB_FLAGS, XL, 16, 2)
> > -/* If PointerMasking should be applied */
> > -FIELD(TB_FLAGS, PM_MASK_ENABLED, 18, 1)
> > -FIELD(TB_FLAGS, PM_BASE_ENABLED, 19, 1)
> > -FIELD(TB_FLAGS, VTA, 20, 1)
> > -FIELD(TB_FLAGS, VMA, 21, 1)
> > +FIELD(TB_FLAGS, VTA, 18, 1)
> > +FIELD(TB_FLAGS, VMA, 19, 1)
> >  /* Native debug itrigger */
> > -FIELD(TB_FLAGS, ITRIGGER, 22, 1)
> > +FIELD(TB_FLAGS, ITRIGGER, 20, 1)
> >  /* Virtual mode enabled */
> > -FIELD(TB_FLAGS, VIRT_ENABLED, 23, 1)
> > -FIELD(TB_FLAGS, PRIV, 24, 2)
> > -FIELD(TB_FLAGS, AXL, 26, 2)
> > +FIELD(TB_FLAGS, VIRT_ENABLED, 21, 1)
> > +FIELD(TB_FLAGS, PRIV, 22, 2)
> > +FIELD(TB_FLAGS, AXL, 24, 2)
> >
> >  #ifdef TARGET_RISCV32
> >  #define riscv_cpu_mxl(env)  ((void)(env), MXL_RV32)
> > @@ -680,8 +664,6 @@ static inline uint32_t vext_get_vlmax(RISCVCPU *cpu,
> target_ulong vtype)
> >  void cpu_get_tb_cpu_state(CPURISCVState *env, vaddr *pc,
> >uint64_t *cs_base, uint32_t *pflags);
> >
> > -void riscv_cpu_update_mask(CPURISCVState *env);
> > -
> >  RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
> > target_ulong *ret_value,
> > target_ulong new_value, target_ulong
> write_mask);
> > diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
> > index ebd7917d49..3f9415d68d 10

Re: [PATCH v3 11/45] Introduce Raspberry PI 4 machine

2023-12-18 Thread Peter Maydell

On Mon, 4 Dec 2023 at 00:27, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v4 10/45] Add BCM2838 checkpoint support

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:33, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---
>  hw/arm/bcm2838_peripherals.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/arm/bcm2838_peripherals.c b/hw/arm/bcm2838_peripherals.c
> index c147b6e453..196fb890a2 100644
> --- a/hw/arm/bcm2838_peripherals.c
> +++ b/hw/arm/bcm2838_peripherals.c
> @@ -22,7 +22,7 @@ static void bcm2838_peripherals_init(Object *obj)
>  {
>  BCM2838PeripheralState *s = BCM2838_PERIPHERALS(obj);
>  BCM2838PeripheralClass *bc = BCM2838_PERIPHERALS_GET_CLASS(obj);
> -RaspiPeripheralBaseState *s_base = RASPI_PERIPHERALS_BASE(obj);
> +BCMSocPeripheralBaseState *s_base = BCM_SOC_PERIPHERALS_BASE(obj);
>
>  /* Lower memory region for peripheral devices (exported to the Soc) */
>  memory_region_init(&s->peri_low_mr, obj, "bcm2838-peripherals",
> --

I don't understand the commit message here, and the contents
of the patch look like something that maybe belongs in a
different patch?

thanks
-- PMM

Re: [PATCH v4 09/45] Add GPIO and SD to BCM2838 periph

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:37, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---
>  hw/arm/bcm2838_peripherals.c | 140 +++
>  include/hw/arm/bcm2838_peripherals.h |   9 ++
>  2 files changed, 149 insertions(+)


> @@ -42,6 +73,115 @@ static void bcm2838_peripherals_realize(DeviceState *dev, 
> Error **errp)
>  BCM2838_VC_PERI_LOW_BASE,
>  &s->peri_low_mr_alias, 1);
>
> +/* Extended Mass Media Controller 2 */
> +object_property_set_uint(OBJECT(&s->emmc2), "sd-spec-version", 3,
> + &error_abort);
> +object_property_set_uint(OBJECT(&s->emmc2), "capareg",
> + BCM2835_SDHC_CAPAREG, &error_abort);
> +object_property_set_bool(OBJECT(&s->emmc2), "pending-insert-quirk", true,
> + &error_abort);
> +if (!sysbus_realize(SYS_BUS_DEVICE(&s->emmc2), errp)) {
> +return;
> +}
> +
> +memory_region_add_subregion(
> +&s_base->peri_mr, EMMC2_OFFSET,
> +sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->emmc2), 0));

Odd indent again here...

> +
> +/* According to DTS, EMMC and EMMC2 share one irq */
> +if (!qdev_realize(DEVICE(&s->mmc_irq_orgate), NULL, errp)) {
> +return;
> +}
> +
> +DeviceState *mmc_irq_orgate = DEVICE(&s->mmc_irq_orgate);
> +sysbus_connect_irq(SYS_BUS_DEVICE(&s->emmc2), 0,
> +qdev_get_gpio_in(mmc_irq_orgate, 0));
> +
> +sysbus_connect_irq(SYS_BUS_DEVICE(&s_base->sdhci), 0,
> +qdev_get_gpio_in(mmc_irq_orgate, 1));

...and here.

> +
> +   /* Connect EMMC and EMMC2 to the interrupt controller */
> +qdev_connect_gpio_out(mmc_irq_orgate, 0,
> +  qdev_get_gpio_in_named(DEVICE(&s_base->ic),
> + BCM2835_IC_GPU_IRQ,
> + INTERRUPT_ARASANSDIO));
> +
> +/* Connect DMA 0-6 to the interrupt controller */
> +for (n = 0; n < 7; n++) {
> +sysbus_connect_irq(SYS_BUS_DEVICE(&s_base->dma), n,
> +   qdev_get_gpio_in_named(DEVICE(&s_base->ic),
> +  BCM2835_IC_GPU_IRQ,
> +  GPU_INTERRUPT_DMA0 + n));
> +}
> +
> +   /* According to DTS, DMA 7 and 8 share one irq */
> +if (!qdev_realize(DEVICE(&s->dma_7_8_irq_orgate), NULL, errp)) {
> +return;
> +}
> +DeviceState *dma_7_8_irq_orgate = DEVICE(&s->dma_7_8_irq_orgate);

Declaration not at top of code block (here and below).


Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v4 08/45] Connect SD controller to BCM2838 GPIO

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:33, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---
>  hw/gpio/bcm2838_gpio.c | 59 +++---
>  include/hw/gpio/bcm2838_gpio.h |  5 +++
>  2 files changed, 60 insertions(+), 4 deletions(-)
>
> diff --git a/hw/gpio/bcm2838_gpio.c b/hw/gpio/bcm2838_gpio.c
> index 51eb55b00a..f166ce7959 100644
> --- a/hw/gpio/bcm2838_gpio.c
> +++ b/hw/gpio/bcm2838_gpio.c
> @@ -17,9 +17,10 @@
>  #include "qemu/timer.h"
>  #include "qapi/error.h"
>  #include "hw/sysbus.h"
> -#include "migration/vmstate.h"
> +#include "hw/sd/sd.h"
>  #include "hw/gpio/bcm2838_gpio.h"
>  #include "hw/irq.h"
> +#include "migration/vmstate.h"

Put the #include in the order you want in the first place,
please, rather than putting it in one place in one patch and
then moving it around in a second patch.

>
>  #define GPFSEL0   0x00
>  #define GPFSEL1   0x04
> @@ -64,6 +65,16 @@

> @@ -302,15 +343,25 @@ static void bcm2838_gpio_init(Object *obj)
>  DeviceState *dev = DEVICE(obj);
>  SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
>
> -memory_region_init_io(&s->iomem, obj, &bcm2838_gpio_ops, s,
> -  "bcm2838_gpio", BCM2838_GPIO_REGS_SIZE);
> +qbus_init(&s->sdbus, sizeof(s->sdbus), TYPE_SD_BUS, DEVICE(s), "sd-bus");
> +
> +memory_region_init_io(
> +&s->iomem, obj,
> +&bcm2838_gpio_ops, s, "bcm2838_gpio", BCM2838_GPIO_REGS_SIZE);

Oddly placed newline after the "(" here.

Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v4 07/45] Implement BCM2838 GPIO functionality

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:33, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---
>  hw/gpio/bcm2838_gpio.c | 192 -
>  1 file changed, 189 insertions(+), 3 deletions(-)

>  static uint64_t bcm2838_gpio_read(void *opaque, hwaddr offset, unsigned size)
>  {
> +BCM2838GpioState *s = (BCM2838GpioState *)opaque;
>  uint64_t value = 0;
>
> -qemu_log_mask(LOG_UNIMP, "%s: %s: not implemented for %"HWADDR_PRIx"\n",
> -  TYPE_BCM2838_GPIO, __func__, offset);
> +switch (offset) {
> +case GPFSEL0:
> +case GPFSEL1:
> +case GPFSEL2:
> +case GPFSEL3:
> +case GPFSEL4:
> +case GPFSEL5:
> +value = gpfsel_get(s, offset / BYTES_IN_WORD);
> +break;
> +case GPSET0:
> +case GPSET1:
> +case GPCLR0:
> +case GPCLR1:
> +/* Write Only */
> +qemu_log_mask(LOG_GUEST_ERROR, "%s: %s: Attempt reading from write 
> only"
> +  " register. %lu will be returned. Address 
> 0x%"HWADDR_PRIx
> +  ", size %u\n", TYPE_BCM2838_GPIO, __func__, value, 
> offset,
> +  size);

'value' is a uint64_t, but you try to print it with a %lu format
string here. This won't compile on 32-bit machines. (In general
watch out for %lu %lx etc and don't use them unless the type
really is "long".)

You can get the compiler to tell you about these format issues by
any of the following options:
 * building for a 32-bit host (eg i386)
 * building for macos (the macos clang is stricter about these even
   when building for a 64-bit)
 * running your patches through the gitlab CI setup: fork the QEMU
   project on gitlab as an ordinary gitlab user; then push your
   branch to your fork of the repo with some environment variables
   set to trigger a CI pipeline run:
   https://www.qemu.org/docs/master/devel/ci.html#custom-ci-cd-variables


>  static void bcm2838_gpio_reset(DeviceState *dev)
>  {
>  BCM2838GpioState *s = BCM2838_GPIO(dev);
>
> +memset(s->fsel, 0, sizeof(s->fsel));
> +
>  s->lev0 = 0;
>  s->lev1 = 0;

I think this bit should go in the previous patch since we added
s->fsel there and do the other reset code there already.

Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v4 06/45] Add BCM2838 GPIO stub

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:39, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---
>  hw/arm/bcm2838.c |   4 +-
>  hw/gpio/bcm2838_gpio.c   | 152 +++
>  hw/gpio/meson.build  |   5 +-
>  include/hw/arm/bcm2838_peripherals.h |   2 -
>  include/hw/gpio/bcm2838_gpio.h   |  40 +++
>  5 files changed, 198 insertions(+), 5 deletions(-)
>  create mode 100644 hw/gpio/bcm2838_gpio.c
>  create mode 100644 include/hw/gpio/bcm2838_gpio.h
>
> diff --git a/hw/arm/bcm2838.c b/hw/arm/bcm2838.c
> index 042e543006..8925957c6c 100644
> --- a/hw/arm/bcm2838.c
> +++ b/hw/arm/bcm2838.c
> @@ -14,7 +14,7 @@
>  #include "hw/arm/bcm2838.h"
>  #include "trace.h"
>
> -#define GIC400_MAINTAINANCE_IRQ  9
> +#define GIC400_MAINTENANCE_IRQ  9
>  #define GIC400_TIMER_NS_EL2_IRQ 10
>  #define GIC400_TIMER_VIRT_IRQ   11
>  #define GIC400_LEGACY_FIQ   12
> @@ -163,7 +163,7 @@ static void bcm2838_realize(DeviceState *dev, Error 
> **errp)
>
>  sysbus_connect_irq(SYS_BUS_DEVICE(&s->gic), n + 4 * BCM283X_NCPUS,
> qdev_get_gpio_in(gicdev,
> -PPI(n, 
> GIC400_MAINTAINANCE_IRQ)));
> +PPI(n, GIC400_MAINTENANCE_IRQ)));
>
>  /* Connect timers from the CPU to the interrupt controller */
>  qdev_connect_gpio_out(cpudev, GTIMER_PHYS,

Squash these changes into the previous patch :-)

> diff --git a/hw/gpio/bcm2838_gpio.c b/hw/gpio/bcm2838_gpio.c
> new file mode 100644
> index 00..15b66cb559
> --- /dev/null
> +++ b/hw/gpio/bcm2838_gpio.c
> @@ -0,0 +1,152 @@
> +/*
> + * Raspberry Pi (BCM2838) GPIO Controller
> + * This implementation is based on bcm2835_gpio (hw/gpio/bcm2835_gpio.c)
> + *
> + * Copyright (c) 2022 Auriga LLC
> + *
> + * Authors:
> + *  Lotosh, Aleksey 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.

It would be nice to be consistent about whether you want to use
SPDX-License-Identifier tags or not in the new files you're adding.
(Patch 4's bcm2838.c uses it; this one doesn't.)

> + */

> +#define RESET_VAL_CNTRL_REG0 0xAAA9;
> +#define RESET_VAL_CNTRL_REG1 0xA0AA;
> +#define RESET_VAL_CNTRL_REG2 0x50AAA95A;
> +#define RESET_VAL_CNTRL_REG3 0x0005;

These shouldn't have trailing semicolons.

> +
> +#define BYTES_IN_WORD4

> diff --git a/hw/gpio/meson.build b/hw/gpio/meson.build
> index 066ea96480..8a8d03d885 100644
> --- a/hw/gpio/meson.build
> +++ b/hw/gpio/meson.build
> @@ -9,6 +9,9 @@ system_ss.add(when: 'CONFIG_IMX', if_true: 
> files('imx_gpio.c'))
>  system_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_gpio.c'))
>  system_ss.add(when: 'CONFIG_NRF51_SOC', if_true: files('nrf51_gpio.c'))
>  system_ss.add(when: 'CONFIG_OMAP', if_true: files('omap_gpio.c'))
> -system_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2835_gpio.c'))
> +system_ss.add(when: 'CONFIG_RASPI', if_true: files(
> +'bcm2835_gpio.c',
> +'bcm2838_gpio.c'
> +))
>  system_ss.add(when: 'CONFIG_ASPEED_SOC', if_true: files('aspeed_gpio.c'))
>  system_ss.add(when: 'CONFIG_SIFIVE_GPIO', if_true: files('sifive_gpio.c'))
> diff --git a/include/hw/arm/bcm2838_peripherals.h 
> b/include/hw/arm/bcm2838_peripherals.h
> index 5a72355183..d07831753a 100644
> --- a/include/hw/arm/bcm2838_peripherals.h
> +++ b/include/hw/arm/bcm2838_peripherals.h
> @@ -11,8 +11,6 @@
>
>  #include "hw/arm/bcm2835_peripherals.h"
>
> -#define GENET_OFFSET0x158
> -

Why does this line get deleted ?

>  /* SPI */
>  #define GIC_SPI_INTERRUPT_MBOX 33
>  #define GIC_SPI_INTERRUPT_MPHI 40

thanks
-- PMM

[PATCH] target/riscv/kvm: do not use non-portable strerrorname_np()

2023-12-18 Thread Natanael Copa

strerrorname_np is non-portable and breaks building with musl libc.

Use strerror(errno) instead, like we do other places.

Cc: qemu-sta...@nongnu.org
Fixes: commit 082e9e4a58ba (target/riscv/kvm: improve 'init_multiext_cfg' error 
msg)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2041
Buglink: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15541
Signed-off-by: Natanael Copa 
---
 target/riscv/kvm/kvm-cpu.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 45b6cf1cfa..117e33cf90 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -832,9 +832,8 @@ static void kvm_riscv_read_multiext_legacy(RISCVCPU *cpu,
 multi_ext_cfg->supported = false;
 val = false;
 } else {
-error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));
 exit(EXIT_FAILURE);
 }
 } else {
@@ -895,8 +894,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, 
KVMScratchCPU *kvmcpu)
  *
  * Error out if we get any other errno.
  */
-error_report("Error when accessing get-reg-list, code: %s",
- strerrorname_np(errno));
+error_report("Error when accessing get-reg-list: %s",
+ strerror(errno));
 exit(EXIT_FAILURE);
 }
 
@@ -905,8 +904,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, 
KVMScratchCPU *kvmcpu)
 reglist->n = rl_struct.n;
 ret = ioctl(kvmcpu->cpufd, KVM_GET_REG_LIST, reglist);
 if (ret) {
-error_report("Error when reading KVM_GET_REG_LIST, code %s ",
- strerrorname_np(errno));
+error_report("Error when reading KVM_GET_REG_LIST: %s",
+ strerror(errno));
 exit(EXIT_FAILURE);
 }
 
@@ -927,9 +926,8 @@ static void kvm_riscv_init_multiext_cfg(RISCVCPU *cpu, 
KVMScratchCPU *kvmcpu)
 reg.addr = (uint64_t)&val;
 ret = ioctl(kvmcpu->cpufd, KVM_GET_ONE_REG, ®);
 if (ret != 0) {
-error_report("Unable to read ISA_EXT KVM register %s, "
- "error code: %s", multi_ext_cfg->name,
- strerrorname_np(errno));
+error_report("Unable to read ISA_EXT KVM register %s: %s",
+ multi_ext_cfg->name, strerror(errno));
 exit(EXIT_FAILURE);
 }
 
-- 
2.43.0

Re: [PATCH v4 05/45] Add GIC-400 to BCM2838 SoC

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:32, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---
>  hw/arm/bcm2838.c | 167 +++
>  hw/arm/trace-events  |   2 +
>  include/hw/arm/bcm2838.h |   2 +
>  include/hw/arm/bcm2838_peripherals.h |  39 +++
>  4 files changed, 210 insertions(+)
>
> diff --git a/hw/arm/bcm2838.c b/hw/arm/bcm2838.c
> index c61c59661b..042e543006 100644
> --- a/hw/arm/bcm2838.c
> +++ b/hw/arm/bcm2838.c
> @@ -14,8 +14,36 @@
>  #include "hw/arm/bcm2838.h"
>  #include "trace.h"
>
> +#define GIC400_MAINTAINANCE_IRQ  9

"MAINTENANCE"

> +#define GIC400_TIMER_NS_EL2_IRQ 10
> +#define GIC400_TIMER_VIRT_IRQ   11
> +#define GIC400_LEGACY_FIQ   12
> +#define GIC400_TIMER_S_EL1_IRQ  13
> +#define GIC400_TIMER_NS_EL1_IRQ 14
> +#define GIC400_LEGACY_IRQ   15

For the virt and sbsa-ref boards we found that having interrupt
#defines use the PPI number was on net a bit confusing, so we
standardized on having the defines be the architectural INTID
(which is the PPI number + 16). See commit 9036e917f8357f4.

But I mention this more as an FYI kind of thing because changing
the numbering base at this point is probably more likely to
introduce bugs than remove them...

> +/* Number of external interrupt lines to configure the GIC with */
> +#define GIC_NUM_IRQS192
> +
> +#define PPI(cpu, irq) (GIC_NUM_IRQS + (cpu) * GIC_INTERNAL + GIC_NR_SGIS + 
> irq)
> +
> +#define GIC_BASE_OFS0x
> +#define GIC_DIST_OFS0x1000
> +#define GIC_CPU_OFS 0x2000
> +#define GIC_VIFACE_THIS_OFS 0x4000
> +#define GIC_VIFACE_OTHER_OFS(cpu)  (0x5000 + (cpu) * 0x200)
> +#define GIC_VCPU_OFS0x6000
> +
>  #define VIRTUAL_PMU_IRQ 7
>
> +static void bcm2838_gic_set_irq(void *opaque, int irq, int level)
> +{
> +BCM2838State *s = (BCM2838State *)opaque;
> +
> +trace_bcm2838_gic_set_irq(irq, level);
> +qemu_set_irq(qdev_get_gpio_in(DEVICE(&s->gic), irq), level);
> +}
> +
>  static void bcm2838_init(Object *obj)
>  {
>  BCM2838State *s = BCM2838(obj);
> @@ -28,11 +56,14 @@ static void bcm2838_init(Object *obj)
>"vcram-size");
>  object_property_add_alias(obj, "command-line", OBJECT(&s->peripherals),
>"command-line");
> +
> +object_initialize_child(obj, "gic", &s->gic, TYPE_ARM_GIC);
>  }
>
>  static void bcm2838_realize(DeviceState *dev, Error **errp)
>  {
>  int n;
> +int int_n;

This is not a good name for a variable, especially in a
function that already has an "n" variable. As far as I can see
the use added here doesn't overlap with the existing "n" so
you could just reuse that.

Note that our coding style these days permits declaration of
loop variables inside the for():

for (int i = 0; i < ARRAY_SIZE(thing); i++) {
/* do something loopy */
}

so if you prefer you can have all the loops in the function do that
and not have any local n declared at the top of the function.

>  BCM2838State *s = BCM2838(dev);
>  BCM283XBaseState *s_base = BCM283X_BASE(dev);
>  BCM283XBaseClass *bc_base = BCM283X_BASE_GET_CLASS(dev);
> @@ -56,6 +87,13 @@ static void bcm2838_realize(DeviceState *dev, Error **errp)
>  /* TODO: this should be converted to a property of ARM_CPU */
>  s_base->cpu[n].core.mp_affinity = (bc_base->clusterid << 8) | n;
>
> +/* set periphbase/CBAR value for CPU-local registers */
> +if (!object_property_set_int(OBJECT(&s_base->cpu[n].core), 
> "reset-cbar",
> + bc_base->ctrl_base + BCM2838_GIC_BASE,
> + errp)) {
> +return;
> +}

This one doesn't need an error check either; compare
https://lore.kernel.org/qemu-devel/20231123143813.42632-3-phi...@linaro.org/

> +
>  /* start powered off if not enabled */
>  if (!object_property_set_bool(OBJECT(&s_base->cpu[n].core),
>"start-powered-off",
> @@ -68,6 +106,135 @@ static void bcm2838_realize(DeviceState *dev, Error 
> **errp)
>  return;
>  }
>  }
> +
> +if (!object_property_set_uint(OBJECT(&s->gic), "revision", 2, errp)) {
> +return;
> +}
> +
> +if (!object_property_set_uint(OBJECT(&s->gic), "num-cpu", BCM283X_NCPUS,
> +  errp)) {
> +return;
> +}
> +
> +if (!object_property_set_uint(OBJECT(&s->gic), "num-irq",
> +  GIC_NUM_IRQS + GIC_INTERNAL, errp)) {
> +return;
> +}
> +
> +if (!object_property_set_bool(OBJECT(&s->gic),
> +  "has-virtualization-extensions", true,
> +  errp)) {
> +return;
> +}
> +
> +if (!sysbus_realize(SYS_BUS_DEVICE(&s->gic), errp)) {
> +return;
> +

Re: [PATCH v4 04/45] Introduce BCM2838 SoC

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:33, Sergey Kambalin  wrote:
>
> Signed-off-by: Sergey Kambalin 
> ---
>  hw/arm/bcm2838.c | 100 +++
>  hw/arm/bcm2838_peripherals.c |  72 +++
>  hw/arm/meson.build   |   2 +
>  include/hw/arm/bcm2838.h |  29 
>  include/hw/arm/bcm2838_peripherals.h |  36 ++
>  5 files changed, 239 insertions(+)
>  create mode 100644 hw/arm/bcm2838.c
>  create mode 100644 hw/arm/bcm2838_peripherals.c
>  create mode 100644 include/hw/arm/bcm2838.h
>  create mode 100644 include/hw/arm/bcm2838_peripherals.h
>
> diff --git a/hw/arm/bcm2838.c b/hw/arm/bcm2838.c
> new file mode 100644
> index 00..c61c59661b
> --- /dev/null
> +++ b/hw/arm/bcm2838.c
> @@ -0,0 +1,100 @@
> +/*
> + * BCM2838 SoC emulation
> + *
> + * Copyright (C) 2022 Ovchinnikov Vitalii 
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/module.h"
> +#include "hw/arm/raspi_platform.h"
> +#include "hw/sysbus.h"
> +#include "hw/arm/bcm2838.h"
> +#include "trace.h"
> +
> +#define VIRTUAL_PMU_IRQ 7
> +
> +static void bcm2838_init(Object *obj)
> +{
> +BCM2838State *s = BCM2838(obj);
> +
> +object_initialize_child(obj, "peripherals", &s->peripherals,
> +TYPE_BCM2838_PERIPHERALS);
> +object_property_add_alias(obj, "board-rev", OBJECT(&s->peripherals),
> +  "board-rev");
> +object_property_add_alias(obj, "vcram-size", OBJECT(&s->peripherals),
> +  "vcram-size");
> +object_property_add_alias(obj, "command-line", OBJECT(&s->peripherals),
> +  "command-line");
> +}
> +
> +static void bcm2838_realize(DeviceState *dev, Error **errp)
> +{
> +int n;
> +BCM2838State *s = BCM2838(dev);
> +BCM283XBaseState *s_base = BCM283X_BASE(dev);
> +BCM283XBaseClass *bc_base = BCM283X_BASE_GET_CLASS(dev);
> +BCM2838PeripheralState *ps = BCM2838_PERIPHERALS(&s->peripherals);
> +BCMSocPeripheralBaseState *ps_base =
> +BCM_SOC_PERIPHERALS_BASE(&s->peripherals);
> +
> +if (!bcm283x_common_realize(dev, ps_base, errp)) {
> +return;
> +}
> +sysbus_mmio_map_overlap(SYS_BUS_DEVICE(ps), 1, BCM2838_PERI_LOW_BASE, 1);
> +
> +/* bcm2836 interrupt controller (and mailboxes, etc.) */
> +if (!sysbus_realize(SYS_BUS_DEVICE(&s_base->control), errp)) {
> +return;
> +}
> +sysbus_mmio_map(SYS_BUS_DEVICE(&s_base->control), 0, bc_base->ctrl_base);
> +
> +/* Create cores */
> +for (n = 0; n < bc_base->core_count; n++) {
> +/* TODO: this should be converted to a property of ARM_CPU */
> +s_base->cpu[n].core.mp_affinity = (bc_base->clusterid << 8) | n;

We have a property now, so we can do:

object_property_set_int(OBJECT(&s->cpu[n].core), "mp-affinity",
(bc->clusterid << 8) | n, &error_abort);

rather than propagating a TODO item.

https://lore.kernel.org/qemu-devel/20231123143813.42632-4-phi...@linaro.org/
is the patch (still pending) that does this in the existing rpi code.

> +
> +/* start powered off if not enabled */
> +if (!object_property_set_bool(OBJECT(&s_base->cpu[n].core),
> +  "start-powered-off",
> +  n >= s_base->enabled_cpus,
> +  errp)) {
> +return;
> +}

Trying to set start-powered-off can never fail, so we don't need
to error-check it, but can just error_abort.
https://lore.kernel.org/qemu-devel/20231123143813.42632-5-phi...@linaro.org/
is the patch which does that for the existing uses.

> +
> +if (!qdev_realize(DEVICE(&s_base->cpu[n].core), NULL, errp)) {
> +return;
> +}
> +}
> +}

Otherwise
Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v4 03/45] Split out raspi machine common part

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:36, Sergey Kambalin  wrote:
>
> Pre-setup for raspberry pi 4 introduction
>
> Signed-off-by: Sergey Kambalin 

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v4 01/45] Split out common part of BCM283X classes

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:32, Sergey Kambalin  wrote:
>
> Pre setup for BCM2838 introduction
>
> Signed-off-by: Sergey Kambalin 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v4 02/45] Split out common part of peripherals

2023-12-18 Thread Peter Maydell

On Fri, 8 Dec 2023 at 02:40, Sergey Kambalin  wrote:
>
> Pre-setup for BCM2838 introduction
>
> Signed-off-by: Sergey Kambalin 

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 01/45] Split out common part of BCM283X classes

2023-12-18 Thread Peter Maydell

On Mon, 4 Dec 2023 at 00:28, Sergey Kambalin  wrote:
>
> Pre setup for BCM2838 introduction
>
> Signed-off-by: Sergey Kambalin 
> ---

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v3 01/45] Split out common part of BCM283X classes

2023-12-18 Thread Peter Maydell

On Mon, 18 Dec 2023 at 15:57, Peter Maydell  wrote:
>
> On Mon, 4 Dec 2023 at 00:28, Sergey Kambalin  wrote:
> >
> > Pre setup for BCM2838 introduction
> >
> > Signed-off-by: Sergey Kambalin 
> > ---
>
> Reviewed-by: Peter Maydell 

Whoops, I meant to send this as a reply to the v4 patch.

-- PMM

Re: [PATCH v2 10/14] aio: remove aio_context_acquire()/aio_context_release() API

2023-12-18 Thread Kevin Wolf

Am 05.12.2023 um 19:20 hat Stefan Hajnoczi geschrieben:
> Delete these functions because nothing calls these functions anymore.
> 
> I introduced these APIs in commit 98563fc3ec44 ("aio: add
> aio_context_acquire() and aio_context_release()") in 2014. It's with a
> sigh of relief that I delete these APIs almost 10 years later.
> 
> Thanks to Paolo Bonzini's vision for multi-queue QEMU, we got an
> understanding of where the code needed to go in order to remove the
> limitations that the original dataplane and the IOThread/AioContext
> approach that followed it.
> 
> Emanuele Giuseppe Esposito had the splendid determination to convert
> large parts of the codebase so that they no longer needed the AioContext
> lock. This was a painstaking process, both in the actual code changes
> required and the iterations of code review that Emanuele eked out of
> Kevin and me over many months.
> 
> Kevin Wolf tackled multitudes of graph locking conversions to protect
> in-flight I/O from run-time changes to the block graph as well as the
> clang Thread Safety Analysis annotations that allow the compiler to
> check whether the graph lock is being used correctly.
> 
> And me, well, I'm just here to add some pizzazz to the QEMU multi-queue
> block layer :). Thank you to everyone who helped with this effort,
> including Eric Blake, code reviewer extraordinaire, and others who I've
> forgotten to mention.
> 
> Signed-off-by: Stefan Hajnoczi 
> Reviewed-by: Eric Blake 

Reviewed-by: Kevin Wolf

Re: [PATCH v2 08/14] scsi: remove AioContext locking

2023-12-18 Thread Kevin Wolf

Am 05.12.2023 um 19:20 hat Stefan Hajnoczi geschrieben:
> The AioContext lock no longer has any effect. Remove it.
> 
> Signed-off-by: Stefan Hajnoczi 
> Reviewed-by: Eric Blake 

Reviewed-by: Kevin Wolf

Re: [PATCH v2 09/14] aio-wait: draw equivalence between AIO_WAIT_WHILE() and AIO_WAIT_WHILE_UNLOCKED()

2023-12-18 Thread Kevin Wolf

Am 05.12.2023 um 19:20 hat Stefan Hajnoczi geschrieben:
> Now that the AioContext lock no longer exists, AIO_WAIT_WHILE() and
> AIO_WAIT_WHILE_UNLOCKED() are equivalent.
> 
> A future patch will get rid of AIO_WAIT_WHILE_UNLOCKED().
> 
> Signed-off-by: Stefan Hajnoczi 
> Reviewed-by: Eric Blake 

Reviewed-by: Kevin Wolf

Re: [PATCH v2 07/14] block: remove bdrv_co_lock()

2023-12-18 Thread Kevin Wolf

Am 05.12.2023 um 19:20 hat Stefan Hajnoczi geschrieben:
> The bdrv_co_lock() and bdrv_co_unlock() functions are already no-ops.
> Remove them.
> 
> Signed-off-by: Stefan Hajnoczi 

Reviewed-by: Kevin Wolf

Re: [PATCH v2 02/14] scsi: assert that callbacks run in the correct AioContext

2023-12-18 Thread Kevin Wolf

Am 05.12.2023 um 19:19 hat Stefan Hajnoczi geschrieben:
> Since the removal of AioContext locking, the correctness of the code
> relies on running requests from a single AioContext at any given time.
> 
> Add assertions that verify that callbacks are invoked in the correct
> AioContext.
> 
> Signed-off-by: Stefan Hajnoczi 

Reviewed-by: Kevin Wolf

Re: [PATCH v5] fsl-imx: add simple RTC emulation for i.MX6 and i.MX7 boards

2023-12-18 Thread Peter Maydell

On Sat, 16 Dec 2023 at 13:34, Nikita Ostrenkov  wrote:
>
> Signed-off-by: Nikita Ostrenkov 
> ---
>  hw/misc/imx7_snvs.c | 93 ++---
>  hw/misc/trace-events|  4 +-
>  include/hw/misc/imx7_snvs.h |  7 ++-
>  3 files changed, 94 insertions(+), 10 deletions(-)



Applied to target-arm.next for 9.0, thanks.

-- PMM

Re: [PATCH] target/arm/helper: Propagate MDCR_EL2.HPMN into PMCR_EL0.N

2023-12-18 Thread Peter Maydell

On Fri, 15 Dec 2023 at 15:16, Jean-Philippe Brucker
 wrote:
>
> MDCR_EL2.HPMN allows an hypervisor to limit the number of PMU counters
> available to EL1 and EL0 (to keep the others to itself). QEMU already
> implements this split correctly, except for PMCR_EL0.N reads: the number
> of counters read by EL1 or EL0 should be the one configured in
> MDCR_EL2.HPMN.
>
> Signed-off-by: Jean-Philippe Brucker 

Applied to target-arm.next for 9.0, thanks. I've added a
Cc: qemu-sta...@nongnu.org because it seems a fix worth
backporting.

-- PMM

[PATCH v8 13/13] hw/riscv/virt-acpi-build.c: Add PLIC in MADT

2023-12-18 Thread Sunil V L

Add PLIC structures for each socket in the MADT when system is
configured with PLIC as the external interrupt controller.

Signed-off-by: Haibo Xu 
Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/virt-acpi-build.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 4d03a27efd..d4a02579d6 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -94,6 +94,12 @@ static void riscv_acpi_madt_add_rintc(uint32_t uid,
   arch_ids->cpus[uid].props.node_id,
   local_cpu_id),
   4);
+} else if (s->aia_type == VIRT_AIA_TYPE_NONE) {
+build_append_int_noprefix(entry,
+  ACPI_BUILD_INTC_ID(
+  arch_ids->cpus[uid].props.node_id,
+  2 * local_cpu_id + 1),
+  4);
 } else {
 build_append_int_noprefix(entry, 0, 4);
 }
@@ -494,6 +500,29 @@ static void build_madt(GArray *table_data,
 build_append_int_noprefix(table_data,
   s->memmap[VIRT_APLIC_S].size, 4);
 }
+} else {
+/* PLICs */
+for (socket = 0; socket < riscv_socket_count(ms); socket++) {
+aplic_addr = s->memmap[VIRT_PLIC].base +
+ s->memmap[VIRT_PLIC].size * socket;
+gsi_base = VIRT_IRQCHIP_NUM_SOURCES * socket;
+build_append_int_noprefix(table_data, 0x1B, 1);   /* Type */
+build_append_int_noprefix(table_data, 36, 1); /* Length */
+build_append_int_noprefix(table_data, 1, 1);  /* Version */
+build_append_int_noprefix(table_data, socket, 1); /* PLIC ID */
+build_append_int_noprefix(table_data, 0, 8);  /* Hardware ID */
+/* Total External Interrupt Sources Supported */
+build_append_int_noprefix(table_data,
+  VIRT_IRQCHIP_NUM_SOURCES - 1, 2);
+build_append_int_noprefix(table_data, 0, 2); /* Max Priority */
+build_append_int_noprefix(table_data, 0, 4); /* Flags */
+/* PLIC Size */
+build_append_int_noprefix(table_data, s->memmap[VIRT_PLIC].size, 
4);
+/* PLIC Address */
+build_append_int_noprefix(table_data, aplic_addr, 8);
+/* Global System Interrupt Vector Base */
+build_append_int_noprefix(table_data, gsi_base, 4);
+}
 }
 
 acpi_table_end(linker, &table);
-- 
2.39.2

[PATCH v8 05/13] hw/riscv/virt-acpi-build.c: Add AIA support in RINTC

2023-12-18 Thread Sunil V L

Update the RINTC structure in MADT with AIA related fields.

Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Acked-by: Alistair Francis 
Reviewed-by: Andrew Jones 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/virt-acpi-build.c | 43 ++
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index d8772c2821..3f9536356e 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -38,6 +38,7 @@
 #include "hw/intc/riscv_aclint.h"
 
 #define ACPI_BUILD_TABLE_SIZE 0x2
+#define ACPI_BUILD_INTC_ID(socket, index) ((socket << 24) | (index))
 
 typedef struct AcpiBuildState {
 /* Copy of table in RAM (for patching) */
@@ -59,17 +60,50 @@ static void acpi_align_size(GArray *blob, unsigned align)
 
 static void riscv_acpi_madt_add_rintc(uint32_t uid,
   const CPUArchIdList *arch_ids,
-  GArray *entry)
+  GArray *entry,
+  RISCVVirtState *s)
 {
+uint8_t  guest_index_bits = imsic_num_bits(s->aia_guests + 1);
 uint64_t hart_id = arch_ids->cpus[uid].arch_id;
+uint32_t imsic_size, local_cpu_id, socket_id;
+uint64_t imsic_socket_addr, imsic_addr;
+MachineState *ms = MACHINE(s);
 
+socket_id = arch_ids->cpus[uid].props.node_id;
+local_cpu_id = (arch_ids->cpus[uid].arch_id -
+riscv_socket_first_hartid(ms, socket_id)) %
+riscv_socket_hart_count(ms, socket_id);
+imsic_socket_addr = s->memmap[VIRT_IMSIC_S].base +
+(socket_id * VIRT_IMSIC_GROUP_MAX_SIZE);
+imsic_size = IMSIC_HART_SIZE(guest_index_bits);
+imsic_addr = imsic_socket_addr + local_cpu_id * imsic_size;
 build_append_int_noprefix(entry, 0x18, 1);   /* Type */
-build_append_int_noprefix(entry, 20, 1); /* Length   */
+build_append_int_noprefix(entry, 36, 1); /* Length   */
 build_append_int_noprefix(entry, 1, 1);  /* Version  */
 build_append_int_noprefix(entry, 0, 1);  /* Reserved */
 build_append_int_noprefix(entry, 0x1, 4);/* Flags*/
 build_append_int_noprefix(entry, hart_id, 8);/* Hart ID  */
 build_append_int_noprefix(entry, uid, 4);/* ACPI Processor UID */
+/* External Interrupt Controller ID */
+if (s->aia_type == VIRT_AIA_TYPE_APLIC) {
+build_append_int_noprefix(entry,
+  ACPI_BUILD_INTC_ID(
+  arch_ids->cpus[uid].props.node_id,
+  local_cpu_id),
+  4);
+} else {
+build_append_int_noprefix(entry, 0, 4);
+}
+
+if (s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC) {
+/* IMSIC Base address */
+build_append_int_noprefix(entry, imsic_addr, 8);
+/* IMSIC Size */
+build_append_int_noprefix(entry, imsic_size, 4);
+} else {
+build_append_int_noprefix(entry, 0, 8);
+build_append_int_noprefix(entry, 0, 4);
+}
 }
 
 static void acpi_dsdt_add_cpus(Aml *scope, RISCVVirtState *s)
@@ -88,7 +122,7 @@ static void acpi_dsdt_add_cpus(Aml *scope, RISCVVirtState *s)
aml_int(arch_ids->cpus[i].arch_id)));
 
 /* build _MAT object */
-riscv_acpi_madt_add_rintc(i, arch_ids, madt_buf);
+riscv_acpi_madt_add_rintc(i, arch_ids, madt_buf, s);
 aml_append(dev, aml_name_decl("_MAT",
   aml_buffer(madt_buf->len,
   (uint8_t *)madt_buf->data)));
@@ -227,6 +261,7 @@ static void build_dsdt(GArray *table_data,
  * 5.2.12 Multiple APIC Description Table (MADT)
  * REF: https://github.com/riscv-non-isa/riscv-acpi/issues/15
  *  https://drive.google.com/file/d/1R6k4MshhN3WTT-hwqAquu5nX6xSEqK2l/view
+ *  https://drive.google.com/file/d/1oMGPyOD58JaPgMl1pKasT-VKsIKia7zR/view
  */
 static void build_madt(GArray *table_data,
BIOSLinker *linker,
@@ -246,7 +281,7 @@ static void build_madt(GArray *table_data,
 
 /* RISC-V Local INTC structures per HART */
 for (int i = 0; i < arch_ids->len; i++) {
-riscv_acpi_madt_add_rintc(i, arch_ids, table_data);
+riscv_acpi_madt_add_rintc(i, arch_ids, table_data, s);
 }
 
 acpi_table_end(linker, &table);
-- 
2.39.2

[PATCH v8 07/13] hw/riscv/virt-acpi-build.c: Add APLIC in the MADT

2023-12-18 Thread Sunil V L

Add APLIC structures for each socket in the MADT when system is configured
with APLIC as the external wired interrupt controller.

Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/virt-acpi-build.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 6bb21014fd..ec49c8804b 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -274,6 +274,8 @@ static void build_madt(GArray *table_data,
 uint8_t  guest_index_bits = imsic_num_bits(s->aia_guests + 1);
 uint16_t imsic_max_hart_per_socket = 0;
 uint8_t  hart_index_bits;
+uint64_t aplic_addr;
+uint32_t gsi_base;
 uint8_t  socket;
 
 for (socket = 0; socket < riscv_socket_count(ms); socket++) {
@@ -319,6 +321,38 @@ static void build_madt(GArray *table_data,
 build_append_int_noprefix(table_data, IMSIC_MMIO_GROUP_MIN_SHIFT, 1);
 }
 
+if (s->aia_type != VIRT_AIA_TYPE_NONE) {
+/* APLICs */
+for (socket = 0; socket < riscv_socket_count(ms); socket++) {
+aplic_addr = s->memmap[VIRT_APLIC_S].base +
+ s->memmap[VIRT_APLIC_S].size * socket;
+gsi_base = VIRT_IRQCHIP_NUM_SOURCES * socket;
+build_append_int_noprefix(table_data, 0x1A, 1);/* Type */
+build_append_int_noprefix(table_data, 36, 1);  /* Length */
+build_append_int_noprefix(table_data, 1, 1);   /* Version */
+build_append_int_noprefix(table_data, socket, 1);  /* APLIC ID */
+build_append_int_noprefix(table_data, 0, 4);   /* Flags */
+build_append_int_noprefix(table_data, 0, 8);   /* Hardware ID 
*/
+/* Number of IDCs */
+if (s->aia_type == VIRT_AIA_TYPE_APLIC) {
+build_append_int_noprefix(table_data,
+  s->soc[socket].num_harts,
+  2);
+} else {
+build_append_int_noprefix(table_data, 0, 2);
+}
+/* Total External Interrupt Sources Supported */
+build_append_int_noprefix(table_data, VIRT_IRQCHIP_NUM_SOURCES, 2);
+/* Global System Interrupt Base */
+build_append_int_noprefix(table_data, gsi_base, 4);
+/* APLIC Address */
+build_append_int_noprefix(table_data, aplic_addr, 8);
+/* APLIC size */
+build_append_int_noprefix(table_data,
+  s->memmap[VIRT_APLIC_S].size, 4);
+}
+}
+
 acpi_table_end(linker, &table);
 }
 
-- 
2.39.2

[PATCH v8 09/13] hw/riscv/virt-acpi-build.c: Add MMU node in RHCT

2023-12-18 Thread Sunil V L

MMU type information is available via MMU node in RHCT. Add this node in
RHCT.

Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/virt-acpi-build.c | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 506d487ede..86c38f7c2b 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -152,6 +152,8 @@ static void build_rhct(GArray *table_data,
 size_t len, aligned_len;
 uint32_t isa_offset, num_rhct_nodes, cmo_offset = 0;
 RISCVCPU *cpu = &s->soc[0].harts[0];
+uint32_t mmu_offset = 0;
+uint8_t satp_mode_max;
 char *isa;
 
 AcpiTable table = { .sig = "RHCT", .rev = 1, .oem_id = s->oem_id,
@@ -171,6 +173,10 @@ static void build_rhct(GArray *table_data,
 num_rhct_nodes++;
 }
 
+if (cpu->cfg.satp_mode.supported != 0) {
+num_rhct_nodes++;
+}
+
 /* Number of RHCT nodes*/
 build_append_int_noprefix(table_data, num_rhct_nodes, 4);
 
@@ -226,6 +232,26 @@ static void build_rhct(GArray *table_data,
 }
 }
 
+/* MMU node structure */
+if (cpu->cfg.satp_mode.supported != 0) {
+satp_mode_max = satp_mode_max_from_map(cpu->cfg.satp_mode.map);
+mmu_offset = table_data->len - table.table_offset;
+build_append_int_noprefix(table_data, 2, 2);/* Type */
+build_append_int_noprefix(table_data, 8, 2);/* Length */
+build_append_int_noprefix(table_data, 0x1, 2);  /* Revision */
+build_append_int_noprefix(table_data, 0, 1);/* Reserved */
+/* MMU Type */
+if (satp_mode_max == VM_1_10_SV57) {
+build_append_int_noprefix(table_data, 2, 1);/* Sv57 */
+} else if (satp_mode_max == VM_1_10_SV48) {
+build_append_int_noprefix(table_data, 1, 1);/* Sv48 */
+} else if (satp_mode_max == VM_1_10_SV39) {
+build_append_int_noprefix(table_data, 0, 1);/* Sv39 */
+} else {
+assert(1);
+}
+}
+
 /* Hart Info Node */
 for (int i = 0; i < arch_ids->len; i++) {
 len = 16;
@@ -238,17 +264,25 @@ static void build_rhct(GArray *table_data,
 num_offsets++;
 }
 
+if (mmu_offset) {
+len += 4;
+num_offsets++;
+}
+
 build_append_int_noprefix(table_data, len, 2);
 build_append_int_noprefix(table_data, 0x1, 2); /* Revision */
 /* Number of offsets */
 build_append_int_noprefix(table_data, num_offsets, 2);
 build_append_int_noprefix(table_data, i, 4);   /* ACPI Processor UID */
-
 /* Offsets */
 build_append_int_noprefix(table_data, isa_offset, 4);
 if (cmo_offset) {
 build_append_int_noprefix(table_data, cmo_offset, 4);
 }
+
+if (mmu_offset) {
+build_append_int_noprefix(table_data, mmu_offset, 4);
+}
 }
 
 acpi_table_end(linker, &table);
-- 
2.39.2

[PATCH v8 12/13] hw/riscv/virt-acpi-build.c: Add IO controllers and devices

2023-12-18 Thread Sunil V L

Add basic IO controllers and devices like PCI, VirtIO and UART in the
ACPI namespace.

Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/Kconfig   |  1 +
 hw/riscv/virt-acpi-build.c | 79 --
 2 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index b6a5eb4452..a50717be87 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -45,6 +45,7 @@ config RISCV_VIRT
 select FW_CFG_DMA
 select PLATFORM_BUS
 select ACPI
+select ACPI_PCI
 
 config SHAKTI_C
 bool
diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 86c38f7c2b..4d03a27efd 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -27,15 +27,18 @@
 #include "hw/acpi/acpi-defs.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
+#include "hw/acpi/pci.h"
 #include "hw/acpi/utils.h"
+#include "hw/intc/riscv_aclint.h"
 #include "hw/nvram/fw_cfg_acpi.h"
+#include "hw/pci-host/gpex.h"
+#include "hw/riscv/virt.h"
+#include "hw/riscv/numa.h"
+#include "hw/virtio/virtio-acpi.h"
+#include "migration/vmstate.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "sysemu/reset.h"
-#include "migration/vmstate.h"
-#include "hw/riscv/virt.h"
-#include "hw/riscv/numa.h"
-#include "hw/intc/riscv_aclint.h"
 
 #define ACPI_BUILD_TABLE_SIZE 0x2
 #define ACPI_BUILD_INTC_ID(socket, index) ((socket << 24) | (index))
@@ -132,6 +135,39 @@ static void acpi_dsdt_add_cpus(Aml *scope, RISCVVirtState 
*s)
 }
 }
 
+static void
+acpi_dsdt_add_uart(Aml *scope, const MemMapEntry *uart_memmap,
+uint32_t uart_irq)
+{
+Aml *dev = aml_device("COM0");
+aml_append(dev, aml_name_decl("_HID", aml_string("PNP0501")));
+aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+
+Aml *crs = aml_resource_template();
+aml_append(crs, aml_memory32_fixed(uart_memmap->base,
+ uart_memmap->size, AML_READ_WRITE));
+aml_append(crs,
+aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
+   AML_EXCLUSIVE, &uart_irq, 1));
+aml_append(dev, aml_name_decl("_CRS", crs));
+
+Aml *pkg = aml_package(2);
+aml_append(pkg, aml_string("clock-frequency"));
+aml_append(pkg, aml_int(3686400));
+
+Aml *UUID = aml_touuid("DAFFD814-6EBA-4D8C-8A91-BC9BBF4AA301");
+
+Aml *pkg1 = aml_package(1);
+aml_append(pkg1, pkg);
+
+Aml *package = aml_package(2);
+aml_append(package, UUID);
+aml_append(package, pkg1);
+
+aml_append(dev, aml_name_decl("_DSD", package));
+aml_append(scope, dev);
+}
+
 /* RHCT Node[N] starts at offset 56 */
 #define RHCT_NODE_ARRAY_OFFSET 56
 
@@ -310,6 +346,8 @@ static void build_dsdt(GArray *table_data,
RISCVVirtState *s)
 {
 Aml *scope, *dsdt;
+MachineState *ms = MACHINE(s);
+uint8_t socket_count;
 const MemMapEntry *memmap = s->memmap;
 AcpiTable table = { .sig = "DSDT", .rev = 2, .oem_id = s->oem_id,
 .oem_table_id = s->oem_table_id };
@@ -329,6 +367,29 @@ static void build_dsdt(GArray *table_data,
 
 fw_cfg_acpi_dsdt_add(scope, &memmap[VIRT_FW_CFG]);
 
+socket_count = riscv_socket_count(ms);
+
+acpi_dsdt_add_uart(scope, &memmap[VIRT_UART0], UART0_IRQ);
+
+if (socket_count == 1) {
+virtio_acpi_dsdt_add(scope, memmap[VIRT_VIRTIO].base,
+ memmap[VIRT_VIRTIO].size,
+ VIRTIO_IRQ, 0, VIRTIO_COUNT);
+acpi_dsdt_add_gpex_host(scope, PCIE_IRQ);
+} else if (socket_count == 2) {
+virtio_acpi_dsdt_add(scope, memmap[VIRT_VIRTIO].base,
+ memmap[VIRT_VIRTIO].size,
+ VIRTIO_IRQ + VIRT_IRQCHIP_NUM_SOURCES, 0,
+ VIRTIO_COUNT);
+acpi_dsdt_add_gpex_host(scope, PCIE_IRQ + VIRT_IRQCHIP_NUM_SOURCES);
+} else {
+virtio_acpi_dsdt_add(scope, memmap[VIRT_VIRTIO].base,
+ memmap[VIRT_VIRTIO].size,
+ VIRTIO_IRQ + VIRT_IRQCHIP_NUM_SOURCES, 0,
+ VIRTIO_COUNT);
+acpi_dsdt_add_gpex_host(scope, PCIE_IRQ + VIRT_IRQCHIP_NUM_SOURCES * 
2);
+}
+
 aml_append(dsdt, scope);
 
 /* copy AML table into ACPI tables blob and patch header there */
@@ -465,6 +526,16 @@ static void virt_acpi_build(RISCVVirtState *s, 
AcpiBuildTables *tables)
 acpi_add_table(table_offsets, tables_blob);
 build_rhct(tables_blob, tables->linker, s);
 
+acpi_add_table(table_offsets, tables_blob);
+{
+AcpiMcfgInfo mcfg = {
+   .base = s->memmap[VIRT_PCIE_MMIO].base,
+   .size = s->memmap[VIRT_PCIE_MMIO].size,
+};
+build_mcfg(tables_blob, tables->linker, &mcfg, s->oem_id,
+   s->oem_table

[PATCH v8 11/13] hw/riscv/virt: Update GPEX MMIO related properties

2023-12-18 Thread Sunil V L

Update the GPEX host bridge properties related to MMIO ranges with
values set for the virt machine.

Suggested-by: Igor Mammedov 
Signed-off-by: Sunil V L 
Reviewed-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/virt.c | 47 -
 include/hw/riscv/virt.h |  1 +
 2 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 9e7629c51c..a7c4c3508e 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -1054,21 +1054,45 @@ static void create_fdt(RISCVVirtState *s, const 
MemMapEntry *memmap)
 }
 
 static inline DeviceState *gpex_pcie_init(MemoryRegion *sys_mem,
-  hwaddr ecam_base, hwaddr ecam_size,
-  hwaddr mmio_base, hwaddr mmio_size,
-  hwaddr high_mmio_base,
-  hwaddr high_mmio_size,
-  hwaddr pio_base,
-  DeviceState *irqchip)
+  DeviceState *irqchip,
+  RISCVVirtState *s)
 {
 DeviceState *dev;
 MemoryRegion *ecam_alias, *ecam_reg;
 MemoryRegion *mmio_alias, *high_mmio_alias, *mmio_reg;
+hwaddr ecam_base = s->memmap[VIRT_PCIE_ECAM].base;
+hwaddr ecam_size = s->memmap[VIRT_PCIE_ECAM].size;
+hwaddr mmio_base = s->memmap[VIRT_PCIE_MMIO].base;
+hwaddr mmio_size = s->memmap[VIRT_PCIE_MMIO].size;
+hwaddr high_mmio_base = virt_high_pcie_memmap.base;
+hwaddr high_mmio_size = virt_high_pcie_memmap.size;
+hwaddr pio_base = s->memmap[VIRT_PCIE_PIO].base;
+hwaddr pio_size = s->memmap[VIRT_PCIE_PIO].size;
 qemu_irq irq;
 int i;
 
 dev = qdev_new(TYPE_GPEX_HOST);
 
+/* Set GPEX object properties for the virt machine */
+object_property_set_uint(OBJECT(GPEX_HOST(dev)), PCI_HOST_ECAM_BASE,
+ecam_base, NULL);
+object_property_set_int(OBJECT(GPEX_HOST(dev)), PCI_HOST_ECAM_SIZE,
+ecam_size, NULL);
+object_property_set_uint(OBJECT(GPEX_HOST(dev)),
+ PCI_HOST_BELOW_4G_MMIO_BASE,
+ mmio_base, NULL);
+object_property_set_int(OBJECT(GPEX_HOST(dev)), 
PCI_HOST_BELOW_4G_MMIO_SIZE,
+mmio_size, NULL);
+object_property_set_uint(OBJECT(GPEX_HOST(dev)),
+ PCI_HOST_ABOVE_4G_MMIO_BASE,
+ high_mmio_base, NULL);
+object_property_set_int(OBJECT(GPEX_HOST(dev)), 
PCI_HOST_ABOVE_4G_MMIO_SIZE,
+high_mmio_size, NULL);
+object_property_set_uint(OBJECT(GPEX_HOST(dev)), PCI_HOST_PIO_BASE,
+pio_base, NULL);
+object_property_set_int(OBJECT(GPEX_HOST(dev)), PCI_HOST_PIO_SIZE,
+pio_size, NULL);
+
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 
 ecam_alias = g_new0(MemoryRegion, 1);
@@ -1099,6 +1123,7 @@ static inline DeviceState *gpex_pcie_init(MemoryRegion 
*sys_mem,
 gpex_set_irq_num(GPEX_HOST(dev), i, PCIE_IRQ + i);
 }
 
+GPEX_HOST(dev)->gpex_cfg.bus = PCI_HOST_BRIDGE(GPEX_HOST(dev))->bus;
 return dev;
 }
 
@@ -1494,15 +1519,7 @@ static void virt_machine_init(MachineState *machine)
 qdev_get_gpio_in(virtio_irqchip, VIRTIO_IRQ + i));
 }
 
-gpex_pcie_init(system_memory,
-   memmap[VIRT_PCIE_ECAM].base,
-   memmap[VIRT_PCIE_ECAM].size,
-   memmap[VIRT_PCIE_MMIO].base,
-   memmap[VIRT_PCIE_MMIO].size,
-   virt_high_pcie_memmap.base,
-   virt_high_pcie_memmap.size,
-   memmap[VIRT_PCIE_PIO].base,
-   pcie_irqchip);
+gpex_pcie_init(system_memory, pcie_irqchip, s);
 
 create_platform_bus(s, mmio_irqchip);
 
diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
index 5b03575ed3..f89790fd58 100644
--- a/include/hw/riscv/virt.h
+++ b/include/hw/riscv/virt.h
@@ -61,6 +61,7 @@ struct RISCVVirtState {
 char *oem_table_id;
 OnOffAuto acpi;
 const MemMapEntry *memmap;
+struct GPEXHost *gpex_host;
 };
 
 enum {
-- 
2.39.2

[PATCH v8 03/13] hw/i386/acpi-microvm.c: Use common function to add virtio in DSDT

2023-12-18 Thread Sunil V L

With common function to add virtio in DSDT created now, update microvm
code also to use it instead of duplicate code.

Suggested-by: Andrew Jones 
Signed-off-by: Sunil V L 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/i386/acpi-microvm.c | 15 ++-
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
index 2909a73933..279da6b4aa 100644
--- a/hw/i386/acpi-microvm.c
+++ b/hw/i386/acpi-microvm.c
@@ -37,6 +37,7 @@
 #include "hw/pci/pci.h"
 #include "hw/pci/pcie_host.h"
 #include "hw/usb/xhci.h"
+#include "hw/virtio/virtio-acpi.h"
 #include "hw/virtio/virtio-mmio.h"
 #include "hw/input/i8042.h"
 
@@ -77,19 +78,7 @@ static void acpi_dsdt_add_virtio(Aml *scope,
 uint32_t irq = mms->virtio_irq_base + index;
 hwaddr base = VIRTIO_MMIO_BASE + index * 512;
 hwaddr size = 512;
-
-Aml *dev = aml_device("VR%02u", (unsigned)index);
-aml_append(dev, aml_name_decl("_HID", aml_string("LNRO0005")));
-aml_append(dev, aml_name_decl("_UID", aml_int(index)));
-aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
-
-Aml *crs = aml_resource_template();
-aml_append(crs, aml_memory32_fixed(base, size, AML_READ_WRITE));
-aml_append(crs,
-   aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
- AML_EXCLUSIVE, &irq, 1));
-aml_append(dev, aml_name_decl("_CRS", crs));
-aml_append(scope, dev);
+virtio_acpi_dsdt_add(scope, base, size, irq, index, 1);
 }
 }
 }
-- 
2.39.2

[PATCH v8 10/13] hw/pci-host/gpex: Define properties for MMIO ranges

2023-12-18 Thread Sunil V L

ACPI DSDT generator needs information like ECAM range, PIO range, 32-bit
and 64-bit PCI MMIO range etc related to the PCI host bridge. Instead of
making these values machine specific, create properties for the GPEX
host bridge with default value 0. During initialization, the firmware
can initialize these properties with correct values for the platform.
This basically allows DSDT generator code independent of the machine
specific memory map accesses.

Suggested-by: Igor Mammedov 
Signed-off-by: Sunil V L 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Daniel Henrique Barboza 
---
 hw/pci-host/gpex-acpi.c| 13 +
 hw/pci-host/gpex.c | 12 
 include/hw/pci-host/gpex.h | 28 
 3 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/hw/pci-host/gpex-acpi.c b/hw/pci-host/gpex-acpi.c
index 1092dc3b70..f69413ea2c 100644
--- a/hw/pci-host/gpex-acpi.c
+++ b/hw/pci-host/gpex-acpi.c
@@ -281,3 +281,16 @@ void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg)
 
 crs_range_set_free(&crs_range_set);
 }
+
+void acpi_dsdt_add_gpex_host(Aml *scope, uint32_t irq)
+{
+bool ambig;
+Object *obj = object_resolve_path_type("", TYPE_GPEX_HOST, &ambig);
+
+if (!obj || ambig) {
+return;
+}
+
+GPEX_HOST(obj)->gpex_cfg.irq = irq;
+acpi_dsdt_add_gpex(scope, &GPEX_HOST(obj)->gpex_cfg);
+}
diff --git a/hw/pci-host/gpex.c b/hw/pci-host/gpex.c
index a6752fac5e..41f4e73f6e 100644
--- a/hw/pci-host/gpex.c
+++ b/hw/pci-host/gpex.c
@@ -154,6 +154,18 @@ static Property gpex_host_properties[] = {
  */
 DEFINE_PROP_BOOL("allow-unmapped-accesses", GPEXHost,
  allow_unmapped_accesses, true),
+DEFINE_PROP_UINT64(PCI_HOST_ECAM_BASE, GPEXHost, gpex_cfg.ecam.base, 0),
+DEFINE_PROP_SIZE(PCI_HOST_ECAM_SIZE, GPEXHost, gpex_cfg.ecam.size, 0),
+DEFINE_PROP_UINT64(PCI_HOST_PIO_BASE, GPEXHost, gpex_cfg.pio.base, 0),
+DEFINE_PROP_SIZE(PCI_HOST_PIO_SIZE, GPEXHost, gpex_cfg.pio.size, 0),
+DEFINE_PROP_UINT64(PCI_HOST_BELOW_4G_MMIO_BASE, GPEXHost,
+   gpex_cfg.mmio32.base, 0),
+DEFINE_PROP_SIZE(PCI_HOST_BELOW_4G_MMIO_SIZE, GPEXHost,
+ gpex_cfg.mmio32.size, 0),
+DEFINE_PROP_UINT64(PCI_HOST_ABOVE_4G_MMIO_BASE, GPEXHost,
+   gpex_cfg.mmio64.base, 0),
+DEFINE_PROP_SIZE(PCI_HOST_ABOVE_4G_MMIO_SIZE, GPEXHost,
+ gpex_cfg.mmio64.size, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/pci-host/gpex.h b/include/hw/pci-host/gpex.h
index b0240bd768..dce883573b 100644
--- a/include/hw/pci-host/gpex.h
+++ b/include/hw/pci-host/gpex.h
@@ -40,6 +40,15 @@ struct GPEXRootState {
 /*< public >*/
 };
 
+struct GPEXConfig {
+MemMapEntry ecam;
+MemMapEntry mmio32;
+MemMapEntry mmio64;
+MemMapEntry pio;
+int irq;
+PCIBus  *bus;
+};
+
 struct GPEXHost {
 /*< private >*/
 PCIExpressHost parent_obj;
@@ -55,19 +64,22 @@ struct GPEXHost {
 int irq_num[GPEX_NUM_IRQS];
 
 bool allow_unmapped_accesses;
-};
 
-struct GPEXConfig {
-MemMapEntry ecam;
-MemMapEntry mmio32;
-MemMapEntry mmio64;
-MemMapEntry pio;
-int irq;
-PCIBus  *bus;
+struct GPEXConfig gpex_cfg;
 };
 
 int gpex_set_irq_num(GPEXHost *s, int index, int gsi);
 
 void acpi_dsdt_add_gpex(Aml *scope, struct GPEXConfig *cfg);
+void acpi_dsdt_add_gpex_host(Aml *scope, uint32_t irq);
+
+#define PCI_HOST_PIO_BASE   "x-pio-base"
+#define PCI_HOST_PIO_SIZE   "x-pio-size"
+#define PCI_HOST_ECAM_BASE  "x-ecam-base"
+#define PCI_HOST_ECAM_SIZE  "x-ecam-size"
+#define PCI_HOST_BELOW_4G_MMIO_BASE "x-below-4g-mmio-base"
+#define PCI_HOST_BELOW_4G_MMIO_SIZE "x-below-4g-mmio-size"
+#define PCI_HOST_ABOVE_4G_MMIO_BASE "x-above-4g-mmio-base"
+#define PCI_HOST_ABOVE_4G_MMIO_SIZE "x-above-4g-mmio-size"
 
 #endif /* HW_GPEX_H */
-- 
2.39.2

[PATCH v8 06/13] hw/riscv/virt-acpi-build.c: Add IMSIC in the MADT

2023-12-18 Thread Sunil V L

Add IMSIC structure in MADT when IMSIC is configured.

Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/virt-acpi-build.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 3f9536356e..6bb21014fd 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -270,6 +270,19 @@ static void build_madt(GArray *table_data,
 MachineClass *mc = MACHINE_GET_CLASS(s);
 MachineState *ms = MACHINE(s);
 const CPUArchIdList *arch_ids = mc->possible_cpu_arch_ids(ms);
+uint8_t  group_index_bits = imsic_num_bits(riscv_socket_count(ms));
+uint8_t  guest_index_bits = imsic_num_bits(s->aia_guests + 1);
+uint16_t imsic_max_hart_per_socket = 0;
+uint8_t  hart_index_bits;
+uint8_t  socket;
+
+for (socket = 0; socket < riscv_socket_count(ms); socket++) {
+if (imsic_max_hart_per_socket < s->soc[socket].num_harts) {
+imsic_max_hart_per_socket = s->soc[socket].num_harts;
+}
+}
+
+hart_index_bits = imsic_num_bits(imsic_max_hart_per_socket);
 
 AcpiTable table = { .sig = "APIC", .rev = 6, .oem_id = s->oem_id,
 .oem_table_id = s->oem_table_id };
@@ -284,6 +297,28 @@ static void build_madt(GArray *table_data,
 riscv_acpi_madt_add_rintc(i, arch_ids, table_data, s);
 }
 
+/* IMSIC */
+if (s->aia_type == VIRT_AIA_TYPE_APLIC_IMSIC) {
+/* IMSIC */
+build_append_int_noprefix(table_data, 0x19, 1); /* Type */
+build_append_int_noprefix(table_data, 16, 1);   /* Length */
+build_append_int_noprefix(table_data, 1, 1);/* Version */
+build_append_int_noprefix(table_data, 0, 1);/* Reserved */
+build_append_int_noprefix(table_data, 0, 4);/* Flags */
+/* Number of supervisor mode Interrupt Identities */
+build_append_int_noprefix(table_data, VIRT_IRQCHIP_NUM_MSIS, 2);
+/* Number of guest mode Interrupt Identities */
+build_append_int_noprefix(table_data, VIRT_IRQCHIP_NUM_MSIS, 2);
+/* Guest Index Bits */
+build_append_int_noprefix(table_data, guest_index_bits, 1);
+/* Hart Index Bits */
+build_append_int_noprefix(table_data, hart_index_bits, 1);
+/* Group Index Bits */
+build_append_int_noprefix(table_data, group_index_bits, 1);
+/* Group Index Shift */
+build_append_int_noprefix(table_data, IMSIC_MMIO_GROUP_MIN_SHIFT, 1);
+}
+
 acpi_table_end(linker, &table);
 }
 
-- 
2.39.2

[PATCH v8 01/13] hw/arm/virt-acpi-build.c: Migrate fw_cfg creation to common location

2023-12-18 Thread Sunil V L

RISC-V also needs to use the same code to create fw_cfg in DSDT. So,
avoid code duplication by moving the code in arm and riscv to a device
specific file.

Suggested-by: Igor Mammedov 
Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Alistair Francis 
Reviewed-by: Andrew Jones 
Acked-by: Michael S. Tsirkin 
---
 hw/arm/virt-acpi-build.c   | 19 ++-
 hw/nvram/fw_cfg-acpi.c | 23 +++
 hw/nvram/meson.build   |  1 +
 hw/riscv/virt-acpi-build.c | 19 ++-
 include/hw/nvram/fw_cfg_acpi.h | 15 +++
 5 files changed, 43 insertions(+), 34 deletions(-)
 create mode 100644 hw/nvram/fw_cfg-acpi.c
 create mode 100644 include/hw/nvram/fw_cfg_acpi.h

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 8bc35a483c..565af9b7ea 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -35,7 +35,7 @@
 #include "target/arm/cpu.h"
 #include "hw/acpi/acpi-defs.h"
 #include "hw/acpi/acpi.h"
-#include "hw/nvram/fw_cfg.h"
+#include "hw/nvram/fw_cfg_acpi.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/utils.h"
@@ -94,21 +94,6 @@ static void acpi_dsdt_add_uart(Aml *scope, const MemMapEntry 
*uart_memmap,
 aml_append(scope, dev);
 }
 
-static void acpi_dsdt_add_fw_cfg(Aml *scope, const MemMapEntry *fw_cfg_memmap)
-{
-Aml *dev = aml_device("FWCF");
-aml_append(dev, aml_name_decl("_HID", aml_string("QEMU0002")));
-/* device present, functioning, decoding, not shown in UI */
-aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
-aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
-
-Aml *crs = aml_resource_template();
-aml_append(crs, aml_memory32_fixed(fw_cfg_memmap->base,
-   fw_cfg_memmap->size, AML_READ_WRITE));
-aml_append(dev, aml_name_decl("_CRS", crs));
-aml_append(scope, dev);
-}
-
 static void acpi_dsdt_add_flash(Aml *scope, const MemMapEntry *flash_memmap)
 {
 Aml *dev, *crs;
@@ -864,7 +849,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 if (vmc->acpi_expose_flash) {
 acpi_dsdt_add_flash(scope, &memmap[VIRT_FLASH]);
 }
-acpi_dsdt_add_fw_cfg(scope, &memmap[VIRT_FW_CFG]);
+fw_cfg_acpi_dsdt_add(scope, &memmap[VIRT_FW_CFG]);
 acpi_dsdt_add_virtio(scope, &memmap[VIRT_MMIO],
 (irqmap[VIRT_MMIO] + ARM_SPI_BASE), NUM_VIRTIO_TRANSPORTS);
 acpi_dsdt_add_pci(scope, memmap, irqmap[VIRT_PCIE] + ARM_SPI_BASE, vms);
diff --git a/hw/nvram/fw_cfg-acpi.c b/hw/nvram/fw_cfg-acpi.c
new file mode 100644
index 00..4e48baeaa0
--- /dev/null
+++ b/hw/nvram/fw_cfg-acpi.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Add fw_cfg device in DSDT
+ *
+ */
+
+#include "hw/nvram/fw_cfg_acpi.h"
+#include "hw/acpi/aml-build.h"
+
+void fw_cfg_acpi_dsdt_add(Aml *scope, const MemMapEntry *fw_cfg_memmap)
+{
+Aml *dev = aml_device("FWCF");
+aml_append(dev, aml_name_decl("_HID", aml_string("QEMU0002")));
+/* device present, functioning, decoding, not shown in UI */
+aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
+aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
+
+Aml *crs = aml_resource_template();
+aml_append(crs, aml_memory32_fixed(fw_cfg_memmap->base,
+   fw_cfg_memmap->size, AML_READ_WRITE));
+aml_append(dev, aml_name_decl("_CRS", crs));
+aml_append(scope, dev);
+}
diff --git a/hw/nvram/meson.build b/hw/nvram/meson.build
index 75e415b1a0..4996c72456 100644
--- a/hw/nvram/meson.build
+++ b/hw/nvram/meson.build
@@ -17,3 +17,4 @@ system_ss.add(when: 'CONFIG_XLNX_EFUSE_ZYNQMP', if_true: 
files(
 system_ss.add(when: 'CONFIG_XLNX_BBRAM', if_true: files('xlnx-bbram.c'))
 
 specific_ss.add(when: 'CONFIG_PSERIES', if_true: files('spapr_nvram.c'))
+specific_ss.add(when: 'CONFIG_ACPI', if_true: files('fw_cfg-acpi.c'))
diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 7331248f59..d8772c2821 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -28,6 +28,7 @@
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/utils.h"
+#include "hw/nvram/fw_cfg_acpi.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "sysemu/reset.h"
@@ -97,22 +98,6 @@ static void acpi_dsdt_add_cpus(Aml *scope, RISCVVirtState *s)
 }
 }
 
-static void acpi_dsdt_add_fw_cfg(Aml *scope, const MemMapEntry *fw_cfg_memmap)
-{
-Aml *dev = aml_device("FWCF");
-aml_append(dev, aml_name_decl("_HID", aml_string("QEMU0002")));
-
-/* device present, functioning, decoding, not shown in UI */
-aml_append(dev, aml_name_decl("_STA", aml_int(0xB)));
-aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
-
-Aml *crs = aml_resource_template();
-aml_append(crs, aml_memory32_fixed(fw_cfg_memmap->base,
-   fw_cfg_memmap->s

[PATCH v8 08/13] hw/riscv/virt-acpi-build.c: Add CMO information in RHCT

2023-12-18 Thread Sunil V L

When CMO related extensions like Zicboz, Zicbom and Zicbop are enabled, the
block size for those extensions need to be communicated via CMO node in
RHCT. Add CMO node in RHCT if any of those CMO extensions are detected.

Signed-off-by: Sunil V L 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
Acked-by: Alistair Francis 
Acked-by: Michael S. Tsirkin 
---
 hw/riscv/virt-acpi-build.c | 64 +-
 1 file changed, 56 insertions(+), 8 deletions(-)

diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index ec49c8804b..506d487ede 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -140,6 +140,7 @@ static void acpi_dsdt_add_cpus(Aml *scope, RISCVVirtState 
*s)
  * 5.2.36 RISC-V Hart Capabilities Table (RHCT)
  * REF: https://github.com/riscv-non-isa/riscv-acpi/issues/16
  *  https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view
+ *  https://drive.google.com/file/d/1sKbOa8m1UZw1JkquZYe3F1zQBN1xXsaf/view
  */
 static void build_rhct(GArray *table_data,
BIOSLinker *linker,
@@ -149,8 +150,8 @@ static void build_rhct(GArray *table_data,
 MachineState *ms = MACHINE(s);
 const CPUArchIdList *arch_ids = mc->possible_cpu_arch_ids(ms);
 size_t len, aligned_len;
-uint32_t isa_offset, num_rhct_nodes;
-RISCVCPU *cpu;
+uint32_t isa_offset, num_rhct_nodes, cmo_offset = 0;
+RISCVCPU *cpu = &s->soc[0].harts[0];
 char *isa;
 
 AcpiTable table = { .sig = "RHCT", .rev = 1, .oem_id = s->oem_id,
@@ -166,6 +167,9 @@ static void build_rhct(GArray *table_data,
 
 /* ISA + N hart info */
 num_rhct_nodes = 1 + ms->smp.cpus;
+if (cpu->cfg.ext_zicbom || cpu->cfg.ext_zicboz) {
+num_rhct_nodes++;
+}
 
 /* Number of RHCT nodes*/
 build_append_int_noprefix(table_data, num_rhct_nodes, 4);
@@ -177,7 +181,6 @@ static void build_rhct(GArray *table_data,
 isa_offset = table_data->len - table.table_offset;
 build_append_int_noprefix(table_data, 0, 2);   /* Type 0 */
 
-cpu = &s->soc[0].harts[0];
 isa = riscv_isa_string(cpu);
 len = 8 + strlen(isa) + 1;
 aligned_len = (len % 2) ? (len + 1) : len;
@@ -193,14 +196,59 @@ static void build_rhct(GArray *table_data,
 build_append_int_noprefix(table_data, 0x0, 1);   /* Optional Padding */
 }
 
+/* CMO node */
+if (cpu->cfg.ext_zicbom || cpu->cfg.ext_zicboz) {
+cmo_offset = table_data->len - table.table_offset;
+build_append_int_noprefix(table_data, 1, 2);/* Type */
+build_append_int_noprefix(table_data, 10, 2);   /* Length */
+build_append_int_noprefix(table_data, 0x1, 2);  /* Revision */
+build_append_int_noprefix(table_data, 0, 1);/* Reserved */
+
+/* CBOM block size */
+if (cpu->cfg.cbom_blocksize) {
+build_append_int_noprefix(table_data,
+  __builtin_ctz(cpu->cfg.cbom_blocksize),
+  1);
+} else {
+build_append_int_noprefix(table_data, 0, 1);
+}
+
+/* CBOP block size */
+build_append_int_noprefix(table_data, 0, 1);
+
+/* CBOZ block size */
+if (cpu->cfg.cboz_blocksize) {
+build_append_int_noprefix(table_data,
+  __builtin_ctz(cpu->cfg.cboz_blocksize),
+  1);
+} else {
+build_append_int_noprefix(table_data, 0, 1);
+}
+}
+
 /* Hart Info Node */
 for (int i = 0; i < arch_ids->len; i++) {
+len = 16;
+int num_offsets = 1;
 build_append_int_noprefix(table_data, 0x, 2);  /* Type */
-build_append_int_noprefix(table_data, 16, 2);  /* Length */
-build_append_int_noprefix(table_data, 0x1, 2); /* Revision */
-build_append_int_noprefix(table_data, 1, 2);/* Number of offsets */
-build_append_int_noprefix(table_data, i, 4);/* ACPI Processor UID 
*/
-build_append_int_noprefix(table_data, isa_offset, 4); /* Offsets[0] */
+
+/* Length */
+if (cmo_offset) {
+len += 4;
+num_offsets++;
+}
+
+build_append_int_noprefix(table_data, len, 2);
+build_append_int_noprefix(table_data, 0x1, 2); /* Revision */
+/* Number of offsets */
+build_append_int_noprefix(table_data, num_offsets, 2);
+build_append_int_noprefix(table_data, i, 4);   /* ACPI Processor UID */
+
+/* Offsets */
+build_append_int_noprefix(table_data, isa_offset, 4);
+if (cmo_offset) {
+build_append_int_noprefix(table_data, cmo_offset, 4);
+}
 }
 
 acpi_table_end(linker, &table);
-- 
2.39.2

1 2 3 >

1 - 100 of 239 matches

Mail list logo