date:20230921

[PATCH 4/4] multifd: reset next_packet_len after sending pages

2023-09-21 Thread Elena Ufimtseva

Sometimes multifd sends just sync packet with no pages
(normal_num is 0). In this case the old value is being
preserved and being accounted for while only packet_len
is being transferred.
Reset it to 0 after sending and accounting for.

TODO: Fix the same packet ids in the stream.
with this patch, there is still an issue with the duplicated
packets ids being sent (with different number of pages/flags).
See in below multifd_send trace (before this change):
multifd_send 394.774 pid=55477 id=0x1 packet_num=0x6f0 normal=0x57 flags=0x1 
next_packet_size=0x57000
multifd_send 181.244 pid=55477 id=0x1 packet_num=0x6f0 normal=0x0 flags=0x0 
next_packet_size=0x57000

With this commit there are still duplicated packets, but since no pages
are being sent with sync flag set, next_packet_size is 0:
multifd_send 27.814 pid=18602 id=0x1 packet_num=0x574 normal=0x7b flags=0x1 
next_packet_size=0x7b000
multifd_send 136054.792 pid=18602 id=0x1 packet_num=0x574 normal=0x0 flags=0x0 
next_packet_size=0x0
If there is a suggestion how to fix this properly, I will be
glad to use it.

Signed-off-by: Elena Ufimtseva 
---
 migration/multifd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/multifd.c b/migration/multifd.c
index 3281397b18..8b4e26051b 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -730,6 +730,7 @@ static void *multifd_send_thread(void *opaque)
p->next_packet_size + p->packet_len);
 stat64_add(&mig_stats.transferred,
p->next_packet_size + p->packet_len);
+p->next_packet_size = 0;
 qemu_mutex_lock(&p->mutex);
 p->pending_job--;
 qemu_mutex_unlock(&p->mutex);
-- 
2.34.1

[PATCH 0/4] multifd: various fixes

2023-09-21 Thread Elena Ufimtseva

Hello

While working and testing various live migration scenarios,
a few issues were found.

This is my first patches in live migration and I will
appreciate the suggestions from the community if these
patches could be done differently.

[PATCH 1/4] multifd: wait for channels_ready before sending sync
I am not certain about this change since it seems that
the sync flag could be the part of the packets with pages that are
being sent out currently.
But the traces show this is not always the case:
multifd_send 230.873 pid=55477 id=0x0 packet_num=0x6f4 normal=0x40 flags=0x1 
next_packet_size=0x4
multifd_send 14.718 pid=55477 id=0x1 packet_num=0x6f5 normal=0x0 flags=0x1 
next_packet_size=0x8
If the sync packet is indeed can be a standalone one, then waiting for
channels_ready before seem to be appropriate, but waisting iteration on
sync only packet.
[PATCH 4/4] is also relevant to 1/4, but fixes the over-accounting in
case of sync only packet.


Thank you in advance and looking forward for your feedback.

Elena

Elena Ufimtseva (4):
  multifd: wait for channels_ready before sending sync
  migration: check for rate_limit_max for RATE_LIMIT_DISABLED
  multifd: fix counters in multifd_send_thread
  multifd: reset next_packet_len after sending pages

 migration/migration-stats.c |  8 
 migration/multifd.c | 11 ++-
 2 files changed, 10 insertions(+), 9 deletions(-)

-- 
2.34.1

[PATCH 3/4] multifd: fix counters in multifd_send_thread

2023-09-21 Thread Elena Ufimtseva

Previous commit cbec7eb76879d419e7dbf531ee2506ec0722e825
"migration/multifd: Compute transferred bytes correctly"
removed accounting for packet_len in non-rdma
case, but the next_packet_size only accounts for pages, not for
the header packet (normal_pages * PAGE_SIZE) that is being sent
as iov[0]. The packet_len part should be added to account for
the size of MultiFDPacket and the array of the offsets.

Signed-off-by: Elena Ufimtseva 
---
 migration/multifd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index e61e458151..3281397b18 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -714,8 +714,6 @@ static void *multifd_send_thread(void *opaque)
 if (ret != 0) {
 break;
 }
-stat64_add(&mig_stats.multifd_bytes, p->packet_len);
-stat64_add(&mig_stats.transferred, p->packet_len);
 } else {
 /* Send header using the same writev call */
 p->iov[0].iov_len = p->packet_len;
@@ -728,8 +726,10 @@ static void *multifd_send_thread(void *opaque)
 break;
 }
 
-stat64_add(&mig_stats.multifd_bytes, p->next_packet_size);
-stat64_add(&mig_stats.transferred, p->next_packet_size);
+stat64_add(&mig_stats.multifd_bytes,
+   p->next_packet_size + p->packet_len);
+stat64_add(&mig_stats.transferred,
+   p->next_packet_size + p->packet_len);
 qemu_mutex_lock(&p->mutex);
 p->pending_job--;
 qemu_mutex_unlock(&p->mutex);
-- 
2.34.1

[PATCH 2/4] migration: check for rate_limit_max for RATE_LIMIT_DISABLED

2023-09-21 Thread Elena Ufimtseva

In migration rate limiting atomic operations are used
to read the rate limit variables and transferred bytes and
they are expensive. Check first if rate_limit_max is equal
to RATE_LIMIT_DISABLED and return false immediately if so.

Signed-off-by: Elena Ufimtseva 
---
 migration/migration-stats.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/migration-stats.c b/migration/migration-stats.c
index 095d6d75bb..abc31483d5 100644
--- a/migration/migration-stats.c
+++ b/migration/migration-stats.c
@@ -24,14 +24,14 @@ bool migration_rate_exceeded(QEMUFile *f)
 return true;
 }
 
-uint64_t rate_limit_start = stat64_get(&mig_stats.rate_limit_start);
-uint64_t rate_limit_current = migration_transferred_bytes(f);
-uint64_t rate_limit_used = rate_limit_current - rate_limit_start;
 uint64_t rate_limit_max = stat64_get(&mig_stats.rate_limit_max);
-
 if (rate_limit_max == RATE_LIMIT_DISABLED) {
 return false;
 }
+uint64_t rate_limit_start = stat64_get(&mig_stats.rate_limit_start);
+uint64_t rate_limit_current = migration_transferred_bytes(f);
+uint64_t rate_limit_used = rate_limit_current - rate_limit_start;
+
 if (rate_limit_max > 0 && rate_limit_used > rate_limit_max) {
 return true;
 }
-- 
2.34.1

[PATCH 1/4] multifd: wait for channels_ready before sending sync

2023-09-21 Thread Elena Ufimtseva

In multifd_send_sync_main we need to wait for channels_ready
before submitting sync packet as the threads may still be sending
their previous pages.
There is also no need to check for channels_ready in the loop
before the wait for sem_sync, next iteration of sending pages
or another sync will start with waiting for channels_ready
semaphore.
Changes to commit 90b3cec351996dd8ef4eb847ad38607812c5e7f5
("multifd: Fix the number of channels ready")

Signed-off-by: Elena Ufimtseva 
---
 migration/multifd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 0f6b203877..e61e458151 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -595,6 +595,7 @@ int multifd_send_sync_main(QEMUFile *f)
 }
 }
 
+qemu_sem_wait(&multifd_send_state->channels_ready);
 /*
  * When using zero-copy, it's necessary to flush the pages before any of
  * the pages can be sent again, so we'll make sure the new version of the
@@ -630,7 +631,6 @@ int multifd_send_sync_main(QEMUFile *f)
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDSendParams *p = &multifd_send_state->params[i];
 
-qemu_sem_wait(&multifd_send_state->channels_ready);
 trace_multifd_send_sync_main_wait(p->id);
 qemu_sem_wait(&p->sem_sync);
 
-- 
2.34.1

[PATCH v2 3/3] tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device

2023-09-21 Thread Tong Ho

Signed-off-by: Tong Ho 
---
 tests/qtest/meson.build |   2 +-
 tests/qtest/xlnx-versal-trng-test.c | 490 
 2 files changed, 491 insertions(+), 1 deletion(-)
 create mode 100644 tests/qtest/xlnx-versal-trng-test.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 1fba07f4ed..215d20e8cf 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -216,7 +216,7 @@ qtests_aarch64 = \
   (config_all.has_key('CONFIG_TCG') and 
config_all_devices.has_key('CONFIG_TPM_TIS_SYSBUS') ?\
 ['tpm-tis-device-test', 'tpm-tis-device-swtpm-test'] : []) +   
  \
   (config_all_devices.has_key('CONFIG_XLNX_ZYNQMP_ARM') ? ['xlnx-can-test', 
'fuzz-xlnx-dp-test'] : []) + \
-  (config_all_devices.has_key('CONFIG_XLNX_VERSAL') ? ['xlnx-canfd-test'] : 
[]) + \
+  (config_all_devices.has_key('CONFIG_XLNX_VERSAL') ? ['xlnx-canfd-test', 
'xlnx-versal-trng-test'] : []) + \
   (config_all_devices.has_key('CONFIG_RASPI') ? ['bcm2835-dma-test'] : []) +  \
   (config_all.has_key('CONFIG_TCG') and
\
config_all_devices.has_key('CONFIG_TPM_TIS_I2C') ? ['tpm-tis-i2c-test'] : 
[]) + \
diff --git a/tests/qtest/xlnx-versal-trng-test.c 
b/tests/qtest/xlnx-versal-trng-test.c
new file mode 100644
index 00..6aff00c7fc
--- /dev/null
+++ b/tests/qtest/xlnx-versal-trng-test.c
@@ -0,0 +1,490 @@
+/*
+ * QTests for the Xilinx Versal True Random Number Generator device
+ *
+ * Copyright (c) 2023 Advanced Micro Devices, Inc.
+ *
+ * SPDX-License-Identifier: MIT
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+
+/* Base Address */
+#define TRNG_BASEADDR  (0xf123)
+
+/* TRNG_INT_CTRL */
+#define R_TRNG_INT_CTRL (0x)
+#define   TRNG_INT_CTRL_CERTF_RST_MASK  (1 << 5)
+#define   TRNG_INT_CTRL_DTF_RST_MASK(1 << 4)
+#define   TRNG_INT_CTRL_DONE_RST_MASK   (1 << 3)
+#define   TRNG_INT_CTRL_CERTF_EN_MASK   (1 << 2)
+#define   TRNG_INT_CTRL_DTF_EN_MASK (1 << 1)
+#define   TRNG_INT_CTRL_DONE_EN_MASK(1)
+
+/* TRNG_STATUS */
+#define R_TRNG_STATUS  (0x0004)
+#define   TRNG_STATUS_QCNT_SHIFT   (9)
+#define   TRNG_STATUS_QCNT_MASK(7 << TRNG_STATUS_QCNT_SHIFT)
+#define   TRNG_STATUS_CERTF_MASK   (1 << 3)
+#define   TRNG_STATUS_DTF_MASK (1 << 1)
+#define   TRNG_STATUS_DONE_MASK(1)
+
+/* TRNG_CTRL */
+#define R_TRNG_CTRL(0x0008)
+#define   TRNG_CTRL_PERSODISABLE_MASK   (1 << 10)
+#define   TRNG_CTRL_SINGLEGENMODE_MASK  (1 << 9)
+#define   TRNG_CTRL_PRNGMODE_MASK   (1 << 7)
+#define   TRNG_CTRL_TSTMODE_MASK(1 << 6)
+#define   TRNG_CTRL_PRNGSTART_MASK  (1 << 5)
+#define   TRNG_CTRL_PRNGXS_MASK (1 << 3)
+#define   TRNG_CTRL_TRSSEN_MASK (1 << 2)
+#define   TRNG_CTRL_QERTUEN_MASK(1 << 1)
+#define   TRNG_CTRL_PRNGSRST_MASK   (1)
+
+/* TRNG_EXT_SEED_0 ... _11 */
+#define R_TRNG_EXT_SEED_0  (0x0040)
+#define R_TRNG_EXT_SEED_11 (R_TRNG_EXT_SEED_0 + 4 * 11)
+
+/* TRNG_PER_STRNG_0 ... 11 */
+#define R_TRNG_PER_STRNG_0 (0x0080)
+#define R_TRNG_PER_STRNG_11(R_TRNG_PER_STRNG_0 + 4 * 11)
+
+/* TRNG_CORE_OUTPUT */
+#define R_TRNG_CORE_OUTPUT (0x00c0)
+
+/* TRNG_RESET */
+#define R_TRNG_RESET   (0x00d0)
+#define   TRNG_RESET_VAL_MASK  (1)
+
+/* TRNG_OSC_EN */
+#define R_TRNG_OSC_EN  (0x00d4)
+#define   TRNG_OSC_EN_VAL_MASK (1)
+
+/* TRNG_TRNG_ISR, _IMR, _IER, _IDR */
+#define R_TRNG_ISR (0x00e0)
+#define R_TRNG_IMR (0x00e4)
+#define R_TRNG_IER (0x00e8)
+#define R_TRNG_IDR (0x00ec)
+#define   TRNG_IRQ_SLVERR_MASK (1 << 1)
+#define   TRNG_IRQ_CORE_INT_MASK   (1)
+
+#define FAILED(FMT, ...) g_error("%s(): " FMT, __func__, ## __VA_ARGS__)
+
+static const uint32_t prng_seed[12] = {
+0x01234567, 0x12345678, 0x23456789, 0x3456789a, 0x456789ab, 0x56789abc,
+0x76543210, 0x87654321, 0x98765432, 0xa9876543, 0xba987654, 0xfedcba98,
+};
+
+static const uint32_t pers_str[12] = {
+0x76543210, 0x87654321, 0x98765432, 0xa9876543, 0xba987654, 0xfedcba98,
+0x01234567, 0x12345678, 0x23456789, 0x3456789a, 0x456789ab, 0x56789abc,
+};
+
+static void trng_test_start(void)
+{
+qtest_start("-machine xlnx-versal-virt");
+}
+
+static void trng_test_stop(void)
+{
+qtest_end();
+}
+
+static void trng_test_set_uint_prop(const char *name, uint64_t value)
+{
+const char *path = "/machine/xlnx-versal/trng";
+QDict *response;
+
+response = qmp("{ 'execute': 'qom-set',"
+" 'arguments': {"
+   " 'path': %s,"
+   " 'property': %s,"
+   " 'value': %llu"
+  "} }", path,
+   name, (unsigned long long)value);
+g_assert(qdict_haskey(response, "return"));
+qobject_unref(response);
+}
+
+static void trng_write(unsigned ra, uint32_t va

[PATCH v2 2/3] hw/arm: xlnx-versal-virt: Add AMD/Xilinx TRNG device

2023-09-21 Thread Tong Ho

Connect the support for Versal True Random Number Generator
(TRNG) device.

Warning: unlike the TRNG component in a real device from the
Versal device familiy, the connected TRNG model is not of
cryptographic grade and is not intended for use cases when
cryptograpically strong TRNG is needed.

Signed-off-by: Tong Ho 
---
 hw/arm/Kconfig   |  1 +
 hw/arm/xlnx-versal-virt.c| 20 
 hw/arm/xlnx-versal.c | 16 
 include/hw/arm/xlnx-versal.h |  5 +
 4 files changed, 42 insertions(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 7e68348440..0a3ff6748d 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -482,6 +482,7 @@ config XLNX_VERSAL
 select XLNX_BBRAM
 select XLNX_EFUSE_VERSAL
 select XLNX_USB_SUBSYS
+select XLNX_VERSAL_TRNG
 
 config NPCM7XX
 bool
diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
index 88c561ff63..d99255ee89 100644
--- a/hw/arm/xlnx-versal-virt.c
+++ b/hw/arm/xlnx-versal-virt.c
@@ -391,6 +391,25 @@ static void fdt_add_rtc_node(VersalVirt *s)
 g_free(name);
 }
 
+static void fdt_add_trng_node(VersalVirt *s)
+{
+const char compat[] = TYPE_XLNX_VERSAL_TRNG;
+const char interrupt_names[] = "trng";
+g_autofree char *name = g_strdup_printf("/trng@%x", MM_PMC_TRNG);
+
+qemu_fdt_add_subnode(s->fdt, name);
+
+qemu_fdt_setprop_cells(s->fdt, name, "interrupts",
+   GIC_FDT_IRQ_TYPE_SPI, VERSAL_TRNG_IRQ,
+   GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+qemu_fdt_setprop(s->fdt, name, "interrupt-names",
+ interrupt_names, sizeof(interrupt_names));
+qemu_fdt_setprop_sized_cells(s->fdt, name, "reg",
+ 2, MM_PMC_TRNG,
+ 2, MM_PMC_TRNG_SIZE);
+qemu_fdt_setprop(s->fdt, name, "compatible", compat, sizeof(compat));
+}
+
 static void fdt_add_bbram_node(VersalVirt *s)
 {
 const char compat[] = TYPE_XLNX_BBRAM;
@@ -690,6 +709,7 @@ static void versal_virt_init(MachineState *machine)
 fdt_add_usb_xhci_nodes(s);
 fdt_add_sd_nodes(s);
 fdt_add_rtc_node(s);
+fdt_add_trng_node(s);
 fdt_add_bbram_node(s);
 fdt_add_efuse_ctrl_node(s);
 fdt_add_efuse_cache_node(s);
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
index fa556d8764..4f74a64a0d 100644
--- a/hw/arm/xlnx-versal.c
+++ b/hw/arm/xlnx-versal.c
@@ -373,6 +373,21 @@ static void versal_create_rtc(Versal *s, qemu_irq *pic)
qdev_get_gpio_in(DEVICE(&s->pmc.apb_irq_orgate), 0));
 }
 
+static void versal_create_trng(Versal *s, qemu_irq *pic)
+{
+SysBusDevice *sbd;
+MemoryRegion *mr;
+
+object_initialize_child(OBJECT(s), "trng", &s->pmc.trng,
+TYPE_XLNX_VERSAL_TRNG);
+sbd = SYS_BUS_DEVICE(&s->pmc.trng);
+sysbus_realize(sbd, &error_fatal);
+
+mr = sysbus_mmio_get_region(sbd, 0);
+memory_region_add_subregion(&s->mr_ps, MM_PMC_TRNG, mr);
+sysbus_connect_irq(sbd, 0, pic[VERSAL_TRNG_IRQ]);
+}
+
 static void versal_create_xrams(Versal *s, qemu_irq *pic)
 {
 int nr_xrams = ARRAY_SIZE(s->lpd.xram.ctrl);
@@ -909,6 +924,7 @@ static void versal_realize(DeviceState *dev, Error **errp)
 versal_create_sds(s, pic);
 versal_create_pmc_apb_irq_orgate(s, pic);
 versal_create_rtc(s, pic);
+versal_create_trng(s, pic);
 versal_create_xrams(s, pic);
 versal_create_bbram(s, pic);
 versal_create_efuse(s, pic);
diff --git a/include/hw/arm/xlnx-versal.h b/include/hw/arm/xlnx-versal.h
index 7b419f88c2..54f4b98d9d 100644
--- a/include/hw/arm/xlnx-versal.h
+++ b/include/hw/arm/xlnx-versal.h
@@ -31,6 +31,7 @@
 #include "hw/dma/xlnx_csu_dma.h"
 #include "hw/misc/xlnx-versal-crl.h"
 #include "hw/misc/xlnx-versal-pmc-iou-slcr.h"
+#include "hw/misc/xlnx-versal-trng.h"
 #include "hw/net/xlnx-versal-canfd.h"
 #include "hw/misc/xlnx-versal-cfu.h"
 #include "hw/misc/xlnx-versal-cframe-reg.h"
@@ -116,6 +117,7 @@ struct Versal {
 } iou;
 
 XlnxZynqMPRTC rtc;
+XlnxVersalTRng trng;
 XlnxBBRam bbram;
 XlnxEFuse efuse;
 XlnxVersalEFuseCtrl efuse_ctrl;
@@ -160,6 +162,7 @@ struct Versal {
 #define VERSAL_OSPI_IRQ124
 #define VERSAL_SD0_IRQ_0   126
 #define VERSAL_EFUSE_IRQ   139
+#define VERSAL_TRNG_IRQ141
 #define VERSAL_RTC_ALARM_IRQ   142
 #define VERSAL_RTC_SECONDS_IRQ 143
 
@@ -329,4 +332,6 @@ struct Versal {
 #define MM_PMC_CRP_SIZE 0x1
 #define MM_PMC_RTC  0xf12a
 #define MM_PMC_RTC_SIZE 0x1
+#define MM_PMC_TRNG 0xf123
+#define MM_PMC_TRNG_SIZE0x1
 #endif
-- 
2.25.1

[PATCH v2 0/3] AMD/Xilinx Versal TRNG support

2023-09-21 Thread Tong Ho

This series adds support for the True Random Number Generator
(TRNG) in the AMD/Xilinx Versal family of devices.

The series starts by introducing a non-cryptographic grade model
of the TRNG controller in the Versal family of devices, followed
by instantiating the model in Xilinx Versal machine.

The series ends with a q-test for sanity check of the TRNG model
in the Xilinx Versal machine.

V1 => V2
1) Change patch #1 only
2) Use g_rand_*() PRNG from glib to replace V1's custom PRNG.
3) Implement ResettableClass for device-reset.
4) Add device-mode description to commit-message.

Best regards,
Tong Ho

Tong Ho (3):
  hw/misc: Introduce AMD/Xilix Versal TRNG device
  hw/arm: xlnx-versal-virt: Add AMD/Xilinx TRNG device
  tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device

 hw/arm/Kconfig  |   1 +
 hw/arm/xlnx-versal-virt.c   |  20 +
 hw/arm/xlnx-versal.c|  16 +
 hw/misc/Kconfig |   3 +
 hw/misc/meson.build |   3 +
 hw/misc/xlnx-versal-trng.c  | 734 
 include/hw/arm/xlnx-versal.h|   5 +
 include/hw/misc/xlnx-versal-trng.h  |  57 +++
 tests/qtest/meson.build |   2 +-
 tests/qtest/xlnx-versal-trng-test.c | 490 +++
 10 files changed, 1330 insertions(+), 1 deletion(-)
 create mode 100644 hw/misc/xlnx-versal-trng.c
 create mode 100644 include/hw/misc/xlnx-versal-trng.h
 create mode 100644 tests/qtest/xlnx-versal-trng-test.c

-- 
2.25.1

[PATCH v2 1/3] hw/misc: Introduce AMD/Xilix Versal TRNG device

2023-09-21 Thread Tong Ho

This adds a non-cryptographic grade implementation of the
model for the True Random Number Generator (TRNG) component
in AMD/Xilinx Versal device family.

This implements all 3 modes defined by the actual hardware
specs, all of which selectable by guest software at will
at anytime:
1) PRNG mode, in which the generated sequence is required to
   be reproducible after reseeded by the same 384-bit value
   as supplied by guest software.
2) Test mode, in which the generated sequence is required to
   be reproducible ater reseeded by the same 128-bit test
   seed supplied by guest software.
3) TRNG mode, in which non-reproducible sequence is generated
   based on periodic reseed by a suitable entropy source.

This model is only intended for non-real world testing of
guest software, where cryptographically strong PRNG and TRNG
is not needed.

This model supports versions 1 & 2 of the device, with
default to be version 2; the 'hw-version' uint32 property
can be set to 0x0100 to override the default.

Other implemented properties:
- 'forced-prng', uint64
  When set to non-zero, mode 3's entropy source is implemented
  as a deterministic sequence based on the given value and other
  deterministic parameters.
  This option allows the emulation to test guest software using
  mode 3 and to reproduce data-dependent defects.

- 'fips-fault-events', uint32, bit-mask
  bit 3: Triggers the SP800-90B entropy health test fault irq
  bit 1: Triggers the FIPS 140-2 continuous test fault irq

Signed-off-by: Tong Ho 
---
 hw/misc/Kconfig|   3 +
 hw/misc/meson.build|   3 +
 hw/misc/xlnx-versal-trng.c | 734 +
 include/hw/misc/xlnx-versal-trng.h |  57 +++
 4 files changed, 797 insertions(+)
 create mode 100644 hw/misc/xlnx-versal-trng.c
 create mode 100644 include/hw/misc/xlnx-versal-trng.h

diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index 6996d265e4..6b6105dcbf 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -186,4 +186,7 @@ config AXP2XX_PMU
 bool
 depends on I2C
 
+config XLNX_VERSAL_TRNG
+bool
+
 source macio/Kconfig
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 88ecab8392..8507ec9e86 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -102,6 +102,9 @@ system_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files(
   'xlnx-cfi-if.c',
   'xlnx-versal-cframe-reg.c',
 ))
+system_ss.add(when: 'CONFIG_XLNX_VERSAL_TRNG', if_true: files(
+  'xlnx-versal-trng.c',
+))
 system_ss.add(when: 'CONFIG_STM32F2XX_SYSCFG', if_true: 
files('stm32f2xx_syscfg.c'))
 system_ss.add(when: 'CONFIG_STM32F4XX_SYSCFG', if_true: 
files('stm32f4xx_syscfg.c'))
 system_ss.add(when: 'CONFIG_STM32F4XX_EXTI', if_true: 
files('stm32f4xx_exti.c'))
diff --git a/hw/misc/xlnx-versal-trng.c b/hw/misc/xlnx-versal-trng.c
new file mode 100644
index 00..6f52b0d636
--- /dev/null
+++ b/hw/misc/xlnx-versal-trng.c
@@ -0,0 +1,734 @@
+/*
+ * Non-crypto strength model of the True Random Number Generator
+ * in the AMD/Xilinx Versal device family.
+ *
+ * Copyright (c) 2017-2020 Xilinx Inc.
+ * Copyright (c) 2023 Advanced Micro Devices, Inc.
+ *
+ * Written by Edgar E. Iglesias 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu/osdep.h"
+#include "hw/misc/xlnx-versal-trng.h"
+
+#include "qemu/bitops.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "qemu/timer.h"
+#include "qapi/visitor.h"
+#include "migration/vmstate.h"
+#include "hw/qdev-properties.h"
+
+#ifndef XLNX_VERSAL_TRNG_ERR_DEBUG
+#define XLNX_VERSAL_TRNG_ERR_DEBUG 0
+#endif
+
+REG32(INT_CTRL, 0x0)
+FIELD(INT_CTRL, CERTF_RST, 5, 1)
+FIELD(INT_CTRL, DTF_RST, 4, 1)
+FIELD(INT_CTRL, DONE_RST, 3, 1)
+FIELD(INT_CTRL, CERTF_EN, 2, 1)
+FIELD(INT_CTRL, DTF_EN, 1, 1)
+FIELD(INT_CTRL, DONE_EN, 0, 1)
+REG32(STATUS, 0x4)
+FIELD(STATUS, QCNT, 9, 3)
+FIELD(STATUS, EAT, 4, 5)
+FIELD(STATUS, CERTF, 3, 1)
+FIELD(S

Re: [PATCH 2/2] seabios: remove PCI drivers from bios.bin

2023-09-21 Thread Thomas Huth


On 21/09/2023 14.10, Paolo Bonzini wrote:

bios.bin is now used only by ISA PC, so PCI drivers are not necessary.

Signed-off-by: Paolo Bonzini 
---
  pc-bios/bios.bin | Bin 131072 -> 131072 bytes
  roms/config.seabios-128k |  30 ++
  2 files changed, 22 insertions(+), 8 deletions(-)

...

diff --git a/roms/config.seabios-128k b/roms/config.seabios-128k
index d18c802c46e..06f4ba35bbe 100644
--- a/roms/config.seabios-128k
+++ b/roms/config.seabios-128k
@@ -1,21 +1,35 @@
-# for qemu machine types 1.7 + older
-# need to turn off features (xhci,uas) to make it fit into 128k
+# SeaBIOS Configuration for -M isapc
+
+#
+# General Features
+#
  CONFIG_QEMU=y
  CONFIG_ROM_SIZE=128
  CONFIG_ATA_DMA=n
  CONFIG_BOOTSPLASH=n
  CONFIG_XEN=n
-CONFIG_USB_OHCI=n
-CONFIG_USB_XHCI=n
-CONFIG_USB_UAS=n
+CONFIG_ATA_PIO32=n
+CONFIG_AHCI=n
  CONFIG_SDCARD=n
  CONFIG_TCGBIOS=n
-CONFIG_MPT_SCSI=n
-CONFIG_ESP_SCSI=n
-CONFIG_MEGASAS=n
+CONFIG_VIRTIO_BLK=n
+CONFIG_VIRTIO_SCSI=n
  CONFIG_PVSCSI=n
+CONFIG_ESP_SCSI=n
+CONFIG_LSI_SCSI=n
+CONFIG_MEGASAS=n
+CONFIG_MPT_SCSI=n


Why did you change the order of MPT, ESP and MEGASAS?

Apart from that, wrt to the config file changes:
Reviewed-by: Thomas Huth 


  CONFIG_NVME=n
  CONFIG_USE_SMM=n
  CONFIG_VGAHOOKS=n
  CONFIG_HOST_BIOS_GEOMETRY=n
+CONFIG_PS2PORT=n
+CONFIG_USB=n
+CONFIG_PMTIMER=n
+CONFIG_PCIBIOS=n
+CONFIG_DISABLE_A20=n
+CONFIG_WRITABLE_UPPERMEMORY=n
+CONFIG_ACPI=n
  CONFIG_ACPI_PARSE=n
+CONFIG_DEBUG_SERIAL=n
+CONFIG_DEBUG_SERIAL_MMIO=n

Re: [PATCH 1/2] pc_piix: remove pc-i440fx-1.4 up to pc-i440fx-1.7

2023-09-21 Thread Thomas Huth


On 21/09/2023 14.10, Paolo Bonzini wrote:

These are the last users of the 128K SeaBIOS blob in the i440FX family.
Removing them allows us to drop PCI support from the 128K blob,
thus making it easier to update SeaBIOS to newer versions.

Signed-off-by: Paolo Bonzini 
---
  docs/about/deprecated.rst   |  8 
  docs/about/removed-features.rst |  2 +-
  hw/i386/pc.c| 54 -
  hw/i386/pc_piix.c   | 73 -
  tests/qtest/test-x86-cpuid-compat.c | 10 +---
  5 files changed, 2 insertions(+), 145 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 694a165a54a..d59bcf36230 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -261,14 +261,6 @@ deprecated; use the new name ``dtb-randomness`` instead. 
The new name
  better reflects the way this property affects all random data within
  the device tree blob, not just the ``kaslr-seed`` node.
  
-``pc-i440fx-1.4`` up to ``pc-i440fx-1.7`` (since 7.0)

-'
-
-These old machine types are quite neglected nowadays and thus might have
-various pitfalls with regards to live migration. Use a newer machine type
-instead.


While you're at it ... do we maybe want to start deprecating the next batch 
of machine types already? (Say pc-i440fx-2.0 up to pc-i440fx-2.2 maybe?)



-
  Backend options
  ---
  
diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst

index 39468b6e926..56e078ad126 100644
--- a/docs/about/removed-features.rst
+++ b/docs/about/removed-features.rst
@@ -730,7 +730,7 @@ mips ``fulong2e`` machine alias (removed in 6.0)
  
  This machine has been renamed ``fuloong2e``.
  
-``pc-0.10`` up to ``pc-1.3`` (removed in 4.0 up to 6.0)

+``pc-0.10`` up to ``pc-1.7`` (removed in 4.0 up to 8.2)
  '''


The names started to change with version 1.4, so it's "pc-i440fx-1.7" and 
not "pc-1.7".



  These machine types were very old and likely could not be used for live
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 54838c0c411..1c7898a2d34 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -359,60 +359,6 @@ GlobalProperty pc_compat_2_0[] = {
  };
  const size_t pc_compat_2_0_len = G_N_ELEMENTS(pc_compat_2_0);
  
-GlobalProperty pc_compat_1_7[] = {

-PC_CPU_MODEL_IDS("1.7.0")
-{ TYPE_USB_DEVICE, "msos-desc", "no" },
-{ "PIIX4_PM", ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, "off" },
-{ "hpet", HPET_INTCAP, "4" },
-};
-const size_t pc_compat_1_7_len = G_N_ELEMENTS(pc_compat_1_7);
-
-GlobalProperty pc_compat_1_6[] = {
-PC_CPU_MODEL_IDS("1.6.0")
-{ "e1000", "mitigation", "off" },
-{ "qemu64-" TYPE_X86_CPU, "model", "2" },
-{ "qemu32-" TYPE_X86_CPU, "model", "3" },
-{ "i440FX-pcihost", "short_root_bus", "1" },
-{ "q35-pcihost", "short_root_bus", "1" },
-};
-const size_t pc_compat_1_6_len = G_N_ELEMENTS(pc_compat_1_6);
-
-GlobalProperty pc_compat_1_5[] = {
-PC_CPU_MODEL_IDS("1.5.0")
-{ "Conroe-" TYPE_X86_CPU, "model", "2" },
-{ "Conroe-" TYPE_X86_CPU, "min-level", "2" },
-{ "Penryn-" TYPE_X86_CPU, "model", "2" },
-{ "Penryn-" TYPE_X86_CPU, "min-level", "2" },
-{ "Nehalem-" TYPE_X86_CPU, "model", "2" },
-{ "Nehalem-" TYPE_X86_CPU, "min-level", "2" },
-{ "virtio-net-pci", "any_layout", "off" },
-{ TYPE_X86_CPU, "pmu", "on" },
-{ "i440FX-pcihost", "short_root_bus", "0" },
-{ "q35-pcihost", "short_root_bus", "0" },
-};
-const size_t pc_compat_1_5_len = G_N_ELEMENTS(pc_compat_1_5);
-
-GlobalProperty pc_compat_1_4[] = {
-PC_CPU_MODEL_IDS("1.4.0")
-{ "scsi-hd", "discard_granularity", "0" },
-{ "scsi-cd", "discard_granularity", "0" },
-{ "ide-hd", "discard_granularity", "0" },
-{ "ide-cd", "discard_granularity", "0" },
-{ "virtio-blk-pci", "discard_granularity", "0" },
-/* DEV_NVECTORS_UNSPECIFIED as a uint32_t string: */
-{ "virtio-serial-pci", "vectors", "0x" },
-{ "virtio-net-pci", "ctrl_guest_offloads", "off" },
-{ "e1000", "romfile", "pxe-e1000.rom" },
-{ "ne2k_pci", "romfile", "pxe-ne2k_pci.rom" },
-{ "pcnet", "romfile", "pxe-pcnet.rom" },
-{ "rtl8139", "romfile", "pxe-rtl8139.rom" },
-{ "virtio-net-pci", "romfile", "pxe-virtio.rom" },
-{ "486-" TYPE_X86_CPU, "model", "0" },
-{ "n270" "-" TYPE_X86_CPU, "movbe", "off" },
-{ "Westmere" "-" TYPE_X86_CPU, "pclmulqdq", "off" },
-};
-const size_t pc_compat_1_4_len = G_N_ELEMENTS(pc_compat_1_4);


It might be worth to have a closer look at the above settings in the various 
devices - maybe we can get rid of some compatibility handling code in the 
devices now, in case the properties are not set by other targets as well.


 Thomas

Re: [PATCH v23 01/20] CPU topology: extend with s390 specifics

2023-09-21 Thread Markus Armbruster

Nina Schoetterl-Glausch  writes:

> On Wed, 2023-09-20 at 12:57 +0200, Markus Armbruster wrote:
>> Nina Schoetterl-Glausch  writes:
>> 
>> > On Tue, 2023-09-19 at 14:47 +0200, Markus Armbruster wrote:
>> > > Nina Schoetterl-Glausch  writes:
>> > > 
>> > > > From: Pierre Morel 
>> > > > 
>> > > > S390 adds two new SMP levels, drawers and books to the CPU
>> > > > topology.
>> > > > S390 CPUs have specific topology features like dedication and
>> > > > entitlement. These indicate to the guest information on host
>> > > > vCPU scheduling and help the guest make better scheduling decisions.
>> > > > 
>> > > > Let us provide the SMP properties with books and drawers levels
>> > > > and S390 CPU with dedication and entitlement,
>> > > > 
>> > > > Signed-off-by: Pierre Morel 
>> > > > Reviewed-by: Nina Schoetterl-Glausch 
>> > > > Co-developed-by: Nina Schoetterl-Glausch 
>> > > > Signed-off-by: Nina Schoetterl-Glausch 
>> > > > ---
>> > > >  qapi/machine-common.json| 21 +
>> > > >  qapi/machine.json   | 19 ++--
>> > > >  include/hw/boards.h | 10 +-
>> > > >  include/hw/qdev-properties-system.h |  4 +++
>> > > >  target/s390x/cpu.h  |  6 
>> > > >  hw/core/machine-smp.c   | 48 -
>> > > >  hw/core/machine.c   |  4 +++
>> > > >  hw/core/qdev-properties-system.c| 13 
>> > > >  hw/s390x/s390-virtio-ccw.c  |  4 +++
>> > > >  softmmu/vl.c|  6 
>> > > >  target/s390x/cpu.c  |  7 +
>> > > >  qapi/meson.build|  1 +
>> > > >  qemu-options.hx |  7 +++--
>> > > >  13 files changed, 137 insertions(+), 13 deletions(-)
>> > > >  create mode 100644 qapi/machine-common.json
>> > > > 
>> > > > diff --git a/qapi/machine-common.json b/qapi/machine-common.json
>> > > > new file mode 100644
>> > > > index 00..e40421bb37
>> > > > --- /dev/null
>> > > > +++ b/qapi/machine-common.json
>> > > 
>> > > Why do you need a separate QAPI sub-module?
>> > 
>> > See here 
>> > https://lore.kernel.org/qemu-devel/d8da6f7d1e3addcb63614f548ed77ac1b8895e63.ca...@linux.ibm.com/
>> 
>> Quote:
>> 
>> CpuS390Entitlement would be useful in both machine.json and 
>> machine-target.json
>> 
>> This is not obvious from this patch.  I figure this patch could add it
>> to machine.json just fine.  The use in machine-target.json in appears
>> only in PATCH 08.
>
> Want me to add the rational to the commit message?

Would work for me.

If the target-specific stuff in machine.json (discussed below) bothers
us, we can clean up on top.

>> because query-cpu-fast is defined in machine.json and set-cpu-topology 
>> is defined
>> in machine-target.json.
>> 
>> So then the question is where best to define CpuS390Entitlement.
>> In machine.json and include machine.json in machine-target.json?
>> Or define it in another file and include it from both?
>> 
>> You do the latter in this patch.
>> 
>> I figure the former would be tolerable, too.
>> 
>> That said, having target-specific stuff in machine.json feels... odd.
>> Before this series, we have CpuInfoS390 and CpuS390State there, for
>> query-cpus-fast.  That command returns a list of objects where common
>> members are target-independent, and the variant members are
>> target-dependent.  qmp_query_cpus_fast() uses a CPU method to populate
>> the target-dependent members.
>> 
>> I'm not sure splitting query-cpus-fast into a target-dependent and a
>> target-independent part is worth the bother.
>> 
>> In this patch, you work with the structure you found.  Can't fault you
>> for that :)
>> 
>> > > > @@ -0,0 +1,21 @@
>> > > > +# -*- Mode: Python -*-
>> > > > +# vim: filetype=python
>> > > > +#
>> > > > +# This work is licensed under the terms of the GNU GPL, version 2 or 
>> > > > later.
>> > > > +# See the COPYING file in the top-level directory.
>> > > > +
>> > > > +##
>> > > > +# = Machines S390 data types
>> > > > +##
>> > > > +
>> > > > +##
>> > > > +# @CpuS390Entitlement:
>> > > > +#
>> > > > +# An enumeration of cpu entitlements that can be assumed by a virtual
>> > > > +# S390 CPU
>> > > > +#
>> > > > +# Since: 8.2
>> > > > +##
>> > > > +{ 'enum': 'CpuS390Entitlement',
>> > > > +  'prefix': 'S390_CPU_ENTITLEMENT',
>> > > > +  'data': [ 'auto', 'low', 'medium', 'high' ] }
>> > > > diff --git a/qapi/machine.json b/qapi/machine.json
>> > > > index a08b6576ca..a63cb951d2 100644
>> > > > --- a/qapi/machine.json
>> > > > +++ b/qapi/machine.json
>> > > > @@ -9,6 +9,7 @@
>> > >##
>> > ># = Machines
>> > > >  ##
>> > > >  
>> > > >  { 'include': 'common.json' }
>> > > > +{ 'include': 'machine-common.json' }
>> > > 
>> > > Section structure is borked :)
>> > > 
>> > > Existing section "Machine" now ends at the new "Machines S390 data
>> > > types" you pull in here.  The contents of below moves from "Machines" to
>> > > "Machines S390 data

Re: [PATCH v3 11/19] target/riscv: introduce KVM AccelCPUClass

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:23 PM Daniel Henrique Barboza
 wrote:
>
> Add a KVM accelerator class like we did with TCG. The difference is
> that, at least for now, we won't be using a realize() implementation for
> this accelerator.
>
> We'll start by assiging kvm_riscv_cpu_add_kvm_properties(), renamed to
> kvm_cpu_instance_init(), as a 'cpu_instance_init' implementation. Change
> riscv_cpu_post_init() to invoke accel_cpu_instance_init(), which will go
> through the 'cpu_instance_init' impl of the current acceleration (if
> available) and execute it. The end result is that the KVM initial setup,
> i.e. starting registers and adding its specific properties, will be done
> via this hook.
>
> Add a 'tcg_enabled()' condition in riscv_cpu_post_init() to avoid
> calling riscv_cpu_add_user_properties() when running KVM. We'll remove
> this condition when the TCG accel class get its own 'cpu_instance_init'
> implementation.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c   |  8 +++-
>  target/riscv/kvm.c   | 26 --
>  target/riscv/kvm_riscv.h |  1 -
>  3 files changed, 27 insertions(+), 8 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 50be127f36..c8a19be1af 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -1219,7 +1219,9 @@ static bool riscv_cpu_has_user_properties(Object 
> *cpu_obj)
>
>  static void riscv_cpu_post_init(Object *obj)
>  {
> -if (riscv_cpu_has_user_properties(obj)) {
> +accel_cpu_instance_init(CPU(obj));
> +
> +if (tcg_enabled() && riscv_cpu_has_user_properties(obj)) {
>  riscv_cpu_add_user_properties(obj);
>  }
>
> @@ -1589,10 +1591,6 @@ static void riscv_cpu_add_multiext_prop_array(Object 
> *obj,
>  static void riscv_cpu_add_user_properties(Object *obj)
>  {
>  #ifndef CONFIG_USER_ONLY
> -if (kvm_enabled()) {
> -kvm_riscv_cpu_add_kvm_properties(obj);
> -return;
> -}
>  riscv_add_satp_mode_properties(obj);
>  #endif
>
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index e5e957121f..606fdab223 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -31,6 +31,7 @@
>  #include "sysemu/kvm_int.h"
>  #include "cpu.h"
>  #include "trace.h"
> +#include "hw/core/accel-cpu.h"
>  #include "hw/pci/pci.h"
>  #include "exec/memattrs.h"
>  #include "exec/address-spaces.h"
> @@ -1318,8 +1319,9 @@ void kvm_riscv_aia_create(MachineState *machine, 
> uint64_t group_shift,
>  kvm_msi_via_irqfd_allowed = kvm_irqfds_enabled();
>  }
>
> -void kvm_riscv_cpu_add_kvm_properties(Object *obj)
> +static void kvm_cpu_instance_init(CPUState *cs)
>  {
> +Object *obj = OBJECT(RISCV_CPU(cs));
>  DeviceState *dev = DEVICE(obj);
>
>  riscv_init_user_properties(obj);
> @@ -1331,7 +1333,7 @@ void kvm_riscv_cpu_add_kvm_properties(Object *obj)
>  riscv_cpu_add_kvm_unavail_prop_array(obj, riscv_cpu_experimental_exts);
>
>  for (Property *prop = riscv_cpu_options; prop && prop->name; prop++) {
> -/* Check if KVM created the property already */
> +/* Check if we have a specific KVM handler for the option */
>  if (object_property_find(obj, prop->name)) {
>  continue;
>  }
> @@ -1339,6 +1341,26 @@ void kvm_riscv_cpu_add_kvm_properties(Object *obj)
>  }
>  }
>
> +static void kvm_cpu_accel_class_init(ObjectClass *oc, void *data)
> +{
> +AccelCPUClass *acc = ACCEL_CPU_CLASS(oc);
> +
> +acc->cpu_instance_init = kvm_cpu_instance_init;
> +}
> +
> +static const TypeInfo kvm_cpu_accel_type_info = {
> +.name = ACCEL_CPU_NAME("kvm"),
> +
> +.parent = TYPE_ACCEL_CPU,
> +.class_init = kvm_cpu_accel_class_init,
> +.abstract = true,
> +};
> +static void kvm_cpu_accel_register_types(void)
> +{
> +type_register_static(&kvm_cpu_accel_type_info);
> +}
> +type_init(kvm_cpu_accel_register_types);
> +
>  static void riscv_host_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = &RISCV_CPU(obj)->env;
> diff --git a/target/riscv/kvm_riscv.h b/target/riscv/kvm_riscv.h
> index da9630c4af..8329cfab82 100644
> --- a/target/riscv/kvm_riscv.h
> +++ b/target/riscv/kvm_riscv.h
> @@ -19,7 +19,6 @@
>  #ifndef QEMU_KVM_RISCV_H
>  #define QEMU_KVM_RISCV_H
>
> -void kvm_riscv_cpu_add_kvm_properties(Object *obj);
>  void kvm_riscv_reset_vcpu(RISCVCPU *cpu);
>  void kvm_riscv_set_irq(RISCVCPU *cpu, int irq, int level);
>  void kvm_riscv_aia_create(MachineState *machine, uint64_t group_shift,
> --
> 2.41.0
>
>

Re: [PATCH v3 10/19] target/riscv: remove kvm-stub.c

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:22 PM Daniel Henrique Barboza
 wrote:
>
> This file is not needed for some time now. Both kvm_riscv_reset_vcpu()
> and kvm_riscv_set_irq() have public declarations in kvm_riscv.h and are
> wrapped in 'if kvm_enabled()' blocks that the compiler will rip it out
> in non-KVM builds.
>
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/kvm-stub.c  | 30 --
>  target/riscv/meson.build |  2 +-
>  2 files changed, 1 insertion(+), 31 deletions(-)
>  delete mode 100644 target/riscv/kvm-stub.c
>
> diff --git a/target/riscv/kvm-stub.c b/target/riscv/kvm-stub.c
> deleted file mode 100644
> index 4e8fc31a21..00
> --- a/target/riscv/kvm-stub.c
> +++ /dev/null
> @@ -1,30 +0,0 @@
> -/*
> - * QEMU KVM RISC-V specific function stubs
> - *
> - * Copyright (c) 2020 Huawei Technologies Co., Ltd
> - *
> - * This program is free software; you can redistribute it and/or modify it
> - * under the terms and conditions of the GNU General Public License,
> - * version 2 or later, as published by the Free Software Foundation.
> - *
> - * This program is distributed in the hope it will be useful, but WITHOUT
> - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> - * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> - * more details.
> - *
> - * You should have received a copy of the GNU General Public License along 
> with
> - * this program.  If not, see .
> - */
> -#include "qemu/osdep.h"
> -#include "cpu.h"
> -#include "kvm_riscv.h"
> -
> -void kvm_riscv_reset_vcpu(RISCVCPU *cpu)
> -{
> -abort();
> -}
> -
> -void kvm_riscv_set_irq(RISCVCPU *cpu, int irq, int level)
> -{
> -abort();
> -}
> diff --git a/target/riscv/meson.build b/target/riscv/meson.build
> index f0486183fa..3323b78b84 100644
> --- a/target/riscv/meson.build
> +++ b/target/riscv/meson.build
> @@ -24,7 +24,7 @@ riscv_ss.add(files(
>'zce_helper.c',
>'vcrypto_helper.c'
>  ))
> -riscv_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'), if_false: 
> files('kvm-stub.c'))
> +riscv_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'))
>
>  riscv_system_ss = ss.source_set()
>  riscv_system_ss.add(files(
> --
> 2.41.0
>
>

Re: [PATCH v3 09/19] target/riscv: make riscv_add_satp_mode_properties() public

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:24 PM Daniel Henrique Barboza
 wrote:
>
> This function is used for both accelerators. Make it public, and call it
> from kvm_riscv_cpu_add_kvm_properties(). This will make it easier to
> split KVM specific code for the KVM accelerator class in the next patch.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 5 ++---
>  target/riscv/cpu.h | 1 +
>  target/riscv/kvm.c | 1 +
>  3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 0dc9b3201d..50be127f36 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -1115,7 +1115,7 @@ static void cpu_riscv_set_satp(Object *obj, Visitor *v, 
> const char *name,
>  satp_map->init |= 1 << satp;
>  }
>
> -static void riscv_add_satp_mode_properties(Object *obj)
> +void riscv_add_satp_mode_properties(Object *obj)
>  {
>  RISCVCPU *cpu = RISCV_CPU(obj);
>
> @@ -1589,12 +1589,11 @@ static void riscv_cpu_add_multiext_prop_array(Object 
> *obj,
>  static void riscv_cpu_add_user_properties(Object *obj)
>  {
>  #ifndef CONFIG_USER_ONLY
> -riscv_add_satp_mode_properties(obj);
> -
>  if (kvm_enabled()) {
>  kvm_riscv_cpu_add_kvm_properties(obj);
>  return;
>  }
> +riscv_add_satp_mode_properties(obj);
>  #endif
>
>  riscv_cpu_add_misa_properties(obj);
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 9dc4113812..cb13464ba6 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -726,6 +726,7 @@ extern const RISCVCPUMultiExtConfig 
> riscv_cpu_experimental_exts[];
>  extern Property riscv_cpu_options[];
>
>  void riscv_cpu_add_misa_properties(Object *cpu_obj);
> +void riscv_add_satp_mode_properties(Object *obj);
>
>  /* CSR function table */
>  extern riscv_csr_operations csr_ops[CSR_TABLE_SIZE];
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index e682a70311..e5e957121f 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -1323,6 +1323,7 @@ void kvm_riscv_cpu_add_kvm_properties(Object *obj)
>  DeviceState *dev = DEVICE(obj);
>
>  riscv_init_user_properties(obj);
> +riscv_add_satp_mode_properties(obj);
>  riscv_cpu_add_misa_properties(obj);
>
>  riscv_cpu_add_kvm_unavail_prop_array(obj, riscv_cpu_extensions);
> --
> 2.41.0
>
>

Re: [PATCH v6 2/2] tpm: add backend for mssim

2023-09-21 Thread Markus Armbruster

Found this cleaning out old mail, sorry for missing it until now!

I think we owe James a quick decision wether we're willing to take the
feature.  Stefan, thoughts?

James Bottomley  writes:

> From: James Bottomley 
>
> The Microsoft Simulator (mssim) is the reference emulation platform
> for the TCG TPM 2.0 specification.
>
> https://github.com/Microsoft/ms-tpm-20-ref.git
>
> It exports a fairly simple network socket based protocol on two
> sockets, one for command (default 2321) and one for control (default
> 2322).  This patch adds a simple backend that can speak the mssim
> protocol over the network.  It also allows the two sockets to be
> specified on the command line.  The benefits are twofold: firstly it
> gives us a backend that actually speaks a standard TPM emulation
> protocol instead of the linux specific TPM driver format of the
> current emulated TPM backend and secondly, using the microsoft
> protocol, the end point of the emulator can be anywhere on the
> network, facilitating the cloud use case where a central TPM service
> can be used over a control network.
>
> The implementation does basic control commands like power off/on, but
> doesn't implement cancellation or startup.  The former because
> cancellation is pretty much useless on a fast operating TPM emulator
> and the latter because this emulator is designed to be used with OVMF
> which itself does TPM startup and I wanted to validate that.
>
> To run this, simply download an emulator based on the MS specification
> (package ibmswtpm2 on openSUSE) and run it, then add these two lines
> to the qemu command and it will use the emulator.
>
> -tpmdev mssim,id=tpm0 \
> -device tpm-crb,tpmdev=tpm0 \
>
> to use a remote emulator replace the first line with
>
> -tpmdev 
> "{'type':'mssim','id':'tpm0','command':{'type':inet,'host':'remote','port':'2321'}}"
>
> tpm-tis also works as the backend.
>
> Signed-off-by: James Bottomley 

[...]

> diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
> index 535912a92b..1398735956 100644
> --- a/docs/specs/tpm.rst
> +++ b/docs/specs/tpm.rst
> @@ -270,6 +270,38 @@ available as a module (assuming a TPM 2 is passed 
> through):
>/sys/devices/LNXSYSTEM:00/LNXSYBUS:00/MSFT0101:00/tpm/tpm0/pcr-sha256/9
>...
>  
> +The QEMU TPM Microsoft Simulator Device
> +---
> +
> +The TCG provides a reference implementation for TPM 2.0 written by


Suggest to copy the cover letter's nice introductory paragraph here:

  The Microsoft Simulator (mssim) is the reference emulation platform
  for the TCG TPM 2.0 specification.

  It provides a reference implementation for TPM 2.0 written by

> +Microsoft (See `ms-tpm-20-ref`_ on github).  The reference implementation
> +starts a network server and listens for TPM commands on port 2321 and
> +TPM Platform control commands on port 2322, although these can be
> +altered.  The QEMU mssim TPM backend talks to this implementation.  By
> +default it connects to the default ports on localhost:
> +
> +.. code-block:: console
> +
> +  qemu-system-x86_64  \
> +-tpmdev mssim,id=tpm0 \
> +-device tpm-crb,tpmdev=tpm0
> +
> +
> +Although it can also communicate with a remote host, which must be
> +specified as a SocketAddress via json on the command line for each of

Is the "via JSON" part in "must be specified ... on the command line"
correct?  I'd expect to be able to use dotted keys as well, like

-tpmdev 
type=mssim,id=tpm0,command.type=inet,command.host=remote,command.port=2321',control.type=inet,control.host=remote,control.port=2322

Aside: I do recommend management applications stick to JSON.

> +the command and control ports:
> +
> +.. code-block:: console
> +
> +  qemu-system-x86_64  \
> +-tpmdev 
> "{'type':'mssim','id':'tpm0','command':{'type':'inet','host':'remote','port':'2321'},'control':{'type':'inet','host':'remote','port':'2322'}}"
>  \
> +-device tpm-crb,tpmdev=tpm0
> +
> +
> +The mssim backend supports snapshotting and migration, but the state
> +of the Microsoft Simulator server must be preserved (or the server
> +kept running) outside of QEMU for restore to be successful.
> +
>  The QEMU TPM emulator device
>  
>  
> @@ -526,3 +558,6 @@ the following:
>  
>  .. _SWTPM protocol:
> 
> https://github.com/stefanberger/swtpm/blob/master/man/man3/swtpm_ioctls.pod
> +
> +.. _ms-tpm-20-ref:
> +   https://github.com/microsoft/ms-tpm-20-ref
> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
> index ed78a87ddd..12482368d0 100644
> --- a/monitor/hmp-cmds.c
> +++ b/monitor/hmp-cmds.c
> @@ -731,6 +731,7 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
>  unsigned int c = 0;
>  TPMPassthroughOptions *tpo;
>  TPMEmulatorOptions *teo;
> +TPMmssimOptions *tmo;
>  
>  info_list = qmp_query_tpm(&err);
>  if (err) {
> @@ -764,6 +765,14 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
>  teo = ti->options->u.emulator.data;
>

Re: [PATCH v3 08/19] target/riscv: move riscv_cpu_add_kvm_properties() to kvm.c

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 10:47 PM Daniel Henrique Barboza
 wrote:
>
> We'll introduce the KVM accelerator class with a 'cpu_instance_init'
> implementation that is going to be invoked during the common
> riscv_cpu_post_init() (via accel_cpu_instance_init()). This
> instance_init will execute KVM exclusive code that TCG doesn't care
> about, such as adding KVM specific properties, initing registers using a
> KVM scratch CPU and so on.
>
> The core of the forementioned cpu_instance_init impl is the current
> riscv_cpu_add_kvm_properties() that is being used by the common code via
> riscv_cpu_add_user_properties() in cpu.c. Move it to kvm.c, together
> will all the relevant artifacts, exporting and renaming it to
> kvm_riscv_cpu_add_kvm_properties() so cpu.c can keep using it for now.
>
> To make this work we'll need to export riscv_cpu_extensions,
> riscv_cpu_vendor_exts and riscv_cpu_experimental_exts from cpu.c as
> well. The TCG accelerator will also need to access those in the near
> future so this export will benefit us in the long run.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c   | 85 +++-
>  target/riscv/cpu.h   | 14 +++
>  target/riscv/kvm.c   | 68 +++-
>  target/riscv/kvm_riscv.h |  3 --
>  4 files changed, 86 insertions(+), 84 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 048a2dbc77..0dc9b3201d 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -1370,7 +1370,7 @@ static RISCVCPUMisaExtConfig misa_ext_cfgs[] = {
>   * change MISA bits during realize() (RVG enables MISA
>   * bits but the user is warned about it).
>   */
> -static void riscv_cpu_add_misa_properties(Object *cpu_obj)
> +void riscv_cpu_add_misa_properties(Object *cpu_obj)
>  {
>  int i;
>
> @@ -1397,17 +1397,11 @@ static void riscv_cpu_add_misa_properties(Object 
> *cpu_obj)
>  }
>  }
>
> -typedef struct RISCVCPUMultiExtConfig {
> -const char *name;
> -uint32_t offset;
> -bool enabled;
> -} RISCVCPUMultiExtConfig;
> -
>  #define MULTI_EXT_CFG_BOOL(_name, _prop, _defval) \
>  {.name = _name, .offset = CPU_CFG_OFFSET(_prop), \
>   .enabled = _defval}
>
> -static const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
> +const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
>  /* Defaults for standard extensions */
>  MULTI_EXT_CFG_BOOL("sscofpmf", ext_sscofpmf, false),
>  MULTI_EXT_CFG_BOOL("Zifencei", ext_ifencei, true),
> @@ -1469,7 +1463,7 @@ static const RISCVCPUMultiExtConfig 
> riscv_cpu_extensions[] = {
>  DEFINE_PROP_END_OF_LIST(),
>  };
>
> -static const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = {
> +const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = {
>  MULTI_EXT_CFG_BOOL("xtheadba", ext_xtheadba, false),
>  MULTI_EXT_CFG_BOOL("xtheadbb", ext_xtheadbb, false),
>  MULTI_EXT_CFG_BOOL("xtheadbs", ext_xtheadbs, false),
> @@ -1487,7 +1481,7 @@ static const RISCVCPUMultiExtConfig 
> riscv_cpu_vendor_exts[] = {
>  };
>
>  /* These are experimental so mark with 'x-' */
> -static const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
> +const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
>  /* ePMP 0.9.3 */
>  MULTI_EXT_CFG_BOOL("x-epmp", epmp, false),
>  MULTI_EXT_CFG_BOOL("x-smaia", ext_smaia, false),
> @@ -1513,7 +1507,7 @@ static const RISCVCPUMultiExtConfig 
> riscv_cpu_experimental_exts[] = {
>  DEFINE_PROP_END_OF_LIST(),
>  };
>
> -static Property riscv_cpu_options[] = {
> +Property riscv_cpu_options[] = {
>  DEFINE_PROP_UINT8("pmu-num", RISCVCPU, cfg.pmu_num, 16),
>
>  DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true),
> @@ -1586,75 +1580,6 @@ static void riscv_cpu_add_multiext_prop_array(Object 
> *obj,
>  }
>  }
>
> -#ifdef CONFIG_KVM
> -static void cpu_set_cfg_unavailable(Object *obj, Visitor *v,
> -const char *name,
> -void *opaque, Error **errp)
> -{
> -const char *propname = opaque;
> -bool value;
> -
> -if (!visit_type_bool(v, name, &value, errp)) {
> -return;
> -}
> -
> -if (value) {
> -error_setg(errp, "extension %s is not available with KVM",
> -   propname);
> -}
> -}
> -
> -static void riscv_cpu_add_kvm_unavail_prop(Object *obj, const char 
> *prop_name)
> -{
> -/* Check if KVM created the property already */
> -if (object_property_find(obj, prop_name)) {
> -return;
> -}
> -
> -/*
> - * Set the default to disabled for every extension
> - * unknown to KVM and error out if the user attempts
> - * to enable any of them.
> - */
> -object_property_add(obj, prop_name, "bool",
> -NULL, cpu_set_cfg_unavailable,
> -NULL, (void *)prop_name);
> -}
>

Re: [PATCH v3 07/19] target/riscv/cpu.c: mark extensions arrays as 'const'

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:21 PM Daniel Henrique Barboza
 wrote:
>
> We'll need to export these arrays to the accelerator classes in the next
> patches. Mark them as 'const' now because they should not be modified at
> runtime.
>
> Note that 'riscv_cpu_options' will also be exported, but can't be marked
> as 'const', because the properties are changed via
> qdev_property_add_static().
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 22 +-
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index f8368ce274..048a2dbc77 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -1407,7 +1407,7 @@ typedef struct RISCVCPUMultiExtConfig {
>  {.name = _name, .offset = CPU_CFG_OFFSET(_prop), \
>   .enabled = _defval}
>
> -static RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
> +static const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
>  /* Defaults for standard extensions */
>  MULTI_EXT_CFG_BOOL("sscofpmf", ext_sscofpmf, false),
>  MULTI_EXT_CFG_BOOL("Zifencei", ext_ifencei, true),
> @@ -1469,7 +1469,7 @@ static RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
>  DEFINE_PROP_END_OF_LIST(),
>  };
>
> -static RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = {
> +static const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = {
>  MULTI_EXT_CFG_BOOL("xtheadba", ext_xtheadba, false),
>  MULTI_EXT_CFG_BOOL("xtheadbb", ext_xtheadbb, false),
>  MULTI_EXT_CFG_BOOL("xtheadbs", ext_xtheadbs, false),
> @@ -1487,7 +1487,7 @@ static RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[] = 
> {
>  };
>
>  /* These are experimental so mark with 'x-' */
> -static RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
> +static const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[] = {
>  /* ePMP 0.9.3 */
>  MULTI_EXT_CFG_BOOL("x-epmp", epmp, false),
>  MULTI_EXT_CFG_BOOL("x-smaia", ext_smaia, false),
> @@ -1558,7 +1558,7 @@ static void cpu_get_multi_ext_cfg(Object *obj, Visitor 
> *v, const char *name,
>  }
>
>  static void cpu_add_multi_ext_prop(Object *cpu_obj,
> -   RISCVCPUMultiExtConfig *multi_cfg)
> +   const RISCVCPUMultiExtConfig *multi_cfg)
>  {
>  object_property_add(cpu_obj, multi_cfg->name, "bool",
>  cpu_get_multi_ext_cfg,
> @@ -1575,11 +1575,13 @@ static void cpu_add_multi_ext_prop(Object *cpu_obj,
>  }
>
>  static void riscv_cpu_add_multiext_prop_array(Object *obj,
> -  RISCVCPUMultiExtConfig *array)
> +const RISCVCPUMultiExtConfig *array)
>  {
> +const RISCVCPUMultiExtConfig *prop;
> +
>  g_assert(array);
>
> -for (RISCVCPUMultiExtConfig *prop = array; prop && prop->name; prop++) {
> +for (prop = array; prop && prop->name; prop++) {
>  cpu_add_multi_ext_prop(obj, prop);
>  }
>  }
> @@ -1620,11 +1622,13 @@ static void riscv_cpu_add_kvm_unavail_prop(Object 
> *obj, const char *prop_name)
>  }
>
>  static void riscv_cpu_add_kvm_unavail_prop_array(Object *obj,
> - RISCVCPUMultiExtConfig 
> *array)
> +const RISCVCPUMultiExtConfig *array)
>  {
> +const RISCVCPUMultiExtConfig *prop;
> +
>  g_assert(array);
>
> -for (RISCVCPUMultiExtConfig *prop = array; prop && prop->name; prop++) {
> +for (prop = array; prop && prop->name; prop++) {
>  riscv_cpu_add_kvm_unavail_prop(obj, prop->name);
>  }
>  }
> @@ -1687,7 +1691,7 @@ static void riscv_init_max_cpu_extensions(Object *obj)
>  {
>  RISCVCPU *cpu = RISCV_CPU(obj);
>  CPURISCVState *env = &cpu->env;
> -RISCVCPUMultiExtConfig *prop;
> +const RISCVCPUMultiExtConfig *prop;
>
>  /* Enable RVG, RVJ and RVV that are disabled by default */
>  set_misa(env, env->misa_mxl, env->misa_ext | RVG | RVJ | RVV);
> --
> 2.41.0
>
>

Re: [PATCH v3 06/19] target/riscv: move 'host' CPU declaration to kvm.c

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:22 PM Daniel Henrique Barboza
 wrote:
>
> This CPU only exists if we're compiling with KVM so move it to the kvm
> specific file.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 15 ---
>  target/riscv/kvm.c | 21 +
>  2 files changed, 21 insertions(+), 15 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 848b58e7c4..f8368ce274 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -652,18 +652,6 @@ static void rv32_imafcu_nommu_cpu_init(Object *obj)
>  }
>  #endif
>
> -#if defined(CONFIG_KVM)
> -static void riscv_host_cpu_init(Object *obj)
> -{
> -CPURISCVState *env = &RISCV_CPU(obj)->env;
> -#if defined(TARGET_RISCV32)
> -set_misa(env, MXL_RV32, 0);
> -#elif defined(TARGET_RISCV64)
> -set_misa(env, MXL_RV64, 0);
> -#endif
> -}
> -#endif /* CONFIG_KVM */
> -
>  static ObjectClass *riscv_cpu_class_by_name(const char *cpu_model)
>  {
>  ObjectClass *oc;
> @@ -2041,9 +2029,6 @@ static const TypeInfo riscv_cpu_type_infos[] = {
>  },
>  DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_ANY,  riscv_any_cpu_init),
>  DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_MAX,  riscv_max_cpu_init),
> -#if defined(CONFIG_KVM)
> -DEFINE_CPU(TYPE_RISCV_CPU_HOST, riscv_host_cpu_init),
> -#endif
>  #if defined(TARGET_RISCV32)
>  DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_BASE32,   rv32_base_cpu_init),
>  DEFINE_CPU(TYPE_RISCV_CPU_IBEX, rv32_ibex_cpu_init),
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 1e4e4456b3..31d2ede4b6 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -1271,3 +1271,24 @@ void kvm_riscv_aia_create(MachineState *machine, 
> uint64_t group_shift,
>
>  kvm_msi_via_irqfd_allowed = kvm_irqfds_enabled();
>  }
> +
> +static void riscv_host_cpu_init(Object *obj)
> +{
> +CPURISCVState *env = &RISCV_CPU(obj)->env;
> +
> +#if defined(TARGET_RISCV32)
> +env->misa_mxl_max = env->misa_mxl = MXL_RV32;
> +#elif defined(TARGET_RISCV64)
> +env->misa_mxl_max = env->misa_mxl = MXL_RV64;
> +#endif
> +}
> +
> +static const TypeInfo riscv_kvm_cpu_type_infos[] = {
> +{
> +.name = TYPE_RISCV_CPU_HOST,
> +.parent = TYPE_RISCV_CPU,
> +.instance_init = riscv_host_cpu_init,
> +}
> +};
> +
> +DEFINE_TYPES(riscv_kvm_cpu_type_infos)
> --
> 2.41.0
>
>

Re: [PATCH v3 05/19] target/riscv/cpu.c: add .instance_post_init()

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:24 PM Daniel Henrique Barboza
 wrote:
>
> All generic CPUs call riscv_cpu_add_user_properties(). The 'max' CPU
> calls riscv_init_max_cpu_extensions(). Both can be moved to a common
> instance_post_init() callback, implemented in riscv_cpu_post_init(),
> called by all CPUs. The call order then becomes:
>
> riscv_cpu_init() -> cpu_init() of each CPU -> .instance_post_init()
>
> In the near future riscv_cpu_post_init() will call the init() function
> of the current accelerator, providing a hook for KVM and TCG accel
> classes to change the init() process of the CPU.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 43 ---
>  1 file changed, 32 insertions(+), 11 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 9426b3b9d6..848b58e7c4 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -427,8 +427,6 @@ static void riscv_max_cpu_init(Object *obj)
>  mlx = MXL_RV32;
>  #endif
>  set_misa(env, mlx, 0);
> -riscv_cpu_add_user_properties(obj);
> -riscv_init_max_cpu_extensions(obj);
>  env->priv_ver = PRIV_VERSION_LATEST;
>  #ifndef CONFIG_USER_ONLY
>  set_satp_mode_max_supported(RISCV_CPU(obj), mlx == MXL_RV32 ?
> @@ -442,7 +440,6 @@ static void rv64_base_cpu_init(Object *obj)
>  CPURISCVState *env = &RISCV_CPU(obj)->env;
>  /* We set this in the realise function */
>  set_misa(env, MXL_RV64, 0);
> -riscv_cpu_add_user_properties(obj);
>  /* Set latest version of privileged specification */
>  env->priv_ver = PRIV_VERSION_LATEST;
>  #ifndef CONFIG_USER_ONLY
> @@ -566,7 +563,6 @@ static void rv128_base_cpu_init(Object *obj)
>  CPURISCVState *env = &RISCV_CPU(obj)->env;
>  /* We set this in the realise function */
>  set_misa(env, MXL_RV128, 0);
> -riscv_cpu_add_user_properties(obj);
>  /* Set latest version of privileged specification */
>  env->priv_ver = PRIV_VERSION_LATEST;
>  #ifndef CONFIG_USER_ONLY
> @@ -579,7 +575,6 @@ static void rv32_base_cpu_init(Object *obj)
>  CPURISCVState *env = &RISCV_CPU(obj)->env;
>  /* We set this in the realise function */
>  set_misa(env, MXL_RV32, 0);
> -riscv_cpu_add_user_properties(obj);
>  /* Set latest version of privileged specification */
>  env->priv_ver = PRIV_VERSION_LATEST;
>  #ifndef CONFIG_USER_ONLY
> @@ -666,7 +661,6 @@ static void riscv_host_cpu_init(Object *obj)
>  #elif defined(TARGET_RISCV64)
>  set_misa(env, MXL_RV64, 0);
>  #endif
> -riscv_cpu_add_user_properties(obj);
>  }
>  #endif /* CONFIG_KVM */
>
> @@ -1215,6 +1209,37 @@ static void riscv_cpu_set_irq(void *opaque, int irq, 
> int level)
>  }
>  #endif /* CONFIG_USER_ONLY */
>
> +static bool riscv_cpu_is_dynamic(Object *cpu_obj)
> +{
> +return object_dynamic_cast(cpu_obj, TYPE_RISCV_DYNAMIC_CPU) != NULL;
> +}
> +
> +static bool riscv_cpu_has_max_extensions(Object *cpu_obj)
> +{
> +return object_dynamic_cast(cpu_obj, TYPE_RISCV_CPU_MAX) != NULL;
> +}
> +
> +static bool riscv_cpu_has_user_properties(Object *cpu_obj)
> +{
> +if (kvm_enabled() &&
> +object_dynamic_cast(cpu_obj, TYPE_RISCV_CPU_HOST) != NULL) {
> +return true;
> +}
> +
> +return riscv_cpu_is_dynamic(cpu_obj);
> +}
> +
> +static void riscv_cpu_post_init(Object *obj)
> +{
> +if (riscv_cpu_has_user_properties(obj)) {
> +riscv_cpu_add_user_properties(obj);
> +}
> +
> +if (riscv_cpu_has_max_extensions(obj)) {
> +riscv_init_max_cpu_extensions(obj);
> +}
> +}
> +
>  static void riscv_cpu_init(Object *obj)
>  {
>  RISCVCPU *cpu = RISCV_CPU(obj);
> @@ -1768,11 +1793,6 @@ static const struct SysemuCPUOps riscv_sysemu_ops = {
>  };
>  #endif
>
> -static bool riscv_cpu_is_dynamic(Object *cpu_obj)
> -{
> -return object_dynamic_cast(cpu_obj, TYPE_RISCV_DYNAMIC_CPU) != NULL;
> -}
> -
>  static void cpu_set_mvendorid(Object *obj, Visitor *v, const char *name,
>void *opaque, Error **errp)
>  {
> @@ -2009,6 +2029,7 @@ static const TypeInfo riscv_cpu_type_infos[] = {
>  .instance_size = sizeof(RISCVCPU),
>  .instance_align = __alignof__(RISCVCPU),
>  .instance_init = riscv_cpu_init,
> +.instance_post_init = riscv_cpu_post_init,
>  .abstract = true,
>  .class_size = sizeof(RISCVCPUClass),
>  .class_init = riscv_cpu_class_init,
> --
> 2.41.0
>
>

RE: [PATCH v1 3/4] hw/arm/virt-acpi-build: patch guest SRAT for NUMA nodes

2023-09-21 Thread Ankit Agrawal

Hi Jonathan

> > +if (pcidev->pdev.has_coherent_memory) {
> > +uint64_t start_node = object_property_get_uint(obj,
> > +  "dev_mem_pxm_start", &error_abort);
> > +uint64_t node_count = object_property_get_uint(obj,
> > +  "dev_mem_pxm_count", &error_abort);
> > +uint64_t node_index;
> > +
> > +/*
> > + * Add the node_count PXM domains starting from start_node as
> > + * hot pluggable. The VM kernel parse the PXM domains and
> > + * creates NUMA nodes.
> > + */
> > +for (node_index = 0; node_index < node_count; node_index++)
> > +build_srat_memory(table_data, 0, 0, start_node + 
> > node_index,
> > +MEM_AFFINITY_ENABLED |
> > + MEM_AFFINITY_HOTPLUGGABLE);
> 
> 0 size SRAT entries for memory? That's not valid.

Can you explain in what sense are these invalid? The Linux kernel accepts
such setting and I had tested it.

> Seems like you've run into the same issue CXL has with dynamic addition of
> nodes to the kernel and all you want to do here is make sure it thinks there 
> are
> enough nodes so initializes various structures large enough.
>
Yes, exactly.

RE: [PATCH v1 1/4] vfio: new command line params for device memory NUMA nodes

2023-09-21 Thread Ankit Agrawal

> Also, good to say why multiple nodes per device are needed.
This is to support the GPU's MIG (Mult-Instance GPUs) feature,
(https://www.nvidia.com/en-in/technologies/multi-instance-gpu/) which
allows partitioning of the GPU device resources (including device memory) into
several isolated instances. We are creating multiple NUMA nodes to give
each partition their own node. Now the partitions are not fixed and they 
can be created/deleted and updated in (memory) sizes at runtime. This is
the reason these nodes are tagged as MEM_AFFINITY_HOTPLUGGABLE. Such
setting allows flexibility in the VM to associate a desired partition/range 
of device memory to a node (that is adjustable). Note that we are replicating
the behavior on baremetal here.

I will also put this detail on the cover letter in the next version.

> QEMU have already means to assign NUMA node affinity
> to PCI hierarchies in generic way by using a PBX per node
> (also done 'backwards') by setting node option on it.
> So every device behind it should belong to that node as well
> and guest OS shall pickup device affinity from PCI tree it belongs to.

Yes, but the problem is that only one node may be associated this way
and we have several.

Re: [PATCH v3 04/19] target/riscv: move riscv_tcg_ops to tcg-cpu.c

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:21 PM Daniel Henrique Barboza
 wrote:
>
> Move the remaining of riscv_tcg_ops now that we have a working realize()
> implementation.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 58 
>  target/riscv/cpu.h |  4 ---
>  target/riscv/tcg/tcg-cpu.c | 60 +-
>  3 files changed, 59 insertions(+), 63 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 7215a29324..9426b3b9d6 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -838,24 +838,6 @@ static vaddr riscv_cpu_get_pc(CPUState *cs)
>  return env->pc;
>  }
>
> -static void riscv_cpu_synchronize_from_tb(CPUState *cs,
> -  const TranslationBlock *tb)
> -{
> -if (!(tb_cflags(tb) & CF_PCREL)) {
> -RISCVCPU *cpu = RISCV_CPU(cs);
> -CPURISCVState *env = &cpu->env;
> -RISCVMXL xl = FIELD_EX32(tb->flags, TB_FLAGS, XL);
> -
> -tcg_debug_assert(!(cs->tcg_cflags & CF_PCREL));
> -
> -if (xl == MXL_RV32) {
> -env->pc = (int32_t) tb->pc;
> -} else {
> -env->pc = tb->pc;
> -}
> -}
> -}
> -
>  static bool riscv_cpu_has_work(CPUState *cs)
>  {
>  #ifndef CONFIG_USER_ONLY
> @@ -871,29 +853,6 @@ static bool riscv_cpu_has_work(CPUState *cs)
>  #endif
>  }
>
> -static void riscv_restore_state_to_opc(CPUState *cs,
> -   const TranslationBlock *tb,
> -   const uint64_t *data)
> -{
> -RISCVCPU *cpu = RISCV_CPU(cs);
> -CPURISCVState *env = &cpu->env;
> -RISCVMXL xl = FIELD_EX32(tb->flags, TB_FLAGS, XL);
> -target_ulong pc;
> -
> -if (tb_cflags(tb) & CF_PCREL) {
> -pc = (env->pc & TARGET_PAGE_MASK) | data[0];
> -} else {
> -pc = data[0];
> -}
> -
> -if (xl == MXL_RV32) {
> -env->pc = (int32_t)pc;
> -} else {
> -env->pc = pc;
> -}
> -env->bins = data[1];
> -}
> -
>  static void riscv_cpu_reset_hold(Object *obj)
>  {
>  #ifndef CONFIG_USER_ONLY
> @@ -1809,23 +1768,6 @@ static const struct SysemuCPUOps riscv_sysemu_ops = {
>  };
>  #endif
>
> -const struct TCGCPUOps riscv_tcg_ops = {
> -.initialize = riscv_translate_init,
> -.synchronize_from_tb = riscv_cpu_synchronize_from_tb,
> -.restore_state_to_opc = riscv_restore_state_to_opc,
> -
> -#ifndef CONFIG_USER_ONLY
> -.tlb_fill = riscv_cpu_tlb_fill,
> -.cpu_exec_interrupt = riscv_cpu_exec_interrupt,
> -.do_interrupt = riscv_cpu_do_interrupt,
> -.do_transaction_failed = riscv_cpu_do_transaction_failed,
> -.do_unaligned_access = riscv_cpu_do_unaligned_access,
> -.debug_excp_handler = riscv_cpu_debug_excp_handler,
> -.debug_check_breakpoint = riscv_cpu_debug_check_breakpoint,
> -.debug_check_watchpoint = riscv_cpu_debug_check_watchpoint,
> -#endif /* !CONFIG_USER_ONLY */
> -};
> -
>  static bool riscv_cpu_is_dynamic(Object *cpu_obj)
>  {
>  return object_dynamic_cast(cpu_obj, TYPE_RISCV_DYNAMIC_CPU) != NULL;
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 409d198635..b2e558f730 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -706,10 +706,6 @@ enum riscv_pmu_event_idx {
>  RISCV_PMU_EVENT_CACHE_ITLB_PREFETCH_MISS = 0x10021,
>  };
>
> -/* Export tcg_ops until we move everything to tcg/tcg-cpu.c */
> -#include "hw/core/tcg-cpu-ops.h"
> -extern const struct TCGCPUOps riscv_tcg_ops;
> -
>  /* used by tcg/tcg-cpu.c*/
>  void isa_ext_update_enabled(RISCVCPU *cpu, uint32_t ext_offset, bool en);
>  bool cpu_cfg_ext_is_user_set(uint32_t ext_offset);
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index d86172f725..e480b9f726 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -28,7 +28,66 @@
>  #include "qemu/error-report.h"
>  #include "qemu/log.h"
>  #include "hw/core/accel-cpu.h"
> +#include "hw/core/tcg-cpu-ops.h"
> +#include "tcg/tcg.h"
>
> +static void riscv_cpu_synchronize_from_tb(CPUState *cs,
> +  const TranslationBlock *tb)
> +{
> +if (!(tb_cflags(tb) & CF_PCREL)) {
> +RISCVCPU *cpu = RISCV_CPU(cs);
> +CPURISCVState *env = &cpu->env;
> +RISCVMXL xl = FIELD_EX32(tb->flags, TB_FLAGS, XL);
> +
> +tcg_debug_assert(!(cs->tcg_cflags & CF_PCREL));
> +
> +if (xl == MXL_RV32) {
> +env->pc = (int32_t) tb->pc;
> +} else {
> +env->pc = tb->pc;
> +}
> +}
> +}
> +
> +static void riscv_restore_state_to_opc(CPUState *cs,
> +   const TranslationBlock *tb,
> +   const uint64_t *data)
> +{
> +RISCVCPU *cpu = RISCV_CPU(cs);
> +CPURISCV

Re: [PATCH v3 03/19] target/riscv: move riscv_cpu_validate_set_extensions() to tcg-cpu.c

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 10:25 PM Daniel Henrique Barboza
 wrote:
>
> This function is the core of the RISC-V validations for TCG CPUs, and it
> has a lot going on.
>
> Functions in cpu.c were made public to allow them to be used by the KVM
> accelerator class later on. 'cpu_cfg_ext_get_min_version()' is notably
> hard to move it to another file due to its dependency with isa_edata_arr[]
> array, thus make it public and use it as is for now.
>
> riscv_cpu_validate_set_extensions() is kept public because it's used by
> csr.c in write_misa().
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 361 +
>  target/riscv/cpu.h |   8 +-
>  target/riscv/csr.c |   1 +
>  target/riscv/tcg/tcg-cpu.c | 357 
>  target/riscv/tcg/tcg-cpu.h |  27 +++
>  5 files changed, 397 insertions(+), 357 deletions(-)
>  create mode 100644 target/riscv/tcg/tcg-cpu.h
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 030629294f..7215a29324 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -163,22 +163,21 @@ static const struct isa_ext_data isa_edata_arr[] = {
>  /* Hash that stores user set extensions */
>  static GHashTable *multi_ext_user_opts;
>
> -static bool isa_ext_is_enabled(RISCVCPU *cpu, uint32_t ext_offset)
> +bool isa_ext_is_enabled(RISCVCPU *cpu, uint32_t ext_offset)
>  {
>  bool *ext_enabled = (void *)&cpu->cfg + ext_offset;
>
>  return *ext_enabled;
>  }
>
> -static void isa_ext_update_enabled(RISCVCPU *cpu, uint32_t ext_offset,
> -   bool en)
> +void isa_ext_update_enabled(RISCVCPU *cpu, uint32_t ext_offset, bool en)
>  {
>  bool *ext_enabled = (void *)&cpu->cfg + ext_offset;
>
>  *ext_enabled = en;
>  }
>
> -static int cpu_cfg_ext_get_min_version(uint32_t ext_offset)
> +int cpu_cfg_ext_get_min_version(uint32_t ext_offset)
>  {
>  int i;
>
> @@ -193,38 +192,12 @@ static int cpu_cfg_ext_get_min_version(uint32_t 
> ext_offset)
>  g_assert_not_reached();
>  }
>
> -static bool cpu_cfg_ext_is_user_set(uint32_t ext_offset)
> +bool cpu_cfg_ext_is_user_set(uint32_t ext_offset)
>  {
>  return g_hash_table_contains(multi_ext_user_opts,
>   GUINT_TO_POINTER(ext_offset));
>  }
>
> -static void cpu_cfg_ext_auto_update(RISCVCPU *cpu, uint32_t ext_offset,
> -bool value)
> -{
> -CPURISCVState *env = &cpu->env;
> -bool prev_val = isa_ext_is_enabled(cpu, ext_offset);
> -int min_version;
> -
> -if (prev_val == value) {
> -return;
> -}
> -
> -if (cpu_cfg_ext_is_user_set(ext_offset)) {
> -return;
> -}
> -
> -if (value && env->priv_ver != PRIV_VERSION_LATEST) {
> -/* Do not enable it if priv_ver is older than min_version */
> -min_version = cpu_cfg_ext_get_min_version(ext_offset);
> -if (env->priv_ver < min_version) {
> -return;
> -}
> -}
> -
> -isa_ext_update_enabled(cpu, ext_offset, value);
> -}
> -
>  const char * const riscv_int_regnames[] = {
>  "x0/zero", "x1/ra",  "x2/sp",  "x3/gp",  "x4/tp",  "x5/t0",   "x6/t1",
>  "x7/t2",   "x8/s0",  "x9/s1",  "x10/a0", "x11/a1", "x12/a2",  "x13/a3",
> @@ -1023,46 +996,7 @@ static void riscv_cpu_disas_set_info(CPUState *s, 
> disassemble_info *info)
>  }
>  }
>
> -static void riscv_cpu_validate_v(CPURISCVState *env, RISCVCPUConfig *cfg,
> - Error **errp)
> -{
> -if (!is_power_of_2(cfg->vlen)) {
> -error_setg(errp, "Vector extension VLEN must be power of 2");
> -return;
> -}
> -if (cfg->vlen > RV_VLEN_MAX || cfg->vlen < 128) {
> -error_setg(errp,
> -   "Vector extension implementation only supports VLEN "
> -   "in the range [128, %d]", RV_VLEN_MAX);
> -return;
> -}
> -if (!is_power_of_2(cfg->elen)) {
> -error_setg(errp, "Vector extension ELEN must be power of 2");
> -return;
> -}
> -if (cfg->elen > 64 || cfg->elen < 8) {
> -error_setg(errp,
> -   "Vector extension implementation only supports ELEN "
> -   "in the range [8, 64]");
> -return;
> -}
> -if (cfg->vext_spec) {
> -if (!g_strcmp0(cfg->vext_spec, "v1.0")) {
> -env->vext_ver = VEXT_VERSION_1_00_0;
> -} else {
> -error_setg(errp, "Unsupported vector spec version '%s'",
> -   cfg->vext_spec);
> -return;
> -}
> -} else if (env->vext_ver == 0) {
> -qemu_log("vector version is not specified, "
> - "use the default value v1.0\n");
> -
> -env->vext_ver = VEXT_VERSION_1_00_0;
> -}
> -}
> -
> -static void riscv_cpu_disable_priv_spec_isa_exts(RISCVCPU *cpu)
> +void ris

Re: [PATCH v3 02/19] target/riscv: move riscv_cpu_realize_tcg() to TCG::cpu_realizefn()

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:24 PM Daniel Henrique Barboza
 wrote:
>
> riscv_cpu_realize_tcg() was added to allow TCG cpus to have a different
> realize() path during the common riscv_cpu_realize(), making it a good
> choice to start moving TCG exclusive code to tcg-cpu.c.
>
> Rename it to tcg_cpu_realizefn() and assign it as a implementation of
> accel::cpu_realizefn(). tcg_cpu_realizefn() will then be called during
> riscv_cpu_realize() via cpu_exec_realizefn(). We'll use a similar
> approach with KVM in the near future.
>
> riscv_cpu_validate_set_extensions() is too big and with too many
> dependencies to be moved in this same patch. We'll do that next.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 
> ---
>  target/riscv/cpu.c | 128 ---
>  target/riscv/tcg/tcg-cpu.c | 133 +
>  2 files changed, 133 insertions(+), 128 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index e72c49c881..030629294f 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -23,9 +23,7 @@
>  #include "qemu/log.h"
>  #include "cpu.h"
>  #include "cpu_vendorid.h"
> -#include "pmu.h"
>  #include "internals.h"
> -#include "time_helper.h"
>  #include "exec/exec-all.h"
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
> @@ -1064,29 +1062,6 @@ static void riscv_cpu_validate_v(CPURISCVState *env, 
> RISCVCPUConfig *cfg,
>  }
>  }
>
> -static void riscv_cpu_validate_priv_spec(RISCVCPU *cpu, Error **errp)
> -{
> -CPURISCVState *env = &cpu->env;
> -int priv_version = -1;
> -
> -if (cpu->cfg.priv_spec) {
> -if (!g_strcmp0(cpu->cfg.priv_spec, "v1.12.0")) {
> -priv_version = PRIV_VERSION_1_12_0;
> -} else if (!g_strcmp0(cpu->cfg.priv_spec, "v1.11.0")) {
> -priv_version = PRIV_VERSION_1_11_0;
> -} else if (!g_strcmp0(cpu->cfg.priv_spec, "v1.10.0")) {
> -priv_version = PRIV_VERSION_1_10_0;
> -} else {
> -error_setg(errp,
> -   "Unsupported privilege spec version '%s'",
> -   cpu->cfg.priv_spec);
> -return;
> -}
> -
> -env->priv_ver = priv_version;
> -}
> -}
> -
>  static void riscv_cpu_disable_priv_spec_isa_exts(RISCVCPU *cpu)
>  {
>  CPURISCVState *env = &cpu->env;
> @@ -,33 +1086,6 @@ static void 
> riscv_cpu_disable_priv_spec_isa_exts(RISCVCPU *cpu)
>  }
>  }
>
> -static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, Error **errp)
> -{
> -RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
> -CPUClass *cc = CPU_CLASS(mcc);
> -CPURISCVState *env = &cpu->env;
> -
> -/* Validate that MISA_MXL is set properly. */
> -switch (env->misa_mxl_max) {
> -#ifdef TARGET_RISCV64
> -case MXL_RV64:
> -case MXL_RV128:
> -cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
> -break;
> -#endif
> -case MXL_RV32:
> -cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
> -break;
> -default:
> -g_assert_not_reached();
> -}
> -
> -if (env->misa_mxl_max != env->misa_mxl) {
> -error_setg(errp, "misa_mxl_max must be equal to misa_mxl");
> -return;
> -}
> -}
> -
>  /*
>   * Check consistency between chosen extensions while setting
>   * cpu->cfg accordingly.
> @@ -1511,74 +1459,6 @@ static void riscv_cpu_finalize_features(RISCVCPU *cpu, 
> Error **errp)
>  #endif
>  }
>
> -static void riscv_cpu_validate_misa_priv(CPURISCVState *env, Error **errp)
> -{
> -if (riscv_has_ext(env, RVH) && env->priv_ver < PRIV_VERSION_1_12_0) {
> -error_setg(errp, "H extension requires priv spec 1.12.0");
> -return;
> -}
> -}
> -
> -static void riscv_cpu_realize_tcg(DeviceState *dev, Error **errp)
> -{
> -RISCVCPU *cpu = RISCV_CPU(dev);
> -CPURISCVState *env = &cpu->env;
> -Error *local_err = NULL;
> -
> -if (object_dynamic_cast(OBJECT(dev), TYPE_RISCV_CPU_HOST)) {
> -error_setg(errp, "'host' CPU is not compatible with TCG 
> acceleration");
> -return;
> -}
> -
> -riscv_cpu_validate_misa_mxl(cpu, &local_err);
> -if (local_err != NULL) {
> -error_propagate(errp, local_err);
> -return;
> -}
> -
> -riscv_cpu_validate_priv_spec(cpu, &local_err);
> -if (local_err != NULL) {
> -error_propagate(errp, local_err);
> -return;
> -}
> -
> -riscv_cpu_validate_misa_priv(env, &local_err);
> -if (local_err != NULL) {
> -error_propagate(errp, local_err);
> -return;
> -}
> -
> -if (cpu->cfg.epmp && !cpu->cfg.pmp) {
> -/*
> - * Enhanced PMP should only be available
> - * on harts with PMP support
> - */
> -error_setg(errp, "Invalid configuration: EPMP requires PMP support");
> -return;
> -}
> -
> -riscv_cpu_validate_set_extensions(cpu, &local_err);
> -if (local_err != NULL) {
> -

Re: [PATCH v3 01/19] target/riscv: introduce TCG AccelCPUClass

2023-09-21 Thread Alistair Francis

On Wed, Sep 20, 2023 at 9:22 PM Daniel Henrique Barboza
 wrote:
>
> target/riscv/cpu.c needs to handle all possible accelerators (TCG and
> KVM at this moment) during both init() and realize() time. This forces
> us to resort to a lot of "if tcg" and "if kvm" throughout the code,
> which isn't wrong, but can get cluttered over time. Splitting
> acceleration specific code from cpu.c to its own file will help to
> declutter the existing code and it will also make it easier to support
> KVM/TCG only builds in the future.
>
> We'll start by adding a new subdir called 'tcg' and a new file called
> 'tcg-cpu.c'. This file will be used to introduce a new accelerator class
> for TCG acceleration in RISC-V, allowing us to center all TCG exclusive
> code in its file instead of using 'cpu.c' for everything. This design is
> inpired by the work Claudio Fontana did in x86 a few years ago in commit
> f5cc5a5c1 ("i386: split cpu accelerators from cpu.c, using
> AccelCPUClass").
>
> To avoid moving too much code at once we'll start by adding the new file
> and TCG AccelCPUClass declaration. The 'class_init' from the accel class
> will init 'tcg_ops', relieving the common riscv_cpu_class_init() from
> doing it.
>
> 'riscv_tcg_ops' is being exported from 'cpu.c' for now to avoid having
> to deal with moving code and files around right now. We'll focus on
> decoupling the realize() logic first.
>
> Signed-off-by: Daniel Henrique Barboza 
> Reviewed-by: Andrew Jones 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c   |  5 +---
>  target/riscv/cpu.h   |  4 +++
>  target/riscv/meson.build |  2 ++
>  target/riscv/tcg/meson.build |  2 ++
>  target/riscv/tcg/tcg-cpu.c   | 58 
>  5 files changed, 67 insertions(+), 4 deletions(-)
>  create mode 100644 target/riscv/tcg/meson.build
>  create mode 100644 target/riscv/tcg/tcg-cpu.c
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 2644638b11..e72c49c881 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -2288,9 +2288,7 @@ static const struct SysemuCPUOps riscv_sysemu_ops = {
>  };
>  #endif
>
> -#include "hw/core/tcg-cpu-ops.h"
> -
> -static const struct TCGCPUOps riscv_tcg_ops = {
> +const struct TCGCPUOps riscv_tcg_ops = {
>  .initialize = riscv_translate_init,
>  .synchronize_from_tb = riscv_cpu_synchronize_from_tb,
>  .restore_state_to_opc = riscv_restore_state_to_opc,
> @@ -2449,7 +2447,6 @@ static void riscv_cpu_class_init(ObjectClass *c, void 
> *data)
>  #endif
>  cc->gdb_arch_name = riscv_gdb_arch_name;
>  cc->gdb_get_dynamic_xml = riscv_gdb_get_dynamic_xml;
> -cc->tcg_ops = &riscv_tcg_ops;
>
>  object_class_property_add(c, "mvendorid", "uint32", cpu_get_mvendorid,
>cpu_set_mvendorid, NULL, NULL);
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 7d6cfb07ea..16a2dfa8c7 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -707,6 +707,10 @@ enum riscv_pmu_event_idx {
>  RISCV_PMU_EVENT_CACHE_ITLB_PREFETCH_MISS = 0x10021,
>  };
>
> +/* Export tcg_ops until we move everything to tcg/tcg-cpu.c */
> +#include "hw/core/tcg-cpu-ops.h"
> +extern const struct TCGCPUOps riscv_tcg_ops;
> +
>  /* CSR function table */
>  extern riscv_csr_operations csr_ops[CSR_TABLE_SIZE];
>
> diff --git a/target/riscv/meson.build b/target/riscv/meson.build
> index 660078bda1..f0486183fa 100644
> --- a/target/riscv/meson.build
> +++ b/target/riscv/meson.build
> @@ -38,5 +38,7 @@ riscv_system_ss.add(files(
>'riscv-qmp-cmds.c',
>  ))
>
> +subdir('tcg')
> +
>  target_arch += {'riscv': riscv_ss}
>  target_softmmu_arch += {'riscv': riscv_system_ss}
> diff --git a/target/riscv/tcg/meson.build b/target/riscv/tcg/meson.build
> new file mode 100644
> index 00..061df3d74a
> --- /dev/null
> +++ b/target/riscv/tcg/meson.build
> @@ -0,0 +1,2 @@
> +riscv_ss.add(when: 'CONFIG_TCG', if_true: files(
> +  'tcg-cpu.c'))
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> new file mode 100644
> index 00..0326cead0d
> --- /dev/null
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -0,0 +1,58 @@
> +/*
> + * riscv TCG cpu class initialization
> + *
> + * Copyright (c) 2023 Ventana Micro Systems Inc.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
>

Re: [PATCH v3 4/5] hw/char: riscv_htif: replace exit calls with proper shutdown

2023-09-21 Thread Alistair Francis

On Thu, Sep 7, 2023 at 9:26 PM Clément Chigot  wrote:
>
> This replaces the exit calls by shutdown requests, ensuring a proper
> cleanup of Qemu. Otherwise, some connections like gdb could be broken
> before its final packet ("Wxx") is being sent. This part, being done
> inside qemu_cleanup function, can be reached only when the main loop
> exits after a shutdown request.
>
> Signed-off-by: Clément Chigot 

Do you mind rebasing this on:
https://github.com/alistair23/qemu/tree/riscv-to-apply.next

Alistair

> ---
>  hw/char/riscv_htif.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/hw/char/riscv_htif.c b/hw/char/riscv_htif.c
> index 37d3ccc76b..7e9b6fcc98 100644
> --- a/hw/char/riscv_htif.c
> +++ b/hw/char/riscv_htif.c
> @@ -31,6 +31,7 @@
>  #include "qemu/error-report.h"
>  #include "exec/address-spaces.h"
>  #include "sysemu/dma.h"
> +#include "sysemu/runstate.h"
>
>  #define RISCV_DEBUG_HTIF 0
>  #define HTIF_DEBUG(fmt, ...) 
>   \
> @@ -205,7 +206,9 @@ static void htif_handle_tohost_write(HTIFState *s, 
> uint64_t val_written)
>  g_free(sig_data);
>  }
>
> -exit(exit_code);
> +qemu_system_shutdown_request_with_code(
> +SHUTDOWN_CAUSE_GUEST_SHUTDOWN, exit_code);
> +return;
>  } else {
>  uint64_t syscall[8];
>  cpu_physical_memory_read(payload, syscall, sizeof(syscall));
> --
> 2.25.1
>

[PATCH v4] hw/i386/pc: improve physical address space bound check for 32-bit x86 systems

2023-09-21 Thread Ani Sinha

32-bit x86 systems do not have a reserved memory for hole64. On those 32-bit
systems without PSE36 or PAE CPU features, hotplugging memory devices are not
supported by QEMU as QEMU always places hotplugged memory above 4 GiB boundary
which is beyond the physical address space of the processor. Linux guests also
does not support memory hotplug on those systems. Please see Linux
kernel commit b59d02ed08690 ("mm/memory_hotplug: disable the functionality
for 32b") for more details.

Therefore, the maximum limit of the guest physical address in the absence of
additional memory devices effectively coincides with the end of
"above 4G memory space" region for 32-bit x86 without PAE/PSE36. When users
configure additional memory devices, after properly accounting for the
additional device memory region to find the maximum value of the guest
physical address, the address will be outside the range of the processor's
physical address space.

This change adds improvements to take above into consideration.

For example, previously this was allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G

With this change now it is no longer allowed:

$ ./qemu-system-x86_64 -cpu pentium -m size=10G
qemu-system-x86_64: Address space limit 0x < 0x2bfff phys-bits too 
low (32)

However, the following are allowed since on both cases physical address
space of the processor is 36 bits:

$ ./qemu-system-x86_64 -cpu pentium2 -m size=10G
$ ./qemu-system-x86_64 -cpu pentium,pse36=on -m size=10G

For 32-bit, without PAE/PSE36, hotplugging additional memory is no longer 
allowed.

$ ./qemu-system-i386 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0x < 0x1 phys-bits too 
low (32)
$ ./qemu-system-i386 -machine q35 -m size=1G,maxmem=3G,slots=2
qemu-system-i386: Address space limit 0x < 0x1 phys-bits too 
low (32)

A new compatibility flag is introduced to make sure pc_max_used_gpa() keeps
returning the old value for machines 8.1 and older.
Therefore, the above is still allowed for older machine types in order to 
support
compatibility. Hence, the following still works:

$ ./qemu-system-i386 -machine pc-i440fx-8.1 -m size=1G,maxmem=3G,slots=2
$ ./qemu-system-i386 -machine pc-q35-8.1 -m size=1G,maxmem=3G,slots=2

Further, following is also allowed as with PSE36, the processor has 36-bit
address space:

$ ./qemu-system-i386 -cpu 486,pse36=on -m size=1G,maxmem=3G,slots=2

After calling CPUID with EAX=0x8001, all AMD64 compliant processors
have the longmode-capable-bit turned on in the extended feature flags (bit 29)
in EDX. The absence of CPUID longmode can be used to differentiate between
32-bit and 64-bit processors and is the recommended approach. QEMU takes this
approach elsewhere (for example, please see x86_cpu_realizefn()), With
this change, pc_max_used_gpa() also uses the same method to detect 32-bit
processors.

Unit tests are modified to not run 32-bit x86 tests that use memory hotplug.

Suggested-by: David Hildenbrand 
Signed-off-by: Ani Sinha 
---
 hw/i386/pc.c   | 31 ---
 hw/i386/pc_piix.c  |  4 
 hw/i386/pc_q35.c   |  2 ++
 include/hw/i386/pc.h   |  6 ++
 tests/qtest/bios-tables-test.c | 26 ++
 tests/qtest/numa-test.c|  7 ++-
 6 files changed, 64 insertions(+), 12 deletions(-)

changelog:
v4: address comments from v3. Fix a bug where compat knob was absent
from q35 machines. Commit message adjustment.
v3: still accounting for additional memory device region above 4G.
unit tests fixed (not running for 32-bit where mem hotplug is used).
v2: removed memory hotplug region from max_gpa. added compat knobs.

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 54838c0c41..2a689cf0bd 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -907,12 +907,37 @@ static uint64_t pc_get_cxl_range_end(PCMachineState *pcms)
 static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
 {
 X86CPU *cpu = X86_CPU(first_cpu);
+PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+MachineState *ms = MACHINE(pcms);
+uint64_t devmem_start = 0;
+ram_addr_t devmem_size = 0;
 
-/* 32-bit systems don't have hole64 thus return max CPU address */
-if (cpu->phys_bits <= 32) {
-return ((hwaddr)1 << cpu->phys_bits) - 1;
+/*
+ * 32-bit systems don't have hole64 but they might have a region for
+ * memory devices. Even if additional hotplugged memory devices might
+ * not be usable by most guest OSes, we need to still consider them for
+ * calculating the highest possible GPA so that we can properly report
+ * if someone configures them on a CPU that cannot possibly address them.
+ */
+if (!(cpu->env.features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM)) {
+/* 32-bit systems */
+if (!pcmc->broken_32bit_mem_addr_check) {
+if (pcmc->has_reserved_memory &&
+(ms->ram_size < ms->maxr

Re: [virtio-dev] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-21 Thread Jason Wang

On Thu, Sep 21, 2023 at 2:28 PM Chen, Jiqian  wrote:
>
> Hi Jason,
>
> On 2023/9/21 12:22, Jason Wang wrote:
> > On Tue, Sep 19, 2023 at 7:43 PM Jiqian Chen  wrote:
> >>
> >> When guest vm does S3, Qemu will reset and clear some things of virtio
> >> devices, but guest can't aware that, so that may cause some problems.
> >> For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
> >> resume, that function will destroy render resources of virtio-gpu. As
> >> a result, after guest resume, the display can't come back and we only
> >> saw a black screen. Due to guest can't re-create all the resources, so
> >> we need to let Qemu not to destroy them when S3.
> >>
> >> For above purpose, we need a mechanism that allows guests and QEMU to
> >> negotiate their reset behavior. So this patch add a new parameter
> >> named freeze_mode to struct virtio_pci_common_cfg. And when guest
> >> suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
> >> devices can change their reset behavior on Qemu side according to
> >> freeze_mode. What's more, freeze_mode can be used for all virtio
> >> devices to affect the behavior of Qemu, not just virtio gpu device.
> >
> > A simple question, why is this issue specific to pci?
> I thought you possibly missed the previous version patches. At the beginning, 
> I just wanted to add a new feature flag VIRTIO_GPU_F_FREEZE_S3 for virtio-gpu 
> since I encountered virtio-gpu issue during guest S3, so that the guest and 
> qemu can negotiate and change the reset behavior during S3. But Parav and 
> Mikhail hoped me can improve the feature to a pci level, then other virtio 
> devices can also benefit from it. Although I am not sure if expanding its 
> influence is appropriate, I have not received any feedback from others, so I 
> change it to the pci level and made this version.
> If you are interested, please see the previous version: 
> https://lists.oasis-open.org/archives/virtio-comment/202307/msg00209.html, 
> thank you.

This is not a good answer. Let me ask you differently, why don't you
see it in other forms of transport like virtio-gpu-mmio?

Thanks

>
> >
> > Thanks
> >
> >
> >>
> >> Signed-off-by: Jiqian Chen 
> >> ---
> >>  transport-pci.tex | 7 +++
> >>  1 file changed, 7 insertions(+)
> >>
> >> diff --git a/transport-pci.tex b/transport-pci.tex
> >> index a5c6719..2543536 100644
> >> --- a/transport-pci.tex
> >> +++ b/transport-pci.tex
> >> @@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
> >> layout}\label{sec:Virtio Transport
> >>  le64 queue_desc;/* read-write */
> >>  le64 queue_driver;  /* read-write */
> >>  le64 queue_device;  /* read-write */
> >> +le16 freeze_mode;   /* read-write */
> >>  le16 queue_notif_config_data;   /* read-only for driver */
> >>  le16 queue_reset;   /* read-write */
> >>
> >> @@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
> >> layout}\label{sec:Virtio Transport
> >>  \item[\field{queue_device}]
> >>  The driver writes the physical address of Device Area here.  See 
> >> section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
> >>
> >> +\item[\field{freeze_mode}]
> >> +The driver writes this to set the freeze mode of virtio pci.
> >> +VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
> >> +VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing S3, and 
> >> virtio-pci enters S3 suspension;
> >> +Other values are reserved for future use, like S4, etc.
> >> +
> >>  \item[\field{queue_notif_config_data}]
> >>  This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has been 
> >> negotiated.
> >>  The driver will use this value when driver sends available buffer
> >> --
> >> 2.34.1
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
> >
>
> --
> Best regards,
> Jiqian Chen.

[PATCH] vfio/pci: rename vfio_put_device to vfio_pci_put_device

2023-09-21 Thread Zhenzhong Duan

vfio_put_device() is a VFIO PCI specific function, rename it with
'vfio_pci' prefix to avoid confusing.

No functional change.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
---
 hw/vfio/pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 3b2ca3c24ca2..b2d5010b9f0e 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2826,7 +2826,7 @@ static void vfio_populate_device(VFIOPCIDevice *vdev, 
Error **errp)
 }
 }
 
-static void vfio_put_device(VFIOPCIDevice *vdev)
+static void vfio_pci_put_device(VFIOPCIDevice *vdev)
 {
 g_free(vdev->vbasedev.name);
 g_free(vdev->msix);
@@ -3317,7 +3317,7 @@ static void vfio_instance_finalize(Object *obj)
  *
  * g_free(vdev->igd_opregion);
  */
-vfio_put_device(vdev);
+vfio_pci_put_device(vdev);
 vfio_put_group(group);
 }
 
-- 
2.34.1

Re: Concerns regarding e17bebd049 ("dump: Set correct vaddr for ELF dump")

2023-09-21 Thread Dave Young

Not sure if crash people subscribed to linux-debuggers, let's add more
cc for awareness about this thread.

On Thu, 21 Sept 2023 at 01:45, Stephen Brennan
 wrote:
>
> Stephen Brennan  writes:
> > Hi Jon,
> >
> > Jon Doron  writes:
> >> Hi Stephen,
> >> Like you have said the reason is as I wrote in the commit message,
> >> without "fixing" the vaddr GDB is messing up mapping and working with
> >> the generated core file.
> >
> > For the record I totally love this workaround :)
> >
> > It's clever and gets the job done and I would have done it in a
> > heartbeat. It's just that it does end up making vmcores that have
> > incorrect data, which is a pain for debuggers that are actually designed
> > to look at kernel core dumps.
> >
> >> This patch is almost 4 years old, perhaps some changes to GDB has been
> >> introduced to resolve this, I have not checked since then.
> >
> > Program Headers:
> >   Type   Offset VirtAddr   PhysAddr
> >  FileSizMemSiz  Flags  Align
> >   NOTE   0x0168 0x 0x
> >  0x1980 0x1980 0x0
> >   LOAD   0x1ae8 0x 0x
> >  0x8000 0x8000 0x0
> >   LOAD   0x80001ae8 0x 0xfffc
> >  0x0004 0x0004 0x0
> >
> > (gdb) info files
> > Local core dump file:
> > `/home/stepbren/repos/test_code/elf/dumpfile', file type 
> > elf64-x86-64.
> > 0x - 0x8000 is load1
> > 0x - 0x0004 is load2
> >
> > $ gdb --version
> > GNU gdb (GDB) Red Hat Enterprise Linux 10.2-10.0.2.el9
> > Copyright (C) 2021 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later 
> > 
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.
> >
> >
> > It doesn't *look like* anything has changed in this version of GDB. But
> > I'm not really certain that GDB is expected to use the physical
> > addresses in the load segments: it's not a kernel debugger.
> >
> > I think hacking the p_vaddr field _is_ the way to get GDB to behave in
> > the way you want: allow you to read physical memory addresses.
> >
> >> As I'm no longer using this feature and have not worked and tested it
> >> in a long while, so I have no obligations to this change, but perhaps
> >> someone else might be using it...
> >
> > I definitely think it's valuable for people to continue being able to
> > use QEMU vmcores generated with paging=off in GDB, even if GDB isn't
> > desgined for it. It seems like a useful hack that appeals to the lowest
> > common denominator: most people have GDB and not a purpose-built kernel
> > debugger. But maybe we could point to a program like the below that will
> > tweak the p_paddr field after the fact, in order to appeal to GDB's
> > sensibilities?
>
> And of course I sent the wrong copy of the file. Attached is the program
> I intended to send (which properly handles endianness and sets the vaddr
> as expected).
>

Re: [PATCH v13 6/9] gfxstream + rutabaga: add initial support for gfxstream

2023-09-21 Thread Akihiko Odaki


On 2023/09/22 9:03, Gurchetan Singh wrote:



On Wed, Sep 20, 2023 at 5:05 AM Mark Cave-Ayland 
mailto:mark.cave-ayl...@ilande.co.uk>> 
wrote:


On 20/09/2023 12:42, Akihiko Odaki wrote:

 > On 2023/08/29 9:36, Gurchetan Singh wrote:
 >> This adds initial support for gfxstream and cross-domain.  Both
 >> features rely on virtio-gpu blob resources and context types, which
 >> are also implemented in this patch.
 >>
 >> gfxstream has a long and illustrious history in Android graphics
 >> paravirtualization.  It has been powering graphics in the Android
 >> Studio Emulator for more than a decade, which is the main developer
 >> platform.
 >>
 >> Originally conceived by Jesse Hall, it was first known as
"EmuGL" [a].
 >> The key design characteristic was a 1:1 threading model and
 >> auto-generation, which fit nicely with the OpenGLES spec.  It also
 >> allowed easy layering with ANGLE on the host, which provides the
GLES
 >> implementations on Windows or MacOS enviroments.
 >>
 >> gfxstream has traditionally been maintained by a single
engineer, and
 >> between 2015 to 2021, the goldfish throne passed to Frank Yang.
 >> Historians often remark this glorious reign ("pax gfxstreama" is the
 >> academic term) was comparable to that of Augustus and both Queen
 >> Elizabeths.  Just to name a few accomplishments in a resplendent
 >> panoply: higher versions of GLES, address space graphics, snapshot
 >> support and CTS compliant Vulkan [b].
 >>
 >> One major drawback was the use of out-of-tree goldfish drivers.
 >> Android engineers didn't know much about DRM/KMS and especially
TTM so
 >> a simple guest to host pipe was conceived.
 >>
 >> Luckily, virtio-gpu 3D started to emerge in 2016 due to the work of
 >> the Mesa/virglrenderer communities.  In 2018, the initial virtio-gpu
 >> port of gfxstream was done by Cuttlefish enthusiast Alistair Delva.
 >> It was a symbol compatible replacement of virglrenderer [c] and
named
 >> "AVDVirglrenderer".  This implementation forms the basis of the
 >> current gfxstream host implementation still in use today.
 >>
 >> cross-domain support follows a similar arc.  Originally conceived by
 >> Wayland aficionado David Reveman and crosvm enjoyer Zach Reizner in
 >> 2018, it initially relied on the downstream "virtio-wl" device.
 >>
 >> In 2020 and 2021, virtio-gpu was extended to include blob resources
 >> and multiple timelines by yours truly, features
gfxstream/cross-domain
 >> both require to function correctly.
 >>
 >> Right now, we stand at the precipice of a truly fantastic
possibility:
 >> the Android Emulator powered by upstream QEMU and upstream Linux
 >> kernel.  gfxstream will then be packaged properfully, and app
 >> developers can even fix gfxstream bugs on their own if they
encounter
 >> them.
 >>
 >> It's been quite the ride, my friends.  Where will gfxstream head
next,
 >> nobody really knows.  I wouldn't be surprised if it's around for
 >> another decade, maintained by a new generation of Android graphics
 >> enthusiasts.
 >>
 >> Technical details:
 >>    - Very simple initial display integration: just used Pixman
 >>    - Largely, 1:1 mapping of virtio-gpu hypercalls to rutabaga
function
 >>  calls
 >>
 >> Next steps for Android VMs:
 >>    - The next step would be improving display integration and UI
interfaces
 >>  with the goal of the QEMU upstream graphics being in an
emulator
 >>  release [d].
 >>
 >> Next steps for Linux VMs for display virtualization:
 >>    - For widespread distribution, someone needs to package
Sommelier or the
 >>  wayland-proxy-virtwl [e] ideally into Debian main. In
addition, newer
 >>  versions of the Linux kernel come with DRM_VIRTIO_GPU_KMS
option,
 >>  which allows disabling KMS hypercalls.  If anyone cares
enough, it'll
 >>  probably be possible to build a custom VM variant that uses
this display
 >>  virtualization strategy.
 >>
 >> [a]
https://android-review.googlesource.com/c/platform/development/+/34470 

 >> [b]
https://android-review.googlesource.com/q/topic:%22vulkan-hostconnection-start%22 

 >> [c]
https://android-review.googlesource.com/c/device/generic/goldfish-opengl/+/761927 

 >> [d] https://developer.android.com/studio/releases/emulator

 >> [e] https://github.com/talex5/wayland-proxy-virtwl

RE: [PATCH v1 13/22] vfio: Add base container

2023-09-21 Thread Duan, Zhenzhong



>-Original Message-
>From: Eric Auger 
>Sent: Friday, September 22, 2023 1:20 AM
>Subject: Re: [PATCH v1 13/22] vfio: Add base container
>
>Hi Zhenzhong,
>On 9/21/23 05:35, Duan, Zhenzhong wrote:
>> Hi Eric,
>>
>>> -Original Message-
>>> From: Eric Auger 
>>> Sent: Thursday, September 21, 2023 1:31 AM
>>> Subject: Re: [PATCH v1 13/22] vfio: Add base container
>>>
>>> Hi Zhenzhong,
>>>
>>> On 9/19/23 19:23, Cédric Le Goater wrote:
 On 8/30/23 12:37, Zhenzhong Duan wrote:
> From: Yi Liu 
>
> Abstract the VFIOContainer to be a base object. It is supposed to be
> embedded by legacy VFIO container and later on, into the new iommufd
> based container.
>
> The base container implements generic code such as code related to
> memory_listener and address space management. The VFIOContainerOps
> implements callbacks that depend on the kernel user space being used.
>
> 'common.c' and vfio device code only manipulates the base container with
> wrapper functions that calls the functions defined in
> VFIOContainerOpsClass.
> Existing 'container.c' code is converted to implement the legacy
> container
> ops functions.
>
> Below is the base container. It's named as VFIOContainer, old
> VFIOContainer
> is replaced with VFIOLegacyContainer.
 Usualy, we introduce the new interface solely, port the current models
 on top of the new interface, wire the new models in the current
 implementation and remove the old implementation. Then, we can start
 adding extensions to support other implementations.

 spapr should be taken care of separatly following the principle above.
 With my PPC hat, I would not even read such a massive change, too risky
 for the subsystem. This path will need (much) further splitting to be
 understandable and acceptable.
>>> We might split this patch by
>>> 1) introducing VFIOLegacyContainer encapsulating the base VFIOContainer,
>>> without using the ops in a first place:
>>>  common.c would call vfio_container_* with harcoded legacy
>>> implementation, ie. retrieving the legacy container with container_of.
>>> 2) we would introduce the BE interface without using it.
>>> 3) we would use the new BE interface
>>>
>>> Obviously this needs to be further tried out. If you wish I can try to
>>> split it that way ... Please let me know
>> Sure, thanks for your help, glad that I can cooperate with you to move
>> this series forward.
>> I just updated the branch which rebased to newest upstream for you to pick at
>https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_cdev_v1_rebased
>
>I have spent most of my day reshuffling this single patch into numerous
>ones (16!). This should help the review.
>I was short of time. This compiles, the end code should be identical to
>the original one. Besides this deserves some additional review on your
>end, commit msg tuning, ...
>
>But at least it is a move forward. Feel free to incorporate that in your
>next respin.
>
>Please find that work on the following branch
>
>https://github.com/eauger/qemu/tree/iommufd_cdev_v1_rebased_split

Thanks Eric, you have done a so quick and awesome work. Let me learn
your change and integrate with my other changes. Will get back to you
then.

BRs.
Zhenzhong

Re: [PATCH v11 6/9] gfxstream + rutabaga: add initial support for gfxstream

2023-09-21 Thread Akihiko Odaki


On 2023/09/22 8:44, Gurchetan Singh wrote:



On Tue, Sep 19, 2023 at 3:07 PM Akihiko Odaki > wrote:


On 2023/09/20 3:36, Bernhard Beschow wrote:
 >
 >
 > Am 15. September 2023 02:38:02 UTC schrieb Gurchetan Singh
mailto:gurchetansi...@chromium.org>>:
 >> On Thu, Sep 14, 2023 at 12:23 AM Bernhard Beschow
mailto:shen...@gmail.com>> wrote:
 >>
 >>>
 >>>
 >>> Am 14. September 2023 04:38:51 UTC schrieb Gurchetan Singh <
 >>> gurchetansi...@chromium.org >:
  On Wed, Sep 13, 2023 at 4:58 AM Bernhard Beschow
mailto:shen...@gmail.com>>
 >>> wrote:
 
 >
 >
 > Am 23. August 2023 01:25:38 UTC schrieb Gurchetan Singh <
 > gurchetansi...@chromium.org
>:
 >> This adds initial support for gfxstream and cross-domain.  Both
 >> features rely on virtio-gpu blob resources and context
types, which
 >> are also implemented in this patch.
 >>
 >> gfxstream has a long and illustrious history in Android graphics
 >> paravirtualization.  It has been powering graphics in the
Android
 >> Studio Emulator for more than a decade, which is the main
developer
 >> platform.
 >>
 >> Originally conceived by Jesse Hall, it was first known as
"EmuGL" [a].
 >> The key design characteristic was a 1:1 threading model and
 >> auto-generation, which fit nicely with the OpenGLES spec. 
It also

 >> allowed easy layering with ANGLE on the host, which provides
the GLES
 >> implementations on Windows or MacOS enviroments.
 >>
 >> gfxstream has traditionally been maintained by a single
engineer, and
 >> between 2015 to 2021, the goldfish throne passed to Frank Yang.
 >> Historians often remark this glorious reign ("pax
gfxstreama" is the
 >> academic term) was comparable to that of Augustus and both Queen
 >> Elizabeths.  Just to name a few accomplishments in a resplendent
 >> panoply: higher versions of GLES, address space graphics,
snapshot
 >> support and CTS compliant Vulkan [b].
 >>
 >> One major drawback was the use of out-of-tree goldfish drivers.
 >> Android engineers didn't know much about DRM/KMS and
especially TTM so
 >> a simple guest to host pipe was conceived.
 >>
 >> Luckily, virtio-gpu 3D started to emerge in 2016 due to the
work of
 >> the Mesa/virglrenderer communities.  In 2018, the initial
virtio-gpu
 >> port of gfxstream was done by Cuttlefish enthusiast Alistair
Delva.
 >> It was a symbol compatible replacement of virglrenderer [c]
and named
 >> "AVDVirglrenderer".  This implementation forms the basis of the
 >> current gfxstream host implementation still in use today.
 >>
 >> cross-domain support follows a similar arc.  Originally
conceived by
 >> Wayland aficionado David Reveman and crosvm enjoyer Zach
Reizner in
 >> 2018, it initially relied on the downstream "virtio-wl" device.
 >>
 >> In 2020 and 2021, virtio-gpu was extended to include blob
resources
 >> and multiple timelines by yours truly, features
gfxstream/cross-domain
 >> both require to function correctly.
 >>
 >> Right now, we stand at the precipice of a truly fantastic
possibility:
 >> the Android Emulator powered by upstream QEMU and upstream Linux
 >> kernel.  gfxstream will then be packaged properfully, and app
 >> developers can even fix gfxstream bugs on their own if they
encounter
 >> them.
 >>
 >> It's been quite the ride, my friends.  Where will gfxstream
head next,
 >> nobody really knows.  I wouldn't be surprised if it's around for
 >> another decade, maintained by a new generation of Android
graphics
 >> enthusiasts.
 >>
 >> Technical details:
 >>   - Very simple initial display integration: just used Pixman
 >>   - Largely, 1:1 mapping of virtio-gpu hypercalls to
rutabaga function
 >>     calls
 >>
 >> Next steps for Android VMs:
 >>   - The next step would be improving display integration and UI
 >>> interfaces
 >>     with the goal of the QEMU upstream graphics being in an
emulator
 >>     release [d].
 >>
 >> Next steps for Linux VMs for display virtualization:
 >>   - For widespread distribution, someone needs to package
Sommelier or
 >>> the
 >>     wayland-proxy-virtwl [e] ideally into Debian main. In
addition,
 >>> newer
 >>     versions of the Linux kernel come with
DRM_VIRTIO_GPU_KMS option,

[PATCH v2] hw/sd/sdhci: Block Size Register bits [14:12] is lost

2023-09-21 Thread Lu Gao

Block Size Register bits [14:12] is SDMA Buffer Boundary, it is missed
in register write, but it is needed in SDMA transfer. e.g. it will be
used in sdhci_sdma_transfer_multi_blocks to calculate boundary_ variables.

Missing this field will cause wrong operation for different SDMA Buffer
Boundary settings.

Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller")
Fixes: dfba99f17f ("hw/sdhci: Fix DMA Transfer Block Size field")

Signed-off-by: Lu Gao 
Signed-off-by: Jianxian Wen 
Reviewed-by: Philippe Mathieu-Daudé? 
---
v2:
 - Add fixes information and reviewed-by information

 hw/sd/sdhci.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index 5564765a9b..40473b0db0 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -321,6 +321,8 @@ static void sdhci_poweron_reset(DeviceState *dev)
 
 static void sdhci_data_transfer(void *opaque);
 
+#define BLOCK_SIZE_MASK (4 * KiB - 1)
+
 static void sdhci_send_command(SDHCIState *s)
 {
 SDRequest request;
@@ -371,7 +373,8 @@ static void sdhci_send_command(SDHCIState *s)
 
 sdhci_update_irq(s);
 
-if (!timeout && s->blksize && (s->cmdreg & SDHC_CMD_DATA_PRESENT)) {
+if (!timeout && (s->blksize & BLOCK_SIZE_MASK) &&
+(s->cmdreg & SDHC_CMD_DATA_PRESENT)) {
 s->data_count = 0;
 sdhci_data_transfer(s);
 }
@@ -406,7 +409,6 @@ static void sdhci_end_transfer(SDHCIState *s)
 /*
  * Programmed i/o data transfer
  */
-#define BLOCK_SIZE_MASK (4 * KiB - 1)
 
 /* Fill host controller's read buffer with BLKSIZE bytes of data from card */
 static void sdhci_read_block_from_card(SDHCIState *s)
@@ -1154,7 +1156,8 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
unsigned size)
 s->sdmasysad = (s->sdmasysad & mask) | value;
 MASKED_WRITE(s->sdmasysad, mask, value);
 /* Writing to last byte of sdmasysad might trigger transfer */
-if (!(mask & 0xFF00) && s->blkcnt && s->blksize &&
+if (!(mask & 0xFF00) && s->blkcnt &&
+(s->blksize & BLOCK_SIZE_MASK) &&
 SDHC_DMA_TYPE(s->hostctl1) == SDHC_CTRL_SDMA) {
 if (s->trnmod & SDHC_TRNS_MULTI) {
 sdhci_sdma_transfer_multi_blocks(s);
@@ -1168,7 +1171,11 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
unsigned size)
 if (!TRANSFERRING_DATA(s->prnsts)) {
 uint16_t blksize = s->blksize;
 
-MASKED_WRITE(s->blksize, mask, extract32(value, 0, 12));
+/*
+ * [14:12] SDMA Buffer Boundary
+ * [11:00] Transfer Block Size
+ */
+MASKED_WRITE(s->blksize, mask, extract32(value, 0, 15));
 MASKED_WRITE(s->blkcnt, mask >> 16, value >> 16);
 
 /* Limit block size to the maximum buffer size */
-- 
2.17.1

Re: [RFC PATCH v2 07/21] i386/pc: Drop pc_machine_kvm_type()

2023-09-21 Thread Xiaoyao Li


On 9/21/2023 4:51 PM, David Hildenbrand wrote:

On 14.09.23 05:51, Xiaoyao Li wrote:

pc_machine_kvm_type() was introduced by commit e21be724eaf5 ("i386/xen:
add pc_machine_kvm_type to initialize XEN_EMULATE mode") to do Xen
specific initialization by utilizing kvm_type method.

commit eeedfe6c6316 ("hw/xen: Simplify emulated Xen platform init")
moves the Xen specific initialization to pc_basic_device_init().

There is no need to keep the PC specific kvm_type() implementation
anymore.


So we'll fallback to kvm_arch_get_default_type(), which simply returns 0.


On the other hand, later patch will implement kvm_type()
method for all x86/i386 machines to support KVM_X86_SW_PROTECTED_VM.



^ I suggest dropping that and merging that patch ahead-of-time as a 
simple cleanup.


I suppose the "that" here means "this patch", right?

If so, I can submit this patch separately.


Reviewed-by: David Hildenbrand

Re: [RFC PATCH v2 05/21] kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot

2023-09-21 Thread Xiaoyao Li


On 9/21/2023 4:56 PM, David Hildenbrand wrote:

On 14.09.23 05:51, Xiaoyao Li wrote:

From: Chao Peng 

Switch to KVM_SET_USER_MEMORY_REGION2 when supported by KVM.

With KVM_SET_USER_MEMORY_REGION2, QEMU can set up memory region that
backend'ed both by hva-based shared memory and gmem fd based private
memory.

Signed-off-by: Chao Peng 
Codeveloped-by: Xiaoyao Li 


"Co-developed-by".


I will fix it, thanks!

Re: [RFC PATCH v2 04/21] memory: Introduce memory_region_has_gmem_fd()

2023-09-21 Thread Xiaoyao Li


On 9/21/2023 4:46 PM, David Hildenbrand wrote:

On 14.09.23 05:51, Xiaoyao Li wrote:

Introduce memory_region_has_gmem_fd() to query if the MemoryRegion has
KVM gmem fd allocated.


*probably* best to just squash that into patch #2.


Sure, I will do it.

Re: [RFC PATCH v2 02/21] RAMBlock: Add support of KVM private gmem

2023-09-21 Thread Xiaoyao Li


On 9/21/2023 4:55 PM, David Hildenbrand wrote:

On 14.09.23 05:50, Xiaoyao Li wrote:

From: Chao Peng 

Add KVM gmem support to RAMBlock so both normal hva based memory
and kvm gmem fd based private memory can be associated in one RAMBlock.

Introduce new flag RAM_KVM_GMEM. It calls KVM ioctl to create private
gmem for the RAMBlock when it's set.



But who sets RAM_KVM_GMEM and when? 


The answer is in the next patch. When `private` property of memory 
backend is set to true, it will pass RAM_KVM_GMEM flag to 
memory_region_init_ram_*()


Don't we simply allocate it for all 
RAMBlocks under such special VMs? 


yes, this is the direction after your comments.

I'll try to figure out how to achieve it.


What's the downside of doing that?


As far as I see, for TDX, no downside.

Re: [PATCH v13 6/9] gfxstream + rutabaga: add initial support for gfxstream

2023-09-21 Thread Gurchetan Singh

On Wed, Sep 20, 2023 at 5:05 AM Mark Cave-Ayland <
mark.cave-ayl...@ilande.co.uk> wrote:

> On 20/09/2023 12:42, Akihiko Odaki wrote:
>
> > On 2023/08/29 9:36, Gurchetan Singh wrote:
> >> This adds initial support for gfxstream and cross-domain.  Both
> >> features rely on virtio-gpu blob resources and context types, which
> >> are also implemented in this patch.
> >>
> >> gfxstream has a long and illustrious history in Android graphics
> >> paravirtualization.  It has been powering graphics in the Android
> >> Studio Emulator for more than a decade, which is the main developer
> >> platform.
> >>
> >> Originally conceived by Jesse Hall, it was first known as "EmuGL" [a].
> >> The key design characteristic was a 1:1 threading model and
> >> auto-generation, which fit nicely with the OpenGLES spec.  It also
> >> allowed easy layering with ANGLE on the host, which provides the GLES
> >> implementations on Windows or MacOS enviroments.
> >>
> >> gfxstream has traditionally been maintained by a single engineer, and
> >> between 2015 to 2021, the goldfish throne passed to Frank Yang.
> >> Historians often remark this glorious reign ("pax gfxstreama" is the
> >> academic term) was comparable to that of Augustus and both Queen
> >> Elizabeths.  Just to name a few accomplishments in a resplendent
> >> panoply: higher versions of GLES, address space graphics, snapshot
> >> support and CTS compliant Vulkan [b].
> >>
> >> One major drawback was the use of out-of-tree goldfish drivers.
> >> Android engineers didn't know much about DRM/KMS and especially TTM so
> >> a simple guest to host pipe was conceived.
> >>
> >> Luckily, virtio-gpu 3D started to emerge in 2016 due to the work of
> >> the Mesa/virglrenderer communities.  In 2018, the initial virtio-gpu
> >> port of gfxstream was done by Cuttlefish enthusiast Alistair Delva.
> >> It was a symbol compatible replacement of virglrenderer [c] and named
> >> "AVDVirglrenderer".  This implementation forms the basis of the
> >> current gfxstream host implementation still in use today.
> >>
> >> cross-domain support follows a similar arc.  Originally conceived by
> >> Wayland aficionado David Reveman and crosvm enjoyer Zach Reizner in
> >> 2018, it initially relied on the downstream "virtio-wl" device.
> >>
> >> In 2020 and 2021, virtio-gpu was extended to include blob resources
> >> and multiple timelines by yours truly, features gfxstream/cross-domain
> >> both require to function correctly.
> >>
> >> Right now, we stand at the precipice of a truly fantastic possibility:
> >> the Android Emulator powered by upstream QEMU and upstream Linux
> >> kernel.  gfxstream will then be packaged properfully, and app
> >> developers can even fix gfxstream bugs on their own if they encounter
> >> them.
> >>
> >> It's been quite the ride, my friends.  Where will gfxstream head next,
> >> nobody really knows.  I wouldn't be surprised if it's around for
> >> another decade, maintained by a new generation of Android graphics
> >> enthusiasts.
> >>
> >> Technical details:
> >>- Very simple initial display integration: just used Pixman
> >>- Largely, 1:1 mapping of virtio-gpu hypercalls to rutabaga function
> >>  calls
> >>
> >> Next steps for Android VMs:
> >>- The next step would be improving display integration and UI
> interfaces
> >>  with the goal of the QEMU upstream graphics being in an emulator
> >>  release [d].
> >>
> >> Next steps for Linux VMs for display virtualization:
> >>- For widespread distribution, someone needs to package Sommelier or
> the
> >>  wayland-proxy-virtwl [e] ideally into Debian main. In addition,
> newer
> >>  versions of the Linux kernel come with DRM_VIRTIO_GPU_KMS option,
> >>  which allows disabling KMS hypercalls.  If anyone cares enough,
> it'll
> >>  probably be possible to build a custom VM variant that uses this
> display
> >>  virtualization strategy.
> >>
> >> [a]
> https://android-review.googlesource.com/c/platform/development/+/34470
> >> [b]
> https://android-review.googlesource.com/q/topic:%22vulkan-hostconnection-start%22
> >> [c]
> https://android-review.googlesource.com/c/device/generic/goldfish-opengl/+/761927
> >> [d] https://developer.android.com/studio/releases/emulator
> >> [e] https://github.com/talex5/wayland-proxy-virtwl
> >>
> >> Signed-off-by: Gurchetan Singh 
> >> Tested-by: Alyssa Ross 
> >> Tested-by: Emmanouil Pitsidianakis 
> >> Tested-by: Akihiko Odaki 
> >> Reviewed-by: Emmanouil Pitsidianakis 
> >> Reviewed-by: Antonio Caggiano 
> >> Reviewed-by: Akihiko Odaki 
> >> ---
> >>   hw/display/virtio-gpu-pci-rutabaga.c |   47 ++
> >>   hw/display/virtio-gpu-rutabaga.c | 1119 ++
> >>   hw/display/virtio-vga-rutabaga.c |   50 ++
> >>   3 files changed, 1216 insertions(+)
> >>   create mode 100644 hw/display/virtio-gpu-pci-rutabaga.c
> >>   create mode 100644 hw/display/virtio-gpu-rutabaga.c
> >>   create mode 100644 hw/display/virtio-vga-rutabaga.c
>

Re: [PATCH v23 01/20] CPU topology: extend with s390 specifics

2023-09-21 Thread Nina Schoetterl-Glausch

On Wed, 2023-09-20 at 12:57 +0200, Markus Armbruster wrote:
> Nina Schoetterl-Glausch  writes:
> 
> > On Tue, 2023-09-19 at 14:47 +0200, Markus Armbruster wrote:
> > > Nina Schoetterl-Glausch  writes:
> > > 
> > > > From: Pierre Morel 
> > > > 
> > > > S390 adds two new SMP levels, drawers and books to the CPU
> > > > topology.
> > > > S390 CPUs have specific topology features like dedication and
> > > > entitlement. These indicate to the guest information on host
> > > > vCPU scheduling and help the guest make better scheduling decisions.
> > > > 
> > > > Let us provide the SMP properties with books and drawers levels
> > > > and S390 CPU with dedication and entitlement,
> > > > 
> > > > Signed-off-by: Pierre Morel 
> > > > Reviewed-by: Nina Schoetterl-Glausch 
> > > > Co-developed-by: Nina Schoetterl-Glausch 
> > > > Signed-off-by: Nina Schoetterl-Glausch 
> > > > ---
> > > >  qapi/machine-common.json| 21 +
> > > >  qapi/machine.json   | 19 ++--
> > > >  include/hw/boards.h | 10 +-
> > > >  include/hw/qdev-properties-system.h |  4 +++
> > > >  target/s390x/cpu.h  |  6 
> > > >  hw/core/machine-smp.c   | 48 -
> > > >  hw/core/machine.c   |  4 +++
> > > >  hw/core/qdev-properties-system.c| 13 
> > > >  hw/s390x/s390-virtio-ccw.c  |  4 +++
> > > >  softmmu/vl.c|  6 
> > > >  target/s390x/cpu.c  |  7 +
> > > >  qapi/meson.build|  1 +
> > > >  qemu-options.hx |  7 +++--
> > > >  13 files changed, 137 insertions(+), 13 deletions(-)
> > > >  create mode 100644 qapi/machine-common.json
> > > > 
> > > > diff --git a/qapi/machine-common.json b/qapi/machine-common.json
> > > > new file mode 100644
> > > > index 00..e40421bb37
> > > > --- /dev/null
> > > > +++ b/qapi/machine-common.json
> > > 
> > > Why do you need a separate QAPI sub-module?
> > 
> > See here 
> > https://lore.kernel.org/qemu-devel/d8da6f7d1e3addcb63614f548ed77ac1b8895e63.ca...@linux.ibm.com/
> 
> Quote:
> 
> CpuS390Entitlement would be useful in both machine.json and 
> machine-target.json
> 
> This is not obvious from this patch.  I figure this patch could add it
> to machine.json just fine.  The use in machine-target.json in appears
> only in PATCH 08.

Want me to add the rational to the commit message?

> 
> because query-cpu-fast is defined in machine.json and set-cpu-topology is 
> defined
> in machine-target.json.
> 
> So then the question is where best to define CpuS390Entitlement.
> In machine.json and include machine.json in machine-target.json?
> Or define it in another file and include it from both?
> 
> You do the latter in this patch.
> 
> I figure the former would be tolerable, too.
> 
> That said, having target-specific stuff in machine.json feels... odd.
> Before this series, we have CpuInfoS390 and CpuS390State there, for
> query-cpus-fast.  That command returns a list of objects where common
> members are target-independent, and the variant members are
> target-dependent.  qmp_query_cpus_fast() uses a CPU method to populate
> the target-dependent members.
> 
> I'm not sure splitting query-cpus-fast into a target-dependent and a
> target-independent part is worth the bother.
> 
> In this patch, you work with the structure you found.  Can't fault you
> for that :)
> 
> > > > @@ -0,0 +1,21 @@
> > > > +# -*- Mode: Python -*-
> > > > +# vim: filetype=python
> > > > +#
> > > > +# This work is licensed under the terms of the GNU GPL, version 2 or 
> > > > later.
> > > > +# See the COPYING file in the top-level directory.
> > > > +
> > > > +##
> > > > +# = Machines S390 data types
> > > > +##
> > > > +
> > > > +##
> > > > +# @CpuS390Entitlement:
> > > > +#
> > > > +# An enumeration of cpu entitlements that can be assumed by a virtual
> > > > +# S390 CPU
> > > > +#
> > > > +# Since: 8.2
> > > > +##
> > > > +{ 'enum': 'CpuS390Entitlement',
> > > > +  'prefix': 'S390_CPU_ENTITLEMENT',
> > > > +  'data': [ 'auto', 'low', 'medium', 'high' ] }
> > > > diff --git a/qapi/machine.json b/qapi/machine.json
> > > > index a08b6576ca..a63cb951d2 100644
> > > > --- a/qapi/machine.json
> > > > +++ b/qapi/machine.json
> > > > @@ -9,6 +9,7 @@
> > >##
> > ># = Machines
> > > >  ##
> > > >  
> > > >  { 'include': 'common.json' }
> > > > +{ 'include': 'machine-common.json' }
> > > 
> > > Section structure is borked :)
> > > 
> > > Existing section "Machine" now ends at the new "Machines S390 data
> > > types" you pull in here.  The contents of below moves from "Machines" to
> > > "Machines S390 data types".
> > > 
> > > Before I explain how to avoid this, I'd like to understand why we need a
> > > new sub-module.
> > > 
> > > >  
> > > >  ##
> > > >  # @SysEmuTarget:
> > > > @@ -71,7 +72,7 @@
> > >##
> > ># @CpuInfoFast:
> > >#
> > ># Information about a v

Re: [PATCH 0/9] Replace remaining target_ulong in system-mode accel

2023-09-21 Thread Michael Tokarev


07.08.2023 18:56, Anton Johansson via wrote:

This patchset replaces the remaining uses of target_ulong in the accel/
directory.  Specifically, the address type of a few kvm/hvf functions
is widened to vaddr, and the address type of the cpu_[st|ld]*()
functions is changed to abi_ptr (which is re-typedef'd to vaddr in
system mode).

As a starting point, my goal is to be able to build cputlb.c once for
system mode, and this is a step in that direction by reducing the
target-dependence of accel/.

* Changes in v2:
 - Removed explicit target_ulong casts from 3rd and 4th patches.

Anton Johansson (9):
   accel/kvm: Widen pc/saved_insn for kvm_sw_breakpoint
   accel/hvf: Widen pc/saved_insn for hvf_sw_breakpoint
   target: Use vaddr for kvm_arch_[insert|remove]_hw_breakpoint
   target: Use vaddr for hvf_arch_[insert|remove]_hw_breakpoint
   Replace target_ulong with abi_ptr in cpu_[st|ld]*()
   include/exec: typedef abi_ptr to vaddr in softmmu
   include/exec: Widen tlb_hit/tlb_hit_page()
   accel/tcg: Widen address arg. in tlb_compare_set()
   accel/tcg: Update run_on_cpu_data static assert


Pinging a relatively old patchset, - which fixes from here needs to
go to stable-8.1?

The context: 
https://lore.kernel.org/qemu-devel/20230721205827.7502-1-a...@rev.ng/
And according to this email:

https://lore.kernel.org/qemu-devel/00e9e08eae1004ef67fe8dca3aaf5043e6863faa.ca...@gmail.com/

at least "include/exec: Widen tlb_hit/tlb_hit_page()" should go to 8.1,
something else?

Thanks,

/mjt

[PATCH] target/arm: Implement FEAT_HPMN0

2023-09-21 Thread Peter Maydell

FEAT_HPMN0 is a small feature which defines that it is valid for
MDCR_EL2.HPMN to be set to 0, meaning "no PMU event counters provided
to an EL1 guest" (previously this setting was reserved). QEMU's
implementation almost gets HPMN == 0 right, but we need to fix
one check in pmevcntr_is_64_bit(). That is enough for us to
advertise the feature in the 'max' CPU.

(We don't need to make the behaviour conditional on feature
presence, because the FEAT_HPMN0 behaviour is within the range
of permitted UNPREDICTABLE behaviour for a non-FEAT_HPMN0
implementation.)

Signed-off-by: Peter Maydell 
---
 docs/system/arm/emulation.rst | 1 +
 target/arm/helper.c   | 2 +-
 target/arm/tcg/cpu32.c| 4 
 target/arm/tcg/cpu64.c| 1 +
 4 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 3df936fc356..b19ea198c24 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -45,6 +45,7 @@ the following architecture extensions:
 - FEAT_HCX (Support for the HCRX_EL2 register)
 - FEAT_HPDS (Hierarchical permission disables)
 - FEAT_HPDS2 (Translation table page-based hardware attributes)
+- FEAT_HPMN0 (Setting of MDCR_EL2.HPMN to zero)
 - FEAT_I8MM (AArch64 Int8 matrix multiplication instructions)
 - FEAT_IDST (ID space trap handling)
 - FEAT_IESB (Implicit error synchronization event)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3b22596eabf..ea3e5c6fd0f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1283,7 +1283,7 @@ static bool pmevcntr_is_64_bit(CPUARMState *env, int 
counter)
 bool hlp = env->cp15.mdcr_el2 & MDCR_HLP;
 int hpmn = env->cp15.mdcr_el2 & MDCR_HPMN;
 
-if (hpmn != 0 && counter >= hpmn) {
+if (counter >= hpmn) {
 return hlp;
 }
 }
diff --git a/target/arm/tcg/cpu32.c b/target/arm/tcg/cpu32.c
index 1f918ff5375..0d5d8e307dd 100644
--- a/target/arm/tcg/cpu32.c
+++ b/target/arm/tcg/cpu32.c
@@ -89,6 +89,10 @@ void aa32_max_features(ARMCPU *cpu)
 t = FIELD_DP32(t, ID_DFR0, COPSDBG, 9);   /* FEAT_Debugv8p4 */
 t = FIELD_DP32(t, ID_DFR0, PERFMON, 6);   /* FEAT_PMUv3p5 */
 cpu->isar.id_dfr0 = t;
+
+t = cpu->isar.id_dfr1;
+t = FIELD_DP32(t, ID_DFR1, HPMN0, 1); /* FEAT_HPMN0 */
+cpu->isar.id_dfr1 = t;
 }
 
 /* CPU models. These are not needed for the AArch64 linux-user build. */
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 7264ab5ead1..ee369f10db6 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1104,6 +1104,7 @@ void aarch64_max_tcg_initfn(Object *obj)
 t = cpu->isar.id_aa64dfr0;
 t = FIELD_DP64(t, ID_AA64DFR0, DEBUGVER, 9);  /* FEAT_Debugv8p4 */
 t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 6);/* FEAT_PMUv3p5 */
+t = FIELD_DP64(t, ID_AA64DFR0, HPMN0, 1); /* FEAT_HPMN0 */
 cpu->isar.id_aa64dfr0 = t;
 
 t = cpu->isar.id_aa64smfr0;
-- 
2.34.1

Re: [PATCH 0/5] file-posix: Clean up and fix zoned checks

2023-09-21 Thread Michael Tokarev


21.09.2023 21:21, Michael Tokarev wrote:
..

Is this stable-worthy (at least 1-3)?  From the bug description it smells
like it should be in 8.1.x, or maybe whole series.


N/M, this whole patchset has been Cc'd qemu-stable already.

Thanks,

/mjt

Re: [PATCH] accel/tcg: mttcg remove false-negative halted assertion

2023-09-21 Thread Michael Tokarev


29.08.2023 04:06, Nicholas Piggin wrote:

mttcg asserts that an execution ending with EXCP_HALTED must have
cpu->halted. However between the event or instruction that sets
cpu->halted and requests exit and the assertion here, an
asynchronous event could clear cpu->halted.

This leads to crashes running AIX on ppc/pseries because it uses
H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and
H_PROD sets other cpu->halted = 0 and kicks it.

H_PROD could be turned into an interrupt to wake, but several other
places in ppc, sparc, and semihosting follow what looks like a similar
pattern setting halted = 0 directly. So remove this assertion.

Reported-by: Ivan Warren 
Signed-off-by: Nicholas Piggin 


This one also smells like a stable material, is it not?

Thanks,

/mjt


diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttcg.c
index b276262007..d0b6f288d9 100644
--- a/accel/tcg/tcg-accel-ops-mttcg.c
+++ b/accel/tcg/tcg-accel-ops-mttcg.c
@@ -98,17 +98,6 @@ static void *mttcg_cpu_thread_fn(void *arg)
  case EXCP_DEBUG:
  cpu_handle_guest_debug(cpu);
  break;
-case EXCP_HALTED:
-/*
- * during start-up the vCPU is reset and the thread is
- * kicked several times. If we don't ensure we go back
- * to sleep in the halted state we won't cleanly
- * start-up when the vCPU is enabled.
- *
- * cpu->halted should ensure we sleep in wait_io_event
- */
-g_assert(cpu->halted);
-break;
  case EXCP_ATOMIC:
  qemu_mutex_unlock_iothread();
  cpu_exec_step_atomic(cpu);

Re: [PATCH 0/5] file-posix: Clean up and fix zoned checks

2023-09-21 Thread Michael Tokarev


24.08.2023 18:53, Hanna Czenczek wrote:

Hi,

As presented in [1] there is a bug in the zone code in raw_co_prw(),
specifically we don’t check whether there actually is zone information
before running code that assumes there is (and thus we run into a
division by zero).  This has now also been reported in [2].

I believe the solution [1] is incomplete, though, which is why I’m
sending this separate series: I don’t think checking bs->wps and/or
bs->bl.zone_size to determine whether there is zone information is
right; for example, we do not have raw_refresh_zoned_limits() clear
those values if on a refresh, zone information were to disappear.

It is also weird that we separate checking bs->wps and bs->bl.zone_size
at all; raw_refresh_zoned_limits() seems to intend to ensure that either
we have information with non-NULL bs->wps and non-zero bs->bl.zone_size,
or we don’t.

I think we should have a single flag that tells whether we have valid
information or not, and it looks to me like bs->bl.zoned != BLK_Z_NONE
is the condition that fits best.

Patch 1 ensures that raw_refresh_zoned_limits() will set bs->bl.zoned to
BLK_Z_NONE on error, so that we can actually be sure that this condition
tells us whether we have valid information or not.

Patch 2 unifies all conditional checks for zone information to use
bs->bl.zoned != BLK_Z_NONE.

Patch 3 is the I/O error path fix, which is not really different from
[1].

Patch 4 does a bit of clean-up.

Patch 5 adds a regression test.


[1] https://lists.nongnu.org/archive/html/qemu-devel/2023-06/msg01742.html
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2234374


Is this stable-worthy (at least 1-3)?  From the bug description it smells
like it should be in 8.1.x, or maybe whole series.

Thanks,

/mjt

Re: [PATCH v4 2/3] i386: Explicitly ignore unsupported BUS_MCEERR_AO MCE on AMD guest

2023-09-21 Thread Yazen Ghannam

On 9/20/23 7:13 AM, Joao Martins wrote:
> On 18/09/2023 23:00, William Roche wrote:
>> Hi John,
>>
>> I'd like to put the emphasis on the fact that ignoring the SRAO error
>> for a VM is a real problem at least for a specific (rare) case I'm
>> currently working on: The VM migration.
>>
>> Context:
>>
>> - In the case of a poisoned page in the VM address space, the migration
>> can't read it and will skip this page, considering it as a zero-filled
>> page. The VM kernel (that handled the vMCE) would have marked it's
>> associated page as poisoned, and if the VM touches the page, the VM
>> kernel generates the associated MCE because it already knows about the
>> poisoned page.
>>
>> - When we ignore the vMCE in the case of a SIGBUS/BUS_MCEERR_AO error
>> (what this patch does), we entirely rely on the Hypervisor to send an
>> SRAR error to qemu when the page is touched: The AMD VM kernel will
>> receive the SIGBUS/BUS_MCEERR_AR and deal with it, thanks to your
>> changes here.
>>
>> So it looks like the mechanism works fine... unless the VM has migrated
>> between the SRAO error and the first time it really touches the poisoned
>> page to get an SRAR error !  In this case, its new address space
>> (created on the migration destination) will have a zero-page where we
>> had a poisoned page, and the AMD VM Kernel (that never dealt with the
>> SRAO) doesn't know about the poisoned page and will access the page
>> finding only zeros...  We have a memory corruption !

I don't understand this. Why would the page be zero? Even so, why would
that affect poison?

Also, during page migration, does the data flow through the CPU core?
Sorry for the basic question. I haven't done a lot with virtualization.

Please note that current AMD systems use an internal poison marker on
memory. This cannot be cleared through normal memory operations. The
only exception, I think, is to use the CLZERO instruction. This will
completely wipe a cacheline including metadata like poison, etc.

So the hardware should not (by design) loose track of poisoned data.

>>
>> It is a very rare window, but in order to fix it the most reasonable
>> course of action would be to make the AMD emulation deal with SRAO
>> errors, instead of ignoring them.
>>
>> Do you agree with my analysis ?
> 
> Under the case that SRAO aren't handled well in the kernel today[*] for AMD, 
> we
> could always add a migration blocker when we hit AO sigbus, in case ignoring
> is our only option. But this would be less than ideal to propagating the
> SRAO into the guest.
> 
> [*] Meaning knowing that handling the SRAO would generate a crash in the guest
> 
> Perhaps as an improvement, perhaps allow qemu to choose to propagate should 
> this
> limitation be lifted via a new -action value and allow it to ignore/propagate 
> or
> not e.g.
> 
>  -action mce=none # default on Intel to propagate all MCE events to the guest
>  -action mce=ignore-optional # Ignore SRAO
> 
> I suppose the second is also useful for ARM64 considering they currently 
> ignore
> SRAO events too.
> 
>> Would an AMD platform generate SRAO signal to a process
>> (SIGBUS/BUS_MCEERR_AO) in case of a real hardware error ?
>>
> This would be useful to confirm.
>

There is no SRAO signal on AMD. The closest equivalent may be a
"Deferred" error interrupt. This is an x86 APIC LVT interrupt, and it's
sent when a deferred (uncorrectable non-urgent) error is detected by a
memory controller.

In this case, the CPU will get the interrupt and log the error (in the
host).

An enhancement will be to take the MCA error information collected
during the interrupt and extract useful data. For example, we'll need to
translate the reported address to a system physical address that can be
mapped to a page.

Once we have the page, then we can decide how we want to signal the
process(es). We could get a deferred/AO error in the host, and signal the
guest with an AR. So the guest handling could be the same in both cases.

Would this be okay? Or is it important that the guest can distinguish
between the A0/AR cases? IOW, will guests have their own policies on
when to take action? Or is it more about allowing the guest to handle
the error less urgently?

Thanks,
Yazen

[PULL 12/30] target/arm: Implement FEAT_MOPS enable bits

2023-09-21 Thread Peter Maydell

FEAT_MOPS defines a handful of new enable bits:
 * HCRX_EL2.MSCEn, SCTLR_EL1.MSCEn, SCTLR_EL2.MSCen:
   define whether the new insns should UNDEF or not
 * HCRX_EL2.MCE2: defines whether memops exceptions from
   EL1 should be taken to EL1 or EL2

Since we don't sanitise what bits can be written for the SCTLR
registers, we only need to handle the new bits in HCRX_EL2, and
define SCTLR_MSCEN for the new SCTLR bit value.

The precedence of "HCRX bits acts as 0 if SCR_EL3.HXEn is 0" versus
"bit acts as 1 if EL2 disabled" is not clear from the register
definition text, but it is clear in the CheckMOPSEnabled()
pseudocode(), so we follow that.  We'll have to check whether other
bits we need to implement in future follow the same logic or not.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-3-peter.mayd...@linaro.org
---
 target/arm/cpu.h|  6 ++
 target/arm/helper.c | 28 +---
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index bc7a69a8753..266c1a9ea1b 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1315,6 +1315,7 @@ void pmu_init(ARMCPU *cpu);
 #define SCTLR_EnIB(1U << 30) /* v8.3, AArch64 only */
 #define SCTLR_EnIA(1U << 31) /* v8.3, AArch64 only */
 #define SCTLR_DSSBS_32 (1U << 31) /* v8.5, AArch32 only */
+#define SCTLR_MSCEN   (1ULL << 33) /* FEAT_MOPS */
 #define SCTLR_BT0 (1ULL << 35) /* v8.5-BTI */
 #define SCTLR_BT1 (1ULL << 36) /* v8.5-BTI */
 #define SCTLR_ITFSB   (1ULL << 37) /* v8.5-MemTag */
@@ -4281,6 +4282,11 @@ static inline bool isar_feature_aa64_doublelock(const 
ARMISARegisters *id)
 return FIELD_SEX64(id->id_aa64dfr0, ID_AA64DFR0, DOUBLELOCK) >= 0;
 }
 
+static inline bool isar_feature_aa64_mops(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64isar2, ID_AA64ISAR2, MOPS);
+}
+
 /*
  * Feature tests for "does this exist in either 32-bit or 64-bit?"
  */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 594985d7c8c..83620787b45 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5980,7 +5980,10 @@ static void hcrx_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 {
 uint64_t valid_mask = 0;
 
-/* No features adding bits to HCRX are implemented. */
+/* FEAT_MOPS adds MSCEn and MCE2 */
+if (cpu_isar_feature(aa64_mops, env_archcpu(env))) {
+valid_mask |= HCRX_MSCEN | HCRX_MCE2;
+}
 
 /* Clear RES0 bits.  */
 env->cp15.hcrx_el2 = value & valid_mask;
@@ -6009,13 +6012,24 @@ uint64_t arm_hcrx_el2_eff(CPUARMState *env)
 {
 /*
  * The bits in this register behave as 0 for all purposes other than
- * direct reads of the register if:
- *   - EL2 is not enabled in the current security state,
- *   - SCR_EL3.HXEn is 0.
+ * direct reads of the register if SCR_EL3.HXEn is 0.
+ * If EL2 is not enabled in the current security state, then the
+ * bit may behave as if 0, or as if 1, depending on the bit.
+ * For the moment, we treat the EL2-disabled case as taking
+ * priority over the HXEn-disabled case. This is true for the only
+ * bit for a feature which we implement where the answer is different
+ * for the two cases (MSCEn for FEAT_MOPS).
+ * This may need to be revisited for future bits.
  */
-if (!arm_is_el2_enabled(env)
-|| (arm_feature(env, ARM_FEATURE_EL3)
-&& !(env->cp15.scr_el3 & SCR_HXEN))) {
+if (!arm_is_el2_enabled(env)) {
+uint64_t hcrx = 0;
+if (cpu_isar_feature(aa64_mops, env_archcpu(env))) {
+/* MSCEn behaves as 1 if EL2 is not enabled */
+hcrx |= HCRX_MSCEN;
+}
+return hcrx;
+}
+if (arm_feature(env, ARM_FEATURE_EL3) && !(env->cp15.scr_el3 & SCR_HXEN)) {
 return 0;
 }
 return env->cp15.hcrx_el2;
-- 
2.34.1

[PULL 08/30] target/arm: Update user-mode ID reg mask values

2023-09-21 Thread Peter Maydell

For user-only mode we reveal a subset of the AArch64 ID registers
to the guest, to emulate the kernel's trap-and-emulate-ID-regs
handling. Update the feature bit masks to match upstream kernel
commit a48fa7efaf1161c1c.

None of these features are yet implemented by QEMU, so this
doesn't yet have a behavioural change, but implementation of
FEAT_MOPS and FEAT_HBC is imminent.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target/arm/helper.c | 11 ++-
 tests/tcg/aarch64/sysregs.c |  4 ++--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3b22596eabf..594985d7c8c 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8621,11 +8621,16 @@ void register_cp_regs_for_features(ARMCPU *cpu)
R_ID_AA64ZFR0_F64MM_MASK },
 { .name = "ID_AA64SMFR0_EL1",
   .exported_bits = R_ID_AA64SMFR0_F32F32_MASK |
+   R_ID_AA64SMFR0_BI32I32_MASK |
R_ID_AA64SMFR0_B16F32_MASK |
R_ID_AA64SMFR0_F16F32_MASK |
R_ID_AA64SMFR0_I8I32_MASK |
+   R_ID_AA64SMFR0_F16F16_MASK |
+   R_ID_AA64SMFR0_B16B16_MASK |
+   R_ID_AA64SMFR0_I16I32_MASK |
R_ID_AA64SMFR0_F64F64_MASK |
R_ID_AA64SMFR0_I16I64_MASK |
+   R_ID_AA64SMFR0_SMEVER_MASK |
R_ID_AA64SMFR0_FA64_MASK },
 { .name = "ID_AA64MMFR0_EL1",
   .exported_bits = R_ID_AA64MMFR0_ECV_MASK,
@@ -8676,7 +8681,11 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .exported_bits = R_ID_AA64ISAR2_WFXT_MASK |
R_ID_AA64ISAR2_RPRES_MASK |
R_ID_AA64ISAR2_GPA3_MASK |
-   R_ID_AA64ISAR2_APA3_MASK },
+   R_ID_AA64ISAR2_APA3_MASK |
+   R_ID_AA64ISAR2_MOPS_MASK |
+   R_ID_AA64ISAR2_BC_MASK |
+   R_ID_AA64ISAR2_RPRFM_MASK |
+   R_ID_AA64ISAR2_CSSC_MASK },
 { .name = "ID_AA64ISAR*_EL1_RESERVED",
   .is_glob = true },
 };
diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
index d8eb06abcf2..f7a055f1d5f 100644
--- a/tests/tcg/aarch64/sysregs.c
+++ b/tests/tcg/aarch64/sysregs.c
@@ -126,7 +126,7 @@ int main(void)
  */
 get_cpu_reg_check_mask(id_aa64isar0_el1, _m(f0ff,,f0ff,fff0));
 get_cpu_reg_check_mask(id_aa64isar1_el1, _m(00ff,f0ff,,));
-get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(,,,));
+get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(00ff,,00ff,));
 /* TGran4 & TGran64 as pegged to -1 */
 get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(f000,,ff00,));
 get_cpu_reg_check_mask(id_aa64mmfr1_el1, _m(,f000,,));
@@ -138,7 +138,7 @@ int main(void)
 get_cpu_reg_check_mask(id_aa64dfr0_el1,  _m(,,,0006));
 get_cpu_reg_check_zero(id_aa64dfr1_el1);
 get_cpu_reg_check_mask(SYS_ID_AA64ZFR0_EL1,  _m(0ff0,ff0f,00ff,00ff));
-get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(80f1,00fd,,));
+get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(8ff1,fcff,,));
 
 get_cpu_reg_check_zero(id_aa64afr0_el1);
 get_cpu_reg_check_zero(id_aa64afr1_el1);
-- 
2.34.1

[PULL 04/30] linux-user/elfload.c: Correct SME feature names reported in cpuinfo

2023-09-21 Thread Peter Maydell

Some of the names we use for CPU features in linux-user's dummy
/proc/cpuinfo don't match the strings in the real kernel in
arch/arm64/kernel/cpuinfo.c. Specifically, the SME related
features have an underscore in the HWCAP_FOO define name,
but (like the SVE ones) they do not have an underscore in the
string in cpuinfo. Correct the errors.

Fixes: a55b9e7226708 ("linux-user: Emulate /proc/cpuinfo on aarch64 and arm")
Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 linux-user/elfload.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a5b28fa3e7a..5ce009d7137 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -844,13 +844,13 @@ const char *elf_hwcap2_str(uint32_t bit)
 [__builtin_ctz(ARM_HWCAP2_A64_RPRES)] = "rpres",
 [__builtin_ctz(ARM_HWCAP2_A64_MTE3 )] = "mte3",
 [__builtin_ctz(ARM_HWCAP2_A64_SME  )] = "sme",
-[__builtin_ctz(ARM_HWCAP2_A64_SME_I16I64   )] = "sme_i16i64",
-[__builtin_ctz(ARM_HWCAP2_A64_SME_F64F64   )] = "sme_f64f64",
-[__builtin_ctz(ARM_HWCAP2_A64_SME_I8I32)] = "sme_i8i32",
-[__builtin_ctz(ARM_HWCAP2_A64_SME_F16F32   )] = "sme_f16f32",
-[__builtin_ctz(ARM_HWCAP2_A64_SME_B16F32   )] = "sme_b16f32",
-[__builtin_ctz(ARM_HWCAP2_A64_SME_F32F32   )] = "sme_f32f32",
-[__builtin_ctz(ARM_HWCAP2_A64_SME_FA64 )] = "sme_fa64",
+[__builtin_ctz(ARM_HWCAP2_A64_SME_I16I64   )] = "smei16i64",
+[__builtin_ctz(ARM_HWCAP2_A64_SME_F64F64   )] = "smef64f64",
+[__builtin_ctz(ARM_HWCAP2_A64_SME_I8I32)] = "smei8i32",
+[__builtin_ctz(ARM_HWCAP2_A64_SME_F16F32   )] = "smef16f32",
+[__builtin_ctz(ARM_HWCAP2_A64_SME_B16F32   )] = "smeb16f32",
+[__builtin_ctz(ARM_HWCAP2_A64_SME_F32F32   )] = "smef32f32",
+[__builtin_ctz(ARM_HWCAP2_A64_SME_FA64 )] = "smefa64",
 };
 
 return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
-- 
2.34.1

[PULL 14/30] target/arm: Define syndrome function for MOPS exceptions

2023-09-21 Thread Peter Maydell

The FEAT_MOPS memory operations can raise a Memory Copy or Memory Set
exception if a copy or set instruction is executed when the CPU
register state is not correct for that instruction. Define the
usual syn_* function that constructs the syndrome register value
for these exceptions.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-5-peter.mayd...@linaro.org
---
 target/arm/syndrome.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
index 8a6b8f8162a..5d34755508d 100644
--- a/target/arm/syndrome.h
+++ b/target/arm/syndrome.h
@@ -58,6 +58,7 @@ enum arm_exception_class {
 EC_DATAABORT  = 0x24,
 EC_DATAABORT_SAME_EL  = 0x25,
 EC_SPALIGNMENT= 0x26,
+EC_MOP= 0x27,
 EC_AA32_FPTRAP= 0x28,
 EC_AA64_FPTRAP= 0x2c,
 EC_SERROR = 0x2f,
@@ -334,4 +335,15 @@ static inline uint32_t syn_serror(uint32_t extra)
 return (EC_SERROR << ARM_EL_EC_SHIFT) | ARM_EL_IL | extra;
 }
 
+static inline uint32_t syn_mop(bool is_set, bool is_setg, int options,
+   bool epilogue, bool wrong_option, bool option_a,
+   int destreg, int srcreg, int sizereg)
+{
+return (EC_MOP << ARM_EL_EC_SHIFT) | ARM_EL_IL |
+(is_set << 24) | (is_setg << 23) | (options << 19) |
+(epilogue << 18) | (wrong_option << 17) | (option_a << 16) |
+(destreg << 10) | (srcreg << 5) | sizereg;
+}
+
+
 #endif /* TARGET_ARM_SYNDROME_H */
-- 
2.34.1

[PULL 11/30] target/arm: Don't skip MTE checks for LDRT/STRT at EL0

2023-09-21 Thread Peter Maydell

The LDRT/STRT "unprivileged load/store" instructions behave like
normal ones if executed at EL0. We handle this correctly for
the load/store semantics, but get the MTE checking wrong.

We always look at s->mte_active[is_unpriv] to see whether we should
be doing MTE checks, but in hflags.c when we set the TB flags that
will be used to fill the mte_active[] array we only set the
MTE0_ACTIVE bit if UNPRIV is true (i.e.  we are not at EL0).

This means that a LDRT at EL0 will see s->mte_active[1] as 0,
and will not do MTE checks even when MTE is enabled.

To avoid the translate-time code having to do an explicit check on
s->unpriv to see if it is OK to index into the mte_active[] array,
duplicate MTE_ACTIVE into MTE0_ACTIVE when UNPRIV is false.

(This isn't a very serious bug because generally nobody executes
LDRT/STRT at EL0, because they have no use there.)

Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-2-peter.mayd...@linaro.org
---
 target/arm/tcg/hflags.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/target/arm/tcg/hflags.c b/target/arm/tcg/hflags.c
index 616c5fa7237..ea642384f5a 100644
--- a/target/arm/tcg/hflags.c
+++ b/target/arm/tcg/hflags.c
@@ -306,6 +306,15 @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, 
int el, int fp_el,
 && !(env->pstate & PSTATE_TCO)
 && (sctlr & (el == 0 ? SCTLR_TCF0 : SCTLR_TCF))) {
 DP_TBFLAG_A64(flags, MTE_ACTIVE, 1);
+if (!EX_TBFLAG_A64(flags, UNPRIV)) {
+/*
+ * In non-unpriv contexts (eg EL0), unpriv load/stores
+ * act like normal ones; duplicate the MTE info to
+ * avoid translate-a64.c having to check UNPRIV to see
+ * whether it is OK to index into MTE_ACTIVE[].
+ */
+DP_TBFLAG_A64(flags, MTE0_ACTIVE, 1);
+}
 }
 }
 /* And again for unprivileged accesses, if required.  */
-- 
2.34.1

[PULL 03/30] hw/arm/boot: Set SCR_EL3.FGTEn when booting kernel

2023-09-21 Thread Peter Maydell

From: Fabian Vogt 

Just like d7ef5e16a17c sets SCR_EL3.HXEn for FEAT_HCX, this commit
handles SCR_EL3.FGTEn for FEAT_FGT:

When we direct boot a kernel on a CPU which emulates EL3, we need to
set up the EL3 system registers as the Linux kernel documentation
specifies:
https://www.kernel.org/doc/Documentation/arm64/booting.rst

> For CPUs with the Fine Grained Traps (FEAT_FGT) extension present:
> - If EL3 is present and the kernel is entered at EL2:
>   - SCR_EL3.FGTEn (bit 27) must be initialised to 0b1.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Fabian Vogt 
Message-id: 4831384.gxafrqv...@linux-e202.suse.de
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 hw/arm/boot.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 720f22531a6..24fa1690600 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -761,6 +761,10 @@ static void do_cpu_reset(void *opaque)
 if (cpu_isar_feature(aa64_hcx, cpu)) {
 env->cp15.scr_el3 |= SCR_HXEN;
 }
+if (cpu_isar_feature(aa64_fgt, cpu)) {
+env->cp15.scr_el3 |= SCR_FGTEN;
+}
+
 /* AArch64 kernels never boot in secure mode */
 assert(!info->secure_boot);
 /* This hook is only supported for AArch32 currently:
-- 
2.34.1

[PULL 29/30] elf2dmp: use Linux mmap with MAP_NORESERVE when possible

2023-09-21 Thread Peter Maydell

From: Viktor Prutyanov 

Glib's g_mapped_file_new maps file with PROT_READ|PROT_WRITE and
MAP_PRIVATE. This leads to premature physical memory allocation of dump
file size on Linux hosts and may fail. On Linux, mapping the file with
MAP_NORESERVE limits the allocation by available memory.

Signed-off-by: Viktor Prutyanov 
Reviewed-by: Akihiko Odaki 
Message-id: 20230915170153.10959-5-vik...@daynix.com
Signed-off-by: Peter Maydell 
---
 contrib/elf2dmp/qemu_elf.h |  2 ++
 contrib/elf2dmp/qemu_elf.c | 68 +++---
 2 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/contrib/elf2dmp/qemu_elf.h b/contrib/elf2dmp/qemu_elf.h
index b2f0d9cbc9b..afa75f10b2d 100644
--- a/contrib/elf2dmp/qemu_elf.h
+++ b/contrib/elf2dmp/qemu_elf.h
@@ -32,7 +32,9 @@ typedef struct QEMUCPUState {
 int is_system(QEMUCPUState *s);
 
 typedef struct QEMU_Elf {
+#ifndef CONFIG_LINUX
 GMappedFile *gmf;
+#endif
 size_t size;
 void *map;
 QEMUCPUState **state;
diff --git a/contrib/elf2dmp/qemu_elf.c b/contrib/elf2dmp/qemu_elf.c
index ebda60dcb8a..de6ad744c6d 100644
--- a/contrib/elf2dmp/qemu_elf.c
+++ b/contrib/elf2dmp/qemu_elf.c
@@ -165,10 +165,40 @@ static bool check_ehdr(QEMU_Elf *qe)
 return true;
 }
 
-int QEMU_Elf_init(QEMU_Elf *qe, const char *filename)
+static int QEMU_Elf_map(QEMU_Elf *qe, const char *filename)
 {
+#ifdef CONFIG_LINUX
+struct stat st;
+int fd;
+
+printf("Using Linux mmap\n");
+
+fd = open(filename, O_RDONLY, 0);
+if (fd == -1) {
+eprintf("Failed to open ELF dump file \'%s\'\n", filename);
+return 1;
+}
+
+if (fstat(fd, &st)) {
+eprintf("Failed to get size of ELF dump file\n");
+close(fd);
+return 1;
+}
+qe->size = st.st_size;
+
+qe->map = mmap(NULL, qe->size, PROT_READ | PROT_WRITE,
+MAP_PRIVATE | MAP_NORESERVE, fd, 0);
+if (qe->map == MAP_FAILED) {
+eprintf("Failed to map ELF file\n");
+close(fd);
+return 1;
+}
+
+close(fd);
+#else
 GError *gerr = NULL;
-int err = 0;
+
+printf("Using GLib mmap\n");
 
 qe->gmf = g_mapped_file_new(filename, TRUE, &gerr);
 if (gerr) {
@@ -179,29 +209,43 @@ int QEMU_Elf_init(QEMU_Elf *qe, const char *filename)
 
 qe->map = g_mapped_file_get_contents(qe->gmf);
 qe->size = g_mapped_file_get_length(qe->gmf);
+#endif
+
+return 0;
+}
+
+static void QEMU_Elf_unmap(QEMU_Elf *qe)
+{
+#ifdef CONFIG_LINUX
+munmap(qe->map, qe->size);
+#else
+g_mapped_file_unref(qe->gmf);
+#endif
+}
+
+int QEMU_Elf_init(QEMU_Elf *qe, const char *filename)
+{
+if (QEMU_Elf_map(qe, filename)) {
+return 1;
+}
 
 if (!check_ehdr(qe)) {
 eprintf("Input file has the wrong format\n");
-err = 1;
-goto out_unmap;
+QEMU_Elf_unmap(qe);
+return 1;
 }
 
 if (init_states(qe)) {
 eprintf("Failed to extract QEMU CPU states\n");
-err = 1;
-goto out_unmap;
+QEMU_Elf_unmap(qe);
+return 1;
 }
 
 return 0;
-
-out_unmap:
-g_mapped_file_unref(qe->gmf);
-
-return err;
 }
 
 void QEMU_Elf_exit(QEMU_Elf *qe)
 {
 exit_states(qe);
-g_mapped_file_unref(qe->gmf);
+QEMU_Elf_unmap(qe);
 }
-- 
2.34.1

[PULL 27/30] elf2dmp: introduce physical block alignment

2023-09-21 Thread Peter Maydell

From: Viktor Prutyanov 

Physical memory ranges may not be aligned to page size in QEMU ELF, but
DMP can only contain page-aligned runs. So, align them.

Signed-off-by: Viktor Prutyanov 
Reviewed-by: Akihiko Odaki 
Message-id: 20230915170153.10959-3-vik...@daynix.com
Signed-off-by: Peter Maydell 
---
 contrib/elf2dmp/addrspace.h |  1 +
 contrib/elf2dmp/addrspace.c | 31 +--
 contrib/elf2dmp/main.c  |  5 +++--
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/contrib/elf2dmp/addrspace.h b/contrib/elf2dmp/addrspace.h
index 00b44c12180..039c70c5b07 100644
--- a/contrib/elf2dmp/addrspace.h
+++ b/contrib/elf2dmp/addrspace.h
@@ -12,6 +12,7 @@
 
 #define ELF2DMP_PAGE_BITS 12
 #define ELF2DMP_PAGE_SIZE (1ULL << ELF2DMP_PAGE_BITS)
+#define ELF2DMP_PAGE_MASK (ELF2DMP_PAGE_SIZE - 1)
 #define ELF2DMP_PFN_MASK (~(ELF2DMP_PAGE_SIZE - 1))
 
 #define INVALID_PA  UINT64_MAX
diff --git a/contrib/elf2dmp/addrspace.c b/contrib/elf2dmp/addrspace.c
index 0b04cba00e5..64b5d680adc 100644
--- a/contrib/elf2dmp/addrspace.c
+++ b/contrib/elf2dmp/addrspace.c
@@ -14,7 +14,7 @@ static struct pa_block *pa_space_find_block(struct pa_space 
*ps, uint64_t pa)
 
 for (i = 0; i < ps->block_nr; i++) {
 if (ps->block[i].paddr <= pa &&
-pa <= ps->block[i].paddr + ps->block[i].size) {
+pa < ps->block[i].paddr + ps->block[i].size) {
 return ps->block + i;
 }
 }
@@ -33,6 +33,30 @@ static uint8_t *pa_space_resolve(struct pa_space *ps, 
uint64_t pa)
 return block->addr + (pa - block->paddr);
 }
 
+static void pa_block_align(struct pa_block *b)
+{
+uint64_t low_align = ((b->paddr - 1) | ELF2DMP_PAGE_MASK) + 1 - b->paddr;
+uint64_t high_align = (b->paddr + b->size) & ELF2DMP_PAGE_MASK;
+
+if (low_align == 0 && high_align == 0) {
+return;
+}
+
+if (low_align + high_align < b->size) {
+printf("Block 0x%"PRIx64"+:0x%"PRIx64" will be aligned to "
+"0x%"PRIx64"+:0x%"PRIx64"\n", b->paddr, b->size,
+b->paddr + low_align, b->size - low_align - high_align);
+b->size -= low_align + high_align;
+} else {
+printf("Block 0x%"PRIx64"+:0x%"PRIx64" is too small to align\n",
+b->paddr, b->size);
+b->size = 0;
+}
+
+b->addr += low_align;
+b->paddr += low_align;
+}
+
 int pa_space_create(struct pa_space *ps, QEMU_Elf *qemu_elf)
 {
 Elf64_Half phdr_nr = elf_getphdrnum(qemu_elf->map);
@@ -60,10 +84,13 @@ int pa_space_create(struct pa_space *ps, QEMU_Elf *qemu_elf)
 .paddr = phdr[i].p_paddr,
 .size = phdr[i].p_filesz,
 };
-block_i++;
+pa_block_align(&ps->block[block_i]);
+block_i = ps->block[block_i].size ? (block_i + 1) : block_i;
 }
 }
 
+ps->block_nr = block_i;
+
 return 0;
 }
 
diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index bb6744c0cd6..b7e39301641 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -400,9 +400,10 @@ static int write_dump(struct pa_space *ps,
 for (i = 0; i < ps->block_nr; i++) {
 struct pa_block *b = &ps->block[i];
 
-printf("Writing block #%zu/%zu to file...\n", i, ps->block_nr);
+printf("Writing block #%zu/%zu of %"PRIu64" bytes to file...\n", i,
+ps->block_nr, b->size);
 if (fwrite(b->addr, b->size, 1, dmp_file) != 1) {
-eprintf("Failed to write dump header\n");
+eprintf("Failed to write block\n");
 fclose(dmp_file);
 return 1;
 }
-- 
2.34.1

[PULL 30/30] elf2dmp: rework PDB_STREAM_INDEXES::segments obtaining

2023-09-21 Thread Peter Maydell

From: Viktor Prutyanov 

PDB for Windows 11 kernel has slightly different structure compared to
previous versions. Since elf2dmp don't use the other fields, copy only
'segments' field from PDB_STREAM_INDEXES.

Signed-off-by: Viktor Prutyanov 
Reviewed-by: Akihiko Odaki 
Message-id: 20230915170153.10959-6-vik...@daynix.com
Signed-off-by: Peter Maydell 
---
 contrib/elf2dmp/pdb.h |  2 +-
 contrib/elf2dmp/pdb.c | 15 ---
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/contrib/elf2dmp/pdb.h b/contrib/elf2dmp/pdb.h
index 4ea8925ee82..2a50da56ac9 100644
--- a/contrib/elf2dmp/pdb.h
+++ b/contrib/elf2dmp/pdb.h
@@ -227,7 +227,7 @@ struct pdb_reader {
 } ds;
 uint32_t file_used[1024];
 PDB_SYMBOLS *symbols;
-PDB_STREAM_INDEXES sidx;
+uint16_t segments;
 uint8_t *modimage;
 char *segs;
 size_t segs_size;
diff --git a/contrib/elf2dmp/pdb.c b/contrib/elf2dmp/pdb.c
index adcfa7e154c..6ca5086f02e 100644
--- a/contrib/elf2dmp/pdb.c
+++ b/contrib/elf2dmp/pdb.c
@@ -160,7 +160,7 @@ static void *pdb_ds_read_file(struct pdb_reader* r, 
uint32_t file_number)
 static int pdb_init_segments(struct pdb_reader *r)
 {
 char *segs;
-unsigned stream_idx = r->sidx.segments;
+unsigned stream_idx = r->segments;
 
 segs = pdb_ds_read_file(r, stream_idx);
 if (!segs) {
@@ -177,9 +177,6 @@ static int pdb_init_symbols(struct pdb_reader *r)
 {
 int err = 0;
 PDB_SYMBOLS *symbols;
-PDB_STREAM_INDEXES *sidx = &r->sidx;
-
-memset(sidx, -1, sizeof(*sidx));
 
 symbols = pdb_ds_read_file(r, 3);
 if (!symbols) {
@@ -188,15 +185,11 @@ static int pdb_init_symbols(struct pdb_reader *r)
 
 r->symbols = symbols;
 
-if (symbols->stream_index_size != sizeof(PDB_STREAM_INDEXES)) {
-err = 1;
-goto out_symbols;
-}
-
-memcpy(sidx, (const char *)symbols + sizeof(PDB_SYMBOLS) +
+r->segments = *(uint16_t *)((const char *)symbols + sizeof(PDB_SYMBOLS) +
 symbols->module_size + symbols->offset_size +
 symbols->hash_size + symbols->srcmodule_size +
-symbols->pdbimport_size + symbols->unknown2_size, sizeof(*sidx));
+symbols->pdbimport_size + symbols->unknown2_size +
+offsetof(PDB_STREAM_INDEXES, segments));
 
 /* Read global symbol table */
 r->modimage = pdb_ds_read_file(r, symbols->gsym_file);
-- 
2.34.1

[PULL 24/30] audio/jackaudio: Avoid dynamic stack allocation in qjack_process()

2023-09-21 Thread Peter Maydell

Avoid a dynamic stack allocation in qjack_process().  Since this
function is a JACK process callback, we are not permitted to malloc()
here, so we allocate a working buffer in qjack_client_init() instead.

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Peter Maydell 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Francisco Iglesias 
Reviewed-by: Christian Schoenebeck 
Message-id: 20230818155846.1651287-3-peter.mayd...@linaro.org
---
 audio/jackaudio.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/audio/jackaudio.c b/audio/jackaudio.c
index 7cb2a49f971..e1eaa3477dc 100644
--- a/audio/jackaudio.c
+++ b/audio/jackaudio.c
@@ -70,6 +70,9 @@ typedef struct QJackClient {
 int buffersize;
 jack_port_t   **port;
 QJackBuffer fifo;
+
+/* Used as workspace by qjack_process() */
+float **process_buffers;
 }
 QJackClient;
 
@@ -267,22 +270,21 @@ static int qjack_process(jack_nframes_t nframes, void 
*arg)
 }
 
 /* get the buffers for the ports */
-float *buffers[c->nchannels];
 for (int i = 0; i < c->nchannels; ++i) {
-buffers[i] = jack_port_get_buffer(c->port[i], nframes);
+c->process_buffers[i] = jack_port_get_buffer(c->port[i], nframes);
 }
 
 if (c->out) {
 if (likely(c->enabled)) {
-qjack_buffer_read_l(&c->fifo, buffers, nframes);
+qjack_buffer_read_l(&c->fifo, c->process_buffers, nframes);
 } else {
 for (int i = 0; i < c->nchannels; ++i) {
-memset(buffers[i], 0, nframes * sizeof(float));
+memset(c->process_buffers[i], 0, nframes * sizeof(float));
 }
 }
 } else {
 if (likely(c->enabled)) {
-qjack_buffer_write_l(&c->fifo, buffers, nframes);
+qjack_buffer_write_l(&c->fifo, c->process_buffers, nframes);
 }
 }
 
@@ -448,6 +450,9 @@ static int qjack_client_init(QJackClient *c)
   jack_get_client_name(c->client));
 }
 
+/* Allocate working buffer for process callback */
+c->process_buffers = g_new(float *, c->nchannels);
+
 jack_set_process_callback(c->client, qjack_process , c);
 jack_set_port_registration_callback(c->client, qjack_port_registration, c);
 jack_set_xrun_callback(c->client, qjack_xrun, c);
@@ -579,6 +584,7 @@ static void qjack_client_fini_locked(QJackClient *c)
 
 qjack_buffer_free(&c->fifo);
 g_free(c->port);
+g_free(c->process_buffers);
 
 c->state = QJACK_STATE_DISCONNECTED;
 /* fallthrough */
-- 
2.34.1

[PULL 17/30] target/arm: Implement the SET* instructions

2023-09-21 Thread Peter Maydell

Implement the SET* instructions which collectively implement a
"memset" operation.  These come in a set of three, eg SETP
(prologue), SETM (main), SETE (epilogue), and each of those has
different flavours to indicate whether memory accesses should be
unpriv or non-temporal.

This commit does not include the "memset with tag setting"
SETG* instructions.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-8-peter.mayd...@linaro.org
---
 target/arm/tcg/helper-a64.h|   4 +
 target/arm/tcg/a64.decode  |  16 ++
 target/arm/tcg/helper-a64.c| 344 +
 target/arm/tcg/translate-a64.c |  49 +
 4 files changed, 413 insertions(+)

diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
index 57cfd68569e..7ce5d2105ad 100644
--- a/target/arm/tcg/helper-a64.h
+++ b/target/arm/tcg/helper-a64.h
@@ -117,3 +117,7 @@ DEF_HELPER_FLAGS_3(stzgm_tags, TCG_CALL_NO_WG, void, env, 
i64, i64)
 
 DEF_HELPER_FLAGS_4(unaligned_access, TCG_CALL_NO_WG,
noreturn, env, i64, i32, i32)
+
+DEF_HELPER_3(setp, void, env, i32, i32)
+DEF_HELPER_3(setm, void, env, i32, i32)
+DEF_HELPER_3(sete, void, env, i32, i32)
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 71113173020..c2a97328eeb 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -554,3 +554,19 @@ LDGM11011001 11 1 . 00 . . 
@ldst_tag_mult p=0 w=0
 STZ2G   11011001 11 1 . 01 . . @ldst_tag p=1 w=1
 STZ2G   11011001 11 1 . 10 . . @ldst_tag p=0 w=0
 STZ2G   11011001 11 1 . 11 . . @ldst_tag p=0 w=1
+
+# Memory operations (memset, memcpy, memmove)
+# Each of these comes in a set of three, eg SETP (prologue), SETM (main),
+# SETE (epilogue), and each of those has different flavours to
+# indicate whether memory accesses should be unpriv or non-temporal.
+# We don't distinguish temporal and non-temporal accesses, but we
+# do need to report it in syndrome register values.
+
+# Memset
+&set rs rn rd unpriv nontemp
+# op2 bit 1 is nontemporal bit
+@set .. . rs:5 .. nontemp:1 unpriv:1 .. rn:5 rd:5 &set
+
+SETP00 011001110 . 00 . . 01 . . @set
+SETM00 011001110 . 01 . . 01 . . @set
+SETE00 011001110 . 10 . . 01 . . @set
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
index 0cf56f6dc44..24ae5ecf32e 100644
--- a/target/arm/tcg/helper-a64.c
+++ b/target/arm/tcg/helper-a64.c
@@ -968,3 +968,347 @@ void HELPER(unaligned_access)(CPUARMState *env, uint64_t 
addr,
 arm_cpu_do_unaligned_access(env_cpu(env), addr, access_type,
 mmu_idx, GETPC());
 }
+
+/* Memory operations (memset, memmove, memcpy) */
+
+/*
+ * Return true if the CPY* and SET* insns can execute; compare
+ * pseudocode CheckMOPSEnabled(), though we refactor it a little.
+ */
+static bool mops_enabled(CPUARMState *env)
+{
+int el = arm_current_el(env);
+
+if (el < 2 &&
+(arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE) &&
+!(arm_hcrx_el2_eff(env) & HCRX_MSCEN)) {
+return false;
+}
+
+if (el == 0) {
+if (!el_is_in_host(env, 0)) {
+return env->cp15.sctlr_el[1] & SCTLR_MSCEN;
+} else {
+return env->cp15.sctlr_el[2] & SCTLR_MSCEN;
+}
+}
+return true;
+}
+
+static void check_mops_enabled(CPUARMState *env, uintptr_t ra)
+{
+if (!mops_enabled(env)) {
+raise_exception_ra(env, EXCP_UDEF, syn_uncategorized(),
+   exception_target_el(env), ra);
+}
+}
+
+/*
+ * Return the target exception level for an exception due
+ * to mismatched arguments in a FEAT_MOPS copy or set.
+ * Compare pseudocode MismatchedCpySetTargetEL()
+ */
+static int mops_mismatch_exception_target_el(CPUARMState *env)
+{
+int el = arm_current_el(env);
+
+if (el > 1) {
+return el;
+}
+if (el == 0 && (arm_hcr_el2_eff(env) & HCR_TGE)) {
+return 2;
+}
+if (el == 1 && (arm_hcrx_el2_eff(env) & HCRX_MCE2)) {
+return 2;
+}
+return 1;
+}
+
+/*
+ * Check whether an M or E instruction was executed with a CF value
+ * indicating the wrong option for this implementation.
+ * Assumes we are always Option A.
+ */
+static void check_mops_wrong_option(CPUARMState *env, uint32_t syndrome,
+uintptr_t ra)
+{
+if (env->CF != 0) {
+syndrome |= 1 << 17; /* Set the wrong-option bit */
+raise_exception_ra(env, EXCP_UDEF, syndrome,
+   mops_mismatch_exception_target_el(env), ra);
+}
+}
+
+/*
+ * Return the maximum number of bytes we can transfer starting at addr
+ * without crossing a page boundary.
+ */
+static uint64_t page_limit(uint64_t addr)
+{
+return TARGET_PAGE_ALIGN(addr + 1) - addr;
+}
+
+/*
+ * Perform

[PULL 25/30] sbsa-ref: add non-secure EL2 virtual timer

2023-09-21 Thread Peter Maydell

From: Marcin Juszkiewicz 

Armv8.1+ cpus have Virtual Host Extension (VHE) which added non-secure
EL2 virtual timer.

This change adds it to fullfil Arm BSA (Base System Architecture)
requirements.

Signed-off-by: Marcin Juszkiewicz 
Message-id: 20230913140610.214893-2-marcin.juszkiew...@linaro.org
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 hw/arm/sbsa-ref.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index bc89eb48062..3c7dfcd6dc5 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -61,6 +61,7 @@
 #define ARCH_TIMER_S_EL1_IRQ   13
 #define ARCH_TIMER_NS_EL1_IRQ  14
 #define ARCH_TIMER_NS_EL2_IRQ  10
+#define ARCH_TIMER_NS_EL2_VIRT_IRQ  12
 
 enum {
 SBSA_FLASH,
@@ -489,6 +490,7 @@ static void create_gic(SBSAMachineState *sms, MemoryRegion 
*mem)
 [GTIMER_VIRT] = ARCH_TIMER_VIRT_IRQ,
 [GTIMER_HYP]  = ARCH_TIMER_NS_EL2_IRQ,
 [GTIMER_SEC]  = ARCH_TIMER_S_EL1_IRQ,
+[GTIMER_HYPVIRT] = ARCH_TIMER_NS_EL2_VIRT_IRQ,
 };
 
 for (irq = 0; irq < ARRAY_SIZE(timer_irq); irq++) {
-- 
2.34.1

[PULL 20/30] target/arm: Implement MTE tag-checking functions for FEAT_MOPS copies

2023-09-21 Thread Peter Maydell

The FEAT_MOPS memory copy operations need an extra helper routine
for checking for MTE tag checking failures beyond the ones we
already added for memory set operations:
 * mte_mops_probe_rev() does the same job as mte_mops_probe(), but
   it checks tags starting at the provided address and working
   backwards, rather than forwards

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-11-peter.mayd...@linaro.org
---
 target/arm/internals.h  | 17 +++
 target/arm/tcg/mte_helper.c | 99 +
 2 files changed, 116 insertions(+)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 642f77df29b..1dd9182a54a 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1288,6 +1288,23 @@ uint64_t mte_check(CPUARMState *env, uint32_t desc, 
uint64_t ptr, uintptr_t ra);
 uint64_t mte_mops_probe(CPUARMState *env, uint64_t ptr, uint64_t size,
 uint32_t desc);
 
+/**
+ * mte_mops_probe_rev: Check where the next MTE failure is for a FEAT_MOPS
+ * operation going in the reverse direction
+ * @env: CPU env
+ * @ptr: *end* address of memory region (dirty pointer)
+ * @size: length of region (guaranteed not to cross a page boundary)
+ * @desc: MTEDESC descriptor word (0 means no MTE checks)
+ * Returns: the size of the region that can be copied without hitting
+ *  an MTE tag failure
+ *
+ * Note that we assume that the caller has already checked the TBI
+ * and TCMA bits with mte_checks_needed() and an MTE check is definitely
+ * required.
+ */
+uint64_t mte_mops_probe_rev(CPUARMState *env, uint64_t ptr, uint64_t size,
+uint32_t desc);
+
 /**
  * mte_check_fail: Record an MTE tag check failure
  * @env: CPU env
diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index 66a80eeb950..2dd7eb3edbf 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -734,6 +734,55 @@ static int checkN(uint8_t *mem, int odd, int cmp, int 
count)
 return n;
 }
 
+/**
+ * checkNrev:
+ * @tag: tag memory to test
+ * @odd: true to begin testing at tags at odd nibble
+ * @cmp: the tag to compare against
+ * @count: number of tags to test
+ *
+ * Return the number of successful tests.
+ * Thus a return value < @count indicates a failure.
+ *
+ * This is like checkN, but it runs backwards, checking the
+ * tags starting with @tag and then the tags preceding it.
+ * This is needed by the backwards-memory-copying operations.
+ */
+static int checkNrev(uint8_t *mem, int odd, int cmp, int count)
+{
+int n = 0, diff;
+
+/* Replicate the test tag and compare.  */
+cmp *= 0x11;
+diff = *mem-- ^ cmp;
+
+if (!odd) {
+goto start_even;
+}
+
+while (1) {
+/* Test odd tag. */
+if (unlikely((diff) & 0xf0)) {
+break;
+}
+if (++n == count) {
+break;
+}
+
+start_even:
+/* Test even tag. */
+if (unlikely((diff) & 0x0f)) {
+break;
+}
+if (++n == count) {
+break;
+}
+
+diff = *mem-- ^ cmp;
+}
+return n;
+}
+
 /**
  * mte_probe_int() - helper for mte_probe and mte_check
  * @env: CPU environment
@@ -1042,6 +1091,56 @@ uint64_t mte_mops_probe(CPUARMState *env, uint64_t ptr, 
uint64_t size,
 }
 }
 
+uint64_t mte_mops_probe_rev(CPUARMState *env, uint64_t ptr, uint64_t size,
+uint32_t desc)
+{
+int mmu_idx, tag_count;
+uint64_t ptr_tag, tag_first, tag_last;
+void *mem;
+bool w = FIELD_EX32(desc, MTEDESC, WRITE);
+uint32_t n;
+
+mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
+/* True probe; this will never fault */
+mem = allocation_tag_mem_probe(env, mmu_idx, ptr,
+   w ? MMU_DATA_STORE : MMU_DATA_LOAD,
+   size, MMU_DATA_LOAD, true, 0);
+if (!mem) {
+return size;
+}
+
+/*
+ * TODO: checkNrev() is not designed for checks of the size we expect
+ * for FEAT_MOPS operations, so we should implement this differently.
+ * Maybe we should do something like
+ *   if (region start and size are aligned nicely) {
+ *  do direct loads of 64 tag bits at a time;
+ *   } else {
+ *  call checkN()
+ *   }
+ */
+/* Round the bounds to the tag granule, and compute the number of tags. */
+ptr_tag = allocation_tag_from_addr(ptr);
+tag_first = QEMU_ALIGN_DOWN(ptr - (size - 1), TAG_GRANULE);
+tag_last = QEMU_ALIGN_DOWN(ptr, TAG_GRANULE);
+tag_count = ((tag_last - tag_first) / TAG_GRANULE) + 1;
+n = checkNrev(mem, ptr & TAG_GRANULE, ptr_tag, tag_count);
+if (likely(n == tag_count)) {
+return size;
+}
+
+/*
+ * Failure; for the first granule, it's at @ptr. Otherwise
+ * it's at the last byte of the nth granule. Calculate how
+ * many bytes we can acce

[PULL 15/30] target/arm: New function allocation_tag_mem_probe()

2023-09-21 Thread Peter Maydell

For the FEAT_MOPS operations, the existing allocation_tag_mem()
function almost does what we want, but it will take a watchpoint
exception even for an ra == 0 probe request, and it requires that the
caller guarantee that the memory is accessible.  For FEAT_MOPS we
want a function that will not take any kind of exception, and will
return NULL for the not-accessible case.

Rename allocation_tag_mem() to allocation_tag_mem_probe() and add an
extra 'probe' argument that lets us distinguish these cases;
allocation_tag_mem() is now a wrapper that always passes 'false'.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-6-peter.mayd...@linaro.org
---
 target/arm/tcg/mte_helper.c | 48 -
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index e2494f73cf3..303bcc7fd84 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -50,13 +50,14 @@ static int choose_nonexcluded_tag(int tag, int offset, 
uint16_t exclude)
 }
 
 /**
- * allocation_tag_mem:
+ * allocation_tag_mem_probe:
  * @env: the cpu environment
  * @ptr_mmu_idx: the addressing regime to use for the virtual address
  * @ptr: the virtual address for which to look up tag memory
  * @ptr_access: the access to use for the virtual address
  * @ptr_size: the number of bytes in the normal memory access
  * @tag_access: the access to use for the tag memory
+ * @probe: true to merely probe, never taking an exception
  * @ra: the return address for exception handling
  *
  * Our tag memory is formatted as a sequence of little-endian nibbles.
@@ -65,15 +66,25 @@ static int choose_nonexcluded_tag(int tag, int offset, 
uint16_t exclude)
  * for the higher addr.
  *
  * Here, resolve the physical address from the virtual address, and return
- * a pointer to the corresponding tag byte.  Exit with exception if the
- * virtual address is not accessible for @ptr_access.
+ * a pointer to the corresponding tag byte.
  *
  * If there is no tag storage corresponding to @ptr, return NULL.
+ *
+ * If the page is inaccessible for @ptr_access, or has a watchpoint, there are
+ * three options:
+ * (1) probe = true, ra = 0 : pure probe -- we return NULL if the page is not
+ * accessible, and do not take watchpoint traps. The calling code must
+ * handle those cases in the right priority compared to MTE traps.
+ * (2) probe = false, ra = 0 : probe, no fault expected -- the caller 
guarantees
+ * that the page is going to be accessible. We will take watchpoint traps.
+ * (3) probe = false, ra != 0 : non-probe -- we will take both memory access
+ * traps and watchpoint traps.
+ * (probe = true, ra != 0 is invalid and will assert.)
  */
-static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
-   uint64_t ptr, MMUAccessType ptr_access,
-   int ptr_size, MMUAccessType tag_access,
-   uintptr_t ra)
+static uint8_t *allocation_tag_mem_probe(CPUARMState *env, int ptr_mmu_idx,
+ uint64_t ptr, MMUAccessType 
ptr_access,
+ int ptr_size, MMUAccessType 
tag_access,
+ bool probe, uintptr_t ra)
 {
 #ifdef CONFIG_USER_ONLY
 uint64_t clean_ptr = useronly_clean_ptr(ptr);
@@ -81,6 +92,8 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
 uint8_t *tags;
 uintptr_t index;
 
+assert(!(probe && ra));
+
 if (!(flags & (ptr_access == MMU_DATA_STORE ? PAGE_WRITE_ORG : 
PAGE_READ))) {
 cpu_loop_exit_sigsegv(env_cpu(env), ptr, ptr_access,
   !(flags & PAGE_VALID), ra);
@@ -111,12 +124,16 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
  * exception for inaccessible pages, and resolves the virtual address
  * into the softmmu tlb.
  *
- * When RA == 0, this is for mte_probe.  The page is expected to be
- * valid.  Indicate to probe_access_flags no-fault, then assert that
- * we received a valid page.
+ * When RA == 0, this is either a pure probe or a no-fault-expected probe.
+ * Indicate to probe_access_flags no-fault, then either return NULL
+ * for the pure probe, or assert that we received a valid page for the
+ * no-fault-expected probe.
  */
 flags = probe_access_full(env, ptr, 0, ptr_access, ptr_mmu_idx,
   ra == 0, &host, &full, ra);
+if (probe && (flags & TLB_INVALID_MASK)) {
+return NULL;
+}
 assert(!(flags & TLB_INVALID_MASK));
 
 /* If the virtual page MemAttr != Tagged, access unchecked. */
@@ -157,7 +174,7 @@ static uint8_t *allocation_tag_mem(CPUARMState *env, int 
ptr_mmu_idx,
 }
 
 /* Any debug exception has priority over a tag check exception. */
-if (unlikely(flags &

[PULL 02/30] docs/devel/loads-stores: Fix git grep regexes

2023-09-21 Thread Peter Maydell

The loads-and-stores documentation includes git grep regexes to find
occurrences of the various functions.  Some of these regexes have
errors, typically failing to escape the '?', '(' and ')' when they
should be metacharacters (since these are POSIX basic REs). We also
weren't consistent about whether to have a ':' on the end of the
line introducing the list of regexes in each section.

Fix the errors.

The following shell rune will complain about any REs in the
file which don't have any matches in the codebase:
 for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do 
git grep -q "$re" || echo "no matches for re $re"; done

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20230904161703.3996734-1-peter.mayd...@linaro.org
---
 docs/devel/loads-stores.rst | 40 ++---
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst
index dab6dfa0acc..ec627aa9c06 100644
--- a/docs/devel/loads-stores.rst
+++ b/docs/devel/loads-stores.rst
@@ -63,12 +63,12 @@ which stores ``val`` to ``ptr`` as an ``{endian}`` order 
value
 of size ``sz`` bytes.
 
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
  - ``\``
  - ``\``
- - ``\``
- - ``\``
+ - ``\``
+ - ``\``
 
 ``cpu_{ld,st}*_mmu``
 
@@ -121,8 +121,8 @@ store: ``cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)``
  - ``_le`` : little endian
 
 Regexes for git grep:
- - ``\``
- - ``\``
+ - ``\``
+ - ``\``
 
 
 ``cpu_{ld,st}*_mmuidx_ra``
@@ -155,8 +155,8 @@ store: ``cpu_st{size}{end}_mmuidx_ra(env, ptr, val, mmuidx, 
retaddr)``
  - ``_le`` : little endian
 
 Regexes for git grep:
- - ``\``
- - ``\``
+ - ``\``
+ - ``\``
 
 ``cpu_{ld,st}*_data_ra``
 
@@ -193,8 +193,8 @@ store: ``cpu_st{size}{end}_data_ra(env, ptr, val, ra)``
  - ``_le`` : little endian
 
 Regexes for git grep:
- - ``\``
- - ``\``
+ - ``\``
+ - ``\``
 
 ``cpu_{ld,st}*_data``
 ~
@@ -231,9 +231,9 @@ store: ``cpu_st{size}{end}_data(env, ptr, val)``
  - ``_be`` : big endian
  - ``_le`` : little endian
 
-Regexes for git grep
- - ``\``
- - ``\``
+Regexes for git grep:
+ - ``\``
+ - ``\``
 
 ``cpu_ld*_code``
 
@@ -296,7 +296,7 @@ swap: ``translator_ld{sign}{size}_swap(env, ptr, swap)``
  - ``l`` : 32 bits
  - ``q`` : 64 bits
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
 
 ``helper_{ld,st}*_mmu``
@@ -325,7 +325,7 @@ store: ``helper_{size}_mmu(env, addr, val, opindex, 
retaddr)``
  - ``l`` : 32 bits
  - ``q`` : 64 bits
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
  - ``\``
 
@@ -382,7 +382,7 @@ succeeded using a MemTxResult return code.
 
 The ``_{endian}`` suffix is omitted for byte accesses.
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
  - ``\``
  - ``\``
@@ -400,7 +400,7 @@ Note that portions of the write which attempt to write data 
to a
 device will be silently ignored -- only real RAM and ROM will
 be written to.
 
-Regexes for git grep
+Regexes for git grep:
  - ``address_space_write_rom``
 
 ``{ld,st}*_phys``
@@ -438,7 +438,7 @@ device doing the access has no way to report such an error.
 
 The ``_{endian}_`` infix is omitted for byte accesses.
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
  - ``\``
 
@@ -462,7 +462,7 @@ For new code they are better avoided:
 
 ``cpu_physical_memory_rw``
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
 
 ``cpu_memory_rw_debug``
@@ -497,7 +497,7 @@ make sure our existing code is doing things correctly.
 
 ``dma_memory_rw``
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
  - ``\``
  - ``\``
@@ -538,7 +538,7 @@ correct address space for that device.
 
 The ``_{endian}_`` infix is omitted for byte accesses.
 
-Regexes for git grep
+Regexes for git grep:
  - ``\``
  - ``\``
  - ``\``
-- 
2.34.1

[PULL 10/30] target/arm: Remove unused allocation_tag_mem() argument

2023-09-21 Thread Peter Maydell

The allocation_tag_mem() function takes an argument tag_size,
but it never uses it. Remove the argument. In mte_probe_int()
in particular this also lets us delete the code computing
the value we were passing in.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
---
 target/arm/tcg/mte_helper.c | 42 +
 1 file changed, 14 insertions(+), 28 deletions(-)

diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index b23d11563ab..e2494f73cf3 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -57,7 +57,6 @@ static int choose_nonexcluded_tag(int tag, int offset, 
uint16_t exclude)
  * @ptr_access: the access to use for the virtual address
  * @ptr_size: the number of bytes in the normal memory access
  * @tag_access: the access to use for the tag memory
- * @tag_size: the number of bytes in the tag memory access
  * @ra: the return address for exception handling
  *
  * Our tag memory is formatted as a sequence of little-endian nibbles.
@@ -69,15 +68,12 @@ static int choose_nonexcluded_tag(int tag, int offset, 
uint16_t exclude)
  * a pointer to the corresponding tag byte.  Exit with exception if the
  * virtual address is not accessible for @ptr_access.
  *
- * The @ptr_size and @tag_size values may not have an obvious relation
- * due to the alignment of @ptr, and the number of tag checks required.
- *
  * If there is no tag storage corresponding to @ptr, return NULL.
  */
 static uint8_t *allocation_tag_mem(CPUARMState *env, int ptr_mmu_idx,
uint64_t ptr, MMUAccessType ptr_access,
int ptr_size, MMUAccessType tag_access,
-   int tag_size, uintptr_t ra)
+   uintptr_t ra)
 {
 #ifdef CONFIG_USER_ONLY
 uint64_t clean_ptr = useronly_clean_ptr(ptr);
@@ -275,7 +271,7 @@ uint64_t HELPER(ldg)(CPUARMState *env, uint64_t ptr, 
uint64_t xt)
 
 /* Trap if accessing an invalid page.  */
 mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_LOAD, 1,
- MMU_DATA_LOAD, 1, GETPC());
+ MMU_DATA_LOAD, GETPC());
 
 /* Load if page supports tags. */
 if (mem) {
@@ -329,7 +325,7 @@ static inline void do_stg(CPUARMState *env, uint64_t ptr, 
uint64_t xt,
 
 /* Trap if accessing an invalid page.  */
 mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE, TAG_GRANULE,
- MMU_DATA_STORE, 1, ra);
+ MMU_DATA_STORE, ra);
 
 /* Store if page supports tags. */
 if (mem) {
@@ -372,10 +368,10 @@ static inline void do_st2g(CPUARMState *env, uint64_t 
ptr, uint64_t xt,
 if (ptr & TAG_GRANULE) {
 /* Two stores unaligned mod TAG_GRANULE*2 -- modify two bytes. */
 mem1 = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE,
-  TAG_GRANULE, MMU_DATA_STORE, 1, ra);
+  TAG_GRANULE, MMU_DATA_STORE, ra);
 mem2 = allocation_tag_mem(env, mmu_idx, ptr + TAG_GRANULE,
   MMU_DATA_STORE, TAG_GRANULE,
-  MMU_DATA_STORE, 1, ra);
+  MMU_DATA_STORE, ra);
 
 /* Store if page(s) support tags. */
 if (mem1) {
@@ -387,7 +383,7 @@ static inline void do_st2g(CPUARMState *env, uint64_t ptr, 
uint64_t xt,
 } else {
 /* Two stores aligned mod TAG_GRANULE*2 -- modify one byte. */
 mem1 = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE,
-  2 * TAG_GRANULE, MMU_DATA_STORE, 1, ra);
+  2 * TAG_GRANULE, MMU_DATA_STORE, ra);
 if (mem1) {
 tag |= tag << 4;
 qatomic_set(mem1, tag);
@@ -435,8 +431,7 @@ uint64_t HELPER(ldgm)(CPUARMState *env, uint64_t ptr)
 
 /* Trap if accessing an invalid page.  */
 tag_mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_LOAD,
- gm_bs_bytes, MMU_DATA_LOAD,
- gm_bs_bytes / (2 * TAG_GRANULE), ra);
+ gm_bs_bytes, MMU_DATA_LOAD, ra);
 
 /* The tag is squashed to zero if the page does not support tags.  */
 if (!tag_mem) {
@@ -495,8 +490,7 @@ void HELPER(stgm)(CPUARMState *env, uint64_t ptr, uint64_t 
val)
 
 /* Trap if accessing an invalid page.  */
 tag_mem = allocation_tag_mem(env, mmu_idx, ptr, MMU_DATA_STORE,
- gm_bs_bytes, MMU_DATA_LOAD,
- gm_bs_bytes / (2 * TAG_GRANULE), ra);
+ gm_bs_bytes, MMU_DATA_LOAD, ra);
 
 /*
  * Tag store only happens if the page support tags,
@@ -552,7 +546,7 @@ void HELPER(stzgm_tags)(CPUARMState *env, uint64_t ptr, 
uint64_t val)
 ptr &= -dcz_bytes;
 
 mem = allocat

[PULL 26/30] elf2dmp: replace PE export name check with PDB name check

2023-09-21 Thread Peter Maydell

From: Viktor Prutyanov 

PE export name check introduced in d399d6b179 isn't reliable enough,
because a page with the export directory may be not present for some
reason. On the other hand, elf2dmp retrieves the PDB name in any case.
It can be also used to check that a PE image is the kernel image. So,
check PDB name when searching for Windows kernel image.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2165917

Signed-off-by: Viktor Prutyanov 
Reviewed-by: Akihiko Odaki 
Message-id: 20230915170153.10959-2-vik...@daynix.com
Signed-off-by: Peter Maydell 
---
 contrib/elf2dmp/main.c | 93 +++---
 1 file changed, 33 insertions(+), 60 deletions(-)

diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index 6d4d18501a3..bb6744c0cd6 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -411,89 +411,64 @@ static int write_dump(struct pa_space *ps,
 return fclose(dmp_file);
 }
 
-static bool pe_check_export_name(uint64_t base, void *start_addr,
-struct va_space *vs)
-{
-IMAGE_EXPORT_DIRECTORY export_dir;
-const char *pe_name;
-
-if (pe_get_data_dir_entry(base, start_addr, IMAGE_FILE_EXPORT_DIRECTORY,
-&export_dir, sizeof(export_dir), vs)) {
-return false;
-}
-
-pe_name = va_space_resolve(vs, base + export_dir.Name);
-if (!pe_name) {
-return false;
-}
-
-return !strcmp(pe_name, PE_NAME);
-}
-
-static int pe_get_pdb_symstore_hash(uint64_t base, void *start_addr,
-char *hash, struct va_space *vs)
+static bool pe_check_pdb_name(uint64_t base, void *start_addr,
+struct va_space *vs, OMFSignatureRSDS *rsds)
 {
 const char sign_rsds[4] = "RSDS";
 IMAGE_DEBUG_DIRECTORY debug_dir;
-OMFSignatureRSDS rsds;
-char *pdb_name;
-size_t pdb_name_sz;
-size_t i;
+char pdb_name[sizeof(PDB_NAME)];
 
 if (pe_get_data_dir_entry(base, start_addr, IMAGE_FILE_DEBUG_DIRECTORY,
 &debug_dir, sizeof(debug_dir), vs)) {
 eprintf("Failed to get Debug Directory\n");
-return 1;
+return false;
 }
 
 if (debug_dir.Type != IMAGE_DEBUG_TYPE_CODEVIEW) {
-return 1;
+eprintf("Debug Directory type is not CodeView\n");
+return false;
 }
 
 if (va_space_rw(vs,
 base + debug_dir.AddressOfRawData,
-&rsds, sizeof(rsds), 0)) {
-return 1;
+rsds, sizeof(*rsds), 0)) {
+eprintf("Failed to resolve OMFSignatureRSDS\n");
+return false;
 }
 
-printf("CodeView signature is \'%.4s\'\n", rsds.Signature);
-
-if (memcmp(&rsds.Signature, sign_rsds, sizeof(sign_rsds))) {
-return 1;
+if (memcmp(&rsds->Signature, sign_rsds, sizeof(sign_rsds))) {
+eprintf("CodeView signature is \'%.4s\', \'%s\' expected\n",
+rsds->Signature, sign_rsds);
+return false;
 }
 
-pdb_name_sz = debug_dir.SizeOfData - sizeof(rsds);
-pdb_name = malloc(pdb_name_sz);
-if (!pdb_name) {
-return 1;
+if (debug_dir.SizeOfData - sizeof(*rsds) != sizeof(PDB_NAME)) {
+eprintf("PDB name size doesn't match\n");
+return false;
 }
 
 if (va_space_rw(vs, base + debug_dir.AddressOfRawData +
-offsetof(OMFSignatureRSDS, name), pdb_name, pdb_name_sz, 0)) {
-free(pdb_name);
-return 1;
+offsetof(OMFSignatureRSDS, name), pdb_name, sizeof(PDB_NAME),
+0)) {
+eprintf("Failed to resolve PDB name\n");
+return false;
 }
 
 printf("PDB name is \'%s\', \'%s\' expected\n", pdb_name, PDB_NAME);
 
-if (strcmp(pdb_name, PDB_NAME)) {
-eprintf("Unexpected PDB name, it seems the kernel isn't found\n");
-free(pdb_name);
-return 1;
-}
+return !strcmp(pdb_name, PDB_NAME);
+}
 
-free(pdb_name);
-
-sprintf(hash, "%.08x%.04x%.04x%.02x%.02x", rsds.guid.a, rsds.guid.b,
-rsds.guid.c, rsds.guid.d[0], rsds.guid.d[1]);
+static void pe_get_pdb_symstore_hash(OMFSignatureRSDS *rsds, char *hash)
+{
+sprintf(hash, "%.08x%.04x%.04x%.02x%.02x", rsds->guid.a, rsds->guid.b,
+rsds->guid.c, rsds->guid.d[0], rsds->guid.d[1]);
 hash += 20;
-for (i = 0; i < 6; i++, hash += 2) {
-sprintf(hash, "%.02x", rsds.guid.e[i]);
+for (unsigned int i = 0; i < 6; i++, hash += 2) {
+sprintf(hash, "%.02x", rsds->guid.e[i]);
 }
 
-sprintf(hash, "%.01x", rsds.age);
-
-return 0;
+sprintf(hash, "%.01x", rsds->age);
 }
 
 int main(int argc, char *argv[])
@@ -515,6 +490,7 @@ int main(int argc, char *argv[])
 KDDEBUGGER_DATA64 *kdbg;
 uint64_t KdVersionBlock;
 bool kernel_found = false;
+OMFSignatureRSDS rsds;
 
 if (argc != 3) {
 eprintf("usage:\n\t%s elf_file dmp_file\n", argv[0]);
@@ -562,7 +538,8 @@ int main(int argc, char *argv[])
 }
 
 if (*(uint16_t *)nt_start_addr == 0x5a4d) { /* MZ */
-

[PULL 13/30] target/arm: Pass unpriv bool to get_a64_user_mem_index()

2023-09-21 Thread Peter Maydell

In every place that we call the get_a64_user_mem_index() function
we do it like this:
 memidx = a->unpriv ? get_a64_user_mem_index(s) : get_mem_index(s);
Refactor so the caller passes in the bool that says whether they
want the 'unpriv' or 'normal' mem_index rather than having to
do the ?: themselves.

Signed-off-by: Peter Maydell 
Message-id: 20230912140434.169-4-peter.mayd...@linaro.org
---
 target/arm/tcg/translate-a64.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 1dd86edae13..24afd929144 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -105,9 +105,17 @@ void a64_translate_init(void)
 }
 
 /*
- * Return the core mmu_idx to use for A64 "unprivileged load/store" insns
+ * Return the core mmu_idx to use for A64 load/store insns which
+ * have a "unprivileged load/store" variant. Those insns access
+ * EL0 if executed from an EL which has control over EL0 (usually
+ * EL1) but behave like normal loads and stores if executed from
+ * elsewhere (eg EL3).
+ *
+ * @unpriv : true for the unprivileged encoding; false for the
+ *   normal encoding (in which case we will return the same
+ *   thing as get_mem_index().
  */
-static int get_a64_user_mem_index(DisasContext *s)
+static int get_a64_user_mem_index(DisasContext *s, bool unpriv)
 {
 /*
  * If AccType_UNPRIV is not used, the insn uses AccType_NORMAL,
@@ -115,7 +123,7 @@ static int get_a64_user_mem_index(DisasContext *s)
  */
 ARMMMUIdx useridx = s->mmu_idx;
 
-if (s->unpriv) {
+if (unpriv && s->unpriv) {
 /*
  * We have pre-computed the condition for AccType_UNPRIV.
  * Therefore we should never get here with a mmu_idx for
@@ -3088,7 +3096,7 @@ static void op_addr_ldst_imm_pre(DisasContext *s, 
arg_ldst_imm *a,
 if (!a->p) {
 tcg_gen_addi_i64(*dirty_addr, *dirty_addr, offset);
 }
-memidx = a->unpriv ? get_a64_user_mem_index(s) : get_mem_index(s);
+memidx = get_a64_user_mem_index(s, a->unpriv);
 *clean_addr = gen_mte_check1_mmuidx(s, *dirty_addr, is_store,
 a->w || a->rn != 31,
 mop, a->unpriv, memidx);
@@ -3109,7 +3117,7 @@ static bool trans_STR_i(DisasContext *s, arg_ldst_imm *a)
 {
 bool iss_sf, iss_valid = !a->w;
 TCGv_i64 clean_addr, dirty_addr, tcg_rt;
-int memidx = a->unpriv ? get_a64_user_mem_index(s) : get_mem_index(s);
+int memidx = get_a64_user_mem_index(s, a->unpriv);
 MemOp mop = finalize_memop(s, a->sz + a->sign * MO_SIGN);
 
 op_addr_ldst_imm_pre(s, a, &clean_addr, &dirty_addr, a->imm, true, mop);
@@ -3127,7 +3135,7 @@ static bool trans_LDR_i(DisasContext *s, arg_ldst_imm *a)
 {
 bool iss_sf, iss_valid = !a->w;
 TCGv_i64 clean_addr, dirty_addr, tcg_rt;
-int memidx = a->unpriv ? get_a64_user_mem_index(s) : get_mem_index(s);
+int memidx = get_a64_user_mem_index(s, a->unpriv);
 MemOp mop = finalize_memop(s, a->sz + a->sign * MO_SIGN);
 
 op_addr_ldst_imm_pre(s, a, &clean_addr, &dirty_addr, a->imm, false, mop);
-- 
2.34.1

[PULL 23/30] audio/jackaudio: Avoid dynamic stack allocation in qjack_client_init

2023-09-21 Thread Peter Maydell

Avoid a dynamic stack allocation in qjack_client_init(), by using
a g_autofree heap allocation instead.

(We stick with allocate + snprintf() because the JACK API requires
the name to be no more than its maximum size, so g_strdup_printf()
would require an extra truncation step.)

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Peter Maydell 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Francisco Iglesias 
Reviewed-by: Christian Schoenebeck 
Message-id: 20230818155846.1651287-2-peter.mayd...@linaro.org
---
 audio/jackaudio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/audio/jackaudio.c b/audio/jackaudio.c
index 5bdf3d7a78d..7cb2a49f971 100644
--- a/audio/jackaudio.c
+++ b/audio/jackaudio.c
@@ -400,7 +400,8 @@ static void qjack_client_connect_ports(QJackClient *c)
 static int qjack_client_init(QJackClient *c)
 {
 jack_status_t status;
-char client_name[jack_client_name_size()];
+int client_name_len = jack_client_name_size(); /* includes NUL */
+g_autofree char *client_name = g_new(char, client_name_len);
 jack_options_t options = JackNullOption;
 
 if (c->state == QJACK_STATE_RUNNING) {
@@ -409,7 +410,7 @@ static int qjack_client_init(QJackClient *c)
 
 c->connect_ports = true;
 
-snprintf(client_name, sizeof(client_name), "%s-%s",
+snprintf(client_name, client_name_len, "%s-%s",
 c->out ? "out" : "in",
 c->opt->client_name ? c->opt->client_name : audio_application_name());
 
-- 
2.34.1

[PULL 01/30] target/m68k: Add URL to semihosting spec

2023-09-21 Thread Peter Maydell

The spec for m68k semihosting is documented in the libgloss
sources. Add a comment with the URL for it, as we already
have for nios2 semihosting.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20230801154451.3505492-1-peter.mayd...@linaro.org
---
 target/m68k/m68k-semi.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/m68k/m68k-semi.c b/target/m68k/m68k-semi.c
index 239f6e44e90..80cd8d70dbb 100644
--- a/target/m68k/m68k-semi.c
+++ b/target/m68k/m68k-semi.c
@@ -15,6 +15,10 @@
  *
  *  You should have received a copy of the GNU General Public License
  *  along with this program; if not, see .
+ *
+ *  The semihosting protocol implemented here is described in the
+ *  libgloss sources:
+ *  
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=libgloss/m68k/m68k-semi.txt;hb=HEAD
  */
 
 #include "qemu/osdep.h"
-- 
2.34.1

[PULL 05/30] linux-user/elfload.c: Add missing arm and arm64 hwcap values

2023-09-21 Thread Peter Maydell

Our lists of Arm 32 and 64 bit hwcap values have lagged behind
the Linux kernel. Update them to include all the bits defined
as of upstream Linux git commit a48fa7efaf1161c1 (in the middle
of the kernel 6.6 dev cycle).

For 64-bit, we don't yet implement any of the features reported via
these hwcap bits.  For 32-bit we do in fact already implement them
all; we'll add the code to set them in a subsequent commit.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 linux-user/elfload.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 5ce009d7137..d51d077998a 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -402,6 +402,12 @@ enum
 ARM_HWCAP_ARM_VFPD32= 1 << 19,
 ARM_HWCAP_ARM_LPAE  = 1 << 20,
 ARM_HWCAP_ARM_EVTSTRM   = 1 << 21,
+ARM_HWCAP_ARM_FPHP  = 1 << 22,
+ARM_HWCAP_ARM_ASIMDHP   = 1 << 23,
+ARM_HWCAP_ARM_ASIMDDP   = 1 << 24,
+ARM_HWCAP_ARM_ASIMDFHM  = 1 << 25,
+ARM_HWCAP_ARM_ASIMDBF16 = 1 << 26,
+ARM_HWCAP_ARM_I8MM  = 1 << 27,
 };
 
 enum {
@@ -410,6 +416,8 @@ enum {
 ARM_HWCAP2_ARM_SHA1 = 1 << 2,
 ARM_HWCAP2_ARM_SHA2 = 1 << 3,
 ARM_HWCAP2_ARM_CRC32= 1 << 4,
+ARM_HWCAP2_ARM_SB   = 1 << 5,
+ARM_HWCAP2_ARM_SSBS = 1 << 6,
 };
 
 /* The commpage only exists for 32 bit kernels */
@@ -540,6 +548,12 @@ const char *elf_hwcap_str(uint32_t bit)
 [__builtin_ctz(ARM_HWCAP_ARM_VFPD32   )] = "vfpd32",
 [__builtin_ctz(ARM_HWCAP_ARM_LPAE )] = "lpae",
 [__builtin_ctz(ARM_HWCAP_ARM_EVTSTRM  )] = "evtstrm",
+[__builtin_ctz(ARM_HWCAP_ARM_FPHP )] = "fphp",
+[__builtin_ctz(ARM_HWCAP_ARM_ASIMDHP  )] = "asimdhp",
+[__builtin_ctz(ARM_HWCAP_ARM_ASIMDDP  )] = "asimddp",
+[__builtin_ctz(ARM_HWCAP_ARM_ASIMDFHM )] = "asimdfhm",
+[__builtin_ctz(ARM_HWCAP_ARM_ASIMDBF16)] = "asimdbf16",
+[__builtin_ctz(ARM_HWCAP_ARM_I8MM )] = "i8mm",
 };
 
 return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
@@ -553,6 +567,8 @@ const char *elf_hwcap2_str(uint32_t bit)
 [__builtin_ctz(ARM_HWCAP2_ARM_SHA1 )] = "sha1",
 [__builtin_ctz(ARM_HWCAP2_ARM_SHA2 )] = "sha2",
 [__builtin_ctz(ARM_HWCAP2_ARM_CRC32)] = "crc32",
+[__builtin_ctz(ARM_HWCAP2_ARM_SB   )] = "sb",
+[__builtin_ctz(ARM_HWCAP2_ARM_SSBS )] = "ssbs",
 };
 
 return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
@@ -696,6 +712,20 @@ enum {
 ARM_HWCAP2_A64_SME_B16F32   = 1 << 28,
 ARM_HWCAP2_A64_SME_F32F32   = 1 << 29,
 ARM_HWCAP2_A64_SME_FA64 = 1 << 30,
+ARM_HWCAP2_A64_WFXT = 1ULL << 31,
+ARM_HWCAP2_A64_EBF16= 1ULL << 32,
+ARM_HWCAP2_A64_SVE_EBF16= 1ULL << 33,
+ARM_HWCAP2_A64_CSSC = 1ULL << 34,
+ARM_HWCAP2_A64_RPRFM= 1ULL << 35,
+ARM_HWCAP2_A64_SVE2P1   = 1ULL << 36,
+ARM_HWCAP2_A64_SME2 = 1ULL << 37,
+ARM_HWCAP2_A64_SME2P1   = 1ULL << 38,
+ARM_HWCAP2_A64_SME_I16I32   = 1ULL << 39,
+ARM_HWCAP2_A64_SME_BI32I32  = 1ULL << 40,
+ARM_HWCAP2_A64_SME_B16B16   = 1ULL << 41,
+ARM_HWCAP2_A64_SME_F16F16   = 1ULL << 42,
+ARM_HWCAP2_A64_MOPS = 1ULL << 43,
+ARM_HWCAP2_A64_HBC  = 1ULL << 44,
 };
 
 #define ELF_HWCAP   get_elf_hwcap()
@@ -851,6 +881,20 @@ const char *elf_hwcap2_str(uint32_t bit)
 [__builtin_ctz(ARM_HWCAP2_A64_SME_B16F32   )] = "smeb16f32",
 [__builtin_ctz(ARM_HWCAP2_A64_SME_F32F32   )] = "smef32f32",
 [__builtin_ctz(ARM_HWCAP2_A64_SME_FA64 )] = "smefa64",
+[__builtin_ctz(ARM_HWCAP2_A64_WFXT )] = "wfxt",
+[__builtin_ctzll(ARM_HWCAP2_A64_EBF16  )] = "ebf16",
+[__builtin_ctzll(ARM_HWCAP2_A64_SVE_EBF16  )] = "sveebf16",
+[__builtin_ctzll(ARM_HWCAP2_A64_CSSC   )] = "cssc",
+[__builtin_ctzll(ARM_HWCAP2_A64_RPRFM  )] = "rprfm",
+[__builtin_ctzll(ARM_HWCAP2_A64_SVE2P1 )] = "sve2p1",
+[__builtin_ctzll(ARM_HWCAP2_A64_SME2   )] = "sme2",
+[__builtin_ctzll(ARM_HWCAP2_A64_SME2P1 )] = "sme2p1",
+[__builtin_ctzll(ARM_HWCAP2_A64_SME_I16I32 )] = "smei16i32",
+[__builtin_ctzll(ARM_HWCAP2_A64_SME_BI32I32)] = "smebi32i32",
+[__builtin_ctzll(ARM_HWCAP2_A64_SME_B16B16 )] = "smeb16b16",
+[__builtin_ctzll(ARM_HWCAP2_A64_SME_F16F16 )] = "smef16f16",
+[__builtin_ctzll(ARM_HWCAP2_A64_MOPS   )] = "mops",
+[__builtin_ctzll(ARM_HWCAP2_A64_HBC)] = "hbc",
 };
 
 return bit < ARRAY_SIZE(hwcap_str) ? hwcap_str[bit] : NULL;
-- 
2.34.1

[PULL 19/30] target/arm: Implement the SETG* instructions

2023-09-21 Thread Peter Maydell

The FEAT_MOPS SETG* instructions are very similar to the SET*
instructions, but as well as setting memory contents they also
set the MTE tags. They are architecturally required to operate
on tag-granule aligned regions only.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-10-peter.mayd...@linaro.org
---
 target/arm/internals.h | 10 
 target/arm/tcg/helper-a64.h|  3 ++
 target/arm/tcg/a64.decode  |  5 ++
 target/arm/tcg/helper-a64.c| 86 --
 target/arm/tcg/mte_helper.c| 40 
 target/arm/tcg/translate-a64.c | 20 +---
 6 files changed, 155 insertions(+), 9 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index a70a7fd50f6..642f77df29b 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1300,6 +1300,16 @@ uint64_t mte_mops_probe(CPUARMState *env, uint64_t ptr, 
uint64_t size,
 void mte_check_fail(CPUARMState *env, uint32_t desc,
 uint64_t dirty_ptr, uintptr_t ra);
 
+/**
+ * mte_mops_set_tags: Set MTE tags for a portion of a FEAT_MOPS operation
+ * @env: CPU env
+ * @dirty_ptr: Start address of memory region (dirty pointer)
+ * @size: length of region (guaranteed not to cross page boundary)
+ * @desc: MTEDESC descriptor word
+ */
+void mte_mops_set_tags(CPUARMState *env, uint64_t dirty_ptr, uint64_t size,
+   uint32_t desc);
+
 static inline int allocation_tag_from_addr(uint64_t ptr)
 {
 return extract64(ptr, 56, 4);
diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
index 7ce5d2105ad..10a99107124 100644
--- a/target/arm/tcg/helper-a64.h
+++ b/target/arm/tcg/helper-a64.h
@@ -121,3 +121,6 @@ DEF_HELPER_FLAGS_4(unaligned_access, TCG_CALL_NO_WG,
 DEF_HELPER_3(setp, void, env, i32, i32)
 DEF_HELPER_3(setm, void, env, i32, i32)
 DEF_HELPER_3(sete, void, env, i32, i32)
+DEF_HELPER_3(setgp, void, env, i32, i32)
+DEF_HELPER_3(setgm, void, env, i32, i32)
+DEF_HELPER_3(setge, void, env, i32, i32)
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index c2a97328eeb..a202faa17bc 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -570,3 +570,8 @@ STZ2G   11011001 11 1 . 11 . . 
@ldst_tag p=0 w=1
 SETP00 011001110 . 00 . . 01 . . @set
 SETM00 011001110 . 01 . . 01 . . @set
 SETE00 011001110 . 10 . . 01 . . @set
+
+# Like SET, but also setting MTE tags
+SETGP   00 011101110 . 00 . . 01 . . @set
+SETGM   00 011101110 . 01 . . 01 . . @set
+SETGE   00 011101110 . 10 . . 01 . . @set
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
index 24ae5ecf32e..2cf89184d77 100644
--- a/target/arm/tcg/helper-a64.c
+++ b/target/arm/tcg/helper-a64.c
@@ -1103,6 +1103,50 @@ static uint64_t set_step(CPUARMState *env, uint64_t 
toaddr,
 return setsize;
 }
 
+/*
+ * Similar, but setting tags. The architecture requires us to do this
+ * in 16-byte chunks. SETP accesses are not tag checked; they set
+ * the tags.
+ */
+static uint64_t set_step_tags(CPUARMState *env, uint64_t toaddr,
+  uint64_t setsize, uint32_t data, int memidx,
+  uint32_t *mtedesc, uintptr_t ra)
+{
+void *mem;
+uint64_t cleanaddr;
+
+setsize = MIN(setsize, page_limit(toaddr));
+
+cleanaddr = useronly_clean_ptr(toaddr);
+/*
+ * Trapless lookup: returns NULL for invalid page, I/O,
+ * watchpoints, clean pages, etc.
+ */
+mem = tlb_vaddr_to_host(env, cleanaddr, MMU_DATA_STORE, memidx);
+
+#ifndef CONFIG_USER_ONLY
+if (unlikely(!mem)) {
+/*
+ * Slow-path: just do one write. This will handle the
+ * watchpoint, invalid page, etc handling correctly.
+ * The architecture requires that we do 16 bytes at a time,
+ * and we know both ptr and size are 16 byte aligned.
+ * For clean code pages, the next iteration will see
+ * the page dirty and will use the fast path.
+ */
+uint64_t repldata = data * 0x0101010101010101ULL;
+MemOpIdx oi16 = make_memop_idx(MO_TE | MO_128, memidx);
+cpu_st16_mmu(env, toaddr, int128_make128(repldata, repldata), oi16, 
ra);
+mte_mops_set_tags(env, toaddr, 16, *mtedesc);
+return 16;
+}
+#endif
+/* Easy case: just memset the host memory */
+memset(mem, data, setsize);
+mte_mops_set_tags(env, toaddr, setsize, *mtedesc);
+return setsize;
+}
+
 typedef uint64_t StepFn(CPUARMState *env, uint64_t toaddr,
 uint64_t setsize, uint32_t data,
 int memidx, uint32_t *mtedesc, uintptr_t ra);
@@ -1141,6 +1185,18 @@ static bool mte_checks_needed(uint64_t ptr, uint32_t 
desc)
 return !tcma_check(desc, bit55, allocation_tag_from_addr(ptr));
 }
 
+/* Take an exception i

[PULL 28/30] elf2dmp: introduce merging of physical memory runs

2023-09-21 Thread Peter Maydell

From: Viktor Prutyanov 

DMP supports 42 physical memory runs at most. So, merge adjacent
physical memory ranges from QEMU ELF when possible to minimize total
number of runs.

Signed-off-by: Viktor Prutyanov 
Reviewed-by: Akihiko Odaki 
Message-id: 20230915170153.10959-4-vik...@daynix.com
[PMM: fixed format string for printing size_t values]
Signed-off-by: Peter Maydell 
---
 contrib/elf2dmp/main.c | 56 --
 1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index b7e39301641..5db163bdbe8 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -20,6 +20,7 @@
 #define PE_NAME "ntoskrnl.exe"
 
 #define INITIAL_MXCSR   0x1f80
+#define MAX_NUMBER_OF_RUNS  42
 
 typedef struct idt_desc {
 uint16_t offset1;   /* offset bits 0..15 */
@@ -234,6 +235,42 @@ static int fix_dtb(struct va_space *vs, QEMU_Elf *qe)
 return 1;
 }
 
+static void try_merge_runs(struct pa_space *ps,
+WinDumpPhyMemDesc64 *PhysicalMemoryBlock)
+{
+unsigned int merge_cnt = 0, run_idx = 0;
+
+PhysicalMemoryBlock->NumberOfRuns = 0;
+
+for (size_t idx = 0; idx < ps->block_nr; idx++) {
+struct pa_block *blk = ps->block + idx;
+struct pa_block *next = blk + 1;
+
+PhysicalMemoryBlock->NumberOfPages += blk->size / ELF2DMP_PAGE_SIZE;
+
+if (idx + 1 != ps->block_nr && blk->paddr + blk->size == next->paddr) {
+printf("Block #%zu 0x%"PRIx64"+:0x%"PRIx64" and %u previous will 
be"
+" merged\n", idx, blk->paddr, blk->size, merge_cnt);
+merge_cnt++;
+} else {
+struct pa_block *first_merged = blk - merge_cnt;
+
+printf("Block #%zu 0x%"PRIx64"+:0x%"PRIx64" and %u previous will 
be"
+" merged to 0x%"PRIx64"+:0x%"PRIx64" (run #%u)\n",
+idx, blk->paddr, blk->size, merge_cnt, first_merged->paddr,
+blk->paddr + blk->size - first_merged->paddr, run_idx);
+PhysicalMemoryBlock->Run[run_idx] = (WinDumpPhyMemRun64) {
+.BasePage = first_merged->paddr / ELF2DMP_PAGE_SIZE,
+.PageCount = (blk->paddr + blk->size - first_merged->paddr) /
+ELF2DMP_PAGE_SIZE,
+};
+PhysicalMemoryBlock->NumberOfRuns++;
+run_idx++;
+merge_cnt = 0;
+}
+}
+}
+
 static int fill_header(WinDumpHeader64 *hdr, struct pa_space *ps,
 struct va_space *vs, uint64_t KdDebuggerDataBlock,
 KDDEBUGGER_DATA64 *kdbg, uint64_t KdVersionBlock, int nr_cpus)
@@ -244,7 +281,6 @@ static int fill_header(WinDumpHeader64 *hdr, struct 
pa_space *ps,
 KUSD_OFFSET_PRODUCT_TYPE);
 DBGKD_GET_VERSION64 kvb;
 WinDumpHeader64 h;
-size_t i;
 
 QEMU_BUILD_BUG_ON(KUSD_OFFSET_SUITE_MASK >= ELF2DMP_PAGE_SIZE);
 QEMU_BUILD_BUG_ON(KUSD_OFFSET_PRODUCT_TYPE >= ELF2DMP_PAGE_SIZE);
@@ -282,13 +318,17 @@ static int fill_header(WinDumpHeader64 *hdr, struct 
pa_space *ps,
 .RequiredDumpSpace = sizeof(h),
 };
 
-for (i = 0; i < ps->block_nr; i++) {
-h.PhysicalMemoryBlock.NumberOfPages +=
-ps->block[i].size / ELF2DMP_PAGE_SIZE;
-h.PhysicalMemoryBlock.Run[i] = (WinDumpPhyMemRun64) {
-.BasePage = ps->block[i].paddr / ELF2DMP_PAGE_SIZE,
-.PageCount = ps->block[i].size / ELF2DMP_PAGE_SIZE,
-};
+if (h.PhysicalMemoryBlock.NumberOfRuns <= MAX_NUMBER_OF_RUNS) {
+for (size_t idx = 0; idx < ps->block_nr; idx++) {
+h.PhysicalMemoryBlock.NumberOfPages +=
+ps->block[idx].size / ELF2DMP_PAGE_SIZE;
+h.PhysicalMemoryBlock.Run[idx] = (WinDumpPhyMemRun64) {
+.BasePage = ps->block[idx].paddr / ELF2DMP_PAGE_SIZE,
+.PageCount = ps->block[idx].size / ELF2DMP_PAGE_SIZE,
+};
+}
+} else {
+try_merge_runs(ps, &h.PhysicalMemoryBlock);
 }
 
 h.RequiredDumpSpace +=
-- 
2.34.1

[PULL 21/30] target/arm: Implement the CPY* instructions

2023-09-21 Thread Peter Maydell

The FEAT_MOPS CPY* instructions implement memory copies. These
come in both "always forwards" (memcpy-style) and "overlap OK"
(memmove-style) flavours.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-12-peter.mayd...@linaro.org
---
 target/arm/tcg/helper-a64.h|   7 +
 target/arm/tcg/a64.decode  |  14 +
 target/arm/tcg/helper-a64.c| 454 +
 target/arm/tcg/translate-a64.c |  60 +
 4 files changed, 535 insertions(+)

diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
index 10a99107124..575a5dab7dc 100644
--- a/target/arm/tcg/helper-a64.h
+++ b/target/arm/tcg/helper-a64.h
@@ -124,3 +124,10 @@ DEF_HELPER_3(sete, void, env, i32, i32)
 DEF_HELPER_3(setgp, void, env, i32, i32)
 DEF_HELPER_3(setgm, void, env, i32, i32)
 DEF_HELPER_3(setge, void, env, i32, i32)
+
+DEF_HELPER_4(cpyp, void, env, i32, i32, i32)
+DEF_HELPER_4(cpym, void, env, i32, i32, i32)
+DEF_HELPER_4(cpye, void, env, i32, i32, i32)
+DEF_HELPER_4(cpyfp, void, env, i32, i32, i32)
+DEF_HELPER_4(cpyfm, void, env, i32, i32, i32)
+DEF_HELPER_4(cpyfe, void, env, i32, i32, i32)
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index a202faa17bc..0cf11470741 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -575,3 +575,17 @@ SETE00 011001110 . 10 . . 01 . . 
@set
 SETGP   00 011101110 . 00 . . 01 . . @set
 SETGM   00 011101110 . 01 . . 01 . . @set
 SETGE   00 011101110 . 10 . . 01 . . @set
+
+# Memmove/Memcopy: the CPY insns allow overlapping src/dest and
+# copy in the correct direction; the CPYF insns always copy forwards.
+#
+# options has the nontemporal and unpriv bits for src and dest
+&cpy rs rn rd options
+@cpy.. ... . . rs:5 options:4 .. rn:5 rd:5 &cpy
+
+CPYFP   00 011 0 01000 .  01 . . @cpy
+CPYFM   00 011 0 01010 .  01 . . @cpy
+CPYFE   00 011 0 01100 .  01 . . @cpy
+CPYP00 011 1 01000 .  01 . . @cpy
+CPYM00 011 1 01010 .  01 . . @cpy
+CPYE00 011 1 01100 .  01 . . @cpy
diff --git a/target/arm/tcg/helper-a64.c b/target/arm/tcg/helper-a64.c
index 2cf89184d77..84f54750fc2 100644
--- a/target/arm/tcg/helper-a64.c
+++ b/target/arm/tcg/helper-a64.c
@@ -1048,6 +1048,15 @@ static uint64_t page_limit(uint64_t addr)
 return TARGET_PAGE_ALIGN(addr + 1) - addr;
 }
 
+/*
+ * Return the number of bytes we can copy starting from addr and working
+ * backwards without crossing a page boundary.
+ */
+static uint64_t page_limit_rev(uint64_t addr)
+{
+return (addr & ~TARGET_PAGE_MASK) + 1;
+}
+
 /*
  * Perform part of a memory set on an area of guest memory starting at
  * toaddr (a dirty address) and extending for setsize bytes.
@@ -1392,3 +1401,448 @@ void HELPER(setge)(CPUARMState *env, uint32_t syndrome, 
uint32_t mtedesc)
 {
 do_sete(env, syndrome, mtedesc, set_step_tags, true, GETPC());
 }
+
+/*
+ * Perform part of a memory copy from the guest memory at fromaddr
+ * and extending for copysize bytes, to the guest memory at
+ * toaddr. Both addreses are dirty.
+ *
+ * Returns the number of bytes actually set, which might be less than
+ * copysize; the caller should loop until the whole copy has been done.
+ * The caller should ensure that the guest registers are correct
+ * for the possibility that the first byte of the copy encounters
+ * an exception or watchpoint. We guarantee not to take any faults
+ * for bytes other than the first.
+ */
+static uint64_t copy_step(CPUARMState *env, uint64_t toaddr, uint64_t fromaddr,
+  uint64_t copysize, int wmemidx, int rmemidx,
+  uint32_t *wdesc, uint32_t *rdesc, uintptr_t ra)
+{
+void *rmem;
+void *wmem;
+
+/* Don't cross a page boundary on either source or destination */
+copysize = MIN(copysize, page_limit(toaddr));
+copysize = MIN(copysize, page_limit(fromaddr));
+/*
+ * Handle MTE tag checks: either handle the tag mismatch for byte 0,
+ * or else copy up to but not including the byte with the mismatch.
+ */
+if (*rdesc) {
+uint64_t mtesize = mte_mops_probe(env, fromaddr, copysize, *rdesc);
+if (mtesize == 0) {
+mte_check_fail(env, *rdesc, fromaddr, ra);
+*rdesc = 0;
+} else {
+copysize = MIN(copysize, mtesize);
+}
+}
+if (*wdesc) {
+uint64_t mtesize = mte_mops_probe(env, toaddr, copysize, *wdesc);
+if (mtesize == 0) {
+mte_check_fail(env, *wdesc, toaddr, ra);
+*wdesc = 0;
+} else {
+copysize = MIN(copysize, mtesize);
+}
+}
+
+toaddr = useronly_clean_ptr(toaddr);
+fromaddr = useronly_clean_ptr(fromaddr);
+/* Trapless lookup of whether we can get a host m

[PULL 06/30] linux-user/elfload.c: Report previously missing arm32 hwcaps

2023-09-21 Thread Peter Maydell

Add the code to report the arm32 hwcaps we were previously missing:
 ss, ssbs, fphp, asimdhp, asimddp, asimdfhm, asimdbf16, i8mm

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 linux-user/elfload.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index d51d077998a..bbb4f08109c 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -506,6 +506,16 @@ uint32_t get_elf_hwcap(void)
 }
 }
 GET_FEATURE_ID(aa32_simdfmac, ARM_HWCAP_ARM_VFPv4);
+/*
+ * MVFR1.FPHP and .SIMDHP must be in sync, and QEMU uses the same
+ * isar_feature function for both. The kernel reports them as two hwcaps.
+ */
+GET_FEATURE_ID(aa32_fp16_arith, ARM_HWCAP_ARM_FPHP);
+GET_FEATURE_ID(aa32_fp16_arith, ARM_HWCAP_ARM_ASIMDHP);
+GET_FEATURE_ID(aa32_dp, ARM_HWCAP_ARM_ASIMDDP);
+GET_FEATURE_ID(aa32_fhm, ARM_HWCAP_ARM_ASIMDFHM);
+GET_FEATURE_ID(aa32_bf16, ARM_HWCAP_ARM_ASIMDBF16);
+GET_FEATURE_ID(aa32_i8mm, ARM_HWCAP_ARM_I8MM);
 
 return hwcaps;
 }
@@ -520,6 +530,8 @@ uint32_t get_elf_hwcap2(void)
 GET_FEATURE_ID(aa32_sha1, ARM_HWCAP2_ARM_SHA1);
 GET_FEATURE_ID(aa32_sha2, ARM_HWCAP2_ARM_SHA2);
 GET_FEATURE_ID(aa32_crc32, ARM_HWCAP2_ARM_CRC32);
+GET_FEATURE_ID(aa32_sb, ARM_HWCAP2_ARM_SB);
+GET_FEATURE_ID(aa32_ssbs, ARM_HWCAP2_ARM_SSBS);
 return hwcaps;
 }
 
-- 
2.34.1

[PULL 09/30] target/arm: Implement FEAT_HBC

2023-09-21 Thread Peter Maydell

FEAT_HBC (Hinted conditional branches) provides a new instruction
BC.cond, which behaves exactly like the existing B.cond except
that it provides a hint to the branch predictor about the
likely behaviour of the branch.

Since QEMU does not implement branch prediction, we can treat
this identically to B.cond.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 docs/system/arm/emulation.rst  | 1 +
 target/arm/cpu.h   | 5 +
 target/arm/tcg/a64.decode  | 3 ++-
 linux-user/elfload.c   | 1 +
 target/arm/tcg/cpu64.c | 4 
 target/arm/tcg/translate-a64.c | 4 
 6 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 3df936fc356..1fb6a2e8c3e 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -42,6 +42,7 @@ the following architecture extensions:
 - FEAT_FlagM2 (Enhancements to flag manipulation instructions)
 - FEAT_GTG (Guest translation granule size)
 - FEAT_HAFDBS (Hardware management of the access flag and dirty bit state)
+- FEAT_HBC (Hinted conditional branches)
 - FEAT_HCX (Support for the HCRX_EL2 register)
 - FEAT_HPDS (Hierarchical permission disables)
 - FEAT_HPDS2 (Translation table page-based hardware attributes)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 7ba2402f727..bc7a69a8753 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -4088,6 +4088,11 @@ static inline bool isar_feature_aa64_i8mm(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, I8MM) != 0;
 }
 
+static inline bool isar_feature_aa64_hbc(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64isar2, ID_AA64ISAR2, BC) != 0;
+}
+
 static inline bool isar_feature_aa64_tgran4_lpa2(const ARMISARegisters *id)
 {
 return FIELD_SEX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN4) >= 1;
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index ef64a3f9cba..71113173020 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -126,7 +126,8 @@ CBZ sf:1 011010 nz:1 ... rt:5 
&cbz imm=%imm19
 
 TBZ . 011011 nz:1 . .. rt:5 &tbz  imm=%imm14 
bitpos=%imm31_19
 
-B_cond  0101010 0 ... 0 cond:4 imm=%imm19
+# B.cond and BC.cond
+B_cond  0101010 0 ... c:1 cond:4 imm=%imm19
 
 BR  1101011  1 00 rn:5 0 &r
 BLR 1101011 0001 1 00 rn:5 0 &r
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index bbb4f08109c..203a2b790d5 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -815,6 +815,7 @@ uint32_t get_elf_hwcap2(void)
 GET_FEATURE_ID(aa64_sme_f64f64, ARM_HWCAP2_A64_SME_F64F64);
 GET_FEATURE_ID(aa64_sme_i16i64, ARM_HWCAP2_A64_SME_I16I64);
 GET_FEATURE_ID(aa64_sme_fa64, ARM_HWCAP2_A64_SME_FA64);
+GET_FEATURE_ID(aa64_hbc, ARM_HWCAP2_A64_HBC);
 
 return hwcaps;
 }
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 7264ab5ead1..57abaea00cd 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1027,6 +1027,10 @@ void aarch64_max_tcg_initfn(Object *obj)
 t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1); /* FEAT_I8MM */
 cpu->isar.id_aa64isar1 = t;
 
+t = cpu->isar.id_aa64isar2;
+t = FIELD_DP64(t, ID_AA64ISAR2, BC, 1);  /* FEAT_HBC */
+cpu->isar.id_aa64isar2 = t;
+
 t = cpu->isar.id_aa64pfr0;
 t = FIELD_DP64(t, ID_AA64PFR0, FP, 1);/* FEAT_FP16 */
 t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);   /* FEAT_FP16 */
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 1b6fbb61e2b..1dd86edae13 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -1453,6 +1453,10 @@ static bool trans_TBZ(DisasContext *s, arg_tbz *a)
 
 static bool trans_B_cond(DisasContext *s, arg_B_cond *a)
 {
+/* BC.cond is only present with FEAT_HBC */
+if (a->c && !dc_isar_feature(aa64_hbc, s)) {
+return false;
+}
 reset_btype(s);
 if (a->cond < 0x0e) {
 /* genuinely conditional branches */
-- 
2.34.1

[PULL 22/30] target/arm: Enable FEAT_MOPS for CPU 'max'

2023-09-21 Thread Peter Maydell

Enable FEAT_MOPS on the AArch64 'max' CPU, and add it to
the list of features we implement.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-13-peter.mayd...@linaro.org
---
 docs/system/arm/emulation.rst | 1 +
 linux-user/elfload.c  | 1 +
 target/arm/tcg/cpu64.c| 1 +
 3 files changed, 3 insertions(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index 1fb6a2e8c3e..965cbf84c51 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -58,6 +58,7 @@ the following architecture extensions:
 - FEAT_LSE (Large System Extensions)
 - FEAT_LSE2 (Large System Extensions v2)
 - FEAT_LVA (Large Virtual Address space)
+- FEAT_MOPS (Standardization of memory operations)
 - FEAT_MTE (Memory Tagging Extension)
 - FEAT_MTE2 (Memory Tagging Extension)
 - FEAT_MTE3 (MTE Asymmetric Fault Handling)
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 203a2b790d5..db75cd4b33f 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -816,6 +816,7 @@ uint32_t get_elf_hwcap2(void)
 GET_FEATURE_ID(aa64_sme_i16i64, ARM_HWCAP2_A64_SME_I16I64);
 GET_FEATURE_ID(aa64_sme_fa64, ARM_HWCAP2_A64_SME_FA64);
 GET_FEATURE_ID(aa64_hbc, ARM_HWCAP2_A64_HBC);
+GET_FEATURE_ID(aa64_mops, ARM_HWCAP2_A64_MOPS);
 
 return hwcaps;
 }
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 57abaea00cd..68928e51272 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1028,6 +1028,7 @@ void aarch64_max_tcg_initfn(Object *obj)
 cpu->isar.id_aa64isar1 = t;
 
 t = cpu->isar.id_aa64isar2;
+t = FIELD_DP64(t, ID_AA64ISAR2, MOPS, 1); /* FEAT_MOPS */
 t = FIELD_DP64(t, ID_AA64ISAR2, BC, 1);  /* FEAT_HBC */
 cpu->isar.id_aa64isar2 = t;
 
-- 
2.34.1

[PULL 07/30] target/arm: Update AArch64 ID register field definitions

2023-09-21 Thread Peter Maydell

Update our AArch64 ID register field definitions from the 2023-06
system register XML release:
 https://developer.arm.com/documentation/ddi0601/2023-06/

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
---
 target/arm/cpu.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f2e3dc49a66..7ba2402f727 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2166,6 +2166,7 @@ FIELD(ID_AA64ISAR0, SHA1, 8, 4)
 FIELD(ID_AA64ISAR0, SHA2, 12, 4)
 FIELD(ID_AA64ISAR0, CRC32, 16, 4)
 FIELD(ID_AA64ISAR0, ATOMIC, 20, 4)
+FIELD(ID_AA64ISAR0, TME, 24, 4)
 FIELD(ID_AA64ISAR0, RDM, 28, 4)
 FIELD(ID_AA64ISAR0, SHA3, 32, 4)
 FIELD(ID_AA64ISAR0, SM3, 36, 4)
@@ -2200,6 +2201,13 @@ FIELD(ID_AA64ISAR2, APA3, 12, 4)
 FIELD(ID_AA64ISAR2, MOPS, 16, 4)
 FIELD(ID_AA64ISAR2, BC, 20, 4)
 FIELD(ID_AA64ISAR2, PAC_FRAC, 24, 4)
+FIELD(ID_AA64ISAR2, CLRBHB, 28, 4)
+FIELD(ID_AA64ISAR2, SYSREG_128, 32, 4)
+FIELD(ID_AA64ISAR2, SYSINSTR_128, 36, 4)
+FIELD(ID_AA64ISAR2, PRFMSLC, 40, 4)
+FIELD(ID_AA64ISAR2, RPRFM, 48, 4)
+FIELD(ID_AA64ISAR2, CSSC, 52, 4)
+FIELD(ID_AA64ISAR2, ATS1A, 60, 4)
 
 FIELD(ID_AA64PFR0, EL0, 0, 4)
 FIELD(ID_AA64PFR0, EL1, 4, 4)
@@ -2227,6 +2235,12 @@ FIELD(ID_AA64PFR1, SME, 24, 4)
 FIELD(ID_AA64PFR1, RNDR_TRAP, 28, 4)
 FIELD(ID_AA64PFR1, CSV2_FRAC, 32, 4)
 FIELD(ID_AA64PFR1, NMI, 36, 4)
+FIELD(ID_AA64PFR1, MTE_FRAC, 40, 4)
+FIELD(ID_AA64PFR1, GCS, 44, 4)
+FIELD(ID_AA64PFR1, THE, 48, 4)
+FIELD(ID_AA64PFR1, MTEX, 52, 4)
+FIELD(ID_AA64PFR1, DF2, 56, 4)
+FIELD(ID_AA64PFR1, PFAR, 60, 4)
 
 FIELD(ID_AA64MMFR0, PARANGE, 0, 4)
 FIELD(ID_AA64MMFR0, ASIDBITS, 4, 4)
@@ -2258,6 +2272,7 @@ FIELD(ID_AA64MMFR1, AFP, 44, 4)
 FIELD(ID_AA64MMFR1, NTLBPA, 48, 4)
 FIELD(ID_AA64MMFR1, TIDCP1, 52, 4)
 FIELD(ID_AA64MMFR1, CMOW, 56, 4)
+FIELD(ID_AA64MMFR1, ECBHB, 60, 4)
 
 FIELD(ID_AA64MMFR2, CNP, 0, 4)
 FIELD(ID_AA64MMFR2, UAO, 4, 4)
@@ -2279,7 +2294,9 @@ FIELD(ID_AA64DFR0, DEBUGVER, 0, 4)
 FIELD(ID_AA64DFR0, TRACEVER, 4, 4)
 FIELD(ID_AA64DFR0, PMUVER, 8, 4)
 FIELD(ID_AA64DFR0, BRPS, 12, 4)
+FIELD(ID_AA64DFR0, PMSS, 16, 4)
 FIELD(ID_AA64DFR0, WRPS, 20, 4)
+FIELD(ID_AA64DFR0, SEBEP, 24, 4)
 FIELD(ID_AA64DFR0, CTX_CMPS, 28, 4)
 FIELD(ID_AA64DFR0, PMSVER, 32, 4)
 FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4)
@@ -2287,12 +2304,14 @@ FIELD(ID_AA64DFR0, TRACEFILT, 40, 4)
 FIELD(ID_AA64DFR0, TRACEBUFFER, 44, 4)
 FIELD(ID_AA64DFR0, MTPMU, 48, 4)
 FIELD(ID_AA64DFR0, BRBE, 52, 4)
+FIELD(ID_AA64DFR0, EXTTRCBUFF, 56, 4)
 FIELD(ID_AA64DFR0, HPMN0, 60, 4)
 
 FIELD(ID_AA64ZFR0, SVEVER, 0, 4)
 FIELD(ID_AA64ZFR0, AES, 4, 4)
 FIELD(ID_AA64ZFR0, BITPERM, 16, 4)
 FIELD(ID_AA64ZFR0, BFLOAT16, 20, 4)
+FIELD(ID_AA64ZFR0, B16B16, 24, 4)
 FIELD(ID_AA64ZFR0, SHA3, 32, 4)
 FIELD(ID_AA64ZFR0, SM4, 40, 4)
 FIELD(ID_AA64ZFR0, I8MM, 44, 4)
@@ -2300,9 +2319,13 @@ FIELD(ID_AA64ZFR0, F32MM, 52, 4)
 FIELD(ID_AA64ZFR0, F64MM, 56, 4)
 
 FIELD(ID_AA64SMFR0, F32F32, 32, 1)
+FIELD(ID_AA64SMFR0, BI32I32, 33, 1)
 FIELD(ID_AA64SMFR0, B16F32, 34, 1)
 FIELD(ID_AA64SMFR0, F16F32, 35, 1)
 FIELD(ID_AA64SMFR0, I8I32, 36, 4)
+FIELD(ID_AA64SMFR0, F16F16, 42, 1)
+FIELD(ID_AA64SMFR0, B16B16, 43, 1)
+FIELD(ID_AA64SMFR0, I16I32, 44, 4)
 FIELD(ID_AA64SMFR0, F64F64, 48, 1)
 FIELD(ID_AA64SMFR0, I16I64, 52, 4)
 FIELD(ID_AA64SMFR0, SMEVER, 56, 4)
-- 
2.34.1

[PULL 00/30] target-arm queue

2023-09-21 Thread Peter Maydell

Hi; here's this week's arm pullreq. Mostly this is my
work on FEAT_MOPS and FEAT_HBC, but there are some
other bits and pieces in there too, including a recent
set of elf2dmp patches.

thanks
-- PMM

The following changes since commit 55394dcbec8f0c29c30e792c102a0edd50a52bf4:

  Merge tag 'pull-loongarch-20230920' of https://gitlab.com/gaosong/qemu into 
staging (2023-09-20 13:56:18 -0400)

are available in the Git repository at:

  https://git.linaro.org/people/pmaydell/qemu-arm.git 
tags/pull-target-arm-20230921

for you to fetch changes up to 231f6a7d66254a58bedbee458591b780e0a507b1:

  elf2dmp: rework PDB_STREAM_INDEXES::segments obtaining (2023-09-21 16:13:54 
+0100)


target-arm queue:
 * target/m68k: Add URL to semihosting spec
 * docs/devel/loads-stores: Fix git grep regexes
 * hw/arm/boot: Set SCR_EL3.FGTEn when booting kernel
 * linux-user: Correct SME feature names reported in cpuinfo
 * linux-user: Add missing arm32 hwcaps
 * Don't skip MTE checks for LDRT/STRT at EL0
 * Implement FEAT_HBC
 * Implement FEAT_MOPS
 * audio/jackaudio: Avoid dynamic stack allocation
 * sbsa-ref: add non-secure EL2 virtual timer
 * elf2dmp: improve Win2022, Win11 and large dumps


Fabian Vogt (1):
  hw/arm/boot: Set SCR_EL3.FGTEn when booting kernel

Marcin Juszkiewicz (1):
  sbsa-ref: add non-secure EL2 virtual timer

Peter Maydell (23):
  target/m68k: Add URL to semihosting spec
  docs/devel/loads-stores: Fix git grep regexes
  linux-user/elfload.c: Correct SME feature names reported in cpuinfo
  linux-user/elfload.c: Add missing arm and arm64 hwcap values
  linux-user/elfload.c: Report previously missing arm32 hwcaps
  target/arm: Update AArch64 ID register field definitions
  target/arm: Update user-mode ID reg mask values
  target/arm: Implement FEAT_HBC
  target/arm: Remove unused allocation_tag_mem() argument
  target/arm: Don't skip MTE checks for LDRT/STRT at EL0
  target/arm: Implement FEAT_MOPS enable bits
  target/arm: Pass unpriv bool to get_a64_user_mem_index()
  target/arm: Define syndrome function for MOPS exceptions
  target/arm: New function allocation_tag_mem_probe()
  target/arm: Implement MTE tag-checking functions for FEAT_MOPS
  target/arm: Implement the SET* instructions
  target/arm: Define new TB flag for ATA0
  target/arm: Implement the SETG* instructions
  target/arm: Implement MTE tag-checking functions for FEAT_MOPS copies
  target/arm: Implement the CPY* instructions
  target/arm: Enable FEAT_MOPS for CPU 'max'
  audio/jackaudio: Avoid dynamic stack allocation in qjack_client_init
  audio/jackaudio: Avoid dynamic stack allocation in qjack_process()

Viktor Prutyanov (5):
  elf2dmp: replace PE export name check with PDB name check
  elf2dmp: introduce physical block alignment
  elf2dmp: introduce merging of physical memory runs
  elf2dmp: use Linux mmap with MAP_NORESERVE when possible
  elf2dmp: rework PDB_STREAM_INDEXES::segments obtaining

 docs/devel/loads-stores.rst|  40 +-
 docs/system/arm/emulation.rst  |   2 +
 contrib/elf2dmp/addrspace.h|   1 +
 contrib/elf2dmp/pdb.h  |   2 +-
 contrib/elf2dmp/qemu_elf.h |   2 +
 target/arm/cpu.h   |  35 ++
 target/arm/internals.h |  55 +++
 target/arm/syndrome.h  |  12 +
 target/arm/tcg/helper-a64.h|  14 +
 target/arm/tcg/translate.h |   4 +-
 target/arm/tcg/a64.decode  |  38 +-
 audio/jackaudio.c  |  21 +-
 contrib/elf2dmp/addrspace.c|  31 +-
 contrib/elf2dmp/main.c | 154 
 contrib/elf2dmp/pdb.c  |  15 +-
 contrib/elf2dmp/qemu_elf.c |  68 +++-
 hw/arm/boot.c  |   4 +
 hw/arm/sbsa-ref.c  |   2 +
 linux-user/elfload.c   |  72 +++-
 target/arm/helper.c|  39 +-
 target/arm/tcg/cpu64.c |   5 +
 target/arm/tcg/helper-a64.c| 878 +
 target/arm/tcg/hflags.c|  21 +
 target/arm/tcg/mte_helper.c| 281 +++--
 target/arm/tcg/translate-a64.c | 164 +++-
 target/m68k/m68k-semi.c|   4 +
 tests/tcg/aarch64/sysregs.c|   4 +-
 27 files changed, 1768 insertions(+), 200 deletions(-)

[PULL 16/30] target/arm: Implement MTE tag-checking functions for FEAT_MOPS

2023-09-21 Thread Peter Maydell

The FEAT_MOPS instructions need a couple of helper routines that
check for MTE tag failures:
 * mte_mops_probe() checks whether there is going to be a tag
   error in the next up-to-a-page worth of data
 * mte_check_fail() is an existing function to record the fact
   of a tag failure, which we need to make global so we can
   call it from helper-a64.c

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-7-peter.mayd...@linaro.org
---
 target/arm/internals.h  | 28 +++
 target/arm/tcg/mte_helper.c | 54 +++--
 2 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 5f5393b25c4..a70a7fd50f6 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1272,6 +1272,34 @@ FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12)  /* size 
- 1 */
 bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
 uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t 
ra);
 
+/**
+ * mte_mops_probe: Check where the next MTE failure is for a FEAT_MOPS 
operation
+ * @env: CPU env
+ * @ptr: start address of memory region (dirty pointer)
+ * @size: length of region (guaranteed not to cross a page boundary)
+ * @desc: MTEDESC descriptor word (0 means no MTE checks)
+ * Returns: the size of the region that can be copied without hitting
+ *  an MTE tag failure
+ *
+ * Note that we assume that the caller has already checked the TBI
+ * and TCMA bits with mte_checks_needed() and an MTE check is definitely
+ * required.
+ */
+uint64_t mte_mops_probe(CPUARMState *env, uint64_t ptr, uint64_t size,
+uint32_t desc);
+
+/**
+ * mte_check_fail: Record an MTE tag check failure
+ * @env: CPU env
+ * @desc: MTEDESC descriptor word
+ * @dirty_ptr: Failing dirty address
+ * @ra: TCG retaddr
+ *
+ * This may never return (if the MTE tag checks are configured to fault).
+ */
+void mte_check_fail(CPUARMState *env, uint32_t desc,
+uint64_t dirty_ptr, uintptr_t ra);
+
 static inline int allocation_tag_from_addr(uint64_t ptr)
 {
 return extract64(ptr, 56, 4);
diff --git a/target/arm/tcg/mte_helper.c b/target/arm/tcg/mte_helper.c
index 303bcc7fd84..1cb61cea7af 100644
--- a/target/arm/tcg/mte_helper.c
+++ b/target/arm/tcg/mte_helper.c
@@ -617,8 +617,8 @@ static void mte_async_check_fail(CPUARMState *env, uint64_t 
dirty_ptr,
 }
 
 /* Record a tag check failure.  */
-static void mte_check_fail(CPUARMState *env, uint32_t desc,
-   uint64_t dirty_ptr, uintptr_t ra)
+void mte_check_fail(CPUARMState *env, uint32_t desc,
+uint64_t dirty_ptr, uintptr_t ra)
 {
 int mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
 ARMMMUIdx arm_mmu_idx = core_to_aa64_mmu_idx(mmu_idx);
@@ -991,3 +991,53 @@ uint64_t HELPER(mte_check_zva)(CPUARMState *env, uint32_t 
desc, uint64_t ptr)
  done:
 return useronly_clean_ptr(ptr);
 }
+
+uint64_t mte_mops_probe(CPUARMState *env, uint64_t ptr, uint64_t size,
+uint32_t desc)
+{
+int mmu_idx, tag_count;
+uint64_t ptr_tag, tag_first, tag_last;
+void *mem;
+bool w = FIELD_EX32(desc, MTEDESC, WRITE);
+uint32_t n;
+
+mmu_idx = FIELD_EX32(desc, MTEDESC, MIDX);
+/* True probe; this will never fault */
+mem = allocation_tag_mem_probe(env, mmu_idx, ptr,
+   w ? MMU_DATA_STORE : MMU_DATA_LOAD,
+   size, MMU_DATA_LOAD, true, 0);
+if (!mem) {
+return size;
+}
+
+/*
+ * TODO: checkN() is not designed for checks of the size we expect
+ * for FEAT_MOPS operations, so we should implement this differently.
+ * Maybe we should do something like
+ *   if (region start and size are aligned nicely) {
+ *  do direct loads of 64 tag bits at a time;
+ *   } else {
+ *  call checkN()
+ *   }
+ */
+/* Round the bounds to the tag granule, and compute the number of tags. */
+ptr_tag = allocation_tag_from_addr(ptr);
+tag_first = QEMU_ALIGN_DOWN(ptr, TAG_GRANULE);
+tag_last = QEMU_ALIGN_DOWN(ptr + size - 1, TAG_GRANULE);
+tag_count = ((tag_last - tag_first) / TAG_GRANULE) + 1;
+n = checkN(mem, ptr & TAG_GRANULE, ptr_tag, tag_count);
+if (likely(n == tag_count)) {
+return size;
+}
+
+/*
+ * Failure; for the first granule, it's at @ptr. Otherwise
+ * it's at the first byte of the nth granule. Calculate how
+ * many bytes we can access without hitting that failure.
+ */
+if (n == 0) {
+return 0;
+} else {
+return n * TAG_GRANULE - (ptr - tag_first);
+}
+}
-- 
2.34.1

[PULL 18/30] target/arm: Define new TB flag for ATA0

2023-09-21 Thread Peter Maydell

Currently the only tag-setting instructions always do so in the
context of the current EL, and so we only need one ATA bit in the TB
flags.  The FEAT_MOPS SETG instructions include ones which set tags
for a non-privileged access, so we now also need the equivalent "are
tags enabled?" information for EL0.

Add the new TB flag, and convert the existing 'bool ata' field in
DisasContext to a 'bool ata[2]' that can be indexed by the is_unpriv
bit in an instruction, similarly to mte[2].

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230912140434.169-9-peter.mayd...@linaro.org
---
 target/arm/cpu.h   |  1 +
 target/arm/tcg/translate.h |  4 ++--
 target/arm/tcg/hflags.c| 12 
 target/arm/tcg/translate-a64.c | 23 ---
 4 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 266c1a9ea1b..bd55c5dabfd 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3171,6 +3171,7 @@ FIELD(TBFLAG_A64, SVL, 24, 4)
 FIELD(TBFLAG_A64, SME_TRAP_NONSTREAMING, 28, 1)
 FIELD(TBFLAG_A64, FGT_ERET, 29, 1)
 FIELD(TBFLAG_A64, NAA, 30, 1)
+FIELD(TBFLAG_A64, ATA0, 31, 1)
 
 /*
  * Helpers for using the above.
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
index f748ba6f394..63922f8bad1 100644
--- a/target/arm/tcg/translate.h
+++ b/target/arm/tcg/translate.h
@@ -114,8 +114,8 @@ typedef struct DisasContext {
 bool unpriv;
 /* True if v8.3-PAuth is active.  */
 bool pauth_active;
-/* True if v8.5-MTE access to tags is enabled.  */
-bool ata;
+/* True if v8.5-MTE access to tags is enabled; index with is_unpriv.  */
+bool ata[2];
 /* True if v8.5-MTE tag checks affect the PE; index with is_unpriv.  */
 bool mte_active[2];
 /* True with v8.5-BTI and SCTLR_ELx.BT* set.  */
diff --git a/target/arm/tcg/hflags.c b/target/arm/tcg/hflags.c
index ea642384f5a..cea1adb7b62 100644
--- a/target/arm/tcg/hflags.c
+++ b/target/arm/tcg/hflags.c
@@ -325,6 +325,18 @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, 
int el, int fp_el,
 && allocation_tag_access_enabled(env, 0, sctlr)) {
 DP_TBFLAG_A64(flags, MTE0_ACTIVE, 1);
 }
+/*
+ * For unpriv tag-setting accesses we alse need ATA0. Again, in
+ * contexts where unpriv and normal insns are the same we
+ * duplicate the ATA bit to save effort for translate-a64.c.
+ */
+if (EX_TBFLAG_A64(flags, UNPRIV)) {
+if (allocation_tag_access_enabled(env, 0, sctlr)) {
+DP_TBFLAG_A64(flags, ATA0, 1);
+}
+} else {
+DP_TBFLAG_A64(flags, ATA0, EX_TBFLAG_A64(flags, ATA));
+}
 /* Cache TCMA as well as TBI. */
 DP_TBFLAG_A64(flags, TCMA, aa64_va_parameter_tcma(tcr, mmu_idx));
 }
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index bb7b15cb6cb..da4aabbaf4e 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -2272,7 +2272,7 @@ static void handle_sys(DisasContext *s, bool isread,
 clean_addr = clean_data_tbi(s, tcg_rt);
 gen_probe_access(s, clean_addr, MMU_DATA_STORE, MO_8);
 
-if (s->ata) {
+if (s->ata[0]) {
 /* Extract the tag from the register to match STZGM.  */
 tag = tcg_temp_new_i64();
 tcg_gen_shri_i64(tag, tcg_rt, 56);
@@ -2289,7 +2289,7 @@ static void handle_sys(DisasContext *s, bool isread,
 clean_addr = clean_data_tbi(s, tcg_rt);
 gen_helper_dc_zva(cpu_env, clean_addr);
 
-if (s->ata) {
+if (s->ata[0]) {
 /* Extract the tag from the register to match STZGM.  */
 tag = tcg_temp_new_i64();
 tcg_gen_shri_i64(tag, tcg_rt, 56);
@@ -3070,7 +3070,7 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
 tcg_gen_qemu_st_i128(tmp, clean_addr, get_mem_index(s), mop);
 
 /* Perform the tag store, if tag access enabled. */
-if (s->ata) {
+if (s->ata[0]) {
 if (tb_cflags(s->base.tb) & CF_PARALLEL) {
 gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr);
 } else {
@@ -3768,7 +3768,7 @@ static bool trans_STZGM(DisasContext *s, arg_ldst_tag *a)
 tcg_gen_addi_i64(addr, addr, a->imm);
 tcg_rt = cpu_reg(s, a->rt);
 
-if (s->ata) {
+if (s->ata[0]) {
 gen_helper_stzgm_tags(cpu_env, addr, tcg_rt);
 }
 /*
@@ -3800,7 +3800,7 @@ static bool trans_STGM(DisasContext *s, arg_ldst_tag *a)
 tcg_gen_addi_i64(addr, addr, a->imm);
 tcg_rt = cpu_reg(s, a->rt);
 
-if (s->ata) {
+if (s->ata[0]) {
 gen_helper_stgm(cpu_env, addr, tcg_rt);
 } else {
 MMUAccessType acc = MMU_DATA_STORE;
@@ -3832,7 +3832,7 @@ static bool trans_LDGM(DisasContext *s, arg_ldst_tag *a)
 tcg_gen_addi_i64(addr, addr, a->imm);
 tcg_rt = cp

Re: [PATCH v1 13/22] vfio: Add base container

2023-09-21 Thread Eric Auger

Hi Zhenzhong,
On 9/21/23 05:35, Duan, Zhenzhong wrote:
> Hi Eric,
>
>> -Original Message-
>> From: Eric Auger 
>> Sent: Thursday, September 21, 2023 1:31 AM
>> Subject: Re: [PATCH v1 13/22] vfio: Add base container
>>
>> Hi Zhenzhong,
>>
>> On 9/19/23 19:23, Cédric Le Goater wrote:
>>> On 8/30/23 12:37, Zhenzhong Duan wrote:
 From: Yi Liu 

 Abstract the VFIOContainer to be a base object. It is supposed to be
 embedded by legacy VFIO container and later on, into the new iommufd
 based container.

 The base container implements generic code such as code related to
 memory_listener and address space management. The VFIOContainerOps
 implements callbacks that depend on the kernel user space being used.

 'common.c' and vfio device code only manipulates the base container with
 wrapper functions that calls the functions defined in
 VFIOContainerOpsClass.
 Existing 'container.c' code is converted to implement the legacy
 container
 ops functions.

 Below is the base container. It's named as VFIOContainer, old
 VFIOContainer
 is replaced with VFIOLegacyContainer.
>>> Usualy, we introduce the new interface solely, port the current models
>>> on top of the new interface, wire the new models in the current
>>> implementation and remove the old implementation. Then, we can start
>>> adding extensions to support other implementations.
>>>
>>> spapr should be taken care of separatly following the principle above.
>>> With my PPC hat, I would not even read such a massive change, too risky
>>> for the subsystem. This path will need (much) further splitting to be
>>> understandable and acceptable.
>> We might split this patch by
>> 1) introducing VFIOLegacyContainer encapsulating the base VFIOContainer,
>> without using the ops in a first place:
>>  common.c would call vfio_container_* with harcoded legacy
>> implementation, ie. retrieving the legacy container with container_of.
>> 2) we would introduce the BE interface without using it.
>> 3) we would use the new BE interface
>>
>> Obviously this needs to be further tried out. If you wish I can try to
>> split it that way ... Please let me know
> Sure, thanks for your help, glad that I can cooperate with you to move
> this series forward.
> I just updated the branch which rebased to newest upstream for you to pick at 
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_cdev_v1_rebased 

I have spent most of my day reshuffling this single patch into numerous
ones (16!). This should help the review.
I was short of time. This compiles, the end code should be identical to
the original one. Besides this deserves some additional review on your
end, commit msg tuning, ...

But at least it is a move forward. Feel free to incorporate that in your
next respin.

Please find that work on the following branch

https://github.com/eauger/qemu/tree/iommufd_cdev_v1_rebased_split

Thanks

Eric
>
> Thanks
> Zhenzhong

Re: EDK2 ArmVirtQemu behaviour with multiple UARTs

2023-09-21 Thread Gerd Hoffmann

On Thu, Sep 21, 2023 at 04:34:27PM +0100, Peter Maydell wrote:
> As long as EDK2 does something sensible when the DTB says "two
> UARTs here and here" and it also finds a virtio-serial PCI
> device, I don't mind what exactly it does. The problem here is
> more that EDK2 currently does strange things when told that
> the hardware is present, rather than that anybody specifically wants
> EDK2 to use multiple serial outputs.
> 
> Though given there's no way to say in the DTB "use a PCI card
> for your console" I think the virtio-serial approach is likely
> to be awkward for users in practice.

edk2 adds a virtio console to the edk2 console multiplexer if
present (for both pci and mmio virtio transports), and systemd
spawns also spawns a getty on /dev/hvc0 if present.  So this
works mostly automatic.  Only if you also want the linux boot
messages show up there too you need to add 'console=hvc0' to
your kernel command line.

take care,
  Gerd

Re: [PATCH v2] linux-user: Fixes for zero_bss

2023-09-21 Thread Philippe Mathieu-Daudé


On 9/9/23 20:45, Richard Henderson wrote:

The previous change, 2d385be6152, assumed !PAGE_VALID meant that
the page would be unmapped by the elf image.  However, since we
reserved the entire image space via mmap, PAGE_VALID will always
be set.  Instead, assume PROT_NONE for the same condition.

Furthermore, assume bss is only ever present for writable segments,
and that there is no page overlap between PT_LOAD segments.
Instead of an assert, return false to indicate failure.

Cc: qemu-sta...@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1854
Fixes: 2d385be6152 ("linux-user: Do not adjust zero_bss for host page size")
Signed-off-by: Richard Henderson 
---
v2: Pass errp to zero_bss, so we can give a reasonable error message.
---
  linux-user/elfload.c | 53 +---
  1 file changed, 40 insertions(+), 13 deletions(-)


To the best of my knowledge,

Reviewed-by: Philippe Mathieu-Daudé

Re: [Bug 1819289] Re: Windows 95 and Windows 98 will not install or run

2023-09-21 Thread John M

This is happening again in 8.1. I used Windows 95 for a while in 6.1 and it
was fine, but when I tried to upgrade to 8.1, it started happening again. I
tried reducing the memory and it still happens. Not an urgent issue though.

On Mon, Aug 30, 2021 at 2:05 AM Philippe Mathieu-Daudé <
1819...@bugs.launchpad.net> wrote:

> Brad said later after testing v6.1 it was fixed so please disregard
> previous comment ¯\_(ツ)_/¯
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1819289
>
> Title:
>   Windows 95 and Windows 98 will not install or run
>
> Status in QEMU:
>   Fix Released
>
> Bug description:
>   The last version of QEMU I have been able to run Windows 95 or Windows
>   98 on was 2.7 or 2.8. Recent versions since then even up to 3.1 will
>   either not install or will not run 95 or 98 at all. I have tried every
>   combination of options like isapc or no isapc, cpu pentium  or cpu as
>   486. Tried different memory configurations, but they just don't work
>   anymore.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1819289/+subscriptions
>
>

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1819289

Title:
  Windows 95 and Windows 98 will not install or run

Status in QEMU:
  Fix Released

Bug description:
  The last version of QEMU I have been able to run Windows 95 or Windows
  98 on was 2.7 or 2.8. Recent versions since then even up to 3.1 will
  either not install or will not run 95 or 98 at all. I have tried every
  combination of options like isapc or no isapc, cpu pentium  or cpu as
  486. Tried different memory configurations, but they just don't work
  anymore.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1819289/+subscriptions

[PATCH 1/1] hw/ide/core: terminate in-flight DMA on IDE bus reset

2023-09-21 Thread Simon Rowe

When an IDE controller is reset, its internal state is being cleared
before any outstanding I/O is cancelled. If a response to DMA is
received in this window, the aio callback will incorrectly continue
with the next part of the transfer (now using sector 0 from
the cleared controller state).

For a write operation, this results in user data being written to the
MBR, replacing the first stage bootloader and/or partition table. A
malicious user could exploit this bug to first read the MBR and then
rewrite it with user-controller bootloader code.

This addresses the bug by checking if DRQ_STAT is still set in the DMA
callback (as it is otherwise cleared at the start of the bus
reset). If it is not, treat the transfer as ended.

This only appears to affect SATA controllers, plain IDE does not use
aio.

Fixes: CVE-2023-5088
Signed-off-by: Simon Rowe 
Cc: Felipe Franciosi 
---
 hw/ide/core.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index b5e0dcd29b..826b7eaeeb 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -906,8 +906,12 @@ static void ide_dma_cb(void *opaque, int ret)
 s->nsector -= n;
 }
 
-/* end of transfer ? */
-if (s->nsector == 0) {
+/*
+ * End of transfer ?
+ * If a bus reset occurs immediately before the callback is invoked the
+ * bus state will have been cleared. Terminate the transfer.
+ */
+if (s->nsector == 0 || !(s->status & DRQ_STAT)) {
 s->status = READY_STAT | SEEK_STAT;
 ide_bus_set_irq(s->bus);
 goto eot;
-- 
2.22.3

[PATCH 0/1] CVE-2023-5088

2023-09-21 Thread Simon Rowe

The attached patch fixes CVE-2023-5088 in which a bug in QEMU could
cause a guest I/O operation otherwise addressed to an arbitrary disk
offset to be targeted to offset 0 instead (potentially overwriting the
VM's boot code). This could be used, for example, by L2 guests with a
virtual disk (vdiskL2) stored on a virtual disk of an L1 (vdiskL1)
hypervisor to read and/or write data to LBA 0 of vdiskL1, potentially
gaining control of L1 at its next reboot.

The symptoms behind this bug have been previously reported as MBR
corruptions and an independent fix has been posted by Fiona Ebner in:

https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg03883.html

Simon Rowe (1):
  hw/ide/core: terminate in-flight DMA on IDE bus reset

 hw/ide/core.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

-- 
2.22.3

Re: [PATCH 2/4] target/ppc: Add recording of taken branches to BHRB

2023-09-21 Thread Glenn Miles





It all looks pretty good otherwise. I do worry about POWER8/9 which
do not have BHRB disable bit. How much do they slow down I wonder?


That is a good question!  I'll see if I can get some linux boot times
with and without the changes on P9.



I ran some tests booting ubuntu 20.04 on the powrnv9 model and found
that it took about 5% longer to boot to the login prompt with the
changes than without.  As discussed offline, I'll remove this support
from P8/P9 as the 5% reduction in performance is not worth the added
functionality.  We could perhaps add it back in the future with a
command line option that has it disabled by default on P8/P9.

P10 and on has it disabled for non problem state branches, so we
should be good there.

Thanks,

Glenn

Re: EDK2 ArmVirtQemu behaviour with multiple UARTs

2023-09-21 Thread Peter Maydell

On Thu, 21 Sept 2023 at 16:26, Gerd Hoffmann  wrote:
>
> On Thu, Sep 21, 2023 at 11:50:20AM +0100, Peter Maydell wrote:
> > Hi; I've been looking again at a very long standing missing feature in
> > the QEMU virt board, which is that we only have one UART. One of the
> > things that has stalled this in the past has been the odd behaviour of
> > EDK2 if the DTB that QEMU passes it describes two UARTs.
>
> Note that edk2 recently got support for virtio-serial, so you can use
> that for the console and leave the uart for debug logging.  The prebuild
> edk2 binaries in qemu have been updated days ago and these already
> support for virtio-serial..

As long as EDK2 does something sensible when the DTB says "two
UARTs here and here" and it also finds a virtio-serial PCI
device, I don't mind what exactly it does. The problem here is
more that EDK2 currently does strange things when told that
the hardware is present, rather than that anybody specifically wants
EDK2 to use multiple serial outputs.

Though given there's no way to say in the DTB "use a PCI card
for your console" I think the virtio-serial approach is likely
to be awkward for users in practice.

-- PMM

Re: EDK2 ArmVirtQemu behaviour with multiple UARTs

2023-09-21 Thread Gerd Hoffmann

On Thu, Sep 21, 2023 at 11:50:20AM +0100, Peter Maydell wrote:
> Hi; I've been looking again at a very long standing missing feature in
> the QEMU virt board, which is that we only have one UART. One of the
> things that has stalled this in the past has been the odd behaviour of
> EDK2 if the DTB that QEMU passes it describes two UARTs.

Note that edk2 recently got support for virtio-serial, so you can use
that for the console and leave the uart for debug logging.  The prebuild
edk2 binaries in qemu have been updated days ago and these already
support for virtio-serial..

take care,
  Gerd

Re: [PULL 00/17] Trivial patches for 2023-09-21

2023-09-21 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature

Re: [PULL v2 00/22] implement discard operation for Parallels images

2023-09-21 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature

Re: [PULL v2 00/28] Block layer patches

2023-09-21 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature

Re: [PULL v3 0/9] testing updates (back to green!)

2023-09-21 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature

Re: [PATCH v2 3/5] elf2dmp: introduce merging of physical memory runs

2023-09-21 Thread Peter Maydell

On Fri, 15 Sept 2023 at 18:02, Viktor Prutyanov  wrote:
>
> DMP supports 42 physical memory runs at most. So, merge adjacent
> physical memory ranges from QEMU ELF when possible to minimize total
> number of runs.
>
> Signed-off-by: Viktor Prutyanov 
> ---
>  contrib/elf2dmp/main.c | 56 --
>  1 file changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
> index b7e3930164..b4683575fd 100644
> --- a/contrib/elf2dmp/main.c
> +++ b/contrib/elf2dmp/main.c
> @@ -20,6 +20,7 @@
>  #define PE_NAME "ntoskrnl.exe"
>
>  #define INITIAL_MXCSR   0x1f80
> +#define MAX_NUMBER_OF_RUNS  42
>
>  typedef struct idt_desc {
>  uint16_t offset1;   /* offset bits 0..15 */
> @@ -234,6 +235,42 @@ static int fix_dtb(struct va_space *vs, QEMU_Elf *qe)
>  return 1;
>  }
>
> +static void try_merge_runs(struct pa_space *ps,
> +WinDumpPhyMemDesc64 *PhysicalMemoryBlock)
> +{
> +unsigned int merge_cnt = 0, run_idx = 0;
> +
> +PhysicalMemoryBlock->NumberOfRuns = 0;
> +
> +for (size_t idx = 0; idx < ps->block_nr; idx++) {
> +struct pa_block *blk = ps->block + idx;
> +struct pa_block *next = blk + 1;
> +
> +PhysicalMemoryBlock->NumberOfPages += blk->size / ELF2DMP_PAGE_SIZE;
> +
> +if (idx + 1 != ps->block_nr && blk->paddr + blk->size == 
> next->paddr) {
> +printf("Block #%lu 0x%"PRIx64"+:0x%"PRIx64" and %u previous will 
> be"
> +" merged\n", idx, blk->paddr, blk->size, merge_cnt);
> +merge_cnt++;
> +} else {
> +struct pa_block *first_merged = blk - merge_cnt;
> +
> +printf("Block #%lu 0x%"PRIx64"+:0x%"PRIx64" and %u previous will 
> be"
> +" merged to 0x%"PRIx64"+:0x%"PRIx64" (run #%u)\n",
> +idx, blk->paddr, blk->size, merge_cnt, 
> first_merged->paddr,
> +blk->paddr + blk->size - first_merged->paddr, run_idx);
> +PhysicalMemoryBlock->Run[run_idx] = (WinDumpPhyMemRun64) {
> +.BasePage = first_merged->paddr / ELF2DMP_PAGE_SIZE,
> +.PageCount = (blk->paddr + blk->size - first_merged->paddr) /
> +ELF2DMP_PAGE_SIZE,
> +};

Hi; this fails to build on 32-bit hosts because in both these printf()
statements the format string uses "%lu" to print a size_t. This
doesn't work if size_t is not a 'long'. The right format operator is "%zu".

I have squashed in the relevant change to this patch in target-arm.next.

thanks
-- PMM

Re: [PATCH 6/7] target/arm: Update user-mode ID reg mask values

2023-09-21 Thread Peter Maydell

On Mon, 11 Sept 2023 at 14:53, Peter Maydell  wrote:
>
> For user-only mode we reveal a subset of the AArch64 ID registers
> to the guest, to emulate the kernel's trap-and-emulate-ID-regs
> handling. Update the feature bit masks to match upstream kernel
> commit a48fa7efaf1161c1c.
>
> None of these features are yet implemented by QEMU, so this
> doesn't yet have a behavioural change, but implementation of
> FEAT_MOPS and FEAT_HBC is imminent.
>
> Signed-off-by: Peter Maydell 
> ---
>  target/arm/helper.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)

I forgot to update tests/tcg/aarch64/sysregs.c to indicate
that the new fields are permitted to be visible to userspace.
This patch needs the following squashed in:

diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c
index d8eb06abcf2..f7a055f1d5f 100644
--- a/tests/tcg/aarch64/sysregs.c
+++ b/tests/tcg/aarch64/sysregs.c
@@ -126,7 +126,7 @@ int main(void)
  */
 get_cpu_reg_check_mask(id_aa64isar0_el1, _m(f0ff,,f0ff,fff0));
 get_cpu_reg_check_mask(id_aa64isar1_el1, _m(00ff,f0ff,,));
-get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(,,,));
+get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(00ff,,00ff,));
 /* TGran4 & TGran64 as pegged to -1 */
 get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(f000,,ff00,));
 get_cpu_reg_check_mask(id_aa64mmfr1_el1, _m(,f000,,));
@@ -138,7 +138,7 @@ int main(void)
 get_cpu_reg_check_mask(id_aa64dfr0_el1,  _m(,,,0006));
 get_cpu_reg_check_zero(id_aa64dfr1_el1);
 get_cpu_reg_check_mask(SYS_ID_AA64ZFR0_EL1,  _m(0ff0,ff0f,00ff,00ff));
-get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(80f1,00fd,,));
+get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(8ff1,fcff,,));

 get_cpu_reg_check_zero(id_aa64afr0_el1);
 get_cpu_reg_check_zero(id_aa64afr1_el1);

to avoid check-tcg failing when the new features like FEAT_MOPS
or FEAT_HBC are present in 'max'.

-- PMM

Re: [PATCH v2] linux-user: Fixes for zero_bss

2023-09-21 Thread Michael Tokarev


09.09.2023 21:45, Richard Henderson wrote:

The previous change, 2d385be6152, assumed !PAGE_VALID meant that
the page would be unmapped by the elf image.  However, since we
reserved the entire image space via mmap, PAGE_VALID will always
be set.  Instead, assume PROT_NONE for the same condition.

Furthermore, assume bss is only ever present for writable segments,
and that there is no page overlap between PT_LOAD segments.
Instead of an assert, return false to indicate failure.

Cc: qemu-sta...@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1854
Fixes: 2d385be6152 ("linux-user: Do not adjust zero_bss for host page size")
Signed-off-by: Richard Henderson 
---
v2: Pass errp to zero_bss, so we can give a reasonable error message.
---
  linux-user/elfload.c | 53 +---
  1 file changed, 40 insertions(+), 13 deletions(-)


Ping? Has this been forgotten?
I picked this one up for debian 8.1 package, at least I don't see
regressions with it applied (together with stuff staging for 8.1.1).

Thanks,

/mjt

[PATCH] qom/object_interfaces: Handle `len-` property first

2023-09-21 Thread Lu Gao

From: "Gao, Lu" 

Array property needs corresponding `len-` property set first to add
actual array properties. Then we need to make sure `len-` property is
set first before array property.

But when the model is used with like
`-device driver[,prop[=value][,...]]`
in QEMU command line options, this is not guaranteed in current
property set from qdict. Array property might be
handled before 'len-' property, then leads to an error.

Signed-off-by: Lu Gao 
Signed-off-by: Jianxian Wen 
---
 qom/object_interfaces.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
index 7d31589b04..87500401a4 100644
--- a/qom/object_interfaces.c
+++ b/qom/object_interfaces.c
@@ -18,6 +18,7 @@
 #include "qapi/opts-visitor.h"
 #include "qemu/config-file.h"
 #include "qemu/keyval.h"
+#include "hw/qdev-properties.h"
 
 bool user_creatable_complete(UserCreatable *uc, Error **errp)
 {
@@ -52,8 +53,22 @@ static void object_set_properties_from_qdict(Object *obj, 
const QDict *qdict,
 return;
 }
 for (e = qdict_first(qdict); e; e = qdict_next(qdict, e)) {
-if (!object_property_set(obj, e->key, v, errp)) {
-goto out;
+/* set "len-" first for the array props to be allocated first */
+if (strncmp(e->key, PROP_ARRAY_LEN_PREFIX,
+strlen(PROP_ARRAY_LEN_PREFIX)) == 0) {
+if (!object_property_set(obj, e->key, v, errp)) {
+goto out;
+}
+}
+}
+
+for (e = qdict_first(qdict); e; e = qdict_next(qdict, e)) {
+/* "len-" has been set above */
+if (strncmp(e->key, PROP_ARRAY_LEN_PREFIX,
+strlen(PROP_ARRAY_LEN_PREFIX)) != 0) {
+if (!object_property_set(obj, e->key, v, errp)) {
+goto out;
+}
 }
 }
 visit_check_struct(v, errp);
-- 
2.17.1

Re: [PATCH v2 07/10] virtiofsd: Use qemu_get_runtime_dir()

2023-09-21 Thread Akihiko Odaki


On 2023/09/21 21:58, Stefan Hajnoczi wrote:

On Thu, Nov 10, 2022 at 07:06:26PM +0900, Akihiko Odaki wrote:

qemu_get_runtime_dir() is used to construct the path to a lock file.

Signed-off-by: Akihiko Odaki 
---
  tools/virtiofsd/fuse_virtio.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
index 9368e292e4..b9eeed85e6 100644
--- a/tools/virtiofsd/fuse_virtio.c
+++ b/tools/virtiofsd/fuse_virtio.c
@@ -901,12 +901,12 @@ static bool fv_socket_lock(struct fuse_session *se)
  {
  g_autofree gchar *sk_name = NULL;
  g_autofree gchar *pidfile = NULL;
-g_autofree gchar *state = NULL;
+g_autofree gchar *run = NULL;
  g_autofree gchar *dir = NULL;
  Error *local_err = NULL;
  
-state = qemu_get_local_state_dir();

-dir = g_build_filename(state, "run", "virtiofsd", NULL);
+run = qemu_get_runtime_dir();
+dir = g_build_filename(run, "virtiofsd", NULL);
  
  if (g_mkdir_with_parents(dir, S_IRWXU) < 0) {

  fuse_log(FUSE_LOG_ERR, "%s: Failed to create directory %s: %s\n",


tools/virtiofsd/ no longer exists. Which version of QEMU did you develop 
against?

commit e0dc2631ec4ac718ebe22ddea0ab25524eb37b0e
Author: Dr. David Alan Gilbert 
Date:   Wed Jan 18 12:11:51 2023 +

 virtiofsd: Remove source

Stefan


It is an old version of the series. You can find the latest version at:
https://patchew.org/QEMU/20230921075425.16738-1-akihiko.od...@daynix.com/

Regards,
Akihiko Odaki

Re: [PATCH v2 7/7] qobject atomics osdep: Make a few macros more hygienic

2023-09-21 Thread Markus Armbruster

Kevin Wolf  writes:

> Am 20.09.2023 um 20:31 hat Markus Armbruster geschrieben:

[...]

>> diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h
>> index 9003b71fd3..d36cc97805 100644
>> --- a/include/qapi/qmp/qobject.h
>> +++ b/include/qapi/qmp/qobject.h
>> @@ -45,10 +45,17 @@ struct QObject {
>>  struct QObjectBase_ base;
>>  };
>>  
>> -#define QOBJECT(obj) ({ \
>> +/*
>> + * Preprocessory sorcery ahead: use a different identifier for the
>> + * local variable in each expansion, so we can nest macro calls
>> + * without shadowing variables.
>> + */
>> +#define QOBJECT_INTERNAL(obj, _obj) ({  \
>>  typeof(obj) _obj = (obj);   \
>> -_obj ? container_of(&(_obj)->base, QObject, base) : NULL;   \
>> +_obj\
>> +? container_of(&(_obj)->base, QObject, base) : NULL;\
>
> What happened here? The code in this line (or now two lines) seems to be
> unchanged apart from a strange looking newline.

Accident, will fix, thanks!

>>  })
>> +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), MAKE_IDENTFIER(_obj))
>
> Kevin

Re: [PATCH v3 00/10] Validate and test qapi examples

2023-09-21 Thread Markus Armbruster

Victor Toso  writes:

> Hi,
>
> v2: https://lists.gnu.org/archive/html/qemu-devel/2023-09/msg02383.html
>
> - Sorry Markus, I kept the two last 'fix example' patches as I don't
>   fully remember how we should go with it.

That's fine.

I see two sane alternatives:

1. Add suitable elision syntax.  Happy to discuss details, but I think
we should discuss my review of PATCH 10 before we complicate matters
further.

2. Decide we don't need elisions.  Delete the ones we have.  I'd like to
see an argument that the ones we have are not helpful enough to justfify
the effort to keep them.

>Not taking them but taking
>   the generator would be bad as we would fail the build.

Understood.

[...]

Re: [PATCH] accel/kvm/kvm-all: Handle register access errors

2023-09-21 Thread Peter Maydell

On Thu, 21 Sept 2023 at 08:25, Akihiko Odaki  wrote:
> On 2023/06/19 21:19, Peter Maydell wrote:
> > On Sat, 10 Jun 2023 at 04:51, Akihiko Odaki  
> > wrote:
> >> On 2022/12/01 20:00, Akihiko Odaki wrote:
> >>> On 2022/12/01 19:40, Peter Maydell wrote:
>  On Thu, 1 Dec 2022 at 10:27, Akihiko Odaki 
>  wrote:
> > A register access error typically means something seriously wrong
> > happened so that anything bad can happen after that and recovery is
> > impossible.
> > Even failing one register access is catastorophic as
> > architecture-specific code are not written so that it torelates such
> > failures.
> >
> > Make sure the VM stop and nothing worse happens if such an error occurs.
> >
> > Signed-off-by: Akihiko Odaki 

> >> QEMU 8.0 is already released so I think it's time to revisit this.
> >
> > Two months ago would have been a better time :-) We're heading up
> > towards softfreeze for 8.1 in about three weeks from now.

> Hi Peter,
>
> Please apply this.

Looking again at the patch I see it hasn't been reviewed by
anybody on the KVM side of things. Paolo, does this seem like
the right way to handle errors from kvm_arch_get_registers()
and kvm_arch_put_registers() ?

The original patch is at:
https://patchew.org/QEMU/20221201102728.69751-1-akihiko.od...@daynix.com/

thanks
-- PMM

Re: [PATCH v2] plugins/hotblocks: Fix potential deadlock in plugin_exit() function

2023-09-21 Thread Philippe Mathieu-Daudé


On 21/9/23 11:23, Cong Liu wrote:

This patch fixes a potential deadlock in the plugin_exit() function of QEMU.
The original code does not release the lock mutex if it is NULL. This patch
adds a check for it being NULL and releases the mutex in that case.

Signed-off-by: Cong Liu 
Suggested-by: Philippe Mathieu-Daudé 


Not really suggested, just reviewed ;)

Reviewed-by: Philippe Mathieu-Daudé 


---
  contrib/plugins/hotblocks.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/plugins/hotblocks.c b/contrib/plugins/hotblocks.c
index 6b74d25fead6..b99b93ad8dc7 100644
--- a/contrib/plugins/hotblocks.c
+++ b/contrib/plugins/hotblocks.c
@@ -69,9 +69,9 @@ static void plugin_exit(qemu_plugin_id_t id, void *p)
  }
  
  g_list_free(it);

-g_mutex_unlock(&lock);
  }
  
+g_mutex_unlock(&lock);

  qemu_plugin_outs(report->str);
  }

1 2 3 >

1 - 100 of 236 matches

Mail list logo