Re: [PATCH] accel/tcg: nochain should disable goto_ptr

2024-05-31 Thread niugen

on 2024/6/1 01:32, Richard Henderson wrote:

On 5/31/24 03:17, NiuGenen wrote:

Signed-off-by: NiuGenen 
---
  accel/tcg/cpu-exec.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 2972f75b96..084fa645c7 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -173,7 +173,7 @@ uint32_t curr_cflags(CPUState *cpu)
  } else if (qatomic_read(_insn_per_tb)) {
  cflags |= CF_NO_GOTO_TB | 1;
  } else if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-    cflags |= CF_NO_GOTO_TB;
+    cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
  }
    return cflags;


Why?

The original intent of nochain was so that -d exec would log all 
blocks, which requires excluding goto_tb.  There is exec logging in 
helper_lookup_goto_ptr, so there is no need to avoid goto_ptr.


You must provide a rationale, at minimum.


r~



Sorry, my mistake. I thought nochain will disable all kinds of branches, 
including direct branch and indirect branch, but I found that indirect 
branch still call helper_lookup_tb_ptr to continue executing TB instead 
of epilogue-tblookup-prologue.


Maybe the exec logging can be removed from helper_lookup_tb_ptr and 
nochain can disable all the chaining of TB?


Thanks for your patience.




Re: [PATCH v4 00/31] Add AMD Secure Nested Paging (SEV-SNP) support

2024-05-31 Thread Gupta, Pankaj

Hi Paolo,


please check if branch qemu-coco-queue of
https://gitlab.com/bonzini/qemu works for you!


Getting compilation error here: Hope I am looking at correct branch.


Oops, sorry:

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 96dc41d355c..ede3ef1225f 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -168,7 +168,7 @@ static const char *vm_type_name[] = {
  [KVM_X86_DEFAULT_VM] = "default",
  [KVM_X86_SEV_VM] = "SEV",
  [KVM_X86_SEV_ES_VM] = "SEV-ES",
-[KVM_X86_SEV_SNP_VM] = "SEV-SNP",
+[KVM_X86_SNP_VM] = "SEV-SNP",
  };

  bool kvm_is_vm_type_supported(int type)

Tested the above builds, and pushed!


Thank you for your work! I tested (quick tests) the updated branch and 
OVMF [1], it works well for single bios option[2] & direct kernel boot 
[3]. For some reason separate 'pflash' & 'bios' option, facing issue 
(maybe some other bug in my code, will try to figure it out and get back 
on this). Also, will check your comment on mailing list on patch [4],

maybe they are related.

For now I think we are good with the 'qemu-coco-queue' & single bios 
binary configuration using 'AmdSevX64'.


[1]  https://github.com/mdroth/edk2/commits/apic-mmio-fix1d/

[2]  -bios 
/home/amd/AMDSEV/OVMF_CODE-upstream-20240228-apicfix-1c-AmdSevX64.fd


[3] Direct kernel loading with '-bios 
/home/amd/AMDSEV/ovmf/OVMF_CODE-apic-mmio-fix1d-AmdSevX64.fd'


[4] "hw/i386/sev: Allow use of pflash in conjunction with -bios"

Thanks,
Pankaj


Paolo


softmmu.fa.p/target_i386_kvm_kvm.c.o.d -o
libqemu-x86_64-softmmu.fa.p/target_i386_kvm_kvm.c.o -c
../target/i386/kvm/kvm.c
../target/i386/kvm/kvm.c:171:6: error: ‘KVM_X86_SEV_SNP_VM’ undeclared
here (not in a function); did you mean ‘KVM_X86_SEV_ES_VM’?
171 | [KVM_X86_SEV_SNP_VM] = "SEV-SNP",
|  ^~
|  KVM_X86_SEV_ES_VM

Thanks,
Pankaj



I tested it successfully on CentOS 9 Stream with kernel from kvm/next
and firmware from edk2-ovmf-20240524-1.fc41.noarch.

Paolo


i386/sev: Replace error_report with error_setg
linux-headers: Update to current kvm/next
i386/sev: Introduce "sev-common" type to encapsulate common SEV state
i386/sev: Move sev_launch_update to separate class method
i386/sev: Move sev_launch_finish to separate class method
i386/sev: Introduce 'sev-snp-guest' object
i386/sev: Add a sev_snp_enabled() helper
i386/sev: Add sev_kvm_init() override for SEV class
i386/sev: Add snp_kvm_init() override for SNP class
i386/cpu: Set SEV-SNP CPUID bit when SNP enabled
i386/sev: Don't return launch measurements for SEV-SNP guests
i386/sev: Add a class method to determine KVM VM type for SNP guests
i386/sev: Update query-sev QAPI format to handle SEV-SNP
i386/sev: Add the SNP launch start context
i386/sev: Add handling to encrypt/finalize guest launch data
i386/sev: Set CPU state to protected once SNP guest payload is finalized
hw/i386/sev: Add function to get SEV metadata from OVMF header
i386/sev: Add support for populating OVMF metadata pages
i386/sev: Add support for SNP CPUID validation
i386/sev: Invoke launch_updata_data() for SEV class
i386/sev: Invoke launch_updata_data() for SNP class
i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
i386/sev: Enable KVM_HC_MAP_GPA_RANGE hcall for SNP guests
i386/sev: Extract build_kernel_loader_hashes
i386/sev: Reorder struct declarations
i386/sev: Allow measured direct kernel boot on SNP
hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
memory: Introduce memory_region_init_ram_guest_memfd()

These patches need a small prerequisite that I'll post soon:

hw/i386/sev: Use guest_memfd for legacy ROMs
hw/i386: Add support for loading BIOS using guest_memfd

This one definitely requires more work:

hw/i386/sev: Allow use of pflash in conjunction with -bios


Paolo











[PATCH] chardev: add path option for pty backend

2024-05-31 Thread Octavian Purdila
Add path option to the pty char backend which will create a symbolic
link to the given path that points to the allocated PTY.

Based on patch from Paulo Neves:

https://patchew.org/QEMU/1548509635-15776-1-git-send-email-ptsne...@gmail.com/

Tested with the following invocations that the link is created and
removed when qemu stops:

  qemu-system-x86_64 -nodefaults -mon chardev=compat_monitor \
  -chardev pty,path=test,id=compat_monitor0

  qemu-system-x86_64 -nodefaults -monitor pty:test

Also tested that when a link path is not passed invocations still work, e.g.:

  qemu-system-x86_64 -monitor pty

Co-authored-by: Paulo Neves 
Signed-off-by: Octavian Purdila 
---
 chardev/char-pty.c | 34 ++
 chardev/char.c |  5 +
 qapi/char.json |  4 ++--
 qemu-options.hx| 14 ++
 4 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/chardev/char-pty.c b/chardev/char-pty.c
index cc2f7617fe..7cd5d34851 100644
--- a/chardev/char-pty.c
+++ b/chardev/char-pty.c
@@ -29,6 +29,7 @@
 #include "qemu/sockets.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
+#include "qemu/option.h"
 #include "qemu/qemu-print.h"
 
 #include "chardev/char-io.h"
@@ -41,6 +42,7 @@ struct PtyChardev {
 
 int connected;
 GSource *timer_src;
+char *symlink_path;
 };
 typedef struct PtyChardev PtyChardev;
 
@@ -204,6 +206,12 @@ static void char_pty_finalize(Object *obj)
 Chardev *chr = CHARDEV(obj);
 PtyChardev *s = PTY_CHARDEV(obj);
 
+/* unlink symlink */
+if (s->symlink_path) {
+unlink(s->symlink_path);
+g_free(s->symlink_path);
+}
+
 pty_chr_state(chr, 0);
 object_unref(OBJECT(s->ioc));
 pty_chr_timer_cancel(s);
@@ -330,6 +338,7 @@ static void char_pty_open(Chardev *chr,
 int master_fd, slave_fd;
 char pty_name[PATH_MAX];
 char *name;
+char *symlink_path = backend->u.pty.data->device;
 
 master_fd = qemu_openpty_raw(_fd, pty_name);
 if (master_fd < 0) {
@@ -354,12 +363,37 @@ static void char_pty_open(Chardev *chr,
 g_free(name);
 s->timer_src = NULL;
 *be_opened = false;
+
+/* create symbolic link */
+if (symlink_path) {
+int res = symlink(pty_name, symlink_path);
+
+if (res != 0) {
+error_setg_errno(errp, errno, "Failed to create PTY symlink");
+close(master_fd);
+} else {
+s->symlink_path = g_strdup(symlink_path);
+}
+}
+}
+
+static void char_pty_parse(QemuOpts *opts, ChardevBackend *backend,
+   Error **errp)
+{
+const char *path = qemu_opt_get(opts, "path");
+ChardevHostdev *dev;
+
+backend->type = CHARDEV_BACKEND_KIND_PTY;
+dev = backend->u.pty.data = g_new0(ChardevHostdev, 1);
+qemu_chr_parse_common(opts, qapi_ChardevHostdev_base(dev));
+dev->device = path ? g_strdup(path) : NULL;
 }
 
 static void char_pty_class_init(ObjectClass *oc, void *data)
 {
 ChardevClass *cc = CHARDEV_CLASS(oc);
 
+cc->parse = char_pty_parse;
 cc->open = char_pty_open;
 cc->chr_write = char_pty_chr_write;
 cc->chr_update_read_handler = pty_chr_update_read_handler;
diff --git a/chardev/char.c b/chardev/char.c
index 3c43fb1278..404c6b8a4f 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -428,6 +428,11 @@ QemuOpts *qemu_chr_parse_compat(const char *label, const 
char *filename,
 qemu_opt_set(opts, "path", p, _abort);
 return opts;
 }
+if (strstart(filename, "pty:", )) {
+qemu_opt_set(opts, "backend", "pty", _abort);
+qemu_opt_set(opts, "path", p, _abort);
+return opts;
+}
 if (strstart(filename, "tcp:", ) ||
 strstart(filename, "telnet:", ) ||
 strstart(filename, "tn3270:", ) ||
diff --git a/qapi/char.json b/qapi/char.json
index 777dde55d9..4c74bfc437 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -509,7 +509,7 @@
 ##
 # @ChardevHostdevWrapper:
 #
-# @data: Configuration info for device and pipe chardevs
+# @data: Configuration info for device, pty and pipe chardevs
 #
 # Since: 1.4
 ##
@@ -650,7 +650,7 @@
 'pipe': 'ChardevHostdevWrapper',
 'socket': 'ChardevSocketWrapper',
 'udp': 'ChardevUdpWrapper',
-'pty': 'ChardevCommonWrapper',
+'pty': 'ChardevHostdevWrapper',
 'null': 'ChardevCommonWrapper',
 'mux': 'ChardevMuxWrapper',
 'msmouse': 'ChardevCommonWrapper',
diff --git a/qemu-options.hx b/qemu-options.hx
index 8ca7f34ef0..5eec194242 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3569,7 +3569,7 @@ DEF("chardev", HAS_ARG, QEMU_OPTION_chardev,
 "-chardev console,id=id[,mux=on|off][,logfile=PATH][,logappend=on|off]\n"
 "-chardev 
serial,id=id,path=path[,mux=on|off][,logfile=PATH][,logappend=on|off]\n"
 #else
-"-chardev pty,id=id[,mux=on|off][,logfile=PATH][,logappend=on|off]\n"
+"-chardev 

[PATCH v2 2/8] hw/riscv/virt.c: add aplic nodename helper

2024-05-31 Thread Daniel Henrique Barboza
We'll change the aplic DT nodename in the next patch and the name is
hardcoded in 2 different functions. Create a helper to change a single
place later.

While we're at it, in create_fdt_socket_aplic(), move 'aplic_name'
inside the conditional to avoid allocating a string that won't be used
when socket == NULL.

Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 1a7e1e73c5..07a07f5ce1 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -588,6 +588,12 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 
 }
 
+/* Caller must free string after use */
+static char *fdt_get_aplic_nodename(unsigned long aplic_addr)
+{
+return g_strdup_printf("/soc/aplic@%lx", aplic_addr);
+}
+
 static void create_fdt_one_aplic(RISCVVirtState *s, int socket,
  unsigned long aplic_addr, uint32_t aplic_size,
  uint32_t msi_phandle,
@@ -597,7 +603,7 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
  bool m_mode, int num_harts)
 {
 int cpu;
-g_autofree char *aplic_name = NULL;
+g_autofree char *aplic_name = fdt_get_aplic_nodename(aplic_addr);
 g_autofree uint32_t *aplic_cells = g_new0(uint32_t, num_harts * 2);
 MachineState *ms = MACHINE(s);
 
@@ -606,7 +612,6 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
 aplic_cells[cpu * 2 + 1] = cpu_to_be32(m_mode ? IRQ_M_EXT : IRQ_S_EXT);
 }
 
-aplic_name = g_strdup_printf("/soc/aplic@%lx", aplic_addr);
 qemu_fdt_add_subnode(ms->fdt, aplic_name);
 qemu_fdt_setprop_string(ms->fdt, aplic_name, "compatible", "riscv,aplic");
 qemu_fdt_setprop_cell(ms->fdt, aplic_name, "#address-cells",
@@ -648,7 +653,6 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
 uint32_t *aplic_phandles,
 int num_harts)
 {
-g_autofree char *aplic_name = NULL;
 unsigned long aplic_addr;
 MachineState *ms = MACHINE(s);
 uint32_t aplic_m_phandle, aplic_s_phandle;
@@ -674,9 +678,8 @@ static void create_fdt_socket_aplic(RISCVVirtState *s,
  aplic_s_phandle, 0,
  false, num_harts);
 
-aplic_name = g_strdup_printf("/soc/aplic@%lx", aplic_addr);
-
 if (!socket) {
+g_autofree char *aplic_name = fdt_get_aplic_nodename(aplic_addr);
 platform_bus_add_all_fdt_nodes(ms->fdt, aplic_name,
memmap[VIRT_PLATFORM_BUS].base,
memmap[VIRT_PLATFORM_BUS].size,
-- 
2.45.1




[PATCH v2 7/8] hw/riscv/virt.c: imsics DT: add 'qemu, imsics' to 'compatible'

2024-05-31 Thread Daniel Henrique Barboza
The DT docs for riscv,imsics [1] predicts a 'qemu,imsics' enum in the
'compatible' property.

[1] Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

Reported-by: Conor Dooley 
Fixes: 28d8c281200f ("hw/riscv: virt: Add optional AIA IMSIC support to virt 
machine")
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 56d7e945c6..ac70993679 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -515,6 +515,9 @@ static void create_fdt_one_imsic(RISCVVirtState *s, hwaddr 
base_addr,
 uint32_t imsic_max_hart_per_socket, imsic_addr, imsic_size;
 g_autofree uint32_t *imsic_cells = NULL;
 g_autofree uint32_t *imsic_regs = NULL;
+static const char * const imsic_compat[2] = {
+"qemu,imsics", "riscv,imsics"
+};
 
 imsic_cells = g_new0(uint32_t, ms->smp.cpus * 2);
 imsic_regs = g_new0(uint32_t, socket_count * 4);
@@ -541,7 +544,10 @@ static void create_fdt_one_imsic(RISCVVirtState *s, hwaddr 
base_addr,
 imsic_name = g_strdup_printf("/soc/interrupt-controller@%lx",
  (unsigned long)base_addr);
 qemu_fdt_add_subnode(ms->fdt, imsic_name);
-qemu_fdt_setprop_string(ms->fdt, imsic_name, "compatible", "riscv,imsics");
+qemu_fdt_setprop_string_array(ms->fdt, imsic_name, "compatible",
+  (char **)_compat,
+  ARRAY_SIZE(imsic_compat));
+
 qemu_fdt_setprop_cell(ms->fdt, imsic_name, "#interrupt-cells",
   FDT_IMSIC_INT_CELLS);
 qemu_fdt_setprop(ms->fdt, imsic_name, "interrupt-controller", NULL, 0);
-- 
2.45.1




[PATCH v2 5/8] hw/riscv/virt.c: aplic DT: rename prop to 'riscv, delegation'

2024-05-31 Thread Daniel Henrique Barboza
The DT docs for riscv,aplic [1] predicts a 'riscv,delegation' property.
Not 'riscv,delegate'.

[1] Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

Reported-by: Conor Dooley 
Fixes: e6faee65855b ("hw/riscv: virt: Add optional AIA APLIC support to virt 
machine")
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 366fe042cc..0a18547c6d 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -640,7 +640,7 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
 if (aplic_child_phandle) {
 qemu_fdt_setprop_cell(ms->fdt, aplic_name, "riscv,children",
   aplic_child_phandle);
-qemu_fdt_setprop_cells(ms->fdt, aplic_name, "riscv,delegate",
+qemu_fdt_setprop_cells(ms->fdt, aplic_name, "riscv,delegation",
aplic_child_phandle, 0x1,
VIRT_IRQCHIP_NUM_SOURCES);
 }
-- 
2.45.1




[PATCH v2 4/8] hw/riscv/virt.c: aplic DT: add 'qemu, aplic' to 'compatible'

2024-05-31 Thread Daniel Henrique Barboza
The DT docs for riscv,aplic [1] predicts a 'qemu,aplic' enum in the
'compatible' property.

[1] Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

Reported-by: Conor Dooley 
Fixes: e6faee65855b ("hw/riscv: virt: Add optional AIA APLIC support to virt 
machine")
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 5505047945..366fe042cc 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -606,6 +606,9 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
 g_autofree char *aplic_name = fdt_get_aplic_nodename(aplic_addr);
 g_autofree uint32_t *aplic_cells = g_new0(uint32_t, num_harts * 2);
 MachineState *ms = MACHINE(s);
+static const char * const aplic_compat[2] = {
+"qemu,aplic", "riscv,aplic"
+};
 
 for (cpu = 0; cpu < num_harts; cpu++) {
 aplic_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
@@ -613,7 +616,9 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
 }
 
 qemu_fdt_add_subnode(ms->fdt, aplic_name);
-qemu_fdt_setprop_string(ms->fdt, aplic_name, "compatible", "riscv,aplic");
+qemu_fdt_setprop_string_array(ms->fdt, aplic_name, "compatible",
+  (char **)_compat,
+  ARRAY_SIZE(aplic_compat));
 qemu_fdt_setprop_cell(ms->fdt, aplic_name, "#address-cells",
   FDT_APLIC_ADDR_CELLS);
 qemu_fdt_setprop_cell(ms->fdt, aplic_name,
-- 
2.45.1




[PATCH v2 0/8] hw/riscv/virt.c: aplic/imsic DT fixes

2024-05-31 Thread Daniel Henrique Barboza
Hi,

This is a series that is being spun from the reviews given on patch 1
[1]. We'll fix some DT validation issues we have in the 'virt' machine
[2] that aren't related to missing extensions in the DT spec.

I'll leave to maintainers to squash the patches as they see fit. I
split it this way to make it easier to bissect possible bugs that these
individual changes can cause.

These are the types of DT warnings solved by this series:

/home/danielhb/work/qemu/riscv64_virt.dtb: aplic@d00: $nodename:0: 
'aplic@d00' does not match '^interrupt-controller(@[0-9a-f,]+)*$'
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
/home/danielhb/work/qemu/riscv64_virt.dtb: aplic@d00: compatible:0: 
'riscv,aplic' is not one of ['qemu,aplic']
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
/home/danielhb/work/qemu/riscv64_virt.dtb: aplic@d00: compatible: 
['riscv,aplic'] is too short
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
/home/danielhb/work/qemu/riscv64_virt.dtb: aplic@d00: Unevaluated 
properties are not allowed ('compatible' was unexpected)
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,aplic.yaml#
(...)
/home/danielhb/work/qemu/riscv64_virt.dtb: imsics@2800: $nodename:0: 
'imsics@2800' does not match '^interrupt-controller(@[0-9a-f,]+)*$'
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
/home/danielhb/work/qemu/riscv64_virt.dtb: imsics@2800: compatible:0: 
'riscv,imsics' is not one of ['qemu,imsics']
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
/home/danielhb/work/qemu/riscv64_virt.dtb: imsics@2800: compatible: 
['riscv,imsics'] is too short
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
/home/danielhb/work/qemu/riscv64_virt.dtb: imsics@2800: '#msi-cells' is a 
required property
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#
/home/danielhb/work/qemu/riscv64_virt.dtb: imsics@2800: Unevaluated 
properties are not allowed ('compatible' was unexpected)
from schema $id: 
http://devicetree.org/schemas/interrupt-controller/riscv,imsics.yaml#

[3] explains how to run 'dt-validate' to reproduce them. To generate a
'processed schema' file what I did was:

- in the Linux kernel tree, run 'make dt_binding_check'. Please note
  that this might require installation of additional python stuff
  (e.g.swig, python3-devel)

- I used the generated file 
'Documentation/devicetree/bindings/processed-schema.json'
  as a 'processed schema'.

Series applicable on both master and alistair/riscv-to-apply.next. 

Changes from v1:
- added patches 2 to 7 to fix the dt-validate warnings on imsics and
  aplic notes
- v1 link: 
https://lore.kernel.org/qemu-riscv/20240530084949.761034-1-dbarb...@ventanamicro.com/

[1] 
https://lore.kernel.org/qemu-riscv/20240530084949.761034-1-dbarb...@ventanamicro.com/
[2] https://lore.kernel.org/all/20240529-rust-tile-a05517a6260f@spud/
[3] 
https://lore.kernel.org/qemu-riscv/20240530-landed-shriek-9362981afade@spud/ 

Daniel Henrique Barboza (8):
  hw/riscv/virt.c: add address-cells in create_fdt_one_aplic()
  hw/riscv/virt.c: add aplic nodename helper
  hw/riscv/virt.c: rename aplic nodename to 'interrupt-controller'
  hw/riscv/virt.c: aplic DT: add 'qemu,aplic' to 'compatible'
  hw/riscv/virt.c: aplic DT: rename prop to 'riscv,delegation'
  hw/riscv/virt.c: change imsic nodename to 'interrupt-controller'
  hw/riscv/virt.c: imsics DT: add 'qemu,imsics' to 'compatible'
  hw/riscv/virt.c: imsics DT: add '#msi-cells'

 hw/riscv/virt.c | 36 +++-
 include/hw/riscv/virt.h |  1 +
 2 files changed, 28 insertions(+), 9 deletions(-)

-- 
2.45.1




[PATCH v2 8/8] hw/riscv/virt.c: imsics DT: add '#msi-cells'

2024-05-31 Thread Daniel Henrique Barboza
The DT docs for riscv,imsics [1] requires a 'msi-cell' property. Add one
and set it zero.

[1] Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

Reported-by: Conor Dooley 
Fixes: 28d8c281200f ("hw/riscv: virt: Add optional AIA IMSIC support to virt 
machine")
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index ac70993679..8675c3a7d1 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -552,6 +552,7 @@ static void create_fdt_one_imsic(RISCVVirtState *s, hwaddr 
base_addr,
   FDT_IMSIC_INT_CELLS);
 qemu_fdt_setprop(ms->fdt, imsic_name, "interrupt-controller", NULL, 0);
 qemu_fdt_setprop(ms->fdt, imsic_name, "msi-controller", NULL, 0);
+qemu_fdt_setprop_cell(ms->fdt, imsic_name, "#msi-cells", 0);
 qemu_fdt_setprop(ms->fdt, imsic_name, "interrupts-extended",
  imsic_cells, ms->smp.cpus * sizeof(uint32_t) * 2);
 qemu_fdt_setprop(ms->fdt, imsic_name, "reg", imsic_regs,
-- 
2.45.1




[PATCH v2 3/8] hw/riscv/virt.c: rename aplic nodename to 'interrupt-controller'

2024-05-31 Thread Daniel Henrique Barboza
The correct name of the aplic controller node, as per Linux kernel DT
docs [1], is 'interrupt-controller@addr'.

[1] Documentation/devicetree/bindings/interrupt-controller/riscv,aplic.yaml

Reported-by: Conor Dooley 
Fixes: e6faee65855b ("hw/riscv: virt: Add optional AIA APLIC support to virt 
machine")
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 07a07f5ce1..5505047945 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -591,7 +591,7 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 /* Caller must free string after use */
 static char *fdt_get_aplic_nodename(unsigned long aplic_addr)
 {
-return g_strdup_printf("/soc/aplic@%lx", aplic_addr);
+return g_strdup_printf("/soc/interrupt-controller@%lx", aplic_addr);
 }
 
 static void create_fdt_one_aplic(RISCVVirtState *s, int socket,
-- 
2.45.1




[PATCH v2 6/8] hw/riscv/virt.c: change imsic nodename to 'interrupt-controller'

2024-05-31 Thread Daniel Henrique Barboza
The Linux DT docs for imsic [1] predicts an 'interrupt-controller@addr'
node, not 'imsic@addr', given this node inherits the
'interrupt-controller' node.

[1] Documentation/devicetree/bindings/interrupt-controller/riscv,imsics.yaml

Reported-by: Conor Dooley 
Fixes: 28d8c281200f ("hw/riscv: virt: Add optional AIA IMSIC support to virt 
machine")
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 0a18547c6d..56d7e945c6 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -538,7 +538,8 @@ static void create_fdt_one_imsic(RISCVVirtState *s, hwaddr 
base_addr,
 }
 }
 
-imsic_name = g_strdup_printf("/soc/imsics@%lx", (unsigned long)base_addr);
+imsic_name = g_strdup_printf("/soc/interrupt-controller@%lx",
+ (unsigned long)base_addr);
 qemu_fdt_add_subnode(ms->fdt, imsic_name);
 qemu_fdt_setprop_string(ms->fdt, imsic_name, "compatible", "riscv,imsics");
 qemu_fdt_setprop_cell(ms->fdt, imsic_name, "#interrupt-cells",
-- 
2.45.1




[PATCH v2 1/8] hw/riscv/virt.c: add address-cells in create_fdt_one_aplic()

2024-05-31 Thread Daniel Henrique Barboza
We need #address-cells properties in all interrupt controllers that are
referred by an interrupt-map [1]. For the RISC-V machine, both PLIC and
APLIC controllers must have this property.

PLIC already sets it in create_fdt_socket_plic(). Set the property for
APLIC in create_fdt_one_aplic().

[1] 
https://lore.kernel.org/linux-arm-kernel/cal_jsqje15d-xxxmelsmud+jqhzzxgzdxvikchn6kfwqk6n...@mail.gmail.com/

Suggested-by: Anup Patel 
Fixes: e6faee65855b ("hw/riscv: virt: Add optional AIA APLIC support to virt 
machine")
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 2 ++
 include/hw/riscv/virt.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 4fdb660525..1a7e1e73c5 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -609,6 +609,8 @@ static void create_fdt_one_aplic(RISCVVirtState *s, int 
socket,
 aplic_name = g_strdup_printf("/soc/aplic@%lx", aplic_addr);
 qemu_fdt_add_subnode(ms->fdt, aplic_name);
 qemu_fdt_setprop_string(ms->fdt, aplic_name, "compatible", "riscv,aplic");
+qemu_fdt_setprop_cell(ms->fdt, aplic_name, "#address-cells",
+  FDT_APLIC_ADDR_CELLS);
 qemu_fdt_setprop_cell(ms->fdt, aplic_name,
   "#interrupt-cells", FDT_APLIC_INT_CELLS);
 qemu_fdt_setprop(ms->fdt, aplic_name, "interrupt-controller", NULL, 0);
diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h
index 3db839160f..c0dc41ff9a 100644
--- a/include/hw/riscv/virt.h
+++ b/include/hw/riscv/virt.h
@@ -118,6 +118,7 @@ enum {
 #define FDT_PLIC_ADDR_CELLS   0
 #define FDT_PLIC_INT_CELLS1
 #define FDT_APLIC_INT_CELLS   2
+#define FDT_APLIC_ADDR_CELLS  0
 #define FDT_IMSIC_INT_CELLS   0
 #define FDT_MAX_INT_CELLS 2
 #define FDT_MAX_INT_MAP_WIDTH (FDT_PCI_ADDR_CELLS + FDT_PCI_INT_CELLS + \
-- 
2.45.1




Re: [PULL 25/72] Adds migration support for Branch History Rolling Buffer (BHRB) internal state.

2024-05-31 Thread Fabiano Rosas
Nicholas Piggin  writes:

> From: Glenn Miles 
>
> Reviewed-by: Nicholas Piggin 
> Signed-off-by: Glenn Miles 
> Signed-off-by: Nicholas Piggin 
> ---
>  target/ppc/machine.c | 21 +
>  1 file changed, 21 insertions(+)
>
> diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> index 6b6c31d903..731dd8df35 100644
> --- a/target/ppc/machine.c
> +++ b/target/ppc/machine.c
> @@ -711,6 +711,26 @@ static const VMStateDescription vmstate_reservation = {
>  }
>  };
>  
> +#ifdef TARGET_PPC64
> +static bool bhrb_needed(void *opaque)
> +{
> +PowerPCCPU *cpu = opaque;
> +return (cpu->env.flags & POWERPC_FLAG_BHRB) != 0;
> +}
> +
> +static const VMStateDescription vmstate_bhrb = {
> +.name = "cpu/bhrb",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.needed = bhrb_needed,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINTTL(env.bhrb_offset, PowerPCCPU),
> +VMSTATE_UINT64_ARRAY(env.bhrb, PowerPCCPU, BHRB_MAX_NUM_ENTRIES),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +#endif
> +
>  const VMStateDescription vmstate_ppc_cpu = {
>  .name = "cpu",
>  .version_id = 5,
> @@ -756,6 +776,7 @@ const VMStateDescription vmstate_ppc_cpu = {
>  #ifdef TARGET_PPC64
>  _tm,
>  _slb,
> +_bhrb,

Running some tests now that Nick re-enabled ppc for migration tests, I
see that this new state breaks backward migrations:

$ QTEST_TRACE="vmstate_*" \
  QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-ppc64 \
  QTEST_QEMU_BINARY=./qemu-system-ppc64 \
  ./tests/qtest/migration-test -p /ppc64/migration/precopy/tcp/plain
...
vmstate_load_state_field cpu/slb:env.slb
vmstate_n_elems env.slb: 64
vmstate_subsection_load cpu/slb
vmstate_subsection_load_bad cpu/slb: cpu/bhrb/(prefix)
vmstate_load_state_end cpu/slb end/0
vmstate_subsection_load_bad cpu: cpu/bhrb/(lookup)
qemu-system-ppc64: error while loading state for instance 0x0 of device 'cpu'
vmstate_downtime_checkpoint dst-precopy-loadvm-completed
qemu-system-ppc64: load of migration failed: No such file or directory

If you want to support backwards migration, then this needs to be
fixed. Otherwise we can ignore it.



Re: [PULL 00/43] target-arm queue

2024-05-31 Thread Richard Henderson

On 5/31/24 05:03, Peter Maydell wrote:

The following changes since commit 3b2fe44bb7f605f179e5e7feb2c13c2eb3abbb80:

   Merge tag 'pull-request-2024-05-29' ofhttps://gitlab.com/thuth/qemu  into 
staging (2024-05-29 08:38:20 -0700)

are available in the Git repository at:

   https://git.linaro.org/people/pmaydell/qemu-arm.git  
tags/pull-target-arm-20240531

for you to fetch changes up to 3c3c233677d4f2fe5f35c5d6d6e9b53df48054f4:

   hw/usb/hcd-ohci: Fix #1510, #303: pid not IN or OUT (2024-05-31 11:26:00 
+0100)


target-arm:
  * hw/intc/arm_gic: Fix set pending of PPIs
  * hw/intc/arm_gic: Fix writes to GICD_ITARGETSRn
  * xilinx_zynq: Add cache controller
  * xilinx_zynq: Support up to two CPU cores
  * tests/avocado: update sbsa-ref firmware
  * sbsa-ref: move to Neoverse-N2 as default
  * More decodetree conversion of A64 ASIMD insns
  * docs/system/target-arm: Re-alphabetize board list
  * Implement FEAT WFxT and enable for '-cpu max'
  * hw/usb/hcd-ohci: Fix #1510, #303: pid not IN or OUT


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/9.1 as 
appropriate.


r~




Re: [PATCH V1 19/26] physmem: preserve ram blocks for cpr

2024-05-31 Thread Steven Sistare via

On 5/30/2024 2:39 PM, Peter Xu wrote:

On Thu, May 30, 2024 at 01:12:40PM -0400, Steven Sistare wrote:

On 5/29/2024 3:25 PM, Peter Xu wrote:

On Wed, May 29, 2024 at 01:31:53PM -0400, Steven Sistare wrote:

On 5/28/2024 5:44 PM, Peter Xu wrote:

On Mon, Apr 29, 2024 at 08:55:28AM -0700, Steve Sistare wrote:

Preserve fields of RAMBlocks that allocate their host memory during CPR so
the RAM allocation can be recovered.


This sentence itself did not explain much, IMHO.  QEMU can share memory
using fd based memory already of all kinds, as long as the memory backend
is path-based it can be shared by sharing the same paths to dst.

This reads very confusing as a generic concept.  I mean, QEMU migration
relies on so many things to work right.  We mostly asks the users to "use
exactly the same cmdline for src/dst QEMU unless you know what you're
doing", otherwise many things can break.  That should also include ramblock
being matched between src/dst due to the same cmdlines provided on both
sides.  It'll be confusing to mention this when we thought the ramblocks
also rely on that fact.

So IIUC this sentence should be dropped in the real patch, and I'll try to
guess the real reason with below..


The properties of the implicitly created ramblocks must be preserved.
The defaults can and do change between qemu releases, even when the command-line
parameters do not change for the explicit objects that cause these implicit
ramblocks to be created.


AFAIU, QEMU relies on ramblocks to be the same before this series.  Do you
have an example?  Would that already cause issue when migrate?


Alignment has changed, and used_length vs max_length changed when
resizeable ramblocks were introduced.  I have dealt with these issues
while supporting cpr for our internal use, and the learned lesson is to
explicitly communicate the creation-time parameters to new qemu.


Why used_length can change?  I'm looking at ram_mig_ram_block_resized():

 if (!migration_is_idle()) {
 /*
  * Precopy code on the source cannot deal with the size of RAM blocks
  * changing at random points in time - especially after sending the
  * RAM block sizes in the migration stream, they must no longer change.
  * Abort and indicate a proper reason.
  */
 error_setg(, "RAM block '%s' resized during precopy.", rb->idstr);
 migration_cancel(err);
 error_free(err);
 }

We sent used_length upfront of a migration during SETUP phase.  Looks like
what you're describing can be something different, though?


I was imprecise.  used_length did not change; it was introduced as being
different than max_length when resizeable ramblocks were introduced.

The max_length is not sent.  It is an implicit property of the implementation,
and can change.  It is the size of the memfd mapping, so we need to know it
and preserve it.

used_length is indeed sent during SETUP.  We could also send max_length
at that time, and store both in the struct ramblock, and *maybe* that would
be safe, but that is more fragile and less future proof than setting both
properties to the correct value when the ramblock struct is created.

And BTW, the ramblock properties are sent using ad-hoc code in setup.
I send them using nice clean vmstate.


Regarding to rb->align: isn't that mostly a constant, reflecting the MR's
alignment?  It's set when ramblock is created IIUC:

 rb->align = mr->align;

When will the alignment change?


The alignment specified by the mr to allocate a new block is an implicit 
property
of the implementation, and has changed before, from one qemu release to another.
Not often, but it did, and could again in the future.  Communicating the 
alignment
from old qemu to new qemu is future proof.


These are not an issue for migration because the ramblock is re-created
and the data copied into the new memory.


Mirror the mr->align field in the RAMBlock to simplify the vmstate.
Preserve the old host address, even though it is immediately discarded,
as it will be needed in the future for CPR with iommufd.  Preserve
guest_memfd, even though CPR does not yet support it, to maintain vmstate
compatibility when it becomes supported.


.. It could be about the vfio vaddr update feature that you mentioned and
only for iommufd (as IIUC vfio still relies on iova ranges, then it won't
help here)?

If so, IMHO we should have this patch (or any variance form) to be there
for your upcoming vfio support.  Keeping this around like this will make
the series harder to review.  Or is it needed even before VFIO?


This patch is needed independently of vfio or iommufd.

guest_memfd is independent of vfio or iommufd.  It is a recent addition
which I have not tried to support, but I added this placeholder field
to it can be supported in the future without adding a new field later
and maintaining backwards compatibility.


Is guest_memfd the only user so far, then?  If so, would it be possible we
split it as a separate 

Re: [PATCH V1 17/26] machine: memfd-alloc option

2024-05-31 Thread Steven Sistare via

On 5/30/2024 2:14 PM, Peter Xu wrote:

On Thu, May 30, 2024 at 01:11:09PM -0400, Steven Sistare wrote:

On 5/29/2024 3:14 PM, Peter Xu wrote:

On Wed, May 29, 2024 at 01:31:38PM -0400, Steven Sistare wrote:

diff --git a/system/memory.c b/system/memory.c
index 49f1cb2..ca04a0e 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1552,8 +1552,9 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
  uint64_t size,
  Error **errp)
{
+uint32_t flags = current_machine->memfd_alloc ? RAM_SHARED : 0;


If there's a machine option to "use memfd for allocations", then it's
shared mem... Hmm..

It is a bit confusing to me in quite a few levels:

 - Why memory allocation method will be defined by a machine property,
   even if we have memory-backend-* which should cover everything?


Some memory regions are implicitly created, and have no explicit representation
on the qemu command line.  memfd-alloc affects those.

More generally, memfd-alloc affects all ramblock allocations that are
not explicitly represented by memory-backend object.  Thus the simple
command line "qemu -m 1G" does not explicitly describe an object, so it
goes through the anonymous allocation path, and is affected by memfd-alloc.


Can we simply now allow "qemu -m 1G" to work for cpr-exec?


I assume you meant "simply not allow".

Yes, I could do that, but I would need to explicitly add code to exclude this
case, and add a blocker.  Right now it "just works" for all paths that lead to
ram_block_alloc_host, without any special logic at the memory-backend level.
And, I'm not convinced that simplifies the docs, as now I would need to tell
the user that "-m 1G" and similar constructions do not work with cpr.

I can try to clarify the doc for -memfd-alloc as currently defined.


Why do we need to keep cpr working for existing qemu cmdlines?  We'll
already need to add more new cmdline options already anyway, right?

cpr-reboot wasn't doing it, and that made sense to me, so that new features
will require the user to opt-in for it, starting with changing its
cmdlines.


I agree.  We need a new option to opt-in to cpr-friendly memory allocation, and 
I
am proposing -machine memfd-alloc. I am simply saying that I can try to do a 
better
job explaining the functionality in my proposed text for memfd-alloc, instead of
changing the functionality to exclude "-m 1G".  I believe excluding "-m 1G" is 
the
wrong approach, for the reasons I stated - messier implementation *and* 
documentation.

I am open to different syntax for opting in.


AFAIU that's
what we do with cpr-reboot: we ask the user to specify the right things to
make other thing work.  Otherwise it won't.



Internally, create_default_memdev does create a memory-backend object.
That is what my doc comment above refers to:
Any associated memory-backend objects are created with share=on

An explicit "qemu -object memory-backend-*" is not affected by memfd-alloc.

The qapi comments in patch "migration: cpr-exec mode" attempt to say all that:

+# Memory backend objects must have the share=on attribute, and
+# must be mmap'able in the new QEMU process.  For example,
+# memory-backend-file is acceptable, but memory-backend-ram is
+# not.
+#
+# The VM must be started with the '-machine memfd-alloc=on'
+# option.  This causes implicit ram blocks -- those not explicitly
+# described by a memory-backend object -- to be allocated by
+# mmap'ing a memfd.  Examples include VGA, ROM, and even guest
+# RAM when it is specified without a memory-backend object.


VGA is IIRC 16MB chunk, ROM is even smaller.  If the user specifies -object
memory-backend-file,share=on propertly, these should be the only outliers?

Are these important enough for the downtime?  Can we put them into the
migrated image alongside with the rest device states?


It's not about downtime.  vfio, vdpa, and iommufd pin all guest pages.
The pages must remain pinned during CPR to support ongoing DMA activity
which could target those pages (which we do not quiesce), and the same
physical pages must be used for the ramblocks in the new qemu process.


Ah ok, yes DMA can happen on the fly.

Guest mem is definitely the major DMA target and that can be covered by
-object memory-backend-*,shared=on cmdlines.

ROM is definitely not a DMA target.  So is VGA ram a target for, perhaps,
an assigned vGPU device?  Do we have a list of things that will need that?
Can we make them work somehow by sharing them like guest mem?


The pass-through devices map and pin all memory accessible to the guest.
We cannot make exceptions based on our intuition of how the memory will
and will not be used.

Also, we cannot simply abandon the old pinned ramblocks, owned by an mm_struct
that will become a zombie.  We would actually need to write additional code
to call device ioctls to unmap the oddball ramblocks.  It is far cleaner
and 

[PATCH RFC v2 2/2] ui/gtk: Add a new parameter to assign connectors/monitors

2024-05-31 Thread dongwon . kim
From: Vivek Kasireddy 

The new parameter named "connector" can be used to assign physical
monitors/connectors to individual GFX VCs such that when the monitor
is connected or hot-plugged, the associated GTK window would be
moved to it. If the monitor is disconnected or unplugged, the
associated GTK window would be hidden and a relevant disconnect
event would be sent to the Guest.

One usage example is here:

-display gtk,gl=on,connectors=DP-1:eDP-1:HDMI-2...

With this, the first graphic virtual console will be placed on DP-1
display, second on eDP-1 and the third on HDMI-2.

v2: Connectors is now in a string format that includes all connector
names separated with a colon (previously it was a linked list)

Code refactoring

Cc: Gerd Hoffmann 
Cc: Marc-André Lureau 
Cc: Daniel P. Berrangé 
Cc: Markus Armbruster 
Signed-off-by: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 qapi/ui.json |  25 -
 include/ui/gtk.h |   1 +
 ui/gtk.c | 250 +++
 qemu-options.hx  |   4 +
 4 files changed, 279 insertions(+), 1 deletion(-)

diff --git a/qapi/ui.json b/qapi/ui.json
index f610bce118..46ed9e76fc 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -1335,13 +1335,36 @@
 # @show-menubar: Display the main window menubar.  Defaults to "on".
 # (Since 8.0)
 #
+# @connectors: A list of physical monitor/connector names where the
+# GTK windows containing the respective GFX VCs (virtual consoles)
+# are to be placed. The connector names should be provided as
+# a string, with each name separated by a colon
+# (e.g., DP-1:DP-2:HDMI-1:HDMI-2). Each connector name in the
+# string will be used as a label for each VC in order.
+# VCs can be skipped by leaving an empty spot between colons
+# (e.g., DP-1::HDMI-2). If a connector name is not provided for
+# a VC, that VC will not be placed on any physical display and
+# the guest will see it as disconnected. If a valid connector name
+# is provided for a VC but its display cable is not plugged in
+# when the guest is launched, the VC will not be displayed initially.
+# It will appear on the display when the cable is plugged in
+# (hot-plug). If the cable is disconnected, the VC will be hidden
+# and the guest will see its virtual display as disconnected.
+# Multiple VCs can share the same connector name. In this case,
+# all VCs with that name will be displayed on the same physical
+# monitor. However, a single VC cannot have multiple connector
+# names.
+#
+# (Since 9.1)
+#
 # Since: 2.12
 ##
 { 'struct'  : 'DisplayGTK',
   'data': { '*grab-on-hover' : 'bool',
 '*zoom-to-fit'   : 'bool',
 '*show-tabs' : 'bool',
-'*show-menubar'  : 'bool'  } }
+'*show-menubar'  : 'bool',
+'*connectors': 'str' } }
 
 ##
 # @DisplayEGLHeadless:
diff --git a/include/ui/gtk.h b/include/ui/gtk.h
index aa3d637029..3f78ee5996 100644
--- a/include/ui/gtk.h
+++ b/include/ui/gtk.h
@@ -83,6 +83,7 @@ typedef struct VirtualConsole {
 GtkWidget *menu_item;
 GtkWidget *tab_item;
 GtkWidget *focus;
+GdkMonitor *monitor;
 VirtualConsoleType type;
 union {
 VirtualGfxConsole gfx;
diff --git a/ui/gtk.c b/ui/gtk.c
index 3bc84090c8..dc356e1dcf 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -38,6 +38,7 @@
 #include "qemu/cutils.h"
 #include "qemu/error-report.h"
 #include "qemu/main-loop.h"
+#include "qemu/option.h"
 
 #include "ui/console.h"
 #include "ui/gtk.h"
@@ -1446,6 +1447,248 @@ static void gd_menu_untabify(GtkMenuItem *item, void 
*opaque)
 }
 }
 
+static void gd_ui_mon_enable(VirtualConsole *vc)
+{
+GdkWindow *window = gtk_widget_get_window(vc->gfx.drawing_area);
+QemuUIInfo info;
+
+if (!dpy_ui_info_supported(vc->gfx.dcl.con)) {
+return;
+}
+
+info = *dpy_get_ui_info(vc->gfx.dcl.con);
+info.width = gdk_window_get_width(window);
+info.height = gdk_window_get_height(window);
+dpy_set_ui_info(vc->gfx.dcl.con, , false);
+}
+
+static void gd_ui_mon_disable(VirtualConsole *vc)
+{
+QemuUIInfo info;
+
+if (!dpy_ui_info_supported(vc->gfx.dcl.con)) {
+return;
+}
+
+info = *dpy_get_ui_info(vc->gfx.dcl.con);
+info.width = 0;
+info.height = 0;
+dpy_set_ui_info(vc->gfx.dcl.con, , false);
+}
+
+static void gd_window_show_on_monitor(GdkDisplay *dpy, VirtualConsole *vc,
+  gint monitor_num)
+{
+GtkDisplayState *s = vc->s;
+GdkMonitor *monitor = gdk_display_get_monitor(dpy, monitor_num);
+GdkRectangle geometry;
+if (!vc->window) {
+gd_tab_window_create(vc);
+}
+
+gdk_window_show(gtk_widget_get_window(vc->window));
+gd_update_windowsize(vc);
+gdk_monitor_get_geometry(monitor, );
+
+gtk_window_move(GTK_WINDOW(vc->window), geometry.x, geometry.y);
+
+if (s->opts->has_full_screen && s->opts->full_screen) {
+ 

[RFC PATCH v2 0/2] ui/gtk: Introduce new param - Connectors

2024-05-31 Thread dongwon . kim
From: Dongwon Kim 

This patch series is a replacement of
https://mail.gnu.org/archive/html/qemu-devel/2023-06/msg03989.html

There is a need, expressed by several users, to assign ownership of one
or more physical monitors/connectors to individual guests. This creates
a clear notion of which guest's contents are being displayed on any given
monitor. Given that there is always a display server/compositor running
on the host, monitor ownership can never truly be transferred to guests.
However, the closest approximation is to request the host compositor to
fullscreen the guest's windows on individual monitors. This allows for
various configurations, such as displaying four different guests' windows
on four different monitors, a single guest's windows (or virtual consoles)
on four monitors, or any similar combination.

This patch series attempts to accomplish this by introducing a new
parameter named "connector" to assign monitors to the GFX VCs associated
with a guest. If the assigned monitor is not connected, the guest's window
will not be displayed, similar to how a host compositor behaves when
connectors are not connected. Once the monitor is hot-plugged, the guest's
window(s) will be positioned on the assigned monitor.

Usage example:

-display gtk,gl=on,connectors=DP-1:eDP-1:HDMI-2...

In this example, the first graphics virtual console will be placed on the
DP-1 display, the second on eDP-1, and the third on HDMI-2.

v2: Connectors is now in a string format that includes all connector names
separated with a colon (previously it was a linked list)

Code refactoring

Vivek Kasireddy (2):
  ui/gtk: Factor out tab window creation into a separate
  ui/gtk: Add a new parameter to assign connectors/monitors

 qapi/ui.json |  25 +++-
 include/ui/gtk.h |   1 +
 ui/gtk.c | 301 +++
 qemu-options.hx  |   4 +
 4 files changed, 308 insertions(+), 23 deletions(-)

-- 
2.34.1




[RFC PATCH v2 1/2] ui/gtk: Factor out tab window creation into a separate

2024-05-31 Thread dongwon . kim
From: Vivek Kasireddy 

Pull the code that creates a new window associated with a notebook
tab into a separate function. This new function can be useful not
just when user wants to detach a tab but also in the future when
a new window creation is needed in other scenarios.

Cc: Gerd Hoffmann 
Cc: Marc-André Lureau 
Cc: Daniel P. Berrangé 
Cc: Markus Armbruster 
Signed-off-by: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
---
 ui/gtk.c | 65 +++-
 1 file changed, 36 insertions(+), 29 deletions(-)

diff --git a/ui/gtk.c b/ui/gtk.c
index 93b13b7a30..3bc84090c8 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -1396,6 +1396,41 @@ static gboolean gd_win_grab(void *opaque)
 return TRUE;
 }
 
+static void gd_tab_window_create(VirtualConsole *vc)
+{
+GtkDisplayState *s = vc->s;
+
+gtk_widget_set_sensitive(vc->menu_item, false);
+vc->window = gtk_window_new(GTK_WINDOW_TOPLEVEL);
+#if defined(CONFIG_OPENGL)
+if (vc->gfx.esurface) {
+eglDestroySurface(qemu_egl_display, vc->gfx.esurface);
+vc->gfx.esurface = NULL;
+}
+if (vc->gfx.ectx) {
+eglDestroyContext(qemu_egl_display, vc->gfx.ectx);
+vc->gfx.ectx = NULL;
+}
+#endif
+gd_widget_reparent(s->notebook, vc->window, vc->tab_item);
+
+g_signal_connect(vc->window, "delete-event",
+ G_CALLBACK(gd_tab_window_close), vc);
+gtk_widget_show_all(vc->window);
+
+if (qemu_console_is_graphic(vc->gfx.dcl.con)) {
+GtkAccelGroup *ag = gtk_accel_group_new();
+gtk_window_add_accel_group(GTK_WINDOW(vc->window), ag);
+
+GClosure *cb = g_cclosure_new_swap(G_CALLBACK(gd_win_grab),
+   vc, NULL);
+gtk_accel_group_connect(ag, GDK_KEY_g, HOTKEY_MODIFIERS, 0, cb);
+}
+
+gd_update_geometry_hints(vc);
+gd_update_caption(s);
+}
+
 static void gd_menu_untabify(GtkMenuItem *item, void *opaque)
 {
 GtkDisplayState *s = opaque;
@@ -1407,35 +1442,7 @@ static void gd_menu_untabify(GtkMenuItem *item, void 
*opaque)
FALSE);
 }
 if (!vc->window) {
-gtk_widget_set_sensitive(vc->menu_item, false);
-vc->window = gtk_window_new(GTK_WINDOW_TOPLEVEL);
-#if defined(CONFIG_OPENGL)
-if (vc->gfx.esurface) {
-eglDestroySurface(qemu_egl_display, vc->gfx.esurface);
-vc->gfx.esurface = NULL;
-}
-if (vc->gfx.ectx) {
-eglDestroyContext(qemu_egl_display, vc->gfx.ectx);
-vc->gfx.ectx = NULL;
-}
-#endif
-gd_widget_reparent(s->notebook, vc->window, vc->tab_item);
-
-g_signal_connect(vc->window, "delete-event",
- G_CALLBACK(gd_tab_window_close), vc);
-gtk_widget_show_all(vc->window);
-
-if (qemu_console_is_graphic(vc->gfx.dcl.con)) {
-GtkAccelGroup *ag = gtk_accel_group_new();
-gtk_window_add_accel_group(GTK_WINDOW(vc->window), ag);
-
-GClosure *cb = g_cclosure_new_swap(G_CALLBACK(gd_win_grab),
-   vc, NULL);
-gtk_accel_group_connect(ag, GDK_KEY_g, HOTKEY_MODIFIERS, 0, cb);
-}
-
-gd_update_geometry_hints(vc);
-gd_update_caption(s);
+gd_tab_window_create(vc);
 }
 }
 
-- 
2.34.1




Re: [PATCH] hw/usb/hcd-ohci: Fix ohci_service_td: accept valid TDs

2024-05-31 Thread Cord Amfmgm
On Fri, May 31, 2024 at 9:03 AM Peter Maydell 
wrote:

> On Tue, 21 May 2024 at 00:26, David Hubbard  wrote:
> >
> > From: Cord Amfmgm 
> >
> > This changes the way the ohci emulation handles a Transfer Descriptor
> with
> > "Current Buffer Pointer" set to "Buffer End" + 1.
> >
> > The OHCI spec 4.3.1.2 Table 4-2 allows td.cbp to be one byte more than
> td.be
> > to signal the buffer has zero length. Currently qemu only accepts
> zero-length
> > Transfer Descriptors if the td.cbp is equal to 0, while actual OHCI
> hardware
> > accepts both cases.
> >
> > The qemu ohci emulation has a regression in ohci_service_td. Version 4.2
> > and earlier matched the spec. (I haven't taken the time to bisect exactly
> > where the logic was changed.)
> >
> > With a tiny OS[1] that boots and executes a test, the issue can be seen:
> >
> > * OS that sends USB requests to a USB mass storage device
> >   but sends td.cbp = td.be + 1
> > * qemu 4.2
> > * qemu HEAD (4e66a0854)
> > * Actual OHCI controller (hardware)
> >
> > Command line:
> > qemu-system-x86_64 -m 20 \
> >  -device pci-ohci,id=ohci \
> >  -drive if=none,format=raw,id=d,file=testmbr.raw \
> >  -device usb-storage,bus=ohci.0,drive=d \
> >  --trace "usb_*" --trace "ohci_*" -D qemu.log
> >
> > Results are:
> >
> >  qemu 4.2   | qemu HEAD  | actual HW
> > ++
> >  works fine | ohci_die() | works fine
> >
> > Tip: if the flags "-serial pty -serial stdio" are added to the command
> line
> > the test will output USB requests like this:
> >
> > Testing qemu HEAD:
> >
> > > Free mem 2M ohci port2 conn FS
> > > setup { 80 6 0 1 0 0 8 0 }
> > > ED info=8 { mps=8 en=0 d=0 } tail=c20920
> > >   td0 c20880 nxt=c20960 f200 setup cbp=c20900 be=c20907
> > >   td1 c20960 nxt=c20980 f314in cbp=c20908 be=c2090f
> > >   td2 c20980 nxt=c20920 f308   out cbp=c20910 be=c2090f ohci20
> host err
> > > usb stopped
> >
> > And in qemu.log:
> >
> > usb_ohci_iso_td_bad_cc_overrun ISO_TD start_offset=0x00c20910 >
> next_offset=0x00c2090f
> >
> > Testing qemu 4.2:
> >
> > > Free mem 2M ohci port2 conn FS
> > > setup { 80 6 0 1 0 0 8 0 }
> > > ED info=8 { mps=8 en=0 d=0 } tail=620920
> > >   td0 620880 nxt=620960 f200 setup cbp=620900 be=620907
>  cbp=0 be=620907
> > >   td1 620960 nxt=620980 f314in cbp=620908 be=62090f
>  cbp=0 be=62090f
> > >   td2 620980 nxt=620920 f308   out cbp=620910 be=62090f
>  cbp=0 be=62090f
> > >rx { 12 1 0 2 0 0 0 8 }
> > > setup { 0 5 1 0 0 0 0 0 } tx {}
> > > ED info=8 { mps=8 en=0 d=0 } tail=620880
> > >   td0 620920 nxt=620960 f200 setup cbp=620900 be=620907
>  cbp=0 be=620907
> > >   td1 620960 nxt=620880 f310in cbp=620908 be=620907
>  cbp=0 be=620907
> > > setup { 80 6 0 1 0 0 12 0 }
> > > ED info=80001 { mps=8 en=0 d=1 } tail=620960
> > >   td0 620880 nxt=6209c0 f200 setup cbp=620920 be=620927
>  cbp=0 be=620927
> > >   td1 6209c0 nxt=6209e0 f314in cbp=620928 be=620939
>  cbp=0 be=620939
> > >   td2 6209e0 nxt=620960 f308   out cbp=62093a be=620939
>  cbp=0 be=620939
> > >rx { 12 1 0 2 0 0 0 8 f4 46 1 0 0 0 1 2 3 1 }
> > > setup { 80 6 0 2 0 0 0 1 }
> > > ED info=80001 { mps=8 en=0 d=1 } tail=620880
> > >   td0 620960 nxt=6209a0 f200 setup cbp=620a20 be=620a27
>  cbp=0 be=620a27
> > >   td1 6209a0 nxt=6209c0 f3140004in cbp=620a28 be=620b27
>  cbp=620a48 be=620b27
> > >   td2 6209c0 nxt=620880 f308   out cbp=620b28 be=620b27
>  cbp=0 be=620b27
> > >rx { 9 2 20 0 1 1 4 c0 0 9 4 0 0 2 8 6 50 0 7 5 81 2 40 0 0 7 5 2 2
> 40 0 0 }
> > > setup { 0 9 1 0 0 0 0 0 } tx {}
> > > ED info=80001 { mps=8 en=0 d=1 } tail=620900
> > >   td0 620880 nxt=620940 f200 setup cbp=620a00 be=620a07
>  cbp=0 be=620a07
> > >   td1 620940 nxt=620900 f310in cbp=620a08 be=620a07
>  cbp=0 be=620a07
> >
> > [1] The OS disk image has been emailed to phi...@linaro.org,
> m...@tls.msk.ru,
> > and kra...@redhat.com:
> >
> > * testCbpOffBy1.img.xz
> > * sha256:
> f87baddcb86de845de12f002c698670a426affb40946025cc32694f9daa3abed
> >
> > Signed-off-by: Cord Amfmgm 
> > ---
> >  hw/usb/hcd-ohci.c   | 4 ++--
> >  hw/usb/trace-events | 1 +
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
> > index acd6016980..71b54914d3 100644
> > --- a/hw/usb/hcd-ohci.c
> > +++ b/hw/usb/hcd-ohci.c
> > @@ -941,8 +941,8 @@ static int ohci_service_td(OHCIState *ohci, struct
> ohci_ed *ed)
> >  if ((td.cbp & 0xf000) != (td.be & 0xf000)) {
> >  len = (td.be & 0xfff) + 0x1001 - (td.cbp & 0xfff);
> >  } else {
> > -if (td.cbp > td.be) {
> > -trace_usb_ohci_iso_td_bad_cc_overrun(td.cbp, td.be);
> > +if (td.cbp - 1 > td.be) {  /* rely on td.cbp != 0 */
> > +trace_usb_ohci_td_bad_buf(td.cbp, td.be);
> >  ohci_die(ohci);
> >  return 1;
> >  }
>
> This patch seems to me to do what the commit 

[PATCH v3 1/2] qio: Inherit follow_coroutine_ctx across TLS

2024-05-31 Thread Eric Blake
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an
assertion failure:

qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion 
`qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed.

It turns out that when we removed AioContext locking, we did so by
having NBD tell its qio channels that it wanted to opt in to
qio_channel_set_follow_coroutine_ctx(); but while we opted in on the
main channel, we did not opt in on the TLS wrapper channel.
qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently
no coverage of NBD+TLS+iothread, or we would have noticed this
regression sooner.  (I'll add that in the next patch)

But while we could manually opt in to the TLS channel in nbd/server.c
(a one-line change), it is more generic if all qio channels that wrap
other channels inherit the follow status, in the same way that they
inherit feature bits.

CC: Stefan Hajnoczi 
CC: Daniel P. Berrangé 
CC: qemu-sta...@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-34786
Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", 
v8.2.0)
Signed-off-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
Message-ID: <20240518025246.791593-5-ebl...@redhat.com>
---
 io/channel-tls.c | 26 +++---
 io/channel-websock.c |  1 +
 2 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 1d9c9c72bfb..67b9760 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -69,37 +69,40 @@ qio_channel_tls_new_server(QIOChannel *master,
const char *aclname,
Error **errp)
 {
-QIOChannelTLS *ioc;
+QIOChannelTLS *tioc;
+QIOChannel *ioc;

-ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS));
+tioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS));
+ioc = QIO_CHANNEL(tioc);

-ioc->master = master;
+tioc->master = master;
+ioc->follow_coroutine_ctx = master->follow_coroutine_ctx;
 if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
-qio_channel_set_feature(QIO_CHANNEL(ioc), 
QIO_CHANNEL_FEATURE_SHUTDOWN);
+qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
 }
 object_ref(OBJECT(master));

-ioc->session = qcrypto_tls_session_new(
+tioc->session = qcrypto_tls_session_new(
 creds,
 NULL,
 aclname,
 QCRYPTO_TLS_CREDS_ENDPOINT_SERVER,
 errp);
-if (!ioc->session) {
+if (!tioc->session) {
 goto error;
 }

 qcrypto_tls_session_set_callbacks(
-ioc->session,
+tioc->session,
 qio_channel_tls_write_handler,
 qio_channel_tls_read_handler,
-ioc);
+tioc);

-trace_qio_channel_tls_new_server(ioc, master, creds, aclname);
-return ioc;
+trace_qio_channel_tls_new_server(tioc, master, creds, aclname);
+return tioc;

  error:
-object_unref(OBJECT(ioc));
+object_unref(OBJECT(tioc));
 return NULL;
 }

@@ -116,6 +119,7 @@ qio_channel_tls_new_client(QIOChannel *master,
 ioc = QIO_CHANNEL(tioc);

 tioc->master = master;
+ioc->follow_coroutine_ctx = master->follow_coroutine_ctx;
 if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
 qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
 }
diff --git a/io/channel-websock.c b/io/channel-websock.c
index a12acc27cf2..de39f0d182d 100644
--- a/io/channel-websock.c
+++ b/io/channel-websock.c
@@ -883,6 +883,7 @@ qio_channel_websock_new_server(QIOChannel *master)
 ioc = QIO_CHANNEL(wioc);

 wioc->master = master;
+ioc->follow_coroutine_ctx = master->follow_coroutine_ctx;
 if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
 qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
 }
-- 
2.45.1




[PATCH v3 0/2] Fix NBD+TLS regression in presence of iothread

2024-05-31 Thread Eric Blake
In v3:
- 2/2: fix iotest filtering [kwolf]

v2 was here:
https://lists.gnu.org/archive/html/qemu-devel/2024-05/msg03517.html

and this time, I'll wait for R-b before sending my v2 pull request:
https://lists.gnu.org/archive/html/qemu-devel/2024-05/msg06206.html

Eric Blake (2):
  qio: Inherit follow_coroutine_ctx across TLS
  iotests: test NBD+TLS+iothread

 io/channel-tls.c  |  26 +--
 io/channel-websock.c  |   1 +
 tests/qemu-iotests/tests/nbd-tls-iothread | 168 ++
 tests/qemu-iotests/tests/nbd-tls-iothread.out |  54 ++
 4 files changed, 238 insertions(+), 11 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread
 create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out

-- 
2.45.1




[PATCH v3 2/2] iotests: test NBD+TLS+iothread

2024-05-31 Thread Eric Blake
Prevent regressions when using NBD with TLS in the presence of
iothreads, adding coverage the fix to qio channels made in the
previous patch.

The shell function pick_unused_port() was copied from
nbdkit.git/tests/functions.sh.in, where it had all authors from Red
Hat, agreeing to the resulting relicensing from 2-clause BSD to GPLv2.

CC: qemu-sta...@nongnu.org
CC: "Richard W.M. Jones" 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/tests/nbd-tls-iothread | 168 ++
 tests/qemu-iotests/tests/nbd-tls-iothread.out |  54 ++
 2 files changed, 222 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread
 create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out

diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread 
b/tests/qemu-iotests/tests/nbd-tls-iothread
new file mode 100755
index 000..a2fb07206e5
--- /dev/null
+++ b/tests/qemu-iotests/tests/nbd-tls-iothread
@@ -0,0 +1,168 @@
+#!/usr/bin/env bash
+# group: rw quick
+#
+# Test of NBD+TLS+iothread
+#
+# Copyright (C) 2024 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=ebl...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1# failure is the default!
+
+_cleanup()
+{
+_cleanup_qemu
+_cleanup_test_img
+rm -f "$dst_image"
+tls_x509_cleanup
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+cd ..
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+. ./common.tls
+. ./common.nbd
+
+_supported_fmt qcow2  # Hardcoded to qcow2 command line and QMP below
+_supported_proto file
+
+# pick_unused_port
+#
+# Picks and returns an "unused" port, setting the global variable
+# $port.
+#
+# This is inherently racy, but we need it because qemu does not currently
+# permit NBD+TLS over a Unix domain socket
+pick_unused_port ()
+{
+if ! (ss --version) >/dev/null 2>&1; then
+_notrun "ss utility required, skipped this test"
+fi
+
+# Start at a random port to make it less likely that two parallel
+# tests will conflict.
+port=$(( 5 + (RANDOM%15000) ))
+while ss -ltn | grep -sqE ":$port\b"; do
+((port++))
+if [ $port -eq 65000 ]; then port=5; fi
+done
+echo picked unused port
+}
+
+tls_x509_init
+
+size=1G
+DST_IMG="$TEST_DIR/dst.qcow2"
+
+echo
+echo "== preparing TLS creds and spare port =="
+
+pick_unused_port
+tls_x509_create_root_ca "ca1"
+tls_x509_create_server "ca1" "server1"
+tls_x509_create_client "ca1" "client1"
+tls_obj_base=tls-creds-x509,id=tls0,verify-peer=true,dir="${tls_dir}"
+
+echo
+echo "== preparing image =="
+
+_make_test_img $size
+$QEMU_IMG create -f qcow2 "$DST_IMG" $size | _filter_img_create
+
+echo
+echo === Starting Src QEMU ===
+echo
+
+_launch_qemu -machine q35 \
+-object iothread,id=iothread0 \
+-object "${tls_obj_base}"/client1,endpoint=client \
+-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true,
+  "bus":"pcie.0"}' \
+-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0",
+  "bus":"root0", "iothread":"iothread0"}' \
+-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1",
+  "bus":"virtio_scsi_pci0.0"}' \
+-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false},
+"filename":"'"$TEST_IMG"'", "node-name":"drive_sys1"}' \
+-blockdev '{"driver":"qcow2", "node-name":"drive_image1",
+"file":"drive_sys1"}'
+h1=$QEMU_HANDLE
+_send_qemu_cmd $h1 '{"execute": "qmp_capabilities"}' 'return'
+
+echo
+echo === Starting Dst VM2 ===
+echo
+
+_launch_qemu -machine q35 \
+-object iothread,id=iothread0 \
+-object "${tls_obj_base}"/server1,endpoint=server \
+-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true,
+  "bus":"pcie.0"}' \
+-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0",
+  "bus":"root0", "iothread":"iothread0"}' \
+-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1",
+  "bus":"virtio_scsi_pci0.0"}' \
+-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false},
+"filename":"'"$DST_IMG"'", "node-name":"drive_sys1"}' \
+-blockdev '{"driver":"qcow2", "node-name":"drive_image1",
+"file":"drive_sys1"}' \
+-incoming 

Re: [PATCH v4 00/31] Add AMD Secure Nested Paging (SEV-SNP) support

2024-05-31 Thread Paolo Bonzini
On Fri, May 31, 2024 at 7:41 PM Gupta, Pankaj  wrote:
> > please check if branch qemu-coco-queue of
> > https://gitlab.com/bonzini/qemu works for you!
>
> Getting compilation error here: Hope I am looking at correct branch.

Oops, sorry:

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 96dc41d355c..ede3ef1225f 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -168,7 +168,7 @@ static const char *vm_type_name[] = {
 [KVM_X86_DEFAULT_VM] = "default",
 [KVM_X86_SEV_VM] = "SEV",
 [KVM_X86_SEV_ES_VM] = "SEV-ES",
-[KVM_X86_SEV_SNP_VM] = "SEV-SNP",
+[KVM_X86_SNP_VM] = "SEV-SNP",
 };

 bool kvm_is_vm_type_supported(int type)

Tested the above builds, and pushed!

Paolo

> softmmu.fa.p/target_i386_kvm_kvm.c.o.d -o
> libqemu-x86_64-softmmu.fa.p/target_i386_kvm_kvm.c.o -c
> ../target/i386/kvm/kvm.c
> ../target/i386/kvm/kvm.c:171:6: error: ‘KVM_X86_SEV_SNP_VM’ undeclared
> here (not in a function); did you mean ‘KVM_X86_SEV_ES_VM’?
>171 | [KVM_X86_SEV_SNP_VM] = "SEV-SNP",
>|  ^~
>|  KVM_X86_SEV_ES_VM
>
> Thanks,
> Pankaj
>
> >
> > I tested it successfully on CentOS 9 Stream with kernel from kvm/next
> > and firmware from edk2-ovmf-20240524-1.fc41.noarch.
> >
> > Paolo
> >
> >> i386/sev: Replace error_report with error_setg
> >> linux-headers: Update to current kvm/next
> >> i386/sev: Introduce "sev-common" type to encapsulate common SEV state
> >> i386/sev: Move sev_launch_update to separate class method
> >> i386/sev: Move sev_launch_finish to separate class method
> >> i386/sev: Introduce 'sev-snp-guest' object
> >> i386/sev: Add a sev_snp_enabled() helper
> >> i386/sev: Add sev_kvm_init() override for SEV class
> >> i386/sev: Add snp_kvm_init() override for SNP class
> >> i386/cpu: Set SEV-SNP CPUID bit when SNP enabled
> >> i386/sev: Don't return launch measurements for SEV-SNP guests
> >> i386/sev: Add a class method to determine KVM VM type for SNP guests
> >> i386/sev: Update query-sev QAPI format to handle SEV-SNP
> >> i386/sev: Add the SNP launch start context
> >> i386/sev: Add handling to encrypt/finalize guest launch data
> >> i386/sev: Set CPU state to protected once SNP guest payload is finalized
> >> hw/i386/sev: Add function to get SEV metadata from OVMF header
> >> i386/sev: Add support for populating OVMF metadata pages
> >> i386/sev: Add support for SNP CPUID validation
> >> i386/sev: Invoke launch_updata_data() for SEV class
> >> i386/sev: Invoke launch_updata_data() for SNP class
> >> i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
> >> i386/sev: Enable KVM_HC_MAP_GPA_RANGE hcall for SNP guests
> >> i386/sev: Extract build_kernel_loader_hashes
> >> i386/sev: Reorder struct declarations
> >> i386/sev: Allow measured direct kernel boot on SNP
> >> hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
> >> memory: Introduce memory_region_init_ram_guest_memfd()
> >>
> >> These patches need a small prerequisite that I'll post soon:
> >>
> >> hw/i386/sev: Use guest_memfd for legacy ROMs
> >> hw/i386: Add support for loading BIOS using guest_memfd
> >>
> >> This one definitely requires more work:
> >>
> >> hw/i386/sev: Allow use of pflash in conjunction with -bios
> >>
> >>
> >> Paolo
> >
>




[RFC PATCH v2 6/6] target/riscv: rvv: Optimize vl8re8.v/vs8r.v with limitations

2024-05-31 Thread Max Chou
The vector load/store whole register instructions (e.g. vl8re8.v/vs8r.v)
perform unmasked continuous load/store. We can optimize these
instructions by replacing the corresponding helper functions by TCG ops
to copy more data at a time with following assumptions:

* Host and target are little endian

Signed-off-by: Max Chou 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 196 +++-
 1 file changed, 194 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index bbac73bb12b..44763ccec06 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -1402,11 +1402,108 @@ GEN_LDST_WHOLE_TRANS(vl4re8_v,  4)
 GEN_LDST_WHOLE_TRANS(vl4re16_v, 4)
 GEN_LDST_WHOLE_TRANS(vl4re32_v, 4)
 GEN_LDST_WHOLE_TRANS(vl4re64_v, 4)
-GEN_LDST_WHOLE_TRANS(vl8re8_v,  8)
 GEN_LDST_WHOLE_TRANS(vl8re16_v, 8)
 GEN_LDST_WHOLE_TRANS(vl8re32_v, 8)
 GEN_LDST_WHOLE_TRANS(vl8re64_v, 8)
 
+static bool trans_vl8re8_v(DisasContext *s, arg_r2 * a)
+{
+if (require_rvv(s) && QEMU_IS_ALIGNED(a->rd, 8)) {
+if (!HOST_BIG_ENDIAN && s->vstart_eq_zero) {
+uint32_t vofs = vreg_ofs(s, a->rd);
+uint32_t midx = s->mem_idx;
+uint32_t evl = s->cfg_ptr->vlenb << 3;
+
+TCGv_i64 t0, t1;
+TCGv_i128 t16;
+TCGv_ptr tp;
+TCGv_ptr i = tcg_temp_new_ptr();
+TCGv len_remain = tcg_temp_new();
+TCGv rs1 = get_gpr(s, a->rs1, EXT_NONE);
+TCGv addr = tcg_temp_new();
+
+TCGLabel *loop_128 = gen_new_label();
+TCGLabel *remain_64 = gen_new_label();
+TCGLabel *remain_32 = gen_new_label();
+TCGLabel *remain_16 = gen_new_label();
+TCGLabel *remain_8 = gen_new_label();
+TCGLabel *over = gen_new_label();
+
+tcg_gen_mov_tl(addr, rs1);
+tcg_gen_movi_tl(len_remain, evl);
+tcg_gen_movi_ptr(i, 0);
+
+tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, evl, over);
+gen_helper_check_probe_read(tcg_env, addr, len_remain);
+
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 16, remain_64);
+
+gen_set_label(loop_128);
+
+t16 = tcg_temp_new_i128();
+tcg_gen_qemu_ld_i128(t16, addr, midx,
+ MO_LE | MO_128 | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 16);
+
+tp = tcg_temp_new_ptr();
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 16);
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+tcg_gen_extr_i128_i64(t0, t1, t16);
+
+tcg_gen_st_i64(t0, tp, vofs);
+tcg_gen_st_i64(t1, tp, vofs + 8);
+tcg_gen_subi_tl(len_remain, len_remain, 16);
+
+tcg_gen_brcondi_tl(TCG_COND_GEU, len_remain, 16, loop_128);
+
+gen_set_label(remain_64);
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 8, remain_32);
+tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUQ | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 8);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 8);
+tcg_gen_st_i64(t0, tp, vofs);
+tcg_gen_subi_tl(len_remain, len_remain, 8);
+
+gen_set_label(remain_32);
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 4, remain_16);
+tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 4);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 4);
+tcg_gen_st32_i64(t0, tp, vofs);
+tcg_gen_subi_tl(len_remain, len_remain, 4);
+
+gen_set_label(remain_16);
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 2, remain_8);
+tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUW | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 2);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 2);
+tcg_gen_st16_i64(t0, tp, vofs);
+tcg_gen_subi_tl(len_remain, len_remain, 2);
+
+gen_set_label(remain_8);
+tcg_gen_brcondi_tl(TCG_COND_EQ, len_remain, 0, over);
+tcg_gen_qemu_ld_i64(t0, addr, midx,
+MO_LE | MO_8 | MO_ATOM_NONE);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_st8_i64(t0, tp, vofs);
+
+gen_set_label(over);
+
+finalize_rvv_inst(s);
+} else {
+return ldst_whole_trans(a->rd, a->rs1, 8, gen_helper_vl8re8_v, s);
+}
+return true;
+}
+return false;
+}
+
 /*
  * The vector whole register store instructions are encoded similar to
  * unmasked unit-stride store of elements with EEW=8.
@@ -1414,7 +1511,102 @@ GEN_LDST_WHOLE_TRANS(vl8re64_v, 8)
 GEN_LDST_WHOLE_TRANS(vs1r_v, 1)
 

Re: [PATCH v4 00/31] Add AMD Secure Nested Paging (SEV-SNP) support

2024-05-31 Thread Gupta, Pankaj




These patches implement SEV-SNP base support along with CPUID enforcement
support for QEMU, and are also available at:

https://github.com/pagupta/qemu/tree/snp_v4

Latest version of kvm changes are posted here [2] and also queued in kvm/next.

Patch Layout

01-03: 'error_setg' independent fix, kvm/next header sync & patch from
Xiaoyao's TDX v5 patchset.
04-29: Introduction of sev-snp-guest object and various configuration
requirements for SNP. Support for creating a cryptographic "launch" 
context
and populating various OVMF metadata pages, BIOS regions, and vCPU/VMSA
pages with the initial encrypted/measured/validated launch data prior to
launching the SNP guest.
30-31: Handling for KVM_HC_MAP_GPA_RANGE hypercall for userspace VMEXIT.


These patches are more or less okay, with only a few nits, and I can
queue them already:


Hey,

please check if branch qemu-coco-queue of
https://gitlab.com/bonzini/qemu works for you!


Getting compilation error here: Hope I am looking at correct branch.

softmmu.fa.p/target_i386_kvm_kvm.c.o.d -o 
libqemu-x86_64-softmmu.fa.p/target_i386_kvm_kvm.c.o -c 
../target/i386/kvm/kvm.c
../target/i386/kvm/kvm.c:171:6: error: ‘KVM_X86_SEV_SNP_VM’ undeclared 
here (not in a function); did you mean ‘KVM_X86_SEV_ES_VM’?

  171 | [KVM_X86_SEV_SNP_VM] = "SEV-SNP",
  |  ^~
  |  KVM_X86_SEV_ES_VM

Thanks,
Pankaj



I tested it successfully on CentOS 9 Stream with kernel from kvm/next
and firmware from edk2-ovmf-20240524-1.fc41.noarch.

Paolo


i386/sev: Replace error_report with error_setg
linux-headers: Update to current kvm/next
i386/sev: Introduce "sev-common" type to encapsulate common SEV state
i386/sev: Move sev_launch_update to separate class method
i386/sev: Move sev_launch_finish to separate class method
i386/sev: Introduce 'sev-snp-guest' object
i386/sev: Add a sev_snp_enabled() helper
i386/sev: Add sev_kvm_init() override for SEV class
i386/sev: Add snp_kvm_init() override for SNP class
i386/cpu: Set SEV-SNP CPUID bit when SNP enabled
i386/sev: Don't return launch measurements for SEV-SNP guests
i386/sev: Add a class method to determine KVM VM type for SNP guests
i386/sev: Update query-sev QAPI format to handle SEV-SNP
i386/sev: Add the SNP launch start context
i386/sev: Add handling to encrypt/finalize guest launch data
i386/sev: Set CPU state to protected once SNP guest payload is finalized
hw/i386/sev: Add function to get SEV metadata from OVMF header
i386/sev: Add support for populating OVMF metadata pages
i386/sev: Add support for SNP CPUID validation
i386/sev: Invoke launch_updata_data() for SEV class
i386/sev: Invoke launch_updata_data() for SNP class
i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
i386/sev: Enable KVM_HC_MAP_GPA_RANGE hcall for SNP guests
i386/sev: Extract build_kernel_loader_hashes
i386/sev: Reorder struct declarations
i386/sev: Allow measured direct kernel boot on SNP
hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
memory: Introduce memory_region_init_ram_guest_memfd()

These patches need a small prerequisite that I'll post soon:

hw/i386/sev: Use guest_memfd for legacy ROMs
hw/i386: Add support for loading BIOS using guest_memfd

This one definitely requires more work:

hw/i386/sev: Allow use of pflash in conjunction with -bios


Paolo







[RFC PATCH v2 1/6] target/riscv: Separate vector segment ld/st instructions

2024-05-31 Thread Max Chou
This commit separate the helper function implementations of vector
segment load/store instructions from other vector load/store
instructions.
This can improve performance by avoiding unnecessary segment operation
when NF = 1.

Signed-off-by: Max Chou 
---
 target/riscv/helper.h   |   4 +
 target/riscv/insn32.decode  |  11 ++-
 target/riscv/insn_trans/trans_rvv.c.inc |  61 +++
 target/riscv/vector_helper.c| 100 +---
 4 files changed, 164 insertions(+), 12 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 451261ce5a4..aaf68eadfb7 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -158,18 +158,22 @@ DEF_HELPER_FLAGS_3(hyp_hsv_d, TCG_CALL_NO_WG, void, env, 
tl, tl)
 /* Vector functions */
 DEF_HELPER_3(vsetvl, tl, env, tl, tl)
 DEF_HELPER_5(vle8_v, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlsege8_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle16_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle32_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle64_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle8_v_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vlsege8_v_mask, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle16_v_mask, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle32_v_mask, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vle64_v_mask, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse8_v, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vssege8_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse16_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse32_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse64_v, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse8_v_mask, void, ptr, ptr, tl, env, i32)
+DEF_HELPER_5(vssege8_v_mask, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse16_v_mask, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse32_v_mask, void, ptr, ptr, tl, env, i32)
 DEF_HELPER_5(vse64_v_mask, void, ptr, ptr, tl, env, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f22df04cfd1..0712e9f6314 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -77,6 +77,7 @@
 @r2  ...   . . ... . ...  %rs1 %rd
 @r2_vm_1 .. . . . ... . ...  vm=1 %rs2 %rd
 @r2_nfvm ... ... vm:1 . . ... . ...  %nf %rs1 %rd
+@r2_nf_1_vm ... ... vm:1 . . ... . ...  nf=1 %rs1 %rd
 @r2_vm   .. vm:1 . . ... . ...  %rs2 %rd
 @r1_vm   .. vm:1 . . ... . ... %rd
 @r_nfvm  ... ... vm:1 . . ... . ...  %nf %rs2 %rs1 %rd
@@ -349,11 +350,17 @@ hsv_d 0110111  .   . 100 0 1110011 @r2_s
 
 # *** Vector loads and stores are encoded within LOADFP/STORE-FP ***
 # Vector unit-stride load/store insns.
-vle8_v ... 000 . 0 . 000 . 111 @r2_nfvm
+{
+  vle8_v 000 000 . 0 . 000 . 111 @r2_nf_1_vm
+  vlsege8_v  ... 000 . 0 . 000 . 111 @r2_nfvm
+}
 vle16_v... 000 . 0 . 101 . 111 @r2_nfvm
 vle32_v... 000 . 0 . 110 . 111 @r2_nfvm
 vle64_v... 000 . 0 . 111 . 111 @r2_nfvm
-vse8_v ... 000 . 0 . 000 . 0100111 @r2_nfvm
+{
+  vse8_v 000 000 . 0 . 000 . 0100111 @r2_nf_1_vm
+  vssege8_v  ... 000 . 0 . 000 . 0100111 @r2_nfvm
+}
 vse16_v... 000 . 0 . 101 . 0100111 @r2_nfvm
 vse32_v... 000 . 0 . 110 . 0100111 @r2_nfvm
 vse64_v... 000 . 0 . 111 . 0100111 @r2_nfvm
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 3a3896ba06c..1e4fa797a86 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -719,6 +719,40 @@ GEN_VEXT_TRANS(vle16_v, MO_16, r2nfvm, ld_us_op, 
ld_us_check)
 GEN_VEXT_TRANS(vle32_v, MO_32, r2nfvm, ld_us_op, ld_us_check)
 GEN_VEXT_TRANS(vle64_v, MO_64, r2nfvm, ld_us_op, ld_us_check)
 
+static bool ld_us_seg_op(DisasContext *s, arg_r2nfvm *a, uint8_t eew)
+{
+uint32_t data = 0;
+gen_helper_ldst_us *fn;
+static gen_helper_ldst_us * const fns[2][4] = {
+/* masked unit stride load */
+{ gen_helper_vlsege8_v_mask, gen_helper_vle16_v_mask,
+  gen_helper_vle32_v_mask, gen_helper_vle64_v_mask },
+/* unmasked unit stride load */
+{ gen_helper_vlsege8_v, gen_helper_vle16_v,
+  gen_helper_vle32_v, gen_helper_vle64_v }
+};
+
+fn =  fns[a->vm][eew];
+if (fn == NULL) {
+return false;
+}
+
+/*
+ * Vector load/store instructions have the EEW encoded
+ * directly in the instructions. The maximum vector size is
+ * calculated with EMUL rather than LMUL.
+ */
+uint8_t emul = vext_get_emul(s, eew);
+data = FIELD_DP32(data, VDATA, VM, a->vm);
+data = FIELD_DP32(data, VDATA, LMUL, emul);
+data = FIELD_DP32(data, VDATA, NF, a->nf);
+data = FIELD_DP32(data, VDATA, VTA, 

[RFC PATCH v2 4/6] target/riscv: Add check_probe_[read|write] helper functions

2024-05-31 Thread Max Chou
The helper_check_probe_[read|write] functions wrap the probe_pages
function to perform virtual address resolution for continuous vector
load/store instructions.

Signed-off-by: Max Chou 
---
 target/riscv/helper.h|  4 
 target/riscv/vector_helper.c | 12 
 2 files changed, 16 insertions(+)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index aaf68eadfb7..f4bc907e85f 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1,6 +1,10 @@
 /* Exceptions */
 DEF_HELPER_2(raise_exception, noreturn, env, i32)
 
+/* Probe page */
+DEF_HELPER_FLAGS_3(check_probe_read, TCG_CALL_NO_WG, void, env, tl, tl)
+DEF_HELPER_FLAGS_3(check_probe_write, TCG_CALL_NO_WG, void, env, tl, tl)
+
 /* Floating Point - rounding mode */
 DEF_HELPER_FLAGS_2(set_rounding_mode, TCG_CALL_NO_WG, void, env, i32)
 DEF_HELPER_FLAGS_2(set_rounding_mode_chkfrm, TCG_CALL_NO_WG, void, env, i32)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index cb7267c3217..9263ab26b19 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -136,6 +136,18 @@ static void probe_pages(CPURISCVState *env, target_ulong 
addr,
 }
 }
 
+void HELPER(check_probe_read)(CPURISCVState *env, target_ulong addr,
+  target_ulong len)
+{
+probe_pages(env, addr, len, GETPC(), MMU_DATA_LOAD);
+}
+
+void HELPER(check_probe_write)(CPURISCVState *env, target_ulong addr,
+   target_ulong len)
+{
+probe_pages(env, addr, len, GETPC(), MMU_DATA_STORE);
+}
+
 static inline void vext_set_elem_mask(void *v0, int index,
   uint8_t value)
 {
-- 
2.34.1




[RFC PATCH v2 5/6] target/riscv: rvv: Optimize v[l|s]e8.v with limitations

2024-05-31 Thread Max Chou
The vector unit-stride load/store instructions (e.g. vle8.v/vse8.v)
perform continuous load/store. We can replace the corresponding helper
functions by TCG ops to copy more data at a time with following
assumptions:

* Perform virtual address resolution once for entire vector at beginning
* Without mask
* Without tail agnostic
* Both host and target are little endian

Signed-off-by: Max Chou 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 197 +++-
 1 file changed, 195 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 1e4fa797a86..bbac73bb12b 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -714,7 +714,105 @@ static bool ld_us_check(DisasContext *s, arg_r2nfvm* a, 
uint8_t eew)
vext_check_load(s, a->rd, a->nf, a->vm, eew);
 }
 
-GEN_VEXT_TRANS(vle8_v,  MO_8,  r2nfvm, ld_us_op, ld_us_check)
+static bool trans_vle8_v(DisasContext *s, arg_r2nfvm * a)
+{
+
+if (ld_us_check(s, a, MO_8)) {
+if (!HOST_BIG_ENDIAN && s->vstart_eq_zero && s->vta == 0 && a->vm) {
+uint32_t vofs = vreg_ofs(s, a->rd);
+uint32_t midx = s->mem_idx;
+
+TCGv_i64 t0, t1;
+TCGv_i128 t16;
+TCGv_ptr tp;
+TCGv_ptr i = tcg_temp_new_ptr();
+TCGv len_remain = tcg_temp_new();
+TCGv rs1 = get_gpr(s, a->rs1, EXT_NONE);
+TCGv addr = tcg_temp_new();
+
+TCGLabel *loop_128 = gen_new_label();
+TCGLabel *remain_64 = gen_new_label();
+TCGLabel *remain_32 = gen_new_label();
+TCGLabel *remain_16 = gen_new_label();
+TCGLabel *remain_8 = gen_new_label();
+TCGLabel *over = gen_new_label();
+
+tcg_gen_mov_tl(addr, rs1);
+tcg_gen_mov_tl(len_remain, cpu_vl);
+tcg_gen_muli_tl(len_remain, len_remain, a->nf);
+tcg_gen_movi_ptr(i, 0);
+
+tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);
+gen_helper_check_probe_read(tcg_env, addr, len_remain);
+
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 16, remain_64);
+
+gen_set_label(loop_128);
+
+t16 = tcg_temp_new_i128();
+tcg_gen_qemu_ld_i128(t16, addr, midx,
+ MO_LE | MO_128 | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 16);
+
+tp = tcg_temp_new_ptr();
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 16);
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+tcg_gen_extr_i128_i64(t0, t1, t16);
+
+tcg_gen_st_i64(t0, tp, vofs);
+tcg_gen_st_i64(t1, tp, vofs + 8);
+tcg_gen_subi_tl(len_remain, len_remain, 16);
+
+tcg_gen_brcondi_tl(TCG_COND_GEU, len_remain, 16, loop_128);
+
+gen_set_label(remain_64);
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 8, remain_32);
+tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUQ | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 8);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 8);
+tcg_gen_st_i64(t0, tp, vofs);
+tcg_gen_subi_tl(len_remain, len_remain, 8);
+
+gen_set_label(remain_32);
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 4, remain_16);
+tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 4);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 4);
+tcg_gen_st32_i64(t0, tp, vofs);
+tcg_gen_subi_tl(len_remain, len_remain, 4);
+
+gen_set_label(remain_16);
+tcg_gen_brcondi_tl(TCG_COND_LTU, len_remain, 2, remain_8);
+tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUW | MO_ATOM_NONE);
+tcg_gen_addi_tl(addr, addr, 2);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_addi_ptr(i, i, 2);
+tcg_gen_st16_i64(t0, tp, vofs);
+tcg_gen_subi_tl(len_remain, len_remain, 2);
+
+gen_set_label(remain_8);
+tcg_gen_brcondi_tl(TCG_COND_EQ, len_remain, 0, over);
+tcg_gen_qemu_ld_i64(t0, addr, midx,
+MO_LE | MO_8 | MO_ATOM_NONE);
+tcg_gen_add_ptr(tp, tcg_env, i);
+tcg_gen_st8_i64(t0, tp, vofs);
+
+gen_set_label(over);
+
+finalize_rvv_inst(s);
+} else {
+return ld_us_op(s, a, MO_8);
+}
+return true;
+}
+return false;
+}
+
 GEN_VEXT_TRANS(vle16_v, MO_16, r2nfvm, ld_us_op, ld_us_check)
 GEN_VEXT_TRANS(vle32_v, MO_32, r2nfvm, ld_us_op, ld_us_check)
 GEN_VEXT_TRANS(vle64_v, MO_64, r2nfvm, ld_us_op, ld_us_check)
@@ -785,7 +883,102 @@ static bool st_us_check(DisasContext *s, arg_r2nfvm* a, 
uint8_t eew)
   

[RFC PATCH v2 3/6] target/riscv: Inline vext_ldst_us and corresponding function for performance

2024-05-31 Thread Max Chou
In the vector unit-stride load/store helper functions. the vext_ldst_us
function corresponding most of the execution time. Inline the functions
can avoid the function call overhead to improve the helper function
performance.

Signed-off-by: Max Chou 
Reviewed-by: Richard Henderson 
---
 target/riscv/vector_helper.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 440c33c141b..cb7267c3217 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -149,25 +149,27 @@ static inline void vext_set_elem_mask(void *v0, int index,
 typedef void vext_ldst_elem_fn(CPURISCVState *env, abi_ptr addr,
uint32_t idx, void *vd, uintptr_t retaddr);
 
-#define GEN_VEXT_LD_ELEM(NAME, ETYPE, H, LDSUF)\
-static void NAME(CPURISCVState *env, abi_ptr addr, \
- uint32_t idx, void *vd, uintptr_t retaddr)\
-{  \
-ETYPE *cur = ((ETYPE *)vd + H(idx));   \
-*cur = cpu_##LDSUF##_data_ra(env, addr, retaddr);  \
-}  \
+#define GEN_VEXT_LD_ELEM(NAME, ETYPE, H, LDSUF) \
+static inline QEMU_ALWAYS_INLINE\
+void NAME(CPURISCVState *env, abi_ptr addr, \
+  uint32_t idx, void *vd, uintptr_t retaddr)\
+{   \
+ETYPE *cur = ((ETYPE *)vd + H(idx));\
+*cur = cpu_##LDSUF##_data_ra(env, addr, retaddr);   \
+}   \
 
 GEN_VEXT_LD_ELEM(lde_b, int8_t,  H1, ldsb)
 GEN_VEXT_LD_ELEM(lde_h, int16_t, H2, ldsw)
 GEN_VEXT_LD_ELEM(lde_w, int32_t, H4, ldl)
 GEN_VEXT_LD_ELEM(lde_d, int64_t, H8, ldq)
 
-#define GEN_VEXT_ST_ELEM(NAME, ETYPE, H, STSUF)\
-static void NAME(CPURISCVState *env, abi_ptr addr, \
- uint32_t idx, void *vd, uintptr_t retaddr)\
-{  \
-ETYPE data = *((ETYPE *)vd + H(idx));  \
-cpu_##STSUF##_data_ra(env, addr, data, retaddr);   \
+#define GEN_VEXT_ST_ELEM(NAME, ETYPE, H, STSUF) \
+static inline QEMU_ALWAYS_INLINE\
+void NAME(CPURISCVState *env, abi_ptr addr, \
+  uint32_t idx, void *vd, uintptr_t retaddr)\
+{   \
+ETYPE data = *((ETYPE *)vd + H(idx));   \
+cpu_##STSUF##_data_ra(env, addr, data, retaddr);\
 }
 
 GEN_VEXT_ST_ELEM(ste_b, int8_t,  H1, stb)
@@ -291,7 +293,7 @@ GEN_VEXT_ST_STRIDE(vsse64_v, int64_t, ste_d)
  */
 
 /* unmasked unit-stride load and store operation */
-static void
+static inline QEMU_ALWAYS_INLINE void
 vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc,
  vext_ldst_elem_fn *ldst_elem, uint32_t log2_esz, uint32_t evl,
  uintptr_t ra)
-- 
2.34.1




[RFC PATCH v2 0/6] Improve the performance of RISC-V vector unit-stride/whole register ld/st instructions

2024-05-31 Thread Max Chou
Hi,

This RFC patch set tries to fix the issue of
https://gitlab.com/qemu-project/qemu/-/issues/2137.

In this new version, we added patches that try to load/store more data
at a time in part of vector continuous load/store (unit-stride/whole
register) instructions with some assumptions (e.g. no masking, no tail
agnostic, perform virtual address resolution once for the entire vector,
etc.) as suggested by Richard Henderson.

This version can improve the performance of the test case provided in
https://gitlab.com/qemu-project/qemu/-/issues/2137#note_1757501369 (from
~13.5 sec to ~1.5 sec) on QEMU user mode.

PS: This RFC patch set only focuses on the vle8.v/vse8.v/vl8re8.v/vs8r.v
instructions. The next version will try to complete other instructions.

Series based on riscv-to-apply.next branch (commit 1806da7).

Max Chou (6):
  target/riscv: Separate vector segment ld/st instructions
  accel/tcg: Avoid unnecessary call overhead from
qemu_plugin_vcpu_mem_cb
  target/riscv: Inline vext_ldst_us and corresponding function for
performance
  target/riscv: Add check_probe_[read|write] helper functions
  target/riscv: rvv: Optimize v[l|s]e8.v with limitations
  target/riscv: rvv: Optimize vl8re8.v/vs8r.v with limitations

 accel/tcg/ldst_common.c.inc |   8 +-
 target/riscv/helper.h   |   8 +
 target/riscv/insn32.decode  |  11 +-
 target/riscv/insn_trans/trans_rvv.c.inc | 454 +++-
 target/riscv/vector_helper.c| 142 ++--
 5 files changed, 591 insertions(+), 32 deletions(-)

-- 
2.34.1




[RFC PATCH v2 2/6] accel/tcg: Avoid unnecessary call overhead from qemu_plugin_vcpu_mem_cb

2024-05-31 Thread Max Chou
If there are not any QEMU plugin memory callback functions, checking
before calling the qemu_plugin_vcpu_mem_cb function can reduce the
function call overhead.

Signed-off-by: Max Chou 
---
 accel/tcg/ldst_common.c.inc | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/ldst_common.c.inc b/accel/tcg/ldst_common.c.inc
index c82048e377e..87ceb954873 100644
--- a/accel/tcg/ldst_common.c.inc
+++ b/accel/tcg/ldst_common.c.inc
@@ -125,7 +125,9 @@ void helper_st_i128(CPUArchState *env, uint64_t addr, 
Int128 val, MemOpIdx oi)
 
 static void plugin_load_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi)
 {
-qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
+if (cpu_plugin_mem_cbs_enabled(env_cpu(env))) {
+qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R);
+}
 }
 
 uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra)
@@ -188,7 +190,9 @@ Int128 cpu_ld16_mmu(CPUArchState *env, abi_ptr addr,
 
 static void plugin_store_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi)
 {
-qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W);
+if (cpu_plugin_mem_cbs_enabled(env_cpu(env))) {
+qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W);
+}
 }
 
 void cpu_stb_mmu(CPUArchState *env, abi_ptr addr, uint8_t val,
-- 
2.34.1




Re: [PATCH 0/6] host/i386: require x86-64-v2 ISA

2024-05-31 Thread Richard Henderson

On 5/31/24 02:14, Paolo Bonzini wrote:

Paolo Bonzini (6):
   host/i386: nothing looks at CPUINFO_SSE4
   meson: assume x86-64-v2 baseline ISA
   host/i386: assume presence of CMOV
   host/i386: assume presence of SSE2
   host/i386: assume presence of SSSE3
   host/i386: assume presence of POPCNT


Reviewed-by: Richard Henderson 


r~



Re: [PATCH v6 5/8] hw/misc/stm32l4x5_rcc: Handle Register Updates

2024-05-31 Thread Peter Maydell
On Sun, 3 Mar 2024 at 14:08, Arnaud Minier
 wrote:
>
> Update the RCC state and propagate frequency changes when writing to the
> RCC registers. Currently, ICSCR, CIER, the reset registers and the stop
> mode registers are not implemented.
>
> Some fields  have not been implemented due to uncertainty about
> how to handle them (Like the clock security system or bypassing
> mecanisms).
>
> Signed-off-by: Arnaud Minier 
> Signed-off-by: Inès Varhol 

Hi; somebody has reported a bug in this change, which they found
using a fuzzer:

https://gitlab.com/qemu-project/qemu/-/issues/2356

> +static void rcc_update_cfgr_register(Stm32l4x5RccState *s)
> +{
> +uint32_t val;
> +/* MCOPRE */
> +val = FIELD_EX32(s->cfgr, CFGR, MCOPRE);
> +assert(val <= 0b100);

You can't assert() things about guest register values,
because then if the guest writes that value QEMU will fall over.
For "this is something the spec says is invalid", the right
thing to do in a device model is to qemu_log_mask(LOG_GUEST_ERROR, ...)
the situation, and proceed as best you can (eg treat the value
as if it was some valid one, or disable the clock entirely).

> +clock_mux_set_factor(>clock_muxes[RCC_CLOCK_MUX_MCO],
> + 1, 1 << val);
> +
> +/* MCOSEL */
> +val = FIELD_EX32(s->cfgr, CFGR, MCOSEL);
> +assert(val <= 0b111);

Similarly here. (The obvious behaviour for "invalid clock
source selected" would be "treat as clock disabled".)

> +if (val == 0) {
> +clock_mux_set_enable(>clock_muxes[RCC_CLOCK_MUX_MCO], false);
> +} else {
> +clock_mux_set_enable(>clock_muxes[RCC_CLOCK_MUX_MCO], true);
> +clock_mux_set_source(>clock_muxes[RCC_CLOCK_MUX_MCO],
> + val - 1);
> +}

thanks
-- PMM



Re: [PATCH v4 00/31] Add AMD Secure Nested Paging (SEV-SNP) support

2024-05-31 Thread Paolo Bonzini
On Fri, May 31, 2024 at 1:20 PM Paolo Bonzini  wrote:
>
> On Thu, May 30, 2024 at 1:16 PM Pankaj Gupta  wrote:
> >
> > These patches implement SEV-SNP base support along with CPUID enforcement
> > support for QEMU, and are also available at:
> >
> > https://github.com/pagupta/qemu/tree/snp_v4
> >
> > Latest version of kvm changes are posted here [2] and also queued in 
> > kvm/next.
> >
> > Patch Layout
> > 
> > 01-03: 'error_setg' independent fix, kvm/next header sync & patch from
> >Xiaoyao's TDX v5 patchset.
> > 04-29: Introduction of sev-snp-guest object and various configuration
> >requirements for SNP. Support for creating a cryptographic "launch" 
> > context
> >and populating various OVMF metadata pages, BIOS regions, and 
> > vCPU/VMSA
> >pages with the initial encrypted/measured/validated launch data 
> > prior to
> >launching the SNP guest.
> > 30-31: Handling for KVM_HC_MAP_GPA_RANGE hypercall for userspace VMEXIT.
>
> These patches are more or less okay, with only a few nits, and I can
> queue them already:

Hey,

please check if branch qemu-coco-queue of
https://gitlab.com/bonzini/qemu works for you!

I tested it successfully on CentOS 9 Stream with kernel from kvm/next
and firmware from edk2-ovmf-20240524-1.fc41.noarch.

Paolo

> i386/sev: Replace error_report with error_setg
> linux-headers: Update to current kvm/next
> i386/sev: Introduce "sev-common" type to encapsulate common SEV state
> i386/sev: Move sev_launch_update to separate class method
> i386/sev: Move sev_launch_finish to separate class method
> i386/sev: Introduce 'sev-snp-guest' object
> i386/sev: Add a sev_snp_enabled() helper
> i386/sev: Add sev_kvm_init() override for SEV class
> i386/sev: Add snp_kvm_init() override for SNP class
> i386/cpu: Set SEV-SNP CPUID bit when SNP enabled
> i386/sev: Don't return launch measurements for SEV-SNP guests
> i386/sev: Add a class method to determine KVM VM type for SNP guests
> i386/sev: Update query-sev QAPI format to handle SEV-SNP
> i386/sev: Add the SNP launch start context
> i386/sev: Add handling to encrypt/finalize guest launch data
> i386/sev: Set CPU state to protected once SNP guest payload is finalized
> hw/i386/sev: Add function to get SEV metadata from OVMF header
> i386/sev: Add support for populating OVMF metadata pages
> i386/sev: Add support for SNP CPUID validation
> i386/sev: Invoke launch_updata_data() for SEV class
> i386/sev: Invoke launch_updata_data() for SNP class
> i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
> i386/sev: Enable KVM_HC_MAP_GPA_RANGE hcall for SNP guests
> i386/sev: Extract build_kernel_loader_hashes
> i386/sev: Reorder struct declarations
> i386/sev: Allow measured direct kernel boot on SNP
> hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
> memory: Introduce memory_region_init_ram_guest_memfd()
>
> These patches need a small prerequisite that I'll post soon:
>
> hw/i386/sev: Use guest_memfd for legacy ROMs
> hw/i386: Add support for loading BIOS using guest_memfd
>
> This one definitely requires more work:
>
> hw/i386/sev: Allow use of pflash in conjunction with -bios
>
>
> Paolo




Re: [PATCH] accel/tcg: nochain should disable goto_ptr

2024-05-31 Thread Richard Henderson

On 5/31/24 03:17, NiuGenen wrote:

Signed-off-by: NiuGenen 
---
  accel/tcg/cpu-exec.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 2972f75b96..084fa645c7 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -173,7 +173,7 @@ uint32_t curr_cflags(CPUState *cpu)
  } else if (qatomic_read(_insn_per_tb)) {
  cflags |= CF_NO_GOTO_TB | 1;
  } else if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-cflags |= CF_NO_GOTO_TB;
+cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
  }
  
  return cflags;


Why?

The original intent of nochain was so that -d exec would log all blocks, which requires 
excluding goto_tb.  There is exec logging in helper_lookup_goto_ptr, so there is no need 
to avoid goto_ptr.


You must provide a rationale, at minimum.


r~



Re: [PATCH v3 5/6] iotests: Filter out `vvfat` fmt from failing tests

2024-05-31 Thread Kevin Wolf
Am 26.05.2024 um 11:56 hat Amjad Alsharafi geschrieben:
> `vvfat` is a special format and not all tests (even generic) can run
> without crashing.  So, added `unsupported_fmt: vvfat` to all failling
> tests.
> 
> Also added `vvfat` format into `meson.build`, vvfaat tests can be run
> on the `block-thorough` suite.
> 
> Signed-off-by: Amjad Alsharafi 

I think the better approach is just not counting vvfat as generic. It's
technically not even a format, but a protocol, though I think I agree
with adding it as a format anyway because you can't store a normal image
inside of it.

This should do the trick and avoid most of the changes in this patch:

diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 588f30a4f1..4053d29de4 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -250,7 +250,7 @@ def __init__(self, source_dir: str, build_dir: str,
 self.qemu_img_options = os.getenv('QEMU_IMG_OPTIONS')
 self.qemu_nbd_options = os.getenv('QEMU_NBD_OPTIONS')

-is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg']
+is_generic = self.imgfmt not in ['bochs', 'cloop', 'dmg', 'vvfat']
 self.imgfmt_generic = 'true' if is_generic else 'false'

 self.qemu_io_options = f'--cache {self.cachemode} --aio {self.aiomode}'

Kevin




Re: [PATCH v3 0/6] vvfat: Fix write bugs for large files and add iotests

2024-05-31 Thread Kevin Wolf
Am 26.05.2024 um 11:56 hat Amjad Alsharafi geschrieben:
> These patches fix some bugs found when modifying files in vvfat.
> First, there was a bug when writing to the cluster 2 or above of a file, it
> will copy the cluster before it instead, so, when writing to cluster=2, the
> content of cluster=1 will be copied into disk instead in its place.
> 
> Another issue was modifying the clusters of a file and adding new
> clusters, this showed 2 issues:
> - If the new cluster is not immediately after the last cluster, it will
> cause issues when reading from this file in the future.
> - Generally, the usage of info.file.offset was incorrect, and the
> system would crash on abort() when the file is modified and a new
> cluster was added.
> 
> Also, added some iotests for vvfat, covering the this fix and also
> general behavior such as reading, writing, and creating files on the 
> filesystem.
> Including tests for reading/writing the first cluster which
> would pass even before this patch.

I was wondering how to reproduce the bugs that patches 2 and 3 fix. So I
tried to run your iotests case, and while it does catch the bug that
patch 1 fixes, it passes even without the other two fixes.

Is this expected? If so, can we add more tests that trigger the problems
the other two patches address?

Kevin




Re: [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before x86_cpu_filter_features

2024-05-31 Thread Chen, Zide



On 5/30/2024 11:30 PM, Zhao Liu wrote:
> Hi Zide,
> 
> On Fri, May 24, 2024 at 01:00:16PM -0700, Zide Chen wrote:
>> Date: Fri, 24 May 2024 13:00:16 -0700
>> From: Zide Chen 
>> Subject: [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before
>>  x86_cpu_filter_features
>> X-Mailer: git-send-email 2.34.1
>>
>> cpu_exec_realizefn which calls the accel-specific realizefn may expand
>> features.  e.g., some accel-specific options may require extra features
>> to be enabled, and it's appropriate to expand these features in accel-
>> specific realizefn.
>>
>> One such example is the cpu-pm option, which may add CPUID_EXT_MONITOR.
>>
>> Thus, call cpu_exec_realizefn before x86_cpu_filter_features to ensure
>> that it won't expose features not supported by the host.
>>
>> Fixes: 662175b91ff2 ("i386: reorder call to cpu_exec_realizefn")
>> Suggested-by: Xiaoyao Li 
>> Signed-off-by: Zide Chen 
>> ---
>>  target/i386/cpu.c | 24 
>>  target/i386/kvm/kvm-cpu.c |  1 -
>>  2 files changed, 12 insertions(+), 13 deletions(-)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index bc2dceb647fa..a1c1c785bd2f 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -7604,6 +7604,18 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
>> **errp)
>>  }
>>  }
>>  
>> +/*
>> + * note: the call to the framework needs to happen after feature 
>> expansion,
>> + * but before the checks/modifications to ucode_rev, mwait, phys_bits.
>> + * These may be set by the accel-specific code,
>> + * and the results are subsequently checked / assumed in this function.
>> + */
>> +cpu_exec_realizefn(cs, _err);
>> +if (local_err != NULL) {
>> +error_propagate(errp, local_err);
>> +return;
>> +}
>> +
>>  x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
> 
> For your case, which sets cpu-pm=on via overcommit, then
> x86_cpu_filter_features() will complain that mwait is not supported.
> 
> Such warning is not necessary, because the purpose of overcommit (from
> code) is only to support mwait when possible, not to commit to support
> mwait in Guest.
> 
> Additionally, I understand x86_cpu_filter_features() is primarily
> intended to filter features configured by the user, 

Yes, that's why this patches intends to let x86_cpu_filter_features()
filter out the MWAIT bit which is set from the overcommit option.

> and the changes of
> CPUID after x86_cpu_filter_features() should by default be regarded like
> "QEMU knows what it is doing".

Sure, we can add feature bits after x86_cpu_filter_features(), but I
think moving cpu_exec_realizefn() before x86_cpu_filter_features() is
more generic, and actually this is what QEMU did before commit 662175b91ff2.

- Less redundant code. Specifically, no need to call
x86_cpu_get_supported_feature_word() again.
- Potentially there could be other features could be added from the
accel-specific realizefn, kvm_cpu_realizefn() for example.  And these
features need to be checked against the host availability.

> 
> I feel adding a check for the CPUID mwait bit in host_cpu_realizefn()
> is enough, after all, this bit should be present if host supports mwait
> and enable_cpu_pm (in kvm_arch_get_supported_cpuid()).

Besides the above reasons, it seems to me expanding env->features in
host-cpu.c is confusing.

> Thanks,
> Zhao
> 



Re: [PATCH V2 3/3] target/i386: Move host_cpu_enable_cpu_pm into kvm_cpu_realizefn()

2024-05-31 Thread Chen, Zide



On 5/30/2024 11:53 PM, Zhao Liu wrote:
> On Fri, May 24, 2024 at 01:00:17PM -0700, Zide Chen wrote:
>> Date: Fri, 24 May 2024 13:00:17 -0700
>> From: Zide Chen 
>> Subject: [PATCH V2 3/3] target/i386: Move host_cpu_enable_cpu_pm into
>>  kvm_cpu_realizefn()
>> X-Mailer: git-send-email 2.34.1
>>
>> It seems not a good idea to expand features in host_cpu_realizefn,
>> which is for host CPU only. 
> 
> It is restricted by max_features and should be part of the "max" CPU,
> and the current target/i386/host-cpu.c should only serve the “host” CPU.

Yes, since this file only serves for "host" CPU, that's why I proposed
to move this code to kvm_cpu_realizefn().

> 
>> Additionally, cpu-pm option is KVM
>> specific, and it's cleaner to put it in kvm_cpu_realizefn(), together
>> with the WAITPKG code.
>>
>> Fixes: f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using 
>> AccelCPUClass")
>> Signed-off-by: Zide Chen 
>> ---
>>  target/i386/host-cpu.c| 12 
>>  target/i386/kvm/kvm-cpu.c | 11 +--
>>  2 files changed, 9 insertions(+), 14 deletions(-)
>>
>> diff --git a/target/i386/host-cpu.c b/target/i386/host-cpu.c
>> index 280e427c017c..8b8bf5afeccf 100644
>> --- a/target/i386/host-cpu.c
>> +++ b/target/i386/host-cpu.c
>> @@ -42,15 +42,6 @@ static uint32_t host_cpu_phys_bits(void)
>>  return host_phys_bits;
>>  }
>>  
>> -static void host_cpu_enable_cpu_pm(X86CPU *cpu)
>> -{
>> -CPUX86State *env = >env;
>> -
>> -host_cpuid(5, 0, >mwait.eax, >mwait.ebx,
>> -   >mwait.ecx, >mwait.edx);
>> -env->features[FEAT_1_ECX] |= CPUID_EXT_MONITOR;
>> -}
>> -
>>  static uint32_t host_cpu_adjust_phys_bits(X86CPU *cpu)
>>  {
>>  uint32_t host_phys_bits = host_cpu_phys_bits();
>> @@ -83,9 +74,6 @@ bool host_cpu_realizefn(CPUState *cs, Error **errp)
>>  X86CPU *cpu = X86_CPU(cs);
>>  CPUX86State *env = >env;
>>  
>> -if (cpu->max_features && enable_cpu_pm) {
>> -host_cpu_enable_cpu_pm(cpu);
>> -}
>>  if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) {
>>  uint32_t phys_bits = host_cpu_adjust_phys_bits(cpu);
>>  
>> diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c
>> index 3adcedf0dbc3..197c892da89a 100644
>> --- a/target/i386/kvm/kvm-cpu.c
>> +++ b/target/i386/kvm/kvm-cpu.c
>> @@ -64,9 +64,16 @@ static bool kvm_cpu_realizefn(CPUState *cs, Error **errp)
>>   *   cpu_common_realizefn() (via xcc->parent_realize)
>>   */
>>  if (cpu->max_features) {
>> -if (enable_cpu_pm && kvm_has_waitpkg()) {
>> -env->features[FEAT_7_0_ECX] |= CPUID_7_0_ECX_WAITPKG;
>> +if (enable_cpu_pm) {
>> +if (kvm_has_waitpkg()) {
>> +env->features[FEAT_7_0_ECX] |= CPUID_7_0_ECX_WAITPKG;
>> +}
> 
> If you agree with my comment in patch 2, here we need a mwait bit check.

I still think that take advantaged of x86_cpu_filter_features() to check
host availability is a better choice.

> 
>> +host_cpuid(5, 0, >mwait.eax, >mwait.ebx,
>> +   >mwait.ecx, >mwait.edx);
>> +env->features[FEAT_1_ECX] |= CPUID_EXT_MONITOR;
>>  }
>> +
>>  if (cpu->ucode_rev == 0) {
>>  cpu->ucode_rev =
>>  kvm_arch_get_supported_msr_feature(kvm_state,
>> -- 
>> 2.34.1
>>
>>



[PATCH] accel/kvm: Fix two lines with hard-coded tabs

2024-05-31 Thread Peter Maydell
In kvm-all.c, two lines have been accidentally indented with
hard-coded tabs rather than spaces. Normalise to match the rest
of the file.

Signed-off-by: Peter Maydell 
---
 accel/kvm/kvm-all.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index c0be9f5eedb..009b49de447 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2893,7 +2893,7 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool 
to_private)
 !memory_region_is_ram_device(mr) &&
 !memory_region_is_rom(mr) &&
 !memory_region_is_romd(mr)) {
-   ret = 0;
+ret = 0;
 } else {
 error_report("Convert non guest_memfd backed memory region "
 "(0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
@@ -2964,7 +2964,7 @@ int kvm_cpu_exec(CPUState *cpu)
 
 kvm_arch_pre_run(cpu, run);
 if (qatomic_read(>exit_request)) {
-   trace_kvm_interrupt_exit_request();
+trace_kvm_interrupt_exit_request();
 /*
  * KVM requires us to reenter the kernel after IO exits to complete
  * instruction emulation. This self-signal will ensure that we
-- 
2.34.1




Re: [PATCH 5/7] target/i386: Split out gdb-internal.h

2024-05-31 Thread Alex Bennée
Richard Henderson  writes:

> Signed-off-by: Richard Henderson 

Reviewed-by: Alex Bennée 

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH 3/7] accel/tcg: Return the TranslationBlock from cpu_unwind_state_data

2024-05-31 Thread Alex Bennée
Richard Henderson  writes:

> Fix the i386 get_memio_eip function to use tb->cflags
> instead of cs->tcg_cflags.
>
> Signed-off-by: Richard Henderson 
> ---
>  include/exec/cpu-common.h | 9 +
>  accel/tcg/translate-all.c | 9 +
>  target/i386/helper.c  | 6 --
>  3 files changed, 14 insertions(+), 10 deletions(-)
>

>  
>  /* Per x86_restore_state_to_opc. */
> -if (cs->tcg_cflags & CF_PCREL) {
> +if (tb->cflags & CF_PCREL) {
>  return (env->eip & TARGET_PAGE_MASK) | data[0];

this has a merge conflict with subsequent changes.

>  } else {
>  return data[0] - env->segs[R_CS].base;

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH 2/7] accel/tcg: Set CPUState.plugin_ra before all plugin callbacks

2024-05-31 Thread Alex Bennée
Richard Henderson  writes:

We really could do with a description of why we are setting plugin_ra
and what we mean to achieve by it. I think it is so we can then do the
same PC/other register recovery as we do at synchronous faulting
exceptions be it generated TCG code or a helper. However we should make
that clear in the commit (and possible some function comments).


> Signed-off-by: Richard Henderson 
> ---
>  include/hw/core/cpu.h  |  1 +
>  accel/tcg/plugin-gen.c | 50 +-
>  2 files changed, 46 insertions(+), 5 deletions(-)
>
> diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
> index 10cd492aff..f4af37c13d 100644
> --- a/include/hw/core/cpu.h
> +++ b/include/hw/core/cpu.h
> @@ -350,6 +350,7 @@ typedef union IcountDecr {
>  typedef struct CPUNegativeOffsetState {
>  CPUTLB tlb;
>  #ifdef CONFIG_PLUGIN
> +uintptr_t plugin_ra;
>  GArray *plugin_mem_cbs;
>  #endif
>  IcountDecr icount_decr;
> diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
> index 36e9134a5d..f96b49cce6 100644
> --- a/accel/tcg/plugin-gen.c
> +++ b/accel/tcg/plugin-gen.c
> @@ -37,6 +37,12 @@ enum plugin_gen_from {
>  PLUGIN_GEN_AFTER_TB,
>  };
>  
> +enum plugin_gen_ra {
> +GEN_RA_DONE,
> +GEN_RA_FROM_TB,
> +GEN_RA_FROM_INSN,
> +};
> +
>  /* called before finishing a TB with exit_tb, goto_tb or goto_ptr */
>  void plugin_gen_disable_mem_helpers(void)
>  {
> @@ -151,11 +157,38 @@ static void gen_mem_cb(struct qemu_plugin_dyn_cb *cb,
>  tcg_temp_free_i32(cpu_index);
>  }
>  
> -static void inject_cb(struct qemu_plugin_dyn_cb *cb)
> +static void inject_ra(enum plugin_gen_ra *gen_ra)
> +{
> +TCGv_ptr ra;
> +
> +switch (*gen_ra) {
> +case GEN_RA_DONE:
> +return;
> +case GEN_RA_FROM_TB:
> +ra = tcg_constant_ptr(NULL);
> +break;
> +case GEN_RA_FROM_INSN:
> +ra = tcg_temp_ebb_new_ptr();
> +tcg_gen_plugin_pc(ra);
> +break;
> +default:
> +g_assert_not_reached();
> +}
> +
> +tcg_gen_st_ptr(ra, tcg_env,
> +   offsetof(CPUState, neg.plugin_ra) -
> +   offsetof(ArchCPU, env));
> +tcg_temp_free_ptr(ra);
> +*gen_ra = GEN_RA_DONE;
> +}
> +
> +static void inject_cb(struct qemu_plugin_dyn_cb *cb,
> +  enum plugin_gen_ra *gen_ra)
>  
>  {
>  switch (cb->type) {
>  case PLUGIN_CB_REGULAR:
> +inject_ra(gen_ra);
>  gen_udata_cb(cb);
>  break;
>  case PLUGIN_CB_INLINE:
> @@ -167,16 +200,18 @@ static void inject_cb(struct qemu_plugin_dyn_cb *cb)
>  }
>  
>  static void inject_mem_cb(struct qemu_plugin_dyn_cb *cb,
> +  enum plugin_gen_ra *gen_ra,
>enum qemu_plugin_mem_rw rw,
>qemu_plugin_meminfo_t meminfo, TCGv_i64 addr)
>  {
>  if (cb->rw & rw) {
>  switch (cb->type) {
>  case PLUGIN_CB_MEM_REGULAR:
> +inject_ra(gen_ra);
>  gen_mem_cb(cb, meminfo, addr);
>  break;
>  default:
> -inject_cb(cb);
> +inject_cb(cb, gen_ra);
>  break;
>  }
>  }
> @@ -186,6 +221,7 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
> *plugin_tb)
>  {
>  TCGOp *op, *next;
>  int insn_idx = -1;
> +enum plugin_gen_ra gen_ra;
>  
>  if (unlikely(qemu_loglevel_mask(LOG_TB_OP_PLUGIN)
>   && qemu_log_in_addr_range(plugin_tb->vaddr))) {
> @@ -205,10 +241,12 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
> *plugin_tb)
>   */
>  memset(tcg_ctx->free_temps, 0, sizeof(tcg_ctx->free_temps));
>  
> +gen_ra = GEN_RA_FROM_TB;
>  QTAILQ_FOREACH_SAFE(op, _ctx->ops, link, next) {
>  switch (op->opc) {
>  case INDEX_op_insn_start:
>  insn_idx++;
> +gen_ra = GEN_RA_FROM_INSN;
>  break;
>  
>  case INDEX_op_plugin_cb:
> @@ -244,7 +282,8 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
> *plugin_tb)
>  cbs = plugin_tb->cbs;
>  for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
>  inject_cb(
> -_array_index(cbs, struct qemu_plugin_dyn_cb, i));
> +_array_index(cbs, struct qemu_plugin_dyn_cb, i),
> +_ra);
>  }
>  break;
>  
> @@ -256,7 +295,8 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
> *plugin_tb)
>  cbs = insn->insn_cbs;
>  for (i = 0, n = (cbs ? cbs->len : 0); i < n; i++) {
>  inject_cb(
> -_array_index(cbs, struct qemu_plugin_dyn_cb, i));
> +_array_index(cbs, struct qemu_plugin_dyn_cb, i),
> +_ra);
>  }
>  break;
>  
> @@ -288,7 +328,7 @@ static void plugin_gen_inject(struct qemu_plugin_tb 
> 

Re: [PATCH v4 18/31] hw/i386/sev: Add function to get SEV metadata from OVMF header

2024-05-31 Thread Liam Merwick via
On 31/05/2024 16:41, Paolo Bonzini wrote:
> On Fri, May 31, 2024 at 5:20 PM Liam Merwick  wrote:
>>> +metadata = (OvmfSevMetadata *)(flash_ptr + flash_size - data->offset);
>>> +if (memcmp(metadata->signature, "ASEV", 4) != 0) {
>>> +return;
>>> +}
>>> +
>>> +ovmf_sev_metadata_table = g_malloc(metadata->len);
>>
>> There should be a bounds check on metadata->len before using it.
> 
> You mean like:
> 
>  if (metadata->len <= flash_size - data->offset) {
>  ovmf_sev_metadata_table = g_memdup2(metadata, metadata->len);
>  }
> 

Yeah, and maybe before that

if (metadata->len < sizeof(OvmfSevMetadata))
  return

But the main thing would be checking the upper bound to avoid
allocating a huge amount of memory 

Regards,
Liam



Re: [PULL 0/2] NBD patches for 2024-05-30

2024-05-31 Thread Peter Maydell
On Fri, 31 May 2024 at 17:17, Eric Blake  wrote:
>
> On Thu, May 30, 2024 at 07:22:16AM GMT, Eric Blake wrote:
> > The following changes since commit 3b2fe44bb7f605f179e5e7feb2c13c2eb3abbb80:
> >
> >   Merge tag 'pull-request-2024-05-29' of https://gitlab.com/thuth/qemu into 
> > staging (2024-05-29 08:38:20 -0700)
> >
> > are available in the Git repository at:
> >
> >   https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2024-05-30
> >
> > for you to fetch changes up to 109c59fa3a1cf6be88c2a39b4699a0041c64821f:
> >
> >   iotests: test NBD+TLS+iothread (2024-05-30 07:18:42 -0500)
> >
> > 
> > NBD patches for 2024-05-30
> >
> > - Fix AioContext assertion with NBD+TLS
> >
> > 
> > Eric Blake (2):
> >   qio: Inherit follow_coroutine_ctx across TLS
> >   iotests: test NBD+TLS+iothread
>
> Patch 2/2 only works on machines with /home/eblake; I will fix that in
> a v2 pull request.

Alternatively we could add that to our release notes
as a build requirement ? :-)

-- PMM



Re: [PATCH] hw/cxl: Fix read from bogus memory

2024-05-31 Thread Peter Maydell
On Fri, 31 May 2024 at 17:22, Ira Weiny  wrote:
>
> Peter and coverity report:
>
> We've passed '' to address_space_write(), which means "read
> from the address on the stack where the function argument 'data'
> lives", so instead of writing 64 bytes of data to the guest ,
> we'll write 64 bytes which start with a host pointer value and
> then continue with whatever happens to be on the host stack
> after that.
>
> Indeed the intention was to write 64 bytes of data at the address given.
>
> Fix the parameter to address_space_write().
>

Coverity CID: 1544772

> Reported-by: Peter Maydell 
> Link: 
> https://lore.kernel.org/all/cafeaca-u4sytgwtksb__y+_+0o2-wwarntm3x8wnhvl1wfh...@mail.gmail.com/
> Fixes: 6bda41a69bdc ("hw/cxl: Add clear poison mailbox command support.")
> Cc: Jonathan Cameron 
> Signed-off-by: Ira Weiny 
> ---
> Compile tested only.  Jonathan please double check me.
> ---
>  hw/mem/cxl_type3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 3e42490b6ce8..582412d9925f 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1025,7 +1025,7 @@ static bool set_cacheline(CXLType3Dev *ct3d, uint64_t 
> dpa_offset, uint8_t *data)
>  as = >hostpmem_as;
>  }
>
> -address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, ,
> +address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, data,
>  CXL_CACHE_LINE_SIZE);
>  return true;
>  }

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [PATCH v3 0/6] Implement icount=auto using TCG Plugins

2024-05-31 Thread Alex Bennée
Pierrick Bouvier  writes:

> The goal here is to be able to scale temporally execution of qemu-user/system,
> using a given number of instructions per second.
>
> We define a virtual clock, that can be late or in advance compared to real 
> time.
> When we are in advance, we slow execution (by sleeping) until catching real
> time.
>
> Finally, we should be able to cleanup icount=auto mode completely, and keep
> icount usage for determistic purposes only.
>
> It is built upon new TCG Plugins inline ops (store + conditional callbacks), 
> now
> merged on master.
>
> Example in user-mode:
>
> - Retrieve number of instructions to execute /bin/true
> $ ./build/qemu-x86_64 -plugin ./build/tests/plugin/libinsn.so -d plugin 
> /bin/true
> cpu 0 insns: 120546
> total insns: 120546
> - Slow execution to match 5 seconds
> $ time ./build/qemu-x86_64 -plugin 
> ./build/contrib/plugins/libips,ips=$((120546/5)) /bin/true
> real0m4.985s
>
> Tested in system mode by booting a full debian system, and using:
> $ sysbench cpu run
> Performance decrease linearly with the given number of ips.

Queued to plugins/next, thanks.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH] plugins: Ensure register handles are not NULL

2024-05-31 Thread Alex Bennée
Akihiko Odaki  writes:

> Ensure register handles are not NULL so that a plugin can assume NULL is
> invalid as a register handle.
>
> Signed-off-by: Akihiko Odaki 

Queued to plugins/next, thanks.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH 0/4] hw/s390x: Alias @dump-skeys -> @dump-s390-skey and deprecate

2024-05-31 Thread Thomas Huth

On 31/05/2024 16.02, Dr. David Alan Gilbert wrote:

* Thomas Huth (th...@redhat.com) wrote:

On 30/05/2024 09.45, Philippe Mathieu-Daudé wrote:

We are trying to unify all qemu-system-FOO to a single binary.
In order to do that we need to remove QAPI target specific code.

@dump-skeys is only available on qemu-system-s390x. This series
rename it as @dump-s390-skey, making it available on other
binaries. We take care of backward compatibility via deprecation.

Philippe Mathieu-Daudé (4):
hw/s390x: Introduce the @dump-s390-skeys QMP command
hw/s390x: Introduce the 'dump_s390_skeys' HMP command
hw/s390x: Deprecate the HMP 'dump_skeys' command
hw/s390x: Deprecate the QMP @dump-skeys command


Why do we have to rename the command? Just for the sake of it? I think
renaming HMP commands is maybe ok, but breaking the API in QMP is something
you should consider twice.

And even if we decide to rename ... maybe we should discuss whether it makes
sense to come up with a generic command instead: As far as I know, ARM also
has something similar, called MTE. Maybe we also want to dump MTE keys one
day? So the new command should maybe be called "dump-memory-keys" instead?


I think there are at least two different concepts; but I agree it would be
nice to keep a single command for matching concepts across different 
architectures;
I can't say I know the details of any, but:

   a) Page table things - I think x86 PKRU/PKEY (???) is a page table thing
 where pages marked a special way are associated with keys.
 That sounds similar to what the skeys are???


Sounds a little bit similar, but s390 storage keys are independent from page 
tables. It's rather that each page (4096 bytes) of RAM has an additional 
7-bit value that contains the storage key and some additional bits. It's 
also usable when page tables are still disabled.


> I'm not sure the two fit in the same command.

Does it make sense to dump all the MTE or x86 keys all at once? If so, we 
could maybe come up with an unified command. Otherwise it might not make 
sense, indeed.


 Thomas




Re: [PULL 04/53] hw/cxl: Add clear poison mailbox command support.

2024-05-31 Thread Ira Weiny
Peter Maydell wrote:
> Ping! This looks like it should be an easy one-liner fix
> for a Coverity-detected read-from-bogus-memory bug --
> could one of the CXL folks have a look at it and send
> a patch, please ?

Done.  Jonathan could you double check I only compile tested.

I think you are correct and apologies for not seeing your report earlier.

Ira



[PATCH] hw/cxl: Fix read from bogus memory

2024-05-31 Thread Ira Weiny
Peter and coverity report:

We've passed '' to address_space_write(), which means "read
from the address on the stack where the function argument 'data'
lives", so instead of writing 64 bytes of data to the guest ,
we'll write 64 bytes which start with a host pointer value and
then continue with whatever happens to be on the host stack
after that.

Indeed the intention was to write 64 bytes of data at the address given.

Fix the parameter to address_space_write().

Reported-by: Peter Maydell 
Link: 
https://lore.kernel.org/all/cafeaca-u4sytgwtksb__y+_+0o2-wwarntm3x8wnhvl1wfh...@mail.gmail.com/
Fixes: 6bda41a69bdc ("hw/cxl: Add clear poison mailbox command support.")
Cc: Jonathan Cameron 
Signed-off-by: Ira Weiny 
---
Compile tested only.  Jonathan please double check me.
---
 hw/mem/cxl_type3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 3e42490b6ce8..582412d9925f 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1025,7 +1025,7 @@ static bool set_cacheline(CXLType3Dev *ct3d, uint64_t 
dpa_offset, uint8_t *data)
 as = >hostpmem_as;
 }
 
-address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, ,
+address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, data,
 CXL_CACHE_LINE_SIZE);
 return true;
 }

---
base-commit: 3b2fe44bb7f605f179e5e7feb2c13c2eb3abbb80
change-id: 20240531-fix-poison-set-cacheline-e32bc1e74b27

Best regards,
-- 
Ira Weiny 




Re: [PULL 0/2] NBD patches for 2024-05-30

2024-05-31 Thread Eric Blake
On Thu, May 30, 2024 at 07:22:16AM GMT, Eric Blake wrote:
> The following changes since commit 3b2fe44bb7f605f179e5e7feb2c13c2eb3abbb80:
> 
>   Merge tag 'pull-request-2024-05-29' of https://gitlab.com/thuth/qemu into 
> staging (2024-05-29 08:38:20 -0700)
> 
> are available in the Git repository at:
> 
>   https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2024-05-30
> 
> for you to fetch changes up to 109c59fa3a1cf6be88c2a39b4699a0041c64821f:
> 
>   iotests: test NBD+TLS+iothread (2024-05-30 07:18:42 -0500)
> 
> 
> NBD patches for 2024-05-30
> 
> - Fix AioContext assertion with NBD+TLS
> 
> 
> Eric Blake (2):
>   qio: Inherit follow_coroutine_ctx across TLS
>   iotests: test NBD+TLS+iothread

Patch 2/2 only works on machines with /home/eblake; I will fix that in
a v2 pull request.

(And this should spur me to write a local commit hook that makes sure
I don't leak stuff like that in future commits...)

> 
>  io/channel-tls.c  |  26 ++--
>  io/channel-websock.c  |   1 +
>  tests/qemu-iotests/tests/nbd-tls-iothread | 168 
> ++
>  tests/qemu-iotests/tests/nbd-tls-iothread.out |  54 +
>  4 files changed, 238 insertions(+), 11 deletions(-)
>  create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread
>  create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out
> 
> -- 
> 2.45.1
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH v3 1/6] vvfat: Fix bug in writing to middle of file

2024-05-31 Thread Kevin Wolf
Am 26.05.2024 um 11:56 hat Amjad Alsharafi geschrieben:
> Before this commit, the behavior when calling `commit_one_file` for
> example with `offset=0x2000` (second cluster), what will happen is that
> we won't fetch the next cluster from the fat, and instead use the first
> cluster for the read operation.
> 
> This is due to off-by-one error here, where `i=0x2000 !< offset=0x2000`,
> thus not fetching the next cluster.
> 
> Signed-off-by: Amjad Alsharafi 

Reviewed-by: Kevin Wolf 
Tested-by: Kevin Wolf 

> diff --git a/block/vvfat.c b/block/vvfat.c
> index 9d050ba3ae..ab342f0743 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -2525,7 +2525,7 @@ commit_one_file(BDRVVVFATState* s, int dir_index, 
> uint32_t offset)
>  return -1;
>  }
>  
> -for (i = s->cluster_size; i < offset; i += s->cluster_size)
> +for (i = s->cluster_size; i <= offset; i += s->cluster_size)
>  c = modified_fat_get(s, c);

While your change results in the correct behaviour, I think I would
prefer the code to be changed like this so that at the start of each
loop iteration, 'i' always refers to the offset that matches 'c':

for (i = 0; i < offset; i += s->cluster_size) {
c = modified_fat_get(s, c);
}

I'm also adding braces here to make the code conform with the QEMU
coding style. You can use scripts/checkpatch.pl to make sure that all
code you add has the correct style. Much of the vvfat code predates the
coding style, so you'll often have to change style when you touch a
line. (Which is good, because it slowly fixes the existing mess.)

You can keep my Reviewed/Tested-by if you change this.

Kevin




Re: [PATCH v2 2/2] iotests: test NBD+TLS+iothread

2024-05-31 Thread Eric Blake
On Fri, May 31, 2024 at 04:36:20PM GMT, Kevin Wolf wrote:
> Am 18.05.2024 um 04:50 hat Eric Blake geschrieben:
> > Prevent regressions when using NBD with TLS in the presence of
> > iothreads, adding coverage the fix to qio channels made in the
> > previous patch.
> > 
> > CC: qemu-sta...@nongnu.org
> > Signed-off-by: Eric Blake 
> 
> > diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread.out 
> > b/tests/qemu-iotests/tests/nbd-tls-iothread.out
> > new file mode 100644
> > index 000..b5e12dcbe7a
> > --- /dev/null
> > +++ b/tests/qemu-iotests/tests/nbd-tls-iothread.out

> > +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
> > +Formatting 
> > '/home/eblake/qemu/build/tests/qemu-iotests/scratch/qcow2-file-nbd-tls-iothread/dst.qcow2',
> >  fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib 
> > size=1073741824 lazy_refcounts=off refcount_bits=16
> 
> /home/eblake is suboptimal in reference output. :-)

Indeed. Will fix, which means I need a v2 pull request.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: Unexpected error in rme_configure_one() at ../target/arm/kvm-rme.c:159

2024-05-31 Thread Jean-Philippe Brucker
On Fri, May 31, 2024 at 11:16:30PM +0900, Itaru Kitayama wrote:
> Thanks! I wasn’t aware of it The good news is that after whole day of try and 
> error attempts I was able to
> bring up a Realm VM on FVP. Here’s my version of overlay yaml, cca-v2.yaml:

That is good news, thanks for the update

> build:
>   linux:
> repo:
>   revision: cca-full/v2
> 
> #  kvmtool:
> #repo:
> #  revision: cca/v2
> 
>   rmm:
> repo:
>   revision: main
> 
> 
> 
>   tfa:
> repo:
>   revision: master
> 
>   kvm-unit-tests:
> repo:
>   revision: cca/v2
> 
> … and the QEMU options are below:
> 
> qemu-system-aarch64 -M 'virt,acpi=off,gic-version=3' \
> -cpu host -enable-kvm -smp 2 -m 512M -overcommit 'mem-lock=on' \
> -M 'confidential-guest-support=rme0' \
> -object 
> 'rme-guest,id=rme0,measurement-algo=sha512,num-pmu-counters=6,sve-vector-length=256'
>  \
> -kernel Image -initrd rootfs.cpio \
> -append 'earycon console=ttyAMA0 rdinit=/sbin/init' -nographic -net none

Note that the cca/v2 branch of QEMU would reject 'num-pmu-counters' and
'sve-vector-length' arguments, so if this works it means an older version
of the QEMU patches is being used (which also means an older Linux branch
is being used). It's possible that shrinkwrap is caching all the build
files, so I'd remove the ~/.shrinkwrap/ files to start fresh

Thanks,
Jean




Re: [PATCH v2 4/4] tests/qtest/migration-test: Add a postcopy memfile test

2024-05-31 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, May 30, 2024 at 07:54:07PM +1000, Nicholas Piggin wrote:
>> Postcopy requires userfaultfd support, which requires tmpfs if a memory
>> file is used.
>> 
>> This adds back support for /dev/shm memory files, but adds preallocation
>> to skip environments where that mount is limited in size.
>> 
>> Signed-off-by: Nicholas Piggin 
>
> Thanks for doing this regardless.
>
>> ---
>>  tests/qtest/migration-test.c | 77 
>>  1 file changed, 69 insertions(+), 8 deletions(-)
>> 
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 86eace354e..5078033ded 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -11,6 +11,7 @@
>>   */
>>  
>>  #include "qemu/osdep.h"
>> +#include "qemu/cutils.h"
>>  
>>  #include "libqtest.h"
>>  #include "qapi/qmp/qdict.h"
>> @@ -553,6 +554,7 @@ typedef struct {
>>   */
>>  bool hide_stderr;
>>  bool use_memfile;
>> +bool use_shm_memfile;
>
> Nitpick: when having both, it's slightly confusing on the name, e.g. not
> clear whether use_memfile needs to be set to true too if use_shm_memfile=true.
>
> Maybe "use_tmpfs_memfile" and "use_shm_memfile"?
>
>>  /* only launch the target process */
>>  bool only_target;
>>  /* Use dirty ring if true; dirty logging otherwise */
>> @@ -739,7 +741,62 @@ static int test_migrate_start(QTestState **from, 
>> QTestState **to,
>>  ignore_stderr = "";
>>  }
>>  
>> -if (args->use_memfile) {
>> +if (!qtest_has_machine(machine_alias)) {
>> +g_autofree char *msg = g_strdup_printf("machine %s not supported",
>> +   machine_alias);
>> +g_test_skip(msg);
>> +return -1;
>> +}
>> +
>> +if (args->use_shm_memfile) {
>> +#if defined(__NR_userfaultfd) && defined(__linux__)
>
> IIUC we only need defined(__linux__) as there's nothing to do with uffd yet?
>
>> +int fd;
>> +uint64_t size;
>> +
>> +if (getenv("GITLAB_CI")) {
>> +/*
>> + * Gitlab runners are limited to 64MB shm size and despite
>> + * pre-allocation there is concern that concurrent tests
>> + * could result in nondeterministic failures. Until all shm
>> + * usage in all CI tests is found to fail gracefully on
>> + * ENOSPC, it is safer to avoid large allocations for now.
>> + *
>> + * https://lore.kernel.org/qemu-devel/875xuwg4mx@suse.de/
>> + */
>> +g_test_skip("shm tests are not supported in Gitlab CI 
>> environment");
>> +return -1;
>> +}
>
> I'm not sure whether this is Fabiano's intention.  I'm wondering whether we
> can drop this and just let it still run there.
>

It was my intention. But I overlooked the fact that the current shm
cannot even run one migration test already.

> Other tests not detecting avaiablility of shmem looks like a separate issue
> to be fixed to me, regardless of this.
>
> My wild guess is since we're doing memory_size+64K below then if test won't
> fail others won't either, as normally the shmem quota should normally be
> power of 2 anyway.. then it should always fit another few MBs if this one.
> While this test is ready to fail gracefully now.
>

I agree. Let's drop this part then.




Re: [PATCH v2 07/18] monitor: Simplify fdset and fd removal

2024-05-31 Thread Peter Xu
On Thu, May 23, 2024 at 04:05:37PM -0300, Fabiano Rosas wrote:
> Remove fds right away instead of setting the ->removed flag. We don't
> need the extra complexity of having a cleanup function reap the
> removed entries at a later time.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v2 13/18] monitor: fdset: Match against O_DIRECT

2024-05-31 Thread Peter Xu
On Fri, May 31, 2024 at 12:42:05PM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Thu, May 23, 2024 at 04:05:43PM -0300, Fabiano Rosas wrote:
> >> We're about to enable the use of O_DIRECT in the migration code and
> >> due to the alignment restrictions imposed by filesystems we need to
> >> make sure the flag is only used when doing aligned IO.
> >> 
> >> The migration will do parallel IO to different regions of a file, so
> >> we need to use more than one file descriptor. Those cannot be obtained
> >> by duplicating (dup()) since duplicated file descriptors share the
> >> file status flags, including O_DIRECT. If one migration channel does
> >> unaligned IO while another sets O_DIRECT to do aligned IO, the
> >> filesystem would fail the unaligned operation.
> >> 
> >> The add-fd QMP command along with the fdset code are specifically
> >> designed to allow the user to pass a set of file descriptors with
> >> different access flags into QEMU to be later fetched by code that
> >> needs to alternate between those flags when doing IO.
> >> 
> >> Extend the fdset matching to behave the same with the O_DIRECT flag.
> >> 
> >> Signed-off-by: Fabiano Rosas 
> >
> > Reviewed-by: Peter Xu 
> >
> > One thing I am confused but totally irrelevant to this specific change.
> >
> > I wonder why do we need dupfds at all, and why client needs to remove-fd at
> > all.
> 
> This answer was lost to time.
> 
> >
> > It's about what would go wrong if qmp client only add-fd, then if it's
> > consumed by anything, it gets moved from "fds" list to "dupfds" list.  The
> > thing is I don't expect the client should pass over any fd that will not be
> > consumed.  Then if it's always consumed, why bother dup() at all, and why
> > bother an explicit remove-fd.
> 
> With the lack of documentation, I can only imagine the code was
> initially written to be more flexible, but we ended up never needing the
> extra flexibility. Maybe we could change that transparently in the
> future and deprecate qmp_remove_fd(). I'm thinking, even for the
> mapped-ram work, we might not need libvirt to call that function ever.

Good to know I'm not the only one thinking that.

It's okay - after your cleanup it's much better at least to me.  The only
thing to avoid these, AFAIU, is more careful design level reviews in the
future, and more documents state the facts and keep in history (and
obviously why remove-fd was needed was not documented).  Now we carry them.

-- 
Peter Xu




Re: [PATCH v2 06/18] monitor: Stop removing non-duplicated fds

2024-05-31 Thread Peter Xu
On Fri, May 31, 2024 at 12:25:52PM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Thu, May 23, 2024 at 04:05:36PM -0300, Fabiano Rosas wrote:
> >> We've been up until now cleaning up any file descriptors that have
> >> been passed into QEMU and never duplicated[1,2]. A file descriptor
> >> without duplicates indicates that no part of QEMU has made use of
> >> it. This approach is starting to show some cracks now that we're
> >> starting to consume fds from the migration code:
> >> 
> >> - Doing cleanup every time the last monitor connection closes works to
> >>   reap unused fds, but also has the side effect of forcing the
> >>   management layer to pass the file descriptors again in case of a
> >>   disconnect/re-connect, if that happened to be the only monitor
> >>   connection.
> >> 
> >>   Another side effect is that removing an fd with qmp_remove_fd() is
> >>   effectively delayed until the last monitor connection closes.
> >> 
> >>   The reliance on mon_refcount is also problematic because it's racy.
> >> 
> >> - Checking runstate_is_running() skips the cleanup unless the VM is
> >>   running and avoids premature cleanup of the fds, but also has the
> >>   side effect of blocking the legitimate removal of an fd via
> >>   qmp_remove_fd() if the VM happens to be in another state.
> >> 
> >>   This affects qmp_remove_fd() and qmp_query_fdsets() in particular
> >>   because requesting a removal at a bad time (guest stopped) might
> >>   cause an fd to never be removed, or to be removed at a much later
> >>   point in time, causing the query command to continue showing the
> >>   supposedly removed fd/fdset.
> >> 
> >> Note that file descriptors that *have* been duplicated are owned by
> >> the code that uses them and will be removed after qemu_close() is
> >> called. Therefore we've decided that the best course of action to
> >> avoid the undesired side-effects is to stop managing non-duplicated
> >> file descriptors.
> >> 
> >> 1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
> >> 2- ebe52b592d ("monitor: Prevent removing fd from set during init")
> >> 
> >> Signed-off-by: Fabiano Rosas 
> >> ---
> >>  monitor/fds.c  | 15 ---
> >>  monitor/hmp.c  |  2 --
> >>  monitor/monitor-internal.h |  1 -
> >>  monitor/qmp.c  |  2 --
> >>  4 files changed, 8 insertions(+), 12 deletions(-)
> >> 
> >> diff --git a/monitor/fds.c b/monitor/fds.c
> >> index 101e21720d..f7b98814fa 100644
> >> --- a/monitor/fds.c
> >> +++ b/monitor/fds.c
> >> @@ -169,6 +169,11 @@ int monitor_get_fd(Monitor *mon, const char *fdname, 
> >> Error **errp)
> >>  
> >>  static void monitor_fdset_free(MonFdset *mon_fdset)
> >>  {
> >> +/*
> >> + * Only remove an empty fdset. The fds are owned by the user and
> >> + * should have been removed with qmp_remove_fd(). The dup_fds are
> >> + * owned by QEMU and should have been removed with qemu_close().
> >> + */
> >>  if (QLIST_EMPTY(_fdset->fds) && QLIST_EMPTY(_fdset->dup_fds)) 
> >> {
> >>  QLIST_REMOVE(mon_fdset, next);
> >>  g_free(mon_fdset);
> >> @@ -189,9 +194,7 @@ static void monitor_fdset_cleanup(MonFdset *mon_fdset)
> >>  MonFdsetFd *mon_fdset_fd_next;
> >>  
> >>  QLIST_FOREACH_SAFE(mon_fdset_fd, _fdset->fds, next, 
> >> mon_fdset_fd_next) {
> >> -if ((mon_fdset_fd->removed ||
> >> -(QLIST_EMPTY(_fdset->dup_fds) && mon_refcount == 0)) 
> >> &&
> >> -runstate_is_running()) {
> >> +if (mon_fdset_fd->removed) {
> >
> > I don't know the code well, but I like it.
> >
> >>  monitor_fdset_fd_free(mon_fdset_fd);
> >>  }
> >>  }
> >> @@ -206,7 +209,7 @@ void monitor_fdsets_cleanup(void)
> >>  
> >>  QEMU_LOCK_GUARD(_fdsets_lock);
> >>  QLIST_FOREACH_SAFE(mon_fdset, _fdsets, next, mon_fdset_next) {
> >> -monitor_fdset_cleanup(mon_fdset);
> >> +monitor_fdset_free(mon_fdset);
> >>  }
> >>  }
> >>  
> >> @@ -479,9 +482,7 @@ void monitor_fdset_dup_fd_remove(int dup_fd)
> >>  if (mon_fdset_fd_dup->fd == dup_fd) {
> >>  QLIST_REMOVE(mon_fdset_fd_dup, next);
> >>  g_free(mon_fdset_fd_dup);
> >> -if (QLIST_EMPTY(_fdset->dup_fds)) {
> >> -monitor_fdset_cleanup(mon_fdset);
> >> -}
> >> +monitor_fdset_free(mon_fdset);
> >
> > This and above changes are not crystal clear to me where the _cleanup()
> > does extra check "removed" and clean those fds.
> >
> > I think it'll make it easier for me to understand if this one and the next
> > are squashed together.  But maybe it's simply because I didn't fully
> > understand.
> 
> monitor_fdsets_cleanup() was doing three things previously:
> 
> 1- Remove the removed=true fds. This is weird, but ok.
> 
> 2- Remove fds from an fdset that has an empty dup_fds list, but only if
> the guest is not running and the monitor is closed. This is 

Re: [PATCH v2 2/4] tests/qtest/migration-test: Enable test_mode_reboot

2024-05-31 Thread Fabiano Rosas
Nicholas Piggin  writes:

> Fabiano pointed out this test probably is not flaky, just that it could
> not run under Gitlab CI due to very small shm filesystem size in that
> environment.
>
> Now that it has moved to use /tmp instead of /dev/shm files, enable it.
>
> Signed-off-by: Nicholas Piggin 

Reviewed-by: Fabiano Rosas 



Re: [PATCH v3 1/4] osdep: Make qemu_madvise() to set errno in all cases

2024-05-31 Thread Philippe Mathieu-Daudé

On 31/5/24 17:10, Michal Privoznik wrote:

The unspoken premise of qemu_madvise() is that errno is set on
error. And it is mostly the case except for posix_madvise() which
is documented to return either zero (on success) or a positive
error number. This means, we must set errno ourselves. And while
at it, make the function return a negative value on error, just
like other error paths do.

Signed-off-by: Michal Privoznik 
---
  util/osdep.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/util/osdep.c b/util/osdep.c
index e996c4744a..1345238a5c 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -57,7 +57,19 @@ int qemu_madvise(void *addr, size_t len, int advice)
  #if defined(CONFIG_MADVISE)
  return madvise(addr, len, advice);
  #elif defined(CONFIG_POSIX_MADVISE)
-return posix_madvise(addr, len, advice);
+/*
+ * On Darwin posix_madvise() has the same return semantics as
+ * plain madvise, i.e. errno is set and -1 is returned. Otherwise,
+ * a positive error number is returned.
+ */


Alternative is to guard with #ifdef CONFIG_DARWIN ... #else ... #endif
which might be clearer.

Although this approach seems reasonable, so:
Reviewed-by: Philippe Mathieu-Daudé 


+int rc = posix_madvise(addr, len, advice);
+if (rc) {
+if (rc > 0) {
+errno = rc;
+}
+return -1;
+}
+return 0;
  #else
  errno = EINVAL;
  return -1;





Re: [PATCH v2 13/18] monitor: fdset: Match against O_DIRECT

2024-05-31 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, May 23, 2024 at 04:05:43PM -0300, Fabiano Rosas wrote:
>> We're about to enable the use of O_DIRECT in the migration code and
>> due to the alignment restrictions imposed by filesystems we need to
>> make sure the flag is only used when doing aligned IO.
>> 
>> The migration will do parallel IO to different regions of a file, so
>> we need to use more than one file descriptor. Those cannot be obtained
>> by duplicating (dup()) since duplicated file descriptors share the
>> file status flags, including O_DIRECT. If one migration channel does
>> unaligned IO while another sets O_DIRECT to do aligned IO, the
>> filesystem would fail the unaligned operation.
>> 
>> The add-fd QMP command along with the fdset code are specifically
>> designed to allow the user to pass a set of file descriptors with
>> different access flags into QEMU to be later fetched by code that
>> needs to alternate between those flags when doing IO.
>> 
>> Extend the fdset matching to behave the same with the O_DIRECT flag.
>> 
>> Signed-off-by: Fabiano Rosas 
>
> Reviewed-by: Peter Xu 
>
> One thing I am confused but totally irrelevant to this specific change.
>
> I wonder why do we need dupfds at all, and why client needs to remove-fd at
> all.

This answer was lost to time.

>
> It's about what would go wrong if qmp client only add-fd, then if it's
> consumed by anything, it gets moved from "fds" list to "dupfds" list.  The
> thing is I don't expect the client should pass over any fd that will not be
> consumed.  Then if it's always consumed, why bother dup() at all, and why
> bother an explicit remove-fd.

With the lack of documentation, I can only imagine the code was
initially written to be more flexible, but we ended up never needing the
extra flexibility. Maybe we could change that transparently in the
future and deprecate qmp_remove_fd(). I'm thinking, even for the
mapped-ram work, we might not need libvirt to call that function ever.



Re: [PATCH v4 18/31] hw/i386/sev: Add function to get SEV metadata from OVMF header

2024-05-31 Thread Paolo Bonzini
On Fri, May 31, 2024 at 5:20 PM Liam Merwick  wrote:
> > +metadata = (OvmfSevMetadata *)(flash_ptr + flash_size - data->offset);
> > +if (memcmp(metadata->signature, "ASEV", 4) != 0) {
> > +return;
> > +}
> > +
> > +ovmf_sev_metadata_table = g_malloc(metadata->len);
>
> There should be a bounds check on metadata->len before using it.

You mean like:

if (metadata->len <= flash_size - data->offset) {
ovmf_sev_metadata_table = g_memdup2(metadata, metadata->len);
}

?

Paolo




Re: [PATCH v3 4/4] backends/hostmem: Report error when memory size is unaligned

2024-05-31 Thread Philippe Mathieu-Daudé

On 31/5/24 17:10, Michal Privoznik wrote:

If memory-backend-{file,ram} has a size that's not aligned to
underlying page size it is not only wasteful, but also may lead
to hard to debug behaviour. For instance, in case
memory-backend-file and hugepages, madvise() and mbind() fail.
Rightfully so, page is the smallest unit they can work with. And
even though an error is reported, the root cause it not very
clear:

   qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': 
Invalid argument

After this commit:

   qemu-system-x86_64: backend 'memory-backend-file' memory size must be 
multiple of 2 MiB


Thanks!

Reviewed-by: Philippe Mathieu-Daudé 


Signed-off-by: Michal Privoznik 
---
  backends/hostmem.c | 10 ++
  1 file changed, 10 insertions(+)





Re: [PATCH v4 02/31] linux-headers: Update to current kvm/next

2024-05-31 Thread Paolo Bonzini
On Fri, May 31, 2024 at 4:38 PM Liam Merwick 
wrote:
> > --- a/linux-headers/asm-x86/kvm.h
> > +++ b/linux-headers/asm-x86/kvm.h
> > @@ -870,5 +919,6 @@ struct kvm_hyperv_eventfd {
> >   #define KVM_X86_SW_PROTECTED_VM 1
> >   #define KVM_X86_SEV_VM  2
> >   #define KVM_X86_SEV_ES_VM   3
> > +#define KVM_X86_SNP_VM   4
>
> I'm not sure which is the best patch, but there needs to be an array entry
> added for vm_type_name[KVM_X86_SNP_VM] in target/i386/kvm/kvm.c

Probably "i386/sev: Add a class method to determine KVM VM type for
SNP guests", which is where KVM_X86_SNP_VM appears in the code.


Paolo




Re: [PATCH v2 11/18] migration/multifd: Add direct-io support

2024-05-31 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, May 23, 2024 at 04:05:41PM -0300, Fabiano Rosas wrote:
>> When multifd is used along with mapped-ram, we can take benefit of a
>> filesystem that supports the O_DIRECT flag and perform direct I/O in
>> the multifd threads. This brings a significant performance improvement
>> because direct-io writes bypass the page cache which would otherwise
>> be thrashed by the multifd data which is unlikely to be needed again
>> in a short period of time.
>> 
>> To be able to use a multifd channel opened with O_DIRECT, we must
>> ensure that a certain aligment is used. Filesystems usually require a
>> block-size alignment for direct I/O. The way to achieve this is by
>> enabling the mapped-ram feature, which already aligns its I/O properly
>> (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c).
>> 
>> By setting O_DIRECT on the multifd channels, all writes to the same
>> file descriptor need to be aligned as well, even the ones that come
>> from outside multifd, such as the QEMUFile I/O from the main migration
>> code. This makes it impossible to use the same file descriptor for the
>> QEMUFile and for the multifd channels. The various flags and metadata
>> written by the main migration code will always be unaligned by virtue
>> of their small size. To workaround this issue, we'll require a second
>> file descriptor to be used exclusively for direct I/O.
>> 
>> The second file descriptor can be obtained by QEMU by re-opening the
>> migration file (already possible), or by being provided by the user or
>> management application (support to be added in future patches).
>> 
>> Signed-off-by: Fabiano Rosas 
>> ---
>>  migration/file.c  | 31 ++-
>>  migration/file.h  |  1 -
>>  migration/migration.c | 23 +++
>>  3 files changed, 49 insertions(+), 6 deletions(-)
>> 
>> diff --git a/migration/file.c b/migration/file.c
>> index ba5b5c44ff..ac4d206492 100644
>> --- a/migration/file.c
>> +++ b/migration/file.c
>> @@ -50,12 +50,31 @@ void file_cleanup_outgoing_migration(void)
>>  outgoing_args.fname = NULL;
>>  }
>>  
>> +static void file_enable_direct_io(int *flags)
>> +{
>> +#ifdef O_DIRECT
>> +if (migrate_direct_io()) {
>> +*flags |= O_DIRECT;
>> +}
>> +#else
>> +/* it should have been rejected when setting the parameter */
>> +g_assert_not_reached();
>> +#endif
>> +}
>> +
>>  bool file_send_channel_create(gpointer opaque, Error **errp)
>>  {
>>  QIOChannelFile *ioc;
>>  int flags = O_WRONLY;
>>  bool ret = true;
>>  
>> +/*
>> + * Attempt to enable O_DIRECT for the secondary channels. These
>> + * are used for sending ram pages and writes should be guaranteed
>> + * to be aligned to at least page size.
>> + */
>> +file_enable_direct_io();
>
> Call this only if enabled?  That looks clearer, IMHO:
>
>if (migrate_direct_io()) {
>file_enable_direct_io();
>}

Sure




Re: [PATCH v2 06/18] monitor: Stop removing non-duplicated fds

2024-05-31 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, May 23, 2024 at 04:05:36PM -0300, Fabiano Rosas wrote:
>> We've been up until now cleaning up any file descriptors that have
>> been passed into QEMU and never duplicated[1,2]. A file descriptor
>> without duplicates indicates that no part of QEMU has made use of
>> it. This approach is starting to show some cracks now that we're
>> starting to consume fds from the migration code:
>> 
>> - Doing cleanup every time the last monitor connection closes works to
>>   reap unused fds, but also has the side effect of forcing the
>>   management layer to pass the file descriptors again in case of a
>>   disconnect/re-connect, if that happened to be the only monitor
>>   connection.
>> 
>>   Another side effect is that removing an fd with qmp_remove_fd() is
>>   effectively delayed until the last monitor connection closes.
>> 
>>   The reliance on mon_refcount is also problematic because it's racy.
>> 
>> - Checking runstate_is_running() skips the cleanup unless the VM is
>>   running and avoids premature cleanup of the fds, but also has the
>>   side effect of blocking the legitimate removal of an fd via
>>   qmp_remove_fd() if the VM happens to be in another state.
>> 
>>   This affects qmp_remove_fd() and qmp_query_fdsets() in particular
>>   because requesting a removal at a bad time (guest stopped) might
>>   cause an fd to never be removed, or to be removed at a much later
>>   point in time, causing the query command to continue showing the
>>   supposedly removed fd/fdset.
>> 
>> Note that file descriptors that *have* been duplicated are owned by
>> the code that uses them and will be removed after qemu_close() is
>> called. Therefore we've decided that the best course of action to
>> avoid the undesired side-effects is to stop managing non-duplicated
>> file descriptors.
>> 
>> 1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
>> 2- ebe52b592d ("monitor: Prevent removing fd from set during init")
>> 
>> Signed-off-by: Fabiano Rosas 
>> ---
>>  monitor/fds.c  | 15 ---
>>  monitor/hmp.c  |  2 --
>>  monitor/monitor-internal.h |  1 -
>>  monitor/qmp.c  |  2 --
>>  4 files changed, 8 insertions(+), 12 deletions(-)
>> 
>> diff --git a/monitor/fds.c b/monitor/fds.c
>> index 101e21720d..f7b98814fa 100644
>> --- a/monitor/fds.c
>> +++ b/monitor/fds.c
>> @@ -169,6 +169,11 @@ int monitor_get_fd(Monitor *mon, const char *fdname, 
>> Error **errp)
>>  
>>  static void monitor_fdset_free(MonFdset *mon_fdset)
>>  {
>> +/*
>> + * Only remove an empty fdset. The fds are owned by the user and
>> + * should have been removed with qmp_remove_fd(). The dup_fds are
>> + * owned by QEMU and should have been removed with qemu_close().
>> + */
>>  if (QLIST_EMPTY(_fdset->fds) && QLIST_EMPTY(_fdset->dup_fds)) {
>>  QLIST_REMOVE(mon_fdset, next);
>>  g_free(mon_fdset);
>> @@ -189,9 +194,7 @@ static void monitor_fdset_cleanup(MonFdset *mon_fdset)
>>  MonFdsetFd *mon_fdset_fd_next;
>>  
>>  QLIST_FOREACH_SAFE(mon_fdset_fd, _fdset->fds, next, 
>> mon_fdset_fd_next) {
>> -if ((mon_fdset_fd->removed ||
>> -(QLIST_EMPTY(_fdset->dup_fds) && mon_refcount == 0)) &&
>> -runstate_is_running()) {
>> +if (mon_fdset_fd->removed) {
>
> I don't know the code well, but I like it.
>
>>  monitor_fdset_fd_free(mon_fdset_fd);
>>  }
>>  }
>> @@ -206,7 +209,7 @@ void monitor_fdsets_cleanup(void)
>>  
>>  QEMU_LOCK_GUARD(_fdsets_lock);
>>  QLIST_FOREACH_SAFE(mon_fdset, _fdsets, next, mon_fdset_next) {
>> -monitor_fdset_cleanup(mon_fdset);
>> +monitor_fdset_free(mon_fdset);
>>  }
>>  }
>>  
>> @@ -479,9 +482,7 @@ void monitor_fdset_dup_fd_remove(int dup_fd)
>>  if (mon_fdset_fd_dup->fd == dup_fd) {
>>  QLIST_REMOVE(mon_fdset_fd_dup, next);
>>  g_free(mon_fdset_fd_dup);
>> -if (QLIST_EMPTY(_fdset->dup_fds)) {
>> -monitor_fdset_cleanup(mon_fdset);
>> -}
>> +monitor_fdset_free(mon_fdset);
>
> This and above changes are not crystal clear to me where the _cleanup()
> does extra check "removed" and clean those fds.
>
> I think it'll make it easier for me to understand if this one and the next
> are squashed together.  But maybe it's simply because I didn't fully
> understand.

monitor_fdsets_cleanup() was doing three things previously:

1- Remove the removed=true fds. This is weird, but ok.

2- Remove fds from an fdset that has an empty dup_fds list, but only if
the guest is not running and the monitor is closed. This is problematic.

3- Remove/free fdsets that have become empty due to the above
removals. This is ok.

This patch covers (2), because that is the only change that has a
complex reasoning behind it and we need to document that without
conflating it with the rest of the changes.

Since (3) is still a 

Re: Unexpected error in rme_configure_one() at ../target/arm/kvm-rme.c:159

2024-05-31 Thread Ard Biesheuvel
On Fri, 31 May 2024 at 17:09, Jean-Philippe Brucker
 wrote:
>
> Hi Gavin,
>
> On Fri, May 31, 2024 at 04:23:13PM +1000, Gavin Shan wrote:
> > I got a chance to try CCA software components, suggested by [1]. However, 
> > the edk2
> > is stuck somewhere. I didn't reach to stage of loading guest kernel yet. 
> > I'm replying
> > to see if anyone has a idea.
> ...
> > INFO:BL31: Preparing for EL3 exit to normal world
> > INFO:Entry point address = 0x6000
> > INFO:SPSR = 0x3c9
> > UEFI firmware (version  built at 01:31:23 on May 31 2024)
> >
> > The boot is stuck and no more output after that. I tried adding more 
> > verbose output
> > from edk2 and found it's stuck at the following point.
> >
> >
> > ArmVirtPkg/PrePi/PrePi.c::PrePiMain
> > rmVirtPkg/Library/PlatformPeiLib/PlatformPeiLib.c::PlatformPeim
> >
> >  #ifdef MDE_CPU_AARCH64
> >   //
> >   // Set the SMCCC conduit to SMC if executing at EL2, which is typically 
> > the
> >   // exception level that services HVCs rather than the one that invokes 
> > them.
> >   //
> >   if (ArmReadCurrentEL () == AARCH64_EL2) {
> > Status = PcdSetBoolS (PcdMonitorConduitHvc, FALSE);   // The 
> > function is never returned in my case
> > ASSERT_EFI_ERROR (Status);
> >   }
> >  #endif
>
> I'm able to reproduce this even without RME. This code was introduced
> recently by c98f7f755089 ("ArmVirtPkg: Use dynamic PCD to set the SMCCC
> conduit"). Maybe Ard (Cc'd) knows what could be going wrong here.
>
> A slightly reduced reproducer:
>
> $ cd edk2/
> $ build -b DEBUG -a AARCH64 -t GCC5 -p ArmVirtPkg/ArmVirtQemuKernel.dsc
> $ cd ..
>
> $ git clone https://github.com/ARM-software/arm-trusted-firmware.git tf-a
> $ cd tf-a/
> $ make -j CROSS_COMPILE=aarch64-linux-gnu- PLAT=qemu DEBUG=1 LOG_LEVEL=40 
> QEMU_USE_GIC_DRIVER=QEMU_GICV3 
> BL33=../edk2/Build/ArmVirtQemuKernel-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd all 
> fip && \
>   dd if=build/qemu/debug/bl1.bin of=flash.bin && \
>   dd if=build/qemu/debug/fip.bin of=flash.bin seek=64 bs=4096
> $ qemu-system-aarch64 -M virt,virtualization=on,secure=on,gic-version=3 -cpu 
> max -m 2G -smp 8 -monitor none -serial mon:stdio -nographic -bios flash.bin
>

Hmm, this is not something I anticipated.

The problem here is that ArmVirtQemuKernel does not actually support
dynamic PCDs, so instead, the PCD here is 'patchable', which means
that the underlying value is just overwritten in the binary image, and
does not propagate to the rest of the firmware. I assume the write
ends up targettng a location that does not tolerate this.

Running ArmVirtQemu or ArmVirtQemuKernel at EL2 has really only ever
worked by accident, it was simply never intended for that. The fix in
question was a last minute tweak to prevent some CVE fixes pushed by
MicroSoft from breaking network boot entirely, and now that the
release has been made, I guess we should revisit this and fix it
properly.

So the underlying issue here is that on these platforms, we need to
decide at runtime whether to use HVC or SMC instructions for SMCCC
calls. This code attempts to record this into a dynamic PCD once at
boot, in a way that permits other users of the same library to simply
hardcode this in the platform definition (given that bare metal
platforms never need this flexibility).



Re: [PATCH v4 18/31] hw/i386/sev: Add function to get SEV metadata from OVMF header

2024-05-31 Thread Liam Merwick via
On 30/05/2024 12:16, Pankaj Gupta wrote:
> From: Brijesh Singh 
> 
> A recent version of OVMF expanded the reset vector GUID list to add
> SEV-specific metadata GUID. The SEV metadata describes the reserved
> memory regions such as the secrets and CPUID page used during the SEV-SNP
> guest launch.
> 
> The pc_system_get_ovmf_sev_metadata_ptr() is used to retieve the SEV

typo: retieve


> metadata pointer from the OVMF GUID list.
> 
> Signed-off-by: Brijesh Singh 
> Signed-off-by: Michael Roth 
> Signed-off-by: Pankaj Gupta 
> ---
>   hw/i386/pc_sysfw.c   |  4 
>   include/hw/i386/pc.h | 26 ++
>   target/i386/sev.c| 31 +++
>   target/i386/sev.h|  2 ++
>   4 files changed, 63 insertions(+)
> 
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index ac88ad4eb9..048d0919c1 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -260,6 +260,10 @@ void x86_firmware_configure(void *ptr, int size)
>   pc_system_parse_ovmf_flash(ptr, size);
>   
>   if (sev_enabled()) {
> +
> +/* Copy the SEV metadata table (if exist) */

Maybe s/exist/it exists/


> +pc_system_parse_sev_metadata(ptr, size);
> +
>   ret = sev_es_save_reset_vector(ptr, size);
>   if (ret) {
>   error_report("failed to locate and/or save reset vector");
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index ad9c3d9ba8..c653b8eeb2 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -164,6 +164,32 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
> level);
>   #define PCI_HOST_ABOVE_4G_MEM_SIZE "above-4g-mem-size"
>   #define PCI_HOST_PROP_SMM_RANGES   "smm-ranges"
>   
> +typedef enum {
> +SEV_DESC_TYPE_UNDEF,
> +/* The section contains the region that must be validated by the VMM. */
> +SEV_DESC_TYPE_SNP_SEC_MEM,
> +/* The section contains the SNP secrets page */
> +SEV_DESC_TYPE_SNP_SECRETS,
> +/* The section contains address that can be used as a CPUID page */
> +SEV_DESC_TYPE_CPUID,
> +
> +} ovmf_sev_metadata_desc_type;
> +
> +typedef struct __attribute__((__packed__)) OvmfSevMetadataDesc {
> +uint32_t base;
> +uint32_t len;
> +ovmf_sev_metadata_desc_type type;
> +} OvmfSevMetadataDesc;
> +
> +typedef struct __attribute__((__packed__)) OvmfSevMetadata {
> +uint8_t signature[4];
> +uint32_t len;
> +uint32_t version;
> +uint32_t num_desc;
> +OvmfSevMetadataDesc descs[];
> +} OvmfSevMetadata;
> +
> +OvmfSevMetadata *pc_system_get_ovmf_sev_metadata_ptr(void);
>   
>   void pc_pci_as_mapping_init(MemoryRegion *system_memory,
>   MemoryRegion *pci_address_space);
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 2ca9a86bf3..d9d1d97f0c 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -611,6 +611,37 @@ SevCapability *qmp_query_sev_capabilities(Error **errp)
>   return sev_get_capabilities(errp);
>   }
>   
> +static OvmfSevMetadata *ovmf_sev_metadata_table;
> +
> +#define OVMF_SEV_META_DATA_GUID "dc886566-984a-4798-A75e-5585a7bf67cc"
> +typedef struct __attribute__((__packed__)) OvmfSevMetadataOffset {
> +uint32_t offset;
> +} OvmfSevMetadataOffset;
> +
> +OvmfSevMetadata *pc_system_get_ovmf_sev_metadata_ptr(void)
> +{
> +return ovmf_sev_metadata_table;
> +}
> +
> +void pc_system_parse_sev_metadata(uint8_t *flash_ptr, size_t flash_size)
> +{
> +OvmfSevMetadata *metadata;
> +OvmfSevMetadataOffset  *data;
> +
> +if (!pc_system_ovmf_table_find(OVMF_SEV_META_DATA_GUID, (uint8_t 
> **),
> +   NULL)) {
> +return;
> +}
> +
> +metadata = (OvmfSevMetadata *)(flash_ptr + flash_size - data->offset);
> +if (memcmp(metadata->signature, "ASEV", 4) != 0) {
> +return;
> +}
> +
> +ovmf_sev_metadata_table = g_malloc(metadata->len);

There should be a bounds check on metadata->len before using it.


> +memcpy(ovmf_sev_metadata_table, metadata, metadata->len);
> +}
> +
>   static SevAttestationReport *sev_get_attestation_report(const char *mnonce,
>   Error **errp)
>   {
> diff --git a/target/i386/sev.h b/target/i386/sev.h
> index 5dc4767b1e..cc12824dd6 100644
> --- a/target/i386/sev.h
> +++ b/target/i386/sev.h
> @@ -66,4 +66,6 @@ int sev_inject_launch_secret(const char *hdr, const char 
> *secret,
>   int sev_es_save_reset_vector(void *flash_ptr, uint64_t flash_size);
>   void sev_es_set_reset_vector(CPUState *cpu);
>   
> +void pc_system_parse_sev_metadata(uint8_t *flash_ptr, size_t flash_size);
> +
>   #endif




Re: [PATCH] scripts/coverity-scan/COMPONENTS.md: Update paths to match gitlab CI

2024-05-31 Thread Philippe Mathieu-Daudé

Hi Peter,

On 31/5/24 16:21, Peter Maydell wrote:

Since commit 83aa1baa069c we have been running the build for Coverity
Scan as a Gitlab CI job, rather than the old setup where it was run
on a local developer's machine.  This is working well, but the
absolute paths of files are different for the Gitlab CI job, which
means that the regexes we use to identify Coverity components no
longer work. With Gitlab CI builds the file paths are of the form
  /builds/qemu-project/qemu/accel/kvm/kvm-all.c

rather than the old
  /qemu/accel/kvm/kvm-all.c

and our regexes all don't match.

Update all the regexes to start with .*/qemu/ . This will hopefully
avoid the need to change them again in future if the build path
changes again.

This change was made with a search-and-replace of (/qemu)?
to .*/qemu .

Signed-off-by: Peter Maydell 
---
As usual with COMPONENTS.md changes, somebody with Coverity admin
access needs to make all the changes by hand in the GUI once this
has gone through code review :-(

If there are any other changes we want to make to our component
regexes, now would be a great time to suggest them, because this
change is going to involve "delete every existing component and
recreate"...
---
  scripts/coverity-scan/COMPONENTS.md | 104 ++--
  1 file changed, 52 insertions(+), 52 deletions(-)




diff --git a/scripts/coverity-scan/COMPONENTS.md 
b/scripts/coverity-scan/COMPONENTS.md
index 1537e49cd5a..98d4bcd6a50 100644
--- a/scripts/coverity-scan/COMPONENTS.md
+++ b/scripts/coverity-scan/COMPONENTS.md
@@ -1,157 +1,157 @@
  This is the list of currently configured Coverity components:




  block
-  ~ 
(/qemu)?(/block.*|(/include?)/(block|storage-daemon)/.*|(/include)?/hw/(block|ide|nvme)/.*|/qemu-(img|io).*|/util/(aio|async|thread-pool).*)
+  ~ 
.*/qemu(/block.*|(/include?)/(block|storage-daemon)/.*|(/include)?/hw/(block|ide|nvme)/.*|/qemu-(img|io).*|/util/(aio|async|thread-pool).*)


util/block-helpers.[ch]

I'd put hw/block/ to another bucket that the block subsystem.


  char
-  ~ (/qemu)?(/qemu-char\.c|/include/sysemu/char\.h|(/include)?/hw/char/.*)
+  ~ .*/qemu(/qemu-char\.c|/include/sysemu/char\.h|(/include)?/hw/char/.*)


Is 'char' the same as 'chardev'?


  crypto
-  ~ 
(/qemu)?((/include)?/crypto/.*|/hw/.*/.*crypto.*|(/include/sysemu|/backends)/cryptodev.*)
+  ~ 
.*/qemu((/include)?/crypto/.*|/hw/.*/.*crypto.*|(/include/sysemu|/backends)/cryptodev.*)


Maybe worth covering host/include/*/host/crypto/?


  disas
-  ~ (/qemu)?((/include)?/disas.*)
+  ~ .*/qemu((/include)?/disas.*)


Missing:

target/avr/disas.c
target/loongarch/disas.c
target/openrisc/disas.c
target/rx/disas.c


  migration
-  ~ (/qemu)?((/include)?/migration/.*)
+  ~ .*/qemu((/include)?/migration/.*)


Not sure about:

hw/vfio/migration.c


  monitor
-  ~ (/qemu)?(/qapi.*|/qobject/.*|/monitor\..*|/[hq]mp\..*)
+  ~ .*/qemu(/qapi.*|/qobject/.*|/monitor\..*|/[hq]mp\..*)


Apparently the pattern is now foo-[hq]mp-cmds.[ch].

Not matched:
hmp-commands-info.hx
hmp-commands.hx
hw/virtio/virtio-qmp.[ch]
include/qapi/qmp-event.h
job-qmp.c
monitor/qemu-config-qmp.c
python/qemu/qmp/events.py
scripts/python_qmp_updater.py
tests/unit/test-qmp-event.c

Not sure about (not covered in testlibs):

tests/qtest/libqmp.c
tests/qtest/libqmp.h


  trace
-  ~ (/qemu)?(/.*trace.*\.[ch])
+  ~ .*/qemu(/.*trace.*\.[ch])



  user
-  ~ 
(/qemu)?(/linux-user/.*|/bsd-user/.*|/user-exec\.c|/thunk\.c|/include/user/.*)
+  ~ 
.*/qemu(/linux-user/.*|/bsd-user/.*|/user-exec\.c|/thunk\.c|/include/user/.*)



  xen
-  ~ (/qemu)?(.*/xen.*)
+  ~ .*/qemu(.*/xen.*)


We could match .*xen.* like trace (ditto other accelerators).


  hvf
-  ~ (/qemu)?(.*/hvf.*)
+  ~ .*/qemu(.*/hvf.*)
  
  kvm

-  ~ (/qemu)?(.*/kvm.*)
+  ~ .*/qemu(.*/kvm.*)
  
  tcg

-  ~ (/qemu)?(/accel/tcg|/replay|/tcg)/.*
+  ~ .*/qemu(/accel/tcg|/replay|/tcg)/.*
  
  sysemu

-  ~ (/qemu)?(/system/.*|/accel/.*)
+  ~ .*/qemu(/system/.*|/accel/.*)






Re: [PATCH v3 00/27] qemu-img: refersh options and --help handling, cleanups

2024-05-31 Thread Michael Tokarev

A friendly ping?

It took me quite some time and energy for all this.  It'd be sad if
it gets lost.

/mjt

24.04.2024 11:50, Michael Tokarev wrote:

Quite big patchset trying to implement normal, readable qemu-img --help
(and qemu-img COMMAND --help) output with readable descriptions, and
adding many long options in the process.

In the end I stopped using qemu-img-opts.hx in qemu-img.c, perhaps
this can be avoided, with only list of commands and their desrciptions
kept there, but I don't see big advantage here.  The same list should
be included in docs/tools/qemu-img.rst, - this is not done now.

Also each command syntax isn't reflected in the doc for now, because
I want to give good names for options first, - and there, we've quite
some inconsistences and questions.  For example, measure --output=OFMT
-O OFMT, - this is priceless :)  I've no idea why we have this ugly
--output=json thing, why not have --json? ;)  I gave the desired
format long name --target-format to avoid clash with --output.

For rebase, src vs tgt probably should be renamed in local variables
too, and I'm not even sure I've got the caches right. For caches,
the thing is inconsistent across commands.

For compare, I used --a-format/--b-format (for -f/-F), - this can
be made --souce-format and --target-format, to compare source (file1)
with target (file2).

For bitmap, things are scary, I'm not sure what -b SRC_FILENAME
really means, - for now I gave it --source option, but this does
not make it more clear, suggestions welcome.

There are many other inconsistencies, I can't fix them all in one go.

Changes since v2:

  - added Dan's R-Bs
  - refined couple cvtnum conversions
  - dropped "stop printing error twice in a few places"

Michael Tokarev (27):
   qemu-img: measure: convert img_size to signed, simplify handling
   qemu-img: create: convert img_size to signed, simplify handling
   qemu-img: global option processing and error printing
   qemu-img: pass current cmd info into command handlers
   qemu-img: create: refresh options/--help
   qemu-img: factor out parse_output_format() and use it in the code
   qemu-img: check: refresh options/--help
   qemu-img: simplify --repair error message
   qemu-img: commit: refresh options/--help
   qemu-img: compare: refresh options/--help
   qemu-img: convert: refresh options/--help
   qemu-img: info: refresh options/--help
   qemu-img: map: refresh options/--help
   qemu-img: snapshot: allow specifying -f fmt
   qemu-img: snapshot: make -l (list) the default, simplify option
 handling
   qemu-img: snapshot: refresh options/--help
   qemu-img: rebase: refresh options/--help
   qemu-img: resize: do not always eat last argument
   qemu-img: resize: refresh options/--help
   qemu-img: amend: refresh options/--help
   qemu-img: bench: refresh options/--help
   qemu-img: bitmap: refresh options/--help
   qemu-img: dd: refresh options/--help
   qemu-img: measure: refresh options/--help
   qemu-img: implement short --help, remove global help() function
   qemu-img: inline list of supported commands, remove qemu-img-cmds.h
 include
   qemu-img: extend cvtnum() and use it in more places

  docs/tools/qemu-img.rst|4 +-
  qemu-img-cmds.hx   |4 +-
  qemu-img.c | 1311 ++--
  tests/qemu-iotests/049.out |9 +-
  4 files changed, 821 insertions(+), 507 deletions(-)



--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E  9D8B E14E 3F2A 9DD7 9199  28F1 61AD 3D98 
ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5  6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 
8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt




[PATCH v3 0/4] backends/hostmem: Report more errors on failures

2024-05-31 Thread Michal Privoznik
v3 of:

diff to v2:
- In 1/4 I've reworked setting errno because of how posix_madvise()
  behaves on Darwin,
- In 4/4 I'm now using size_to_str() to print the page size, oh, and the
  error message now contains the backend name too.

Michal Privoznik (4):
  osdep: Make qemu_madvise() to set errno in all cases
  osdep: Make qemu_madvise() return ENOSYS on unsupported OSes
  backends/hostmem: Report error on qemu_madvise() failures
  backends/hostmem: Report error when memory size is unaligned

 backends/hostmem.c | 46 ++
 util/osdep.c   | 16 ++--
 2 files changed, 52 insertions(+), 10 deletions(-)

-- 
2.44.1




[PATCH v3 4/4] backends/hostmem: Report error when memory size is unaligned

2024-05-31 Thread Michal Privoznik
If memory-backend-{file,ram} has a size that's not aligned to
underlying page size it is not only wasteful, but also may lead
to hard to debug behaviour. For instance, in case
memory-backend-file and hugepages, madvise() and mbind() fail.
Rightfully so, page is the smallest unit they can work with. And
even though an error is reported, the root cause it not very
clear:

  qemu-system-x86_64: Couldn't set property 'dump' on 'memory-backend-file': 
Invalid argument

After this commit:

  qemu-system-x86_64: backend 'memory-backend-file' memory size must be 
multiple of 2 MiB

Signed-off-by: Michal Privoznik 
---
 backends/hostmem.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 012a8c190f..4d6c69fe4d 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -20,6 +20,7 @@
 #include "qom/object_interfaces.h"
 #include "qemu/mmap-alloc.h"
 #include "qemu/madvise.h"
+#include "qemu/cutils.h"
 #include "hw/qdev-core.h"
 
 #ifdef CONFIG_NUMA
@@ -337,6 +338,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
 void *ptr;
 uint64_t sz;
+size_t pagesize;
 bool async = !phase_check(PHASE_LATE_BACKENDS_CREATED);
 
 if (!bc->alloc) {
@@ -348,6 +350,14 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 
 ptr = memory_region_get_ram_ptr(>mr);
 sz = memory_region_size(>mr);
+pagesize = qemu_ram_pagesize(backend->mr.ram_block);
+
+if (!QEMU_IS_ALIGNED(sz, pagesize)) {
+g_autofree char *pagesize_str = size_to_str(pagesize);
+error_setg(errp, "backend '%s' memory size must be multiple of %s",
+   object_get_typename(OBJECT(uc)), pagesize_str);
+return;
+}
 
 if (backend->merge &&
 qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE)) {
-- 
2.44.1




[PATCH v3 2/4] osdep: Make qemu_madvise() return ENOSYS on unsupported OSes

2024-05-31 Thread Michal Privoznik
Not every OS is capable of madvise() or posix_madvise() even. In
that case, errno should be set to ENOSYS as it reflects the cause
better.

Signed-off-by: Michal Privoznik 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: David Hildenbrand 
---
 util/osdep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/osdep.c b/util/osdep.c
index 1345238a5c..4a8920ba93 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -71,7 +71,7 @@ int qemu_madvise(void *addr, size_t len, int advice)
 }
 return 0;
 #else
-errno = EINVAL;
+errno = ENOSYS;
 return -1;
 #endif
 }
-- 
2.44.1




[PATCH v3 1/4] osdep: Make qemu_madvise() to set errno in all cases

2024-05-31 Thread Michal Privoznik
The unspoken premise of qemu_madvise() is that errno is set on
error. And it is mostly the case except for posix_madvise() which
is documented to return either zero (on success) or a positive
error number. This means, we must set errno ourselves. And while
at it, make the function return a negative value on error, just
like other error paths do.

Signed-off-by: Michal Privoznik 
---
 util/osdep.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/util/osdep.c b/util/osdep.c
index e996c4744a..1345238a5c 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -57,7 +57,19 @@ int qemu_madvise(void *addr, size_t len, int advice)
 #if defined(CONFIG_MADVISE)
 return madvise(addr, len, advice);
 #elif defined(CONFIG_POSIX_MADVISE)
-return posix_madvise(addr, len, advice);
+/*
+ * On Darwin posix_madvise() has the same return semantics as
+ * plain madvise, i.e. errno is set and -1 is returned. Otherwise,
+ * a positive error number is returned.
+ */
+int rc = posix_madvise(addr, len, advice);
+if (rc) {
+if (rc > 0) {
+errno = rc;
+}
+return -1;
+}
+return 0;
 #else
 errno = EINVAL;
 return -1;
-- 
2.44.1




[PATCH v3 3/4] backends/hostmem: Report error on qemu_madvise() failures

2024-05-31 Thread Michal Privoznik
If user sets .merge or .dump attributes qemu_madvise() is called
with corresponding advice. But it is never checked for failure
which may mislead users into thinking the attribute is set
correctly. Report an appropriate error.

Signed-off-by: Michal Privoznik 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: David Hildenbrand 
---
 backends/hostmem.c | 36 
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index eb9682b4a8..012a8c190f 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -178,8 +178,14 @@ static void host_memory_backend_set_merge(Object *obj, 
bool value, Error **errp)
 void *ptr = memory_region_get_ram_ptr(>mr);
 uint64_t sz = memory_region_size(>mr);
 
-qemu_madvise(ptr, sz,
- value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE);
+if (qemu_madvise(ptr, sz,
+ value ? QEMU_MADV_MERGEABLE : QEMU_MADV_UNMERGEABLE)) 
{
+error_setg_errno(errp, errno,
+ "Couldn't change property 'merge' on '%s'",
+ object_get_typename(obj));
+return;
+}
+
 backend->merge = value;
 }
 }
@@ -204,8 +210,14 @@ static void host_memory_backend_set_dump(Object *obj, bool 
value, Error **errp)
 void *ptr = memory_region_get_ram_ptr(>mr);
 uint64_t sz = memory_region_size(>mr);
 
-qemu_madvise(ptr, sz,
- value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP);
+if (qemu_madvise(ptr, sz,
+ value ? QEMU_MADV_DODUMP : QEMU_MADV_DONTDUMP)) {
+error_setg_errno(errp, errno,
+ "Couldn't change property 'dump' on '%s'",
+ object_get_typename(obj));
+return;
+}
+
 backend->dump = value;
 }
 }
@@ -337,11 +349,19 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 ptr = memory_region_get_ram_ptr(>mr);
 sz = memory_region_size(>mr);
 
-if (backend->merge) {
-qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
+if (backend->merge &&
+qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE)) {
+error_setg_errno(errp, errno,
+ "Couldn't set property 'merge' on '%s'",
+ object_get_typename(OBJECT(uc)));
+return;
 }
-if (!backend->dump) {
-qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);
+if (!backend->dump &&
+qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP)) {
+error_setg_errno(errp, errno,
+ "Couldn't set property 'dump' on '%s'",
+ object_get_typename(OBJECT(uc)));
+return;
 }
 #ifdef CONFIG_NUMA
 unsigned long lastbit = find_last_bit(backend->host_nodes, MAX_NODES);
-- 
2.44.1




Re: Unexpected error in rme_configure_one() at ../target/arm/kvm-rme.c:159

2024-05-31 Thread Jean-Philippe Brucker
Hi Gavin,

On Fri, May 31, 2024 at 04:23:13PM +1000, Gavin Shan wrote:
> I got a chance to try CCA software components, suggested by [1]. However, the 
> edk2
> is stuck somewhere. I didn't reach to stage of loading guest kernel yet. I'm 
> replying
> to see if anyone has a idea.
...
> INFO:BL31: Preparing for EL3 exit to normal world
> INFO:Entry point address = 0x6000
> INFO:SPSR = 0x3c9
> UEFI firmware (version  built at 01:31:23 on May 31 2024)
> 
> The boot is stuck and no more output after that. I tried adding more verbose 
> output
> from edk2 and found it's stuck at the following point.
> 
> 
> ArmVirtPkg/PrePi/PrePi.c::PrePiMain
> rmVirtPkg/Library/PlatformPeiLib/PlatformPeiLib.c::PlatformPeim
> 
>  #ifdef MDE_CPU_AARCH64
>   //
>   // Set the SMCCC conduit to SMC if executing at EL2, which is typically the
>   // exception level that services HVCs rather than the one that invokes them.
>   //
>   if (ArmReadCurrentEL () == AARCH64_EL2) {
> Status = PcdSetBoolS (PcdMonitorConduitHvc, FALSE);   // The function 
> is never returned in my case
> ASSERT_EFI_ERROR (Status);
>   }
>  #endif

I'm able to reproduce this even without RME. This code was introduced
recently by c98f7f755089 ("ArmVirtPkg: Use dynamic PCD to set the SMCCC
conduit"). Maybe Ard (Cc'd) knows what could be going wrong here.

A slightly reduced reproducer:

$ cd edk2/
$ build -b DEBUG -a AARCH64 -t GCC5 -p ArmVirtPkg/ArmVirtQemuKernel.dsc
$ cd ..

$ git clone https://github.com/ARM-software/arm-trusted-firmware.git tf-a
$ cd tf-a/
$ make -j CROSS_COMPILE=aarch64-linux-gnu- PLAT=qemu DEBUG=1 LOG_LEVEL=40 
QEMU_USE_GIC_DRIVER=QEMU_GICV3 
BL33=../edk2/Build/ArmVirtQemuKernel-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd all fip 
&& \
  dd if=build/qemu/debug/bl1.bin of=flash.bin && \
  dd if=build/qemu/debug/fip.bin of=flash.bin seek=64 bs=4096
$ qemu-system-aarch64 -M virt,virtualization=on,secure=on,gic-version=3 -cpu 
max -m 2G -smp 8 -monitor none -serial mon:stdio -nographic -bios flash.bin

Thanks,
Jean



Re: [PATCH v2 05/18] monitor: Introduce monitor_fdset_*free

2024-05-31 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, May 23, 2024 at 04:05:35PM -0300, Fabiano Rosas wrote:
>> Introduce two new functions to remove and free no longer used fds and
>> fdsets.
>> 
>> We need those to decouple the remove/free routines from
>> monitor_fdset_cleanup() which will go away in the next patches.
>> 
>> The new functions:
>> 
>> - monitor_fdset_free() will be used when a monitor connection closes
>>   and when an fd is removed to cleanup any fdset that is now empty.
>> 
>> - monitor_fdset_fd_free() will be used to remove one or more fds that
>>   have been explicitly targeted by qmp_remove_fd().
>> 
>> Signed-off-by: Fabiano Rosas 
>
> Reviewed-by: Peter Xu 
>
> One nitpick below.
>
>> ---
>>  monitor/fds.c | 26 ++
>>  1 file changed, 18 insertions(+), 8 deletions(-)
>> 
>> diff --git a/monitor/fds.c b/monitor/fds.c
>> index fb9f58c056..101e21720d 100644
>> --- a/monitor/fds.c
>> +++ b/monitor/fds.c
>> @@ -167,6 +167,22 @@ int monitor_get_fd(Monitor *mon, const char *fdname, 
>> Error **errp)
>>  return -1;
>>  }
>>  
>> +static void monitor_fdset_free(MonFdset *mon_fdset)
>> +{
>> +if (QLIST_EMPTY(_fdset->fds) && QLIST_EMPTY(_fdset->dup_fds)) {
>> +QLIST_REMOVE(mon_fdset, next);
>> +g_free(mon_fdset);
>> +}
>> +}
>
> Would monitor_fdset_free_if_empty() (or similar) slightly better?

Yes, I'll use that.

>
> static void monitor_fdset_free(MonFdset *mon_fdset)
> {
> QLIST_REMOVE(mon_fdset, next);
> g_free(mon_fdset);
> }
>
> static void monitor_fdset_free_if_empty(MonFdset *mon_fdset)
> {
> if (QLIST_EMPTY(_fdset->fds) && QLIST_EMPTY(_fdset->dup_fds)) {
> monitor_fdset_free(mon_fdset);
> }
> }
>
>> +
>> +static void monitor_fdset_fd_free(MonFdsetFd *mon_fdset_fd)
>> +{
>> +close(mon_fdset_fd->fd);
>> +g_free(mon_fdset_fd->opaque);
>> +QLIST_REMOVE(mon_fdset_fd, next);
>> +g_free(mon_fdset_fd);
>> +}
>> +
>>  static void monitor_fdset_cleanup(MonFdset *mon_fdset)
>>  {
>>  MonFdsetFd *mon_fdset_fd;
>> @@ -176,17 +192,11 @@ static void monitor_fdset_cleanup(MonFdset *mon_fdset)
>>  if ((mon_fdset_fd->removed ||
>>  (QLIST_EMPTY(_fdset->dup_fds) && mon_refcount == 0)) &&
>>  runstate_is_running()) {
>> -close(mon_fdset_fd->fd);
>> -g_free(mon_fdset_fd->opaque);
>> -QLIST_REMOVE(mon_fdset_fd, next);
>> -g_free(mon_fdset_fd);
>> +monitor_fdset_fd_free(mon_fdset_fd);
>>  }
>>  }
>>  
>> -if (QLIST_EMPTY(_fdset->fds) && QLIST_EMPTY(_fdset->dup_fds)) {
>> -QLIST_REMOVE(mon_fdset, next);
>> -g_free(mon_fdset);
>> -}
>> +monitor_fdset_free(mon_fdset);
>>  }
>>  
>>  void monitor_fdsets_cleanup(void)
>> -- 
>> 2.35.3
>> 



Re: [PATCH v2 01/18] migration: Fix file migration with fdset

2024-05-31 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, May 23, 2024 at 04:05:31PM -0300, Fabiano Rosas wrote:
>> When the "file:" migration support was added we missed the special
>> case in the qemu_open_old implementation that allows for a particular
>> file name format to be used to refer to a set of file descriptors that
>> have been previously provided to QEMU via the add-fd QMP command.
>> 
>> When using this fdset feature, we should not truncate the migration
>> file because being given an fd means that the management layer is in
>> control of the file and will likely already have some data written to
>> it. This is further indicated by the presence of the 'offset'
>> argument, which indicates the start of the region where QEMU is
>> allowed to write.
>> 
>> Fix the issue by replacing the O_TRUNC flag on open by an ftruncate
>> call, which will take the offset into consideration.
>> 
>> Fixes: 385f510df5 ("migration: file URI offset")
>> Suggested-by: Daniel P. Berrangé 
>> Signed-off-by: Fabiano Rosas 
>
> Reviewed-by: Peter Xu 

Thanks

>
> Right below the change, did we forget to free the ioc if
> qio_channel_io_seek() failed?
>

Yep, looks like it. I'll fix it.

>> ---
>>  migration/file.c | 11 +--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>> 
>> diff --git a/migration/file.c b/migration/file.c
>> index ab18ba505a..ba5b5c44ff 100644
>> --- a/migration/file.c
>> +++ b/migration/file.c
>> @@ -84,12 +84,19 @@ void file_start_outgoing_migration(MigrationState *s,
>>  
>>  trace_migration_file_outgoing(filename);
>>  
>> -fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
>> - 0600, errp);
>> +fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY, 0600, 
>> errp);
>>  if (!fioc) {
>>  return;
>>  }
>>  
>> +if (ftruncate(fioc->fd, offset)) {
>> +error_setg_errno(errp, errno,
>> + "failed to truncate migration file to offset %" 
>> PRIx64,
>> + offset);
>> +object_unref(OBJECT(fioc));
>> +return;
>> +}
>> +
>>  outgoing_args.fname = g_strdup(filename);
>>  
>>  ioc = QIO_CHANNEL(fioc);
>> -- 
>> 2.35.3
>> 



Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust

2024-05-31 Thread Alex Bennée
Daniel P. Berrangé  writes:

> On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
>> Hi maintainers and list,
>> 
>> This RFC series attempts to re-implement simpletrace.py with Rust, which
>> is the 1st task of Paolo's GSoC 2024 proposal.
>> 
>> There are two motivations for this work:
>> 1. This is an open chance to discuss how to integrate Rust into QEMU.
>
> I don't think this proposal really triggers that discussion to any
> great extent, because 'simpletrace.py' is not a part of the QEMU
> codebase in any meaningul way. It is a standalone python script
> that just happens to live in the qemu.git repository.
>
> The difficult questions around Rust integration will arise when we
> want to have Rust used to /replace/ some existing non-optional
> functionality. IOW, if Rust were to become a mandatory dep of QEMU
> that could not be avoided.

We hope to post some PoC device models in Rust soon with exactly that
debate in mind.

> In fact I kinda wonder whether this Rust simpletrace code could
> simply be its own git repo under gitlab.com/qemu-project/,
> rather than put into the monolithic QEMU repo ? That just makes
> it a "normal" Rust project and no questions around integration
> with QEMU's traditional build system would arise.

Do we export the necessary bits for external projects to use? I don't
think we've had the equivalent of a qemu-devel package yet and doing so
start implying externally visible versioning and deprecation policies
for QEMU APIs and data formats.

TCG plugins have a header based API but currently everyone builds
against their local checkout and we are fairly liberal about bumping the
API versions.


>
>
> With regards,
> Daniel

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PATCH v4 02/31] linux-headers: Update to current kvm/next

2024-05-31 Thread Liam Merwick via
On 30/05/2024 12:16, Pankaj Gupta wrote:
> Co-developed-by: Michael Roth 
> Signed-off-by: Michael Roth 
> Signed-off-by: Pankaj Gupta 
> ---
>   linux-headers/asm-loongarch/bitsperlong.h | 23 ++
>   linux-headers/asm-loongarch/kvm.h |  4 ++
>   linux-headers/asm-loongarch/mman.h|  9 
>   linux-headers/asm-riscv/kvm.h |  1 +
>   linux-headers/asm-riscv/mman.h| 36 +++-
>   linux-headers/asm-s390/mman.h | 36 +++-
>   linux-headers/asm-x86/kvm.h   | 52 ++-
>   linux-headers/linux/vhost.h   | 15 ---
>   8 files changed, 166 insertions(+), 10 deletions(-)
>

> --- a/linux-headers/asm-x86/kvm.h
> +++ b/linux-headers/asm-x86/kvm.h

> @@ -870,5 +919,6 @@ struct kvm_hyperv_eventfd {
>   #define KVM_X86_SW_PROTECTED_VM 1
>   #define KVM_X86_SEV_VM  2
>   #define KVM_X86_SEV_ES_VM   3
> +#define KVM_X86_SNP_VM   4

I'm not sure which is the best patch, but there needs to be an array entry
added for vm_type_name[KVM_X86_SNP_VM] in target/i386/kvm/kvm.c

Regards,
Liam




Re: [PATCH v2 2/2] iotests: test NBD+TLS+iothread

2024-05-31 Thread Kevin Wolf
Am 18.05.2024 um 04:50 hat Eric Blake geschrieben:
> Prevent regressions when using NBD with TLS in the presence of
> iothreads, adding coverage the fix to qio channels made in the
> previous patch.
> 
> CC: qemu-sta...@nongnu.org
> Signed-off-by: Eric Blake 

> diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread.out 
> b/tests/qemu-iotests/tests/nbd-tls-iothread.out
> new file mode 100644
> index 000..b5e12dcbe7a
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/nbd-tls-iothread.out
> @@ -0,0 +1,54 @@
> +QA output created by nbd-tls-iothread
> +
> +== preparing TLS creds and spare port ==
> +picked unused port
> +Generating a self signed certificate...
> +Generating a signed certificate...
> +Generating a signed certificate...
> +
> +== preparing image ==
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824
> +Formatting 
> '/home/eblake/qemu/build/tests/qemu-iotests/scratch/qcow2-file-nbd-tls-iothread/dst.qcow2',
>  fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib 
> size=1073741824 lazy_refcounts=off refcount_bits=16

/home/eblake is suboptimal in reference output. :-)

Kevin




Re: [PATCH v2 0/6] Rework x86 page table walks

2024-05-31 Thread Peter Maydell
On Fri, 31 May 2024 at 14:48, Peter Maydell  wrote:
>
> On Fri, 24 May 2024 at 18:08, Don Porter  wrote:
> >
> > This version of the 'info pg' command adopts Peter Maydell's request
> > to write some guest-agnostic page table iterator and accessor code,
> > along with architecture-specific hooks.  The first patch in this
> > series contributes a generic page table iterator and an x86
> > instantiation.  As a client, we first introduce an 'info pg' monitor
> > command, as well as a compressing callback hook for creating succinct
> > page table representations.
> >
> > After this, each successive patch replaces an exisitng x86 page table
> > walker with a use of common iterator code.
> >
> > I could use advice on how to ensure this is sufficiently well tested.
> > I used 'make check' and 'make check-avocado', which both pass; what is
> > the typical standard for testing something like a page table related
> > change?
> >
> > As far as generality, I have only attempted this on x86, but I expect
> > the design would work for any similar radix-tree style page table.
> >
> > I am still new enough to the code base that I wasn't certain about
> > where to put the generic code, as well as naming conventions.
> >
> > Per David Gilbert's suggestion, I was careful to ensure that monitor
> > calls do not perturb TLB state (see the read-only flag in some
> > functions).
> >
> > I appreciate Nadav's suggestion about other ways to pursue the same
> > goal: I ended up deciding I would like to try my hand at consolidating
> > the x86 page table code.
> >
> > Don Porter (6):
> >   Add an "info pg" command that prints the current page tables
> >   Convert 'info tlb' to use generic iterator
> >   Convert 'info mem' to use generic iterator
> >   Convert x86_cpu_get_memory_mapping() to use generic iterators
> >   Move tcg implementation of x86 get_physical_address into common helper
> > code.
> >   Convert x86_mmu_translate() to use common code.
>
> Thanks for doing this work -- I like the diffstats for patches 2 and
> 3 a lot :-)
>
> I have been digging out from under a big backlog of unreviewed
> patches, but in an ideal world I might get to this next week.
> Hopefully the x86 maintainers will have a look at it too.

PS: forgot to mention, but if you eventually send a v2 of this
series, could you send it as its own fresh email, not as
a reply to an existing email like this series? Patches and
patchsets that are replies to existing threads tend to get
overlooked, and our automated patch tooling doesn't spot them
either.

thanks
-- PMM



[PATCH] scripts/coverity-scan/COMPONENTS.md: Update paths to match gitlab CI

2024-05-31 Thread Peter Maydell
Since commit 83aa1baa069c we have been running the build for Coverity
Scan as a Gitlab CI job, rather than the old setup where it was run
on a local developer's machine.  This is working well, but the
absolute paths of files are different for the Gitlab CI job, which
means that the regexes we use to identify Coverity components no
longer work. With Gitlab CI builds the file paths are of the form
 /builds/qemu-project/qemu/accel/kvm/kvm-all.c

rather than the old
 /qemu/accel/kvm/kvm-all.c

and our regexes all don't match.

Update all the regexes to start with .*/qemu/ . This will hopefully
avoid the need to change them again in future if the build path
changes again.

This change was made with a search-and-replace of (/qemu)?
to .*/qemu .

Signed-off-by: Peter Maydell 
---
As usual with COMPONENTS.md changes, somebody with Coverity admin
access needs to make all the changes by hand in the GUI once this
has gone through code review :-(

If there are any other changes we want to make to our component
regexes, now would be a great time to suggest them, because this
change is going to involve "delete every existing component and
recreate"...
---
 scripts/coverity-scan/COMPONENTS.md | 104 ++--
 1 file changed, 52 insertions(+), 52 deletions(-)

diff --git a/scripts/coverity-scan/COMPONENTS.md 
b/scripts/coverity-scan/COMPONENTS.md
index 1537e49cd5a..98d4bcd6a50 100644
--- a/scripts/coverity-scan/COMPONENTS.md
+++ b/scripts/coverity-scan/COMPONENTS.md
@@ -1,157 +1,157 @@
 This is the list of currently configured Coverity components:
 
 alpha
-  ~ (/qemu)?((/include)?/hw/alpha/.*|/target/alpha/.*)
+  ~ .*/qemu((/include)?/hw/alpha/.*|/target/alpha/.*)
 
 arm
-  ~ 
(/qemu)?((/include)?/hw/arm/.*|(/include)?/hw/.*/(arm|allwinner-a10|bcm28|digic|exynos|imx|omap|stellaris|pxa2xx|versatile|zynq|cadence).*|/hw/net/xgmac.c|/hw/ssi/xilinx_spips.c|/target/arm/.*)
+  ~ 
.*/qemu((/include)?/hw/arm/.*|(/include)?/hw/.*/(arm|allwinner-a10|bcm28|digic|exynos|imx|omap|stellaris|pxa2xx|versatile|zynq|cadence).*|/hw/net/xgmac.c|/hw/ssi/xilinx_spips.c|/target/arm/.*)
 
 avr
-  ~ (/qemu)?((/include)?/hw/avr/.*|/target/avr/.*)
+  ~ .*/qemu((/include)?/hw/avr/.*|/target/avr/.*)
 
 cris
-  ~ (/qemu)?((/include)?/hw/cris/.*|/target/cris/.*)
+  ~ .*/qemu((/include)?/hw/cris/.*|/target/cris/.*)
 
 hexagon-gen (component should be ignored in analysis)
-  ~ (/qemu)?(/target/hexagon/.*generated.*)
+  ~ .*/qemu(/target/hexagon/.*generated.*)
 
 hexagon
-  ~ (/qemu)?(/target/hexagon/.*)
+  ~ .*/qemu(/target/hexagon/.*)
 
 hppa
-  ~ (/qemu)?((/include)?/hw/hppa/.*|/target/hppa/.*)
+  ~ .*/qemu((/include)?/hw/hppa/.*|/target/hppa/.*)
 
 i386
-  ~ (/qemu)?((/include)?/hw/i386/.*|/target/i386/.*|/hw/intc/[^/]*apic[^/]*\.c)
+  ~ .*/qemu((/include)?/hw/i386/.*|/target/i386/.*|/hw/intc/[^/]*apic[^/]*\.c)
 
 loongarch
-  ~ (/qemu)?((/include)?/hw/(loongarch/.*|.*/loongarch.*)|/target/loongarch/.*)
+  ~ .*/qemu((/include)?/hw/(loongarch/.*|.*/loongarch.*)|/target/loongarch/.*)
 
 m68k
-  ~ 
(/qemu)?((/include)?/hw/m68k/.*|/target/m68k/.*|(/include)?/hw(/.*)?/mcf.*|(/include)?/hw/nubus/.*)
+  ~ 
.*/qemu((/include)?/hw/m68k/.*|/target/m68k/.*|(/include)?/hw(/.*)?/mcf.*|(/include)?/hw/nubus/.*)
 
 microblaze
-  ~ (/qemu)?((/include)?/hw/microblaze/.*|/target/microblaze/.*)
+  ~ .*/qemu((/include)?/hw/microblaze/.*|/target/microblaze/.*)
 
 mips
-  ~ (/qemu)?((/include)?/hw/mips/.*|/target/mips/.*)
+  ~ .*/qemu((/include)?/hw/mips/.*|/target/mips/.*)
 
 openrisc
-  ~ (/qemu)?((/include)?/hw/openrisc/.*|/target/openrisc/.*)
+  ~ .*/qemu((/include)?/hw/openrisc/.*|/target/openrisc/.*)
 
 ppc
-  ~ 
(/qemu)?((/include)?/hw/ppc/.*|/target/ppc/.*|/hw/pci-host/(uninorth.*|dec.*|prep.*|ppc.*)|/hw/misc/macio/.*|(/include)?/hw/.*/(xics|openpic|spapr).*)
+  ~ 
.*/qemu((/include)?/hw/ppc/.*|/target/ppc/.*|/hw/pci-host/(uninorth.*|dec.*|prep.*|ppc.*)|/hw/misc/macio/.*|(/include)?/hw/.*/(xics|openpic|spapr).*)
 
 riscv
-  ~ 
(/qemu)?((/include)?/hw/riscv/.*|/target/riscv/.*|/hw/.*/(riscv_|ibex_|sifive_).*)
+  ~ 
.*/qemu((/include)?/hw/riscv/.*|/target/riscv/.*|/hw/.*/(riscv_|ibex_|sifive_).*)
 
 rx
-  ~ (/qemu)?((/include)?/hw/rx/.*|/target/rx/.*)
+  ~ .*/qemu((/include)?/hw/rx/.*|/target/rx/.*)
 
 s390
-  ~ (/qemu)?((/include)?/hw/s390x/.*|/target/s390x/.*|/hw/.*/s390_.*)
+  ~ .*/qemu((/include)?/hw/s390x/.*|/target/s390x/.*|/hw/.*/s390_.*)
 
 sh4
-  ~ (/qemu)?((/include)?/hw/sh4/.*|/target/sh4/.*)
+  ~ .*/qemu((/include)?/hw/sh4/.*|/target/sh4/.*)
 
 sparc
-  ~ 
(/qemu)?((/include)?/hw/sparc(64)?.*|/target/sparc/.*|/hw/.*/grlib.*|/hw/display/cg3.c)
+  ~ 
.*/qemu((/include)?/hw/sparc(64)?.*|/target/sparc/.*|/hw/.*/grlib.*|/hw/display/cg3.c)
 
 tricore
-  ~ (/qemu)?((/include)?/hw/tricore/.*|/target/tricore/.*)
+  ~ .*/qemu((/include)?/hw/tricore/.*|/target/tricore/.*)
 
 xtensa
-  ~ (/qemu)?((/include)?/hw/xtensa/.*|/target/xtensa/.*)
+  ~ .*/qemu((/include)?/hw/xtensa/.*|/target/xtensa/.*)
 
 9pfs
-  ~ (/qemu)?(/hw/9pfs/.*|/fsdev/.*)
+  ~ 

Re: [PATCH v2 2/6] Convert 'info tlb' to use generic iterator

2024-05-31 Thread Dr. David Alan Gilbert
* Don Porter (por...@cs.unc.edu) wrote:
> Signed-off-by: Don Porter 

If this changes the output of 'info tlb' could you add a before/after
to the commit message please.

Also, have a look at glib's g_printf and friends, you might find they're
easier;
https://www.manpagez.com/html/glib/glib-2.52.3/glib-String-Utility-Functions.php#g-printf

Dave

> ---
>  target/i386/monitor.c | 203 ++
>  1 file changed, 28 insertions(+), 175 deletions(-)
> 
> diff --git a/target/i386/monitor.c b/target/i386/monitor.c
> index d7aae99c73..adf95edfb4 100644
> --- a/target/i386/monitor.c
> +++ b/target/i386/monitor.c
> @@ -430,201 +430,54 @@ void hmp_info_pg(Monitor *mon, const QDict *qdict)
>  }
>  
>  static void print_pte(Monitor *mon, CPUArchState *env, hwaddr addr,
> -  hwaddr pte, hwaddr mask)
> +  hwaddr pte)
>  {
> -addr = addr_canonical(env, addr);
> -
> -monitor_printf(mon, HWADDR_FMT_plx ": " HWADDR_FMT_plx
> -   " %c%c%c%c%c%c%c%c%c\n",
> -   addr,
> -   pte & mask,
> -   pte & PG_NX_MASK ? 'X' : '-',
> -   pte & PG_GLOBAL_MASK ? 'G' : '-',
> -   pte & PG_PSE_MASK ? 'P' : '-',
> -   pte & PG_DIRTY_MASK ? 'D' : '-',
> -   pte & PG_ACCESSED_MASK ? 'A' : '-',
> -   pte & PG_PCD_MASK ? 'C' : '-',
> -   pte & PG_PWT_MASK ? 'T' : '-',
> -   pte & PG_USER_MASK ? 'U' : '-',
> -   pte & PG_RW_MASK ? 'W' : '-');
> -}
> +char buf[128];
> +char *pos = buf;
>  
> -static void tlb_info_32(Monitor *mon, CPUArchState *env)
> -{
> -unsigned int l1, l2;
> -uint32_t pgd, pde, pte;
> +addr = addr_canonical(env, addr);
>  
> -pgd = env->cr[3] & ~0xfff;
> -for(l1 = 0; l1 < 1024; l1++) {
> -cpu_physical_memory_read(pgd + l1 * 4, , 4);
> -pde = le32_to_cpu(pde);
> -if (pde & PG_PRESENT_MASK) {
> -if ((pde & PG_PSE_MASK) && (env->cr[4] & CR4_PSE_MASK)) {
> -/* 4M pages */
> -print_pte(mon, env, (l1 << 22), pde, ~((1 << 21) - 1));
> -} else {
> -for(l2 = 0; l2 < 1024; l2++) {
> -cpu_physical_memory_read((pde & ~0xfff) + l2 * 4, , 
> 4);
> -pte = le32_to_cpu(pte);
> -if (pte & PG_PRESENT_MASK) {
> -print_pte(mon, env, (l1 << 22) + (l2 << 12),
> -  pte & ~PG_PSE_MASK,
> -  ~0xfff);
> -}
> -}
> -}
> -}
> -}
> -}
> +pos += sprintf(pos, HWADDR_FMT_plx ": " HWADDR_FMT_plx " ", addr,
> +   (hwaddr) (pte & PG_ADDRESS_MASK));
>  
> -static void tlb_info_pae32(Monitor *mon, CPUArchState *env)
> -{
> -unsigned int l1, l2, l3;
> -uint64_t pdpe, pde, pte;
> -uint64_t pdp_addr, pd_addr, pt_addr;
> +pos += sprintf(pos, " %s", pg_bits(pte));
>  
> -pdp_addr = env->cr[3] & ~0x1f;
> -for (l1 = 0; l1 < 4; l1++) {
> -cpu_physical_memory_read(pdp_addr + l1 * 8, , 8);
> -pdpe = le64_to_cpu(pdpe);
> -if (pdpe & PG_PRESENT_MASK) {
> -pd_addr = pdpe & 0x3f000ULL;
> -for (l2 = 0; l2 < 512; l2++) {
> -cpu_physical_memory_read(pd_addr + l2 * 8, , 8);
> -pde = le64_to_cpu(pde);
> -if (pde & PG_PRESENT_MASK) {
> -if (pde & PG_PSE_MASK) {
> -/* 2M pages with PAE, CR4.PSE is ignored */
> -print_pte(mon, env, (l1 << 30) + (l2 << 21), pde,
> -  ~((hwaddr)(1 << 20) - 1));
> -} else {
> -pt_addr = pde & 0x3f000ULL;
> -for (l3 = 0; l3 < 512; l3++) {
> -cpu_physical_memory_read(pt_addr + l3 * 8, , 
> 8);
> -pte = le64_to_cpu(pte);
> -if (pte & PG_PRESENT_MASK) {
> -print_pte(mon, env, (l1 << 30) + (l2 << 21)
> -  + (l3 << 12),
> -  pte & ~PG_PSE_MASK,
> -  ~(hwaddr)0xfff);
> -}
> -}
> -}
> -}
> -}
> -}
> +/* Trim line to fit screen */
> +if (pos - buf > 79) {
> +strcpy(buf + 77, "..");
>  }
> -}
>  
> -#ifdef TARGET_X86_64
> -static void tlb_info_la48(Monitor *mon, CPUArchState *env,
> -uint64_t l0, uint64_t pml4_addr)
> -{
> -uint64_t l1, l2, l3, l4;
> -uint64_t pml4e, pdpe, pde, pte;
> -uint64_t pdp_addr, pd_addr, pt_addr;
> -
> -for (l1 = 0; l1 < 512; l1++) {
> -

Re: Unexpected error in rme_configure_one() at ../target/arm/kvm-rme.c:159

2024-05-31 Thread Itaru Kitayama
Hi Jean,

> On May 31, 2024, at 19:21, Jean-Philippe Brucker  
> wrote:
> 
> Hi Itaru,
> 
> On Fri, May 31, 2024 at 10:57:13AM +0100, Peter Maydell wrote:
>> On Fri, 31 May 2024 at 05:20, Itaru Kitayama  
>> wrote:
>>> 
>>> 
>>> 
 On May 30, 2024, at 22:30, Philippe Mathieu-Daudé  
 wrote:
 
 Cc'ing more developers
 
 On 30/5/24 06:30, Itaru Kitayama wrote:
> Hi,
> When I see a Realm VM creation fails with:
> Unexpected error in rme_configure_one() at ../target/arm/kvm-rme.c:159:
> qemu-system-aarch64: RME: failed to configure SVE: Invalid argument
> test.sh: line 8:  2502 Aborted qemu-system-aarch64 -M 
> 'virt,acpi=off,gic-version=3' -cpu host -enable-kvm -smp 2 -m 512M 
> -overcommit 'mem-lock=on' -M 'confidential-guest-support=rme0' -object 
> 'rme-guest,id=rme0,measurement-algo=sha512,num-pmu-counters=6,sve-vector-length=256'
>  -kernel Image -initrd rootfs.cpio -append 'earycon console=ttyAMA0 
> rdinit=/sbin/init' -nographic -net none
> do I need to suspect first the VMM, QEMU, or the Image? The kernel is 
> built with LLVM, does it matter?
> Thanks,
> Itaru.
 
>>> 
>>> I’m testing Jean’s repo at:
>>> 
>>> https://git.codelinaro.org/linaro/dcap/qemu/-/tree/cca/v2?ref_type=heads
> 
> Thanks again for testing, you can report issues by replying directly to
> my posting, so I can get to them quicker. If you want I can Cc you on the
> next one. The latest is:
> 
> [PATCH v2 00/22] arm: Run CCA VMs with KVM
> https://lore.kernel.org/qemu-devel/20240419155709.318866-2-jean-phili...@linaro.org/

Thanks! I wasn’t aware of it The good news is that after whole day of try and 
error attempts I was able to
bring up a Realm VM on FVP. Here’s my version of overlay yaml, cca-v2.yaml:

build:
  linux:
repo:
  revision: cca-full/v2

#  kvmtool:
#repo:
#  revision: cca/v2

  rmm:
repo:
  revision: main



  tfa:
repo:
  revision: master

  kvm-unit-tests:
repo:
  revision: cca/v2

… and the QEMU options are below:

qemu-system-aarch64 -M 'virt,acpi=off,gic-version=3' \
-cpu host -enable-kvm -smp 2 -m 512M -overcommit 'mem-lock=on' \
-M 'confidential-guest-support=rme0' \
-object 
'rme-guest,id=rme0,measurement-algo=sha512,num-pmu-counters=6,sve-vector-length=256'
 \
-kernel Image -initrd rootfs.cpio \
-append 'earycon console=ttyAMA0 rdinit=/sbin/init' -nographic -net none

Thanks,
Itaru.

> 
> That does sound like the KVM host doesn't support SVE, but the QEMU VMM
> version is also too old: in the latest series 'sve-vector-length' was
> removed and we use the existing -cpu parameters to configure SVE. Please
> make sure that the QEMU branch is cca/v2 to match the Linux KVM branch,
> because the older QEMU patches doesn't work with the newest KVM patches.
> You'll need to update the command-line as well, because paramaters have
> changed for cca/v2.
> 
> This may be the case of older build directories that aren't properly
> synchronized. They can be removed manually but the quicker way is usually
> to remove all source and build directories and start anew.
> 
> Thanks,
> Jean
> 
> 
>> 
>> OK, we should cc Jean-Philippe then.
>> 
>> I'm wondering if this is as simple as "RME via KVM doesn't support SVE yet",
>> perhaps.
>> 
>> thanks
>> -- PMM





Re: [PATCH v2 1/6] Add an "info pg" command that prints the current page tables

2024-05-31 Thread Dr. David Alan Gilbert
* Don Porter (por...@cs.unc.edu) wrote:
> The new "info pg" monitor command prints the current page table,
> including virtual address ranges, flag bits, and snippets of physical
> page numbers.  Completely filled regions of the page table with
> compatible flags are "folded", with the result that the complete
> output for a freshly booted x86-64 Linux VM can fit in a single
> terminal window.  The output looks like this:
> 
> VPN range Entry FlagsPhysical page
> [7f000-7f000] PML4[0fe] ---DA--UWP
>   [7f28c-7f28f]  PDP[0a3] ---DA--UWP
> [7f28c4600-7f28c47ff]  PDE[023] ---DA--UWP
>   [7f28c4655-7f28c4656]  PTE[055-056] X--D---U-P 007f14-007f15
>   [7f28c465b-7f28c465b]  PTE[05b] A--U-P 001cfc
> ...
> [ff800-ff800] PML4[1ff] ---DA--UWP
>   [8-b]  PDP[1fe] ---DA---WP
> [81000-81dff]  PDE[008-00e] -GSDA---WP 001000-001dff
>   [c-f]  PDP[1ff] ---DA--UWP
> [ff400-ff5ff]  PDE[1fa] ---DA--UWP
>   [ff5fb-ff5fc]  PTE[1fb-1fc] XG-DACT-WP 0fec00 0fee00
> [ff600-ff7ff]  PDE[1fb] ---DA--UWP
>   [ff600-ff600]  PTE[000] -G-DA--U-P 001467
> 
> This draws heavy inspiration from Austin Clements' original patch.
> 
> This also adds a generic page table walker, which other monitor
> and execution commands will be migrated to in subsequent patches.
> 
> Signed-off-by: Don Porter 
> ---
>  hmp-commands-info.hx  |  26 ++
>  include/monitor/hmp-target.h  |   1 +
>  target/i386/arch_memory_mapping.c | 486 +-
>  target/i386/cpu.h |  16 +
>  target/i386/monitor.c | 380 +++
>  5 files changed, 908 insertions(+), 1 deletion(-)
> 
> diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
> index 20a9835ea8..918b82015c 100644
> --- a/hmp-commands-info.hx
> +++ b/hmp-commands-info.hx
> @@ -237,6 +237,32 @@ ERST
>  .cmd= hmp_info_mtree,
>  },
>  
> +#if defined(TARGET_I386)
> +{
> +.name   = "pg",
> +.args_type  = "",
> +.params = "",
> +.help   = "show the page table",
> +.cmd= hmp_info_pg,
> +},
> +#endif

So that looks OK

> +
> +SRST 
>   |
> +  ``info pg``
>   |
> +Show the active page table.  
>   |
> +ERST

Are those |'s over ->in the actual patch?

> +{
> +.name   = "mtree",
> +.args_type  = "flatview:-f,dispatch_tree:-d,owner:-o,disabled:-D",
> +.params = "[-f][-d][-o][-D]",
> +.help   = "show memory tree (-f: dump flat view for address 
> spaces;"
> +  "-d: dump dispatch tree, valid with -f only);"
> +  "-o: dump region owners/parents;"
> +  "-D: dump disabled regions",
> +.cmd= hmp_info_mtree,
> +},

Hmm this looks like a copy-pasteism ?

Dave

>  SRST
>``info mtree``
>  Show memory tree.
> diff --git a/include/monitor/hmp-target.h b/include/monitor/hmp-target.h
> index b679aaebbf..9af72ea58d 100644
> --- a/include/monitor/hmp-target.h
> +++ b/include/monitor/hmp-target.h
> @@ -50,6 +50,7 @@ CPUState *mon_get_cpu(Monitor *mon);
>  void hmp_info_mem(Monitor *mon, const QDict *qdict);
>  void hmp_info_tlb(Monitor *mon, const QDict *qdict);
>  void hmp_mce(Monitor *mon, const QDict *qdict);
> +void hmp_info_pg(Monitor *mon, const QDict *qdict);
>  void hmp_info_local_apic(Monitor *mon, const QDict *qdict);
>  void hmp_info_sev(Monitor *mon, const QDict *qdict);
>  void hmp_info_sgx(Monitor *mon, const QDict *qdict);
> diff --git a/target/i386/arch_memory_mapping.c 
> b/target/i386/arch_memory_mapping.c
> index d1ff659128..00bf2a2116 100644
> --- a/target/i386/arch_memory_mapping.c
> +++ b/target/i386/arch_memory_mapping.c
> @@ -15,6 +15,491 @@
>  #include "cpu.h"
>  #include "sysemu/memory_mapping.h"
>  
> +/**
> + ** code hook implementations for x86 ***
> + */
> +
> +#define PML4_ADDR_MASK 0xff000ULL /* selects bits 51:12 */
> +
> +/**
> + * mmu_page_table_root - Given a CPUState, return the physical address
> + *   of the current page table root, as well as
> + *   write the height of the tree into *height.
> + *
> + * @cs - CPU state
> + * @height - a pointer to an integer, to store the page table tree height
> + *
> + * Returns a hardware address on success.  Should not fail (i.e., caller is
> + * responsible to ensure that a page table is actually present).
> + */
> +static hwaddr mmu_page_table_root(CPUState *cs, int *height)
> +{
> +X86CPU *cpu = X86_CPU(cs);
> +CPUX86State *env = >env;
> +/*
> + 

Re: [PATCH] hw/dma/xlnx_dpdma: Read descriptor into buffer, not into pointer-to-buffer

2024-05-31 Thread Edgar E. Iglesias
On Fri, May 31, 2024 at 2:46 PM Peter Maydell 
wrote:

> In fdf029762f501 we factored out the handling of reading and writing
> DMA descriptors from guest memory.  Unfortunately we accidentally
> made the descriptor-read read the descriptor into the address of the
> buffer rather than into the buffer, because we didn't notice we
> needed to update the arguments to the dma_memory_read() call. Before
> the refactoring, "" is the address of a local struct DPDMADescriptor
> variable in xlnx_dpdma_start_operation(), which is the correct target
> for the guest-memory-read. But after the refactoring 'desc' is the
> "DPDMADescriptor *desc" argument to the new function, and so it is
> already an address.
>
> This bug is an overrun of a stack variable, since a pointer is at
> most 8 bytes long and we try to read 64 bytes, as well as being
> incorrect behaviour.
>
> Pass 'desc' rather than '' as the dma_memory_read() argument
> to fix this.
>
> (The same bug is not present in xlnx_dpdma_write_descriptor(),
> because there we are writing the descriptor from a local struct
> variable "DPDMADescriptor tmp_desc" and so passing _desc to
> dma_memory_write() is correct.)
>
> Spotted by Coverity: CID 1546649
>
>
+ CC Fred.

Reviewed-by: Edgar E. Iglesias 




> Fixes: fdf029762f50101 ("xlnx_dpdma: fix descriptor endianness bug")
> Signed-off-by: Peter Maydell 
> ---
>  hw/dma/xlnx_dpdma.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c
> index dde4aeca401..a685bd28bb8 100644
> --- a/hw/dma/xlnx_dpdma.c
> +++ b/hw/dma/xlnx_dpdma.c
> @@ -619,7 +619,7 @@ static MemTxResult
> xlnx_dpdma_read_descriptor(XlnxDPDMAState *s,
>DPDMADescriptor *desc)
>  {
>  MemTxResult res = dma_memory_read(_space_memory, desc_addr,
> -  , sizeof(DPDMADescriptor),
> +  desc, sizeof(DPDMADescriptor),
>MEMTXATTRS_UNSPECIFIED);
>  if (res) {
>  return res;
> --
> 2.34.1
>
>


Re: [PATCH] hw/usb/hcd-ohci: Fix ohci_service_td: accept valid TDs

2024-05-31 Thread Peter Maydell
On Tue, 21 May 2024 at 00:26, David Hubbard  wrote:
>
> From: Cord Amfmgm 
>
> This changes the way the ohci emulation handles a Transfer Descriptor with
> "Current Buffer Pointer" set to "Buffer End" + 1.
>
> The OHCI spec 4.3.1.2 Table 4-2 allows td.cbp to be one byte more than td.be
> to signal the buffer has zero length. Currently qemu only accepts zero-length
> Transfer Descriptors if the td.cbp is equal to 0, while actual OHCI hardware
> accepts both cases.
>
> The qemu ohci emulation has a regression in ohci_service_td. Version 4.2
> and earlier matched the spec. (I haven't taken the time to bisect exactly
> where the logic was changed.)
>
> With a tiny OS[1] that boots and executes a test, the issue can be seen:
>
> * OS that sends USB requests to a USB mass storage device
>   but sends td.cbp = td.be + 1
> * qemu 4.2
> * qemu HEAD (4e66a0854)
> * Actual OHCI controller (hardware)
>
> Command line:
> qemu-system-x86_64 -m 20 \
>  -device pci-ohci,id=ohci \
>  -drive if=none,format=raw,id=d,file=testmbr.raw \
>  -device usb-storage,bus=ohci.0,drive=d \
>  --trace "usb_*" --trace "ohci_*" -D qemu.log
>
> Results are:
>
>  qemu 4.2   | qemu HEAD  | actual HW
> ++
>  works fine | ohci_die() | works fine
>
> Tip: if the flags "-serial pty -serial stdio" are added to the command line
> the test will output USB requests like this:
>
> Testing qemu HEAD:
>
> > Free mem 2M ohci port2 conn FS
> > setup { 80 6 0 1 0 0 8 0 }
> > ED info=8 { mps=8 en=0 d=0 } tail=c20920
> >   td0 c20880 nxt=c20960 f200 setup cbp=c20900 be=c20907
> >   td1 c20960 nxt=c20980 f314in cbp=c20908 be=c2090f
> >   td2 c20980 nxt=c20920 f308   out cbp=c20910 be=c2090f ohci20 host err
> > usb stopped
>
> And in qemu.log:
>
> usb_ohci_iso_td_bad_cc_overrun ISO_TD start_offset=0x00c20910 > 
> next_offset=0x00c2090f
>
> Testing qemu 4.2:
>
> > Free mem 2M ohci port2 conn FS
> > setup { 80 6 0 1 0 0 8 0 }
> > ED info=8 { mps=8 en=0 d=0 } tail=620920
> >   td0 620880 nxt=620960 f200 setup cbp=620900 be=620907   cbp=0 
> > be=620907
> >   td1 620960 nxt=620980 f314in cbp=620908 be=62090f   cbp=0 
> > be=62090f
> >   td2 620980 nxt=620920 f308   out cbp=620910 be=62090f   cbp=0 
> > be=62090f
> >rx { 12 1 0 2 0 0 0 8 }
> > setup { 0 5 1 0 0 0 0 0 } tx {}
> > ED info=8 { mps=8 en=0 d=0 } tail=620880
> >   td0 620920 nxt=620960 f200 setup cbp=620900 be=620907   cbp=0 
> > be=620907
> >   td1 620960 nxt=620880 f310in cbp=620908 be=620907   cbp=0 
> > be=620907
> > setup { 80 6 0 1 0 0 12 0 }
> > ED info=80001 { mps=8 en=0 d=1 } tail=620960
> >   td0 620880 nxt=6209c0 f200 setup cbp=620920 be=620927   cbp=0 
> > be=620927
> >   td1 6209c0 nxt=6209e0 f314in cbp=620928 be=620939   cbp=0 
> > be=620939
> >   td2 6209e0 nxt=620960 f308   out cbp=62093a be=620939   cbp=0 
> > be=620939
> >rx { 12 1 0 2 0 0 0 8 f4 46 1 0 0 0 1 2 3 1 }
> > setup { 80 6 0 2 0 0 0 1 }
> > ED info=80001 { mps=8 en=0 d=1 } tail=620880
> >   td0 620960 nxt=6209a0 f200 setup cbp=620a20 be=620a27   cbp=0 
> > be=620a27
> >   td1 6209a0 nxt=6209c0 f3140004in cbp=620a28 be=620b27   
> > cbp=620a48 be=620b27
> >   td2 6209c0 nxt=620880 f308   out cbp=620b28 be=620b27   cbp=0 
> > be=620b27
> >rx { 9 2 20 0 1 1 4 c0 0 9 4 0 0 2 8 6 50 0 7 5 81 2 40 0 0 7 5 2 2 40 0 
> > 0 }
> > setup { 0 9 1 0 0 0 0 0 } tx {}
> > ED info=80001 { mps=8 en=0 d=1 } tail=620900
> >   td0 620880 nxt=620940 f200 setup cbp=620a00 be=620a07   cbp=0 
> > be=620a07
> >   td1 620940 nxt=620900 f310in cbp=620a08 be=620a07   cbp=0 
> > be=620a07
>
> [1] The OS disk image has been emailed to phi...@linaro.org, m...@tls.msk.ru,
> and kra...@redhat.com:
>
> * testCbpOffBy1.img.xz
> * sha256: f87baddcb86de845de12f002c698670a426affb40946025cc32694f9daa3abed
>
> Signed-off-by: Cord Amfmgm 
> ---
>  hw/usb/hcd-ohci.c   | 4 ++--
>  hw/usb/trace-events | 1 +
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
> index acd6016980..71b54914d3 100644
> --- a/hw/usb/hcd-ohci.c
> +++ b/hw/usb/hcd-ohci.c
> @@ -941,8 +941,8 @@ static int ohci_service_td(OHCIState *ohci, struct 
> ohci_ed *ed)
>  if ((td.cbp & 0xf000) != (td.be & 0xf000)) {
>  len = (td.be & 0xfff) + 0x1001 - (td.cbp & 0xfff);
>  } else {
> -if (td.cbp > td.be) {
> -trace_usb_ohci_iso_td_bad_cc_overrun(td.cbp, td.be);
> +if (td.cbp - 1 > td.be) {  /* rely on td.cbp != 0 */
> +trace_usb_ohci_td_bad_buf(td.cbp, td.be);
>  ohci_die(ohci);
>  return 1;
>  }

This patch seems to me to do what the commit message sets out to
do, and it looks unlikely to have any unintended side effects
because it turns a "USB controller flags an error" case into
a "treat as zero length packet" case, and I 

Re: [PATCH 0/4] hw/s390x: Alias @dump-skeys -> @dump-s390-skey and deprecate

2024-05-31 Thread Dr. David Alan Gilbert
* Thomas Huth (th...@redhat.com) wrote:
> On 30/05/2024 09.45, Philippe Mathieu-Daudé wrote:
> > We are trying to unify all qemu-system-FOO to a single binary.
> > In order to do that we need to remove QAPI target specific code.
> > 
> > @dump-skeys is only available on qemu-system-s390x. This series
> > rename it as @dump-s390-skey, making it available on other
> > binaries. We take care of backward compatibility via deprecation.
> > 
> > Philippe Mathieu-Daudé (4):
> >hw/s390x: Introduce the @dump-s390-skeys QMP command
> >hw/s390x: Introduce the 'dump_s390_skeys' HMP command
> >hw/s390x: Deprecate the HMP 'dump_skeys' command
> >hw/s390x: Deprecate the QMP @dump-skeys command
> 
> Why do we have to rename the command? Just for the sake of it? I think
> renaming HMP commands is maybe ok, but breaking the API in QMP is something
> you should consider twice.
> 
> And even if we decide to rename ... maybe we should discuss whether it makes
> sense to come up with a generic command instead: As far as I know, ARM also
> has something similar, called MTE. Maybe we also want to dump MTE keys one
> day? So the new command should maybe be called "dump-memory-keys" instead?

I think there are at least two different concepts; but I agree it would be
nice to keep a single command for matching concepts across different 
architectures;
I can't say I know the details of any, but:

  a) Page table things - I think x86 PKRU/PKEY (???) is a page table thing
where pages marked a special way are associated with keys.
That sounds similar to what the skeys are???

  b) Upper bit things - where you steal a few bits from the virtual address
and then use that to associate some security; I think that's closer
to what MTE is isn't it?

I'm not sure the two fit in the same command.

Dave

> Or should it maybe rather be an option to the existing "dump-guest-memory"
> command instead?
> 
>  Thomas
> 
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Re: [PATCH 2/5] cpu: move Qemu[Thread|Cond] setup into common code

2024-05-31 Thread Reinoud Zandijk
On Thu, May 30, 2024 at 03:29:41PM -0700, Pierrick Bouvier wrote:
> On 5/30/24 12:42, Alex Bennée wrote:
> > Aside from the round robin threads this is all common code. By
> > moving the halt_cond setup we also no longer need hacks to work around
> > the race between QOM object creation and thread creation.
> > 
> > It is a little ugly to free stuff up for the round robin thread but
> > better it deal with its own specialises than making the other
> > accelerators jump through hoops.
> > 
> > Signed-off-by: Alex Bennée 
...
> > diff --git a/target/i386/nvmm/nvmm-accel-ops.c 
> > b/target/i386/nvmm/nvmm-accel-ops.c
> > index 6b2bfd9b9c..0ba31201e2 100644
> > --- a/target/i386/nvmm/nvmm-accel-ops.c
> > +++ b/target/i386/nvmm/nvmm-accel-ops.c
> > @@ -64,9 +64,6 @@ static void nvmm_start_vcpu_thread(CPUState *cpu)
> >   {
> >   char thread_name[VCPU_THREAD_NAME_SIZE];
> > -cpu->thread = g_new0(QemuThread, 1);
> > -cpu->halt_cond = g_new0(QemuCond, 1);
> > -qemu_cond_init(cpu->halt_cond);
> >   snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/NVMM",
> >cpu->cpu_index);
> >   qemu_thread_create(cpu->thread, thread_name, qemu_nvmm_cpu_thread_fn,

I haven't tested it since I don't have a recent qemu build but I doubt it will
give issues as its main qemu stuff.

Reinoud




Re: [PATCH v2 0/6] Rework x86 page table walks

2024-05-31 Thread Peter Maydell
On Fri, 24 May 2024 at 18:08, Don Porter  wrote:
>
> This version of the 'info pg' command adopts Peter Maydell's request
> to write some guest-agnostic page table iterator and accessor code,
> along with architecture-specific hooks.  The first patch in this
> series contributes a generic page table iterator and an x86
> instantiation.  As a client, we first introduce an 'info pg' monitor
> command, as well as a compressing callback hook for creating succinct
> page table representations.
>
> After this, each successive patch replaces an exisitng x86 page table
> walker with a use of common iterator code.
>
> I could use advice on how to ensure this is sufficiently well tested.
> I used 'make check' and 'make check-avocado', which both pass; what is
> the typical standard for testing something like a page table related
> change?
>
> As far as generality, I have only attempted this on x86, but I expect
> the design would work for any similar radix-tree style page table.
>
> I am still new enough to the code base that I wasn't certain about
> where to put the generic code, as well as naming conventions.
>
> Per David Gilbert's suggestion, I was careful to ensure that monitor
> calls do not perturb TLB state (see the read-only flag in some
> functions).
>
> I appreciate Nadav's suggestion about other ways to pursue the same
> goal: I ended up deciding I would like to try my hand at consolidating
> the x86 page table code.
>
> Don Porter (6):
>   Add an "info pg" command that prints the current page tables
>   Convert 'info tlb' to use generic iterator
>   Convert 'info mem' to use generic iterator
>   Convert x86_cpu_get_memory_mapping() to use generic iterators
>   Move tcg implementation of x86 get_physical_address into common helper
> code.
>   Convert x86_mmu_translate() to use common code.

Thanks for doing this work -- I like the diffstats for patches 2 and
3 a lot :-)

I have been digging out from under a big backlog of unreviewed
patches, but in an ideal world I might get to this next week.
Hopefully the x86 maintainers will have a look at it too.

thanks
-- PMM



Re: [PATCH] target/arm: fix MPIDR value for ARM CPUs with SMT

2024-05-31 Thread Dorjoy Chowdhury
Hi Peter,

On Fri, May 31, 2024, 6:53 PM Peter Maydell 
wrote:

> On Fri, 3 May 2024 at 17:53, Dorjoy Chowdhury 
> wrote:
> >
> > On Fri, May 3, 2024 at 10:28 PM Peter Maydell 
> wrote:
> > >
> > > On Fri, 19 Apr 2024 at 19:31, Dorjoy Chowdhury 
> wrote:
> > > >
> > > > Some ARM CPUs advertise themselves as SMT by having the MT[24] bit
> set
> > > > to 1 in the MPIDR register. These CPUs have the thread id in
> Aff0[7:0]
> > > > bits, CPU id in Aff1[15:8] bits and cluster id in Aff2[23:16] bits in
> > > > MPIDR.
> > > >
> > > > On the other hand, ARM CPUs without SMT have the MT[24] bit set to 0,
> > > > CPU id in Aff0[7:0] bits and cluster id in Aff1[15:8] bits in MPIDR.
> > > >
> > > > The mpidr_read_val() function always reported non-SMT i.e., MT=0
> style
> > > > MPIDR value which means it was wrong for the following CPUs with SMT
> > > > supported by QEMU:
> > > > - cortex-a55
> > > > - cortex-a76
> > > > - cortex-a710
> > > > - neoverse-v1
> > > > - neoverse-n1
> > > > - neoverse-n2
> > >
> > > This has definitely turned out to be rather more complicated
> > > than I thought it would be when I wrote up the original issue
> > > in gitlab, so sorry about that.
> > >
> > > I still need to think through how we should deal with the
> > > interaction between what the CPU type implies about the MPIDR
> > > format and the topology information provided by the user.
> > > I probably won't get to that next week, because I'm on holiday
> > > for most of it, but I will see if I can at least make a start.
> > >
> >
> > No problem at all. Just let me know when you get to it. I will see if
> > I can fix it or ask if I need help then. Please enjoy your holidays.
>
> Hi -- this is a note to say that I haven't forgotten about this
> patch and the related issues, but I still haven't been able to
> make the time to think through how it ought to work yet :-(
>

No problem at all. Thanks for letting me know.

Regards,
Dorjoy

>


Re: [PATCH v2 4/4] tests/qtest/migration-test: Add a postcopy memfile test

2024-05-31 Thread Peter Xu
On Thu, May 30, 2024 at 07:54:07PM +1000, Nicholas Piggin wrote:
> Postcopy requires userfaultfd support, which requires tmpfs if a memory
> file is used.
> 
> This adds back support for /dev/shm memory files, but adds preallocation
> to skip environments where that mount is limited in size.
> 
> Signed-off-by: Nicholas Piggin 

Thanks for doing this regardless.

> ---
>  tests/qtest/migration-test.c | 77 
>  1 file changed, 69 insertions(+), 8 deletions(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 86eace354e..5078033ded 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -11,6 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/cutils.h"
>  
>  #include "libqtest.h"
>  #include "qapi/qmp/qdict.h"
> @@ -553,6 +554,7 @@ typedef struct {
>   */
>  bool hide_stderr;
>  bool use_memfile;
> +bool use_shm_memfile;

Nitpick: when having both, it's slightly confusing on the name, e.g. not
clear whether use_memfile needs to be set to true too if use_shm_memfile=true.

Maybe "use_tmpfs_memfile" and "use_shm_memfile"?

>  /* only launch the target process */
>  bool only_target;
>  /* Use dirty ring if true; dirty logging otherwise */
> @@ -739,7 +741,62 @@ static int test_migrate_start(QTestState **from, 
> QTestState **to,
>  ignore_stderr = "";
>  }
>  
> -if (args->use_memfile) {
> +if (!qtest_has_machine(machine_alias)) {
> +g_autofree char *msg = g_strdup_printf("machine %s not supported",
> +   machine_alias);
> +g_test_skip(msg);
> +return -1;
> +}
> +
> +if (args->use_shm_memfile) {
> +#if defined(__NR_userfaultfd) && defined(__linux__)

IIUC we only need defined(__linux__) as there's nothing to do with uffd yet?

> +int fd;
> +uint64_t size;
> +
> +if (getenv("GITLAB_CI")) {
> +/*
> + * Gitlab runners are limited to 64MB shm size and despite
> + * pre-allocation there is concern that concurrent tests
> + * could result in nondeterministic failures. Until all shm
> + * usage in all CI tests is found to fail gracefully on
> + * ENOSPC, it is safer to avoid large allocations for now.
> + *
> + * https://lore.kernel.org/qemu-devel/875xuwg4mx@suse.de/
> + */
> +g_test_skip("shm tests are not supported in Gitlab CI 
> environment");
> +return -1;
> +}

I'm not sure whether this is Fabiano's intention.  I'm wondering whether we
can drop this and just let it still run there.

Other tests not detecting avaiablility of shmem looks like a separate issue
to be fixed to me, regardless of this.

My wild guess is since we're doing memory_size+64K below then if test won't
fail others won't either, as normally the shmem quota should normally be
power of 2 anyway.. then it should always fit another few MBs if this one.
While this test is ready to fail gracefully now.

> +
> +if (!g_file_test("/dev/shm", G_FILE_TEST_IS_DIR)) {
> +g_test_skip("/dev/shm does not exist or is not a directory");
> +return -1;
> +}
> +
> +/*
> + * Pre-create and allocate the file here, because /dev/shm/
> + * is known to be limited in size in some places (e.g., Gitlab CI).
> + */
> +memfile_path = g_strdup_printf("/dev/shm/qemu-%d", getpid());
> +fd = open(memfile_path, O_WRONLY | O_CREAT | O_EXCL, S_IRUSR | 
> S_IWUSR);
> +if (fd == -1) {
> +g_test_skip("/dev/shm file could not be created");
> +return -1;
> +}
> +
> +g_assert(qemu_strtosz(memory_size, NULL, ) == 0);
> +size += 64*1024; /* QEMU may map a bit more memory for a guard page 
> */
> +
> +if (fallocate(fd, 0, 0, size) == -1) {
> +unlink(memfile_path);
> +perror("could not alloc"); exit(1);
> +g_test_skip("Could not allocate machine memory in /dev/shm");
> +return -1;
> +}
> +close(fd);
> +#else
> +g_test_skip("userfaultfd is not supported");

"/dev/shm not available" instead?

> +#endif
> +} else if (args->use_memfile) {
>  memfile_path = g_strdup_printf("/%s/qemu-%d", tmpfs, getpid());
>  memfile_opts = g_strdup_printf(
>  "-object memory-backend-file,id=mem0,size=%s"
> @@ -751,12 +808,6 @@ static int test_migrate_start(QTestState **from, 
> QTestState **to,
>  kvm_opts = ",dirty-ring-size=4096";
>  }
>  
> -if (!qtest_has_machine(machine_alias)) {
> -g_autofree char *msg = g_strdup_printf("machine %s not supported", 
> machine_alias);
> -g_test_skip(msg);
> -return -1;
> -}
> -
>  machine = resolve_machine_version(machine_alias, QEMU_ENV_SRC,
> 

Re: [PATCH v4 0/3] hw/riscv/virt: pflash improvements

2024-05-31 Thread Sunil V L
On Thu, May 30, 2024 at 04:37:56AM -0700, Andrea Bolognani wrote:
> On Mon, Nov 20, 2023 at 08:06:19PM GMT, Sunil V L wrote:
> > On Mon, Nov 20, 2023 at 02:29:28PM +, Andrea Bolognani wrote:
> > > On Fri, May 26, 2023 at 11:10:12AM +0200, Andrew Jones wrote:
> > > > > > > So, are edk2 users the only ones who would (temporarily) need to
> > > > > > > manually turn ACPI off if virt-manager started enabling it by
> > > > > > > default?
> > > > > >
> > > > > > I assume so, but I'm not tracking firmware status. If the firmware
> > > > > > doesn't extract the ACPI tables from QEMU and present them to the
> > > > > > guest (afaik only edk2 does that), then the guest kernel falls back
> > > > > > to DT, which is why it's working for you.
> > > > > >
> > > > > > I suppose we should wait until Linux merges the ACPI patches, before
> > > > > > adding RISC-V to the libvirt capabilities ACPI list.
> > > > >
> > > > > That sounds reasonable to me, but note that 1) the libvirt change
> > > > > might take a while to propagate to distros and 2) someone will have
> > > > > to remind me to prepare such a patch when the time comes ;)
> > > >
> > > > Initial ACPI support will probably be merged for 6.4. So maybe it is
> > > > time to get the libvirt side of things going.
> > >
> > > Randomly remembered about this. Did ACPI support make it into 6.4
> > > after all? Is now a good time to change libvirt?
> >
> > Hi Andrea,
> >
> > Not yet. While basic ACPI changes are merged, the interrupt controller
> > support is still going on. Looks like it will take few merge windows to
> > get ACPI fully supported. So, we still need to wait for libvirt change.
> 
> Hey,
> 
> I've been working on making RISC-V support a bit smoother across the
> virtualization stack recently, and I just so happened to remember
> that this topic was still pending.
> 
> I've tried manually switching ACPI on for an existing Fedora RISC-V
> guest running under TCG and booting via UEFI, which promptly made it
> stop working, so I assume the necessary bits haven't made it into the
> kernel yet.
> 
> Is anyone actually tracking that work? We've been waiting for it to
> land for a fairly long time at this point...
> 
Hi Andrea,

It is still WIP. DT patches (which was a dependency for ACPI)  for AIA
interrupt controllers got merged recently and available in 6.10-rc1. So,
I hope ACPI patches will be merged within next couple of merge windows
provided maintainers of different subsystems will have sufficient time
to review.

Thanks,
Sunil



Re: [PATCH] physmem: allow debug writes to MMIO regions

2024-05-31 Thread Peter Maydell
On Mon, 20 May 2024 at 19:48, Perry Hung  wrote:
>
> Philippe, Peter,
>
> Thank you for the comments. I am not even sure what the semantics of
> putting a breakpoint or watchpoint
> on device regions are supposed to be. I would imagine it is
> architecture-specific as to whether this is even allowed.
>
> It appears for example, that armv8-a allows watchpoints to be set on any
> type of memory. armv7-a prohibits
> watchpoints on Device or Strongly-ordered memory that might be accessed
> by instructions multiple times
> (e.g LDM and LDC instructions).
>
> What is the current behavior for QEMU and what should
> breakpoints/watchpoints do when placed on IO memory?

Personally I don't think it matters very much, because the
user shouldn't really be doing something silly like that
in the first place. If they do, they get to deal with
whatever problems result.

My feeling is that the neater place to put the handling of
memory regions that aren't RAM/ROM is probably in
address_space_write_rom_internal(). But doing that makes me
nervous, because that's in the file-loading path and I
bet there are dubious guest images out there that put
data in some area covered by a device and currently rely
on it being ignored when the image is loaded...

thanks
-- PMM



Re: [PATCH 1/6] host/i386: nothing looks at CPUINFO_SSE4

2024-05-31 Thread Philippe Mathieu-Daudé

On 31/5/24 11:14, Paolo Bonzini wrote:

The only user was the SSE4.1 variant of buffer_is_zero, which has
been removed; code to compute CPUINFO_SSE4 is dead.

Signed-off-by: Paolo Bonzini 
---
  host/include/i386/host/cpuinfo.h | 1 -
  util/cpuinfo-i386.c  | 1 -
  2 files changed, 2 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] hw/dma/xlnx_dpdma: Read descriptor into buffer, not into pointer-to-buffer

2024-05-31 Thread Philippe Mathieu-Daudé

On 31/5/24 14:46, Peter Maydell wrote:

In fdf029762f501 we factored out the handling of reading and writing
DMA descriptors from guest memory.  Unfortunately we accidentally
made the descriptor-read read the descriptor into the address of the
buffer rather than into the buffer, because we didn't notice we
needed to update the arguments to the dma_memory_read() call. Before
the refactoring, "" is the address of a local struct DPDMADescriptor
variable in xlnx_dpdma_start_operation(), which is the correct target
for the guest-memory-read. But after the refactoring 'desc' is the
"DPDMADescriptor *desc" argument to the new function, and so it is
already an address.

This bug is an overrun of a stack variable, since a pointer is at
most 8 bytes long and we try to read 64 bytes, as well as being
incorrect behaviour.

Pass 'desc' rather than '' as the dma_memory_read() argument
to fix this.

(The same bug is not present in xlnx_dpdma_write_descriptor(),
because there we are writing the descriptor from a local struct
variable "DPDMADescriptor tmp_desc" and so passing _desc to
dma_memory_write() is correct.)

Spotted by Coverity: CID 1546649

Fixes: fdf029762f50101 ("xlnx_dpdma: fix descriptor endianness bug")
Signed-off-by: Peter Maydell 
---
  hw/dma/xlnx_dpdma.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] target/arm: fix MPIDR value for ARM CPUs with SMT

2024-05-31 Thread Peter Maydell
On Fri, 3 May 2024 at 17:53, Dorjoy Chowdhury  wrote:
>
> On Fri, May 3, 2024 at 10:28 PM Peter Maydell  
> wrote:
> >
> > On Fri, 19 Apr 2024 at 19:31, Dorjoy Chowdhury  
> > wrote:
> > >
> > > Some ARM CPUs advertise themselves as SMT by having the MT[24] bit set
> > > to 1 in the MPIDR register. These CPUs have the thread id in Aff0[7:0]
> > > bits, CPU id in Aff1[15:8] bits and cluster id in Aff2[23:16] bits in
> > > MPIDR.
> > >
> > > On the other hand, ARM CPUs without SMT have the MT[24] bit set to 0,
> > > CPU id in Aff0[7:0] bits and cluster id in Aff1[15:8] bits in MPIDR.
> > >
> > > The mpidr_read_val() function always reported non-SMT i.e., MT=0 style
> > > MPIDR value which means it was wrong for the following CPUs with SMT
> > > supported by QEMU:
> > > - cortex-a55
> > > - cortex-a76
> > > - cortex-a710
> > > - neoverse-v1
> > > - neoverse-n1
> > > - neoverse-n2
> >
> > This has definitely turned out to be rather more complicated
> > than I thought it would be when I wrote up the original issue
> > in gitlab, so sorry about that.
> >
> > I still need to think through how we should deal with the
> > interaction between what the CPU type implies about the MPIDR
> > format and the topology information provided by the user.
> > I probably won't get to that next week, because I'm on holiday
> > for most of it, but I will see if I can at least make a start.
> >
>
> No problem at all. Just let me know when you get to it. I will see if
> I can fix it or ask if I need help then. Please enjoy your holidays.

Hi -- this is a note to say that I haven't forgotten about this
patch and the related issues, but I still haven't been able to
make the time to think through how it ought to work yet :-(

-- PMM



  1   2   3   >