date:20210804

Re: [PATCH v4 02/14] doc: update AMD SEV to include Live migration flow

2021-08-04 Thread Dov Murik




On 04/08/2021 14:53, Ashish Kalra wrote:
> From: Brijesh Singh 
> 
> Signed-off-by: Brijesh Singh 
> Signed-off-by: Ashish Kalra 
> ---
>  docs/amd-memory-encryption.txt | 46 +-
>  1 file changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/amd-memory-encryption.txt b/docs/amd-memory-encryption.txt
> index 12ca25180e..0d9184532a 100644
> --- a/docs/amd-memory-encryption.txt
> +++ b/docs/amd-memory-encryption.txt
> @@ -126,7 +126,51 @@ TODO
> 
>  Live Migration
>  
> -TODO
> +AMD SEV encrypts the memory of VMs and because a different key is used
> +in each VM, the hypervisor will be unable to simply copy the
> +ciphertext from one VM to another to migrate the VM. Instead the AMD SEV Key
> +Management API provides sets of function which the hypervisor can use
> +to package a guest page for migration, while maintaining the confidentiality
> +provided by AMD SEV.
> +
> +SEV guest VMs have the concept of private and shared memory. The private
> +memory is encrypted with the guest-specific key, while shared memory may
> +be encrypted with the hypervisor key. The migration APIs provided by the
> +SEV API spec should be used for migrating the private pages.
> +
> +The KVM_HC_MAP_GPA_RANGE hypercall is used by the SEV guest to notify a
> +change in the page encryption status to the hypervisor. The hypercall
> +is invoked when the encryption attribute is changed from encrypted -> 
> decrypted
> +and vice versa. By default all guest pages are considered encrypted.
> +
> +This hypercall exits to qemu via KVM_EXIT_HYPERCALL to manage the guest
> +shared regions and integrate with the qemu's migration code. The shared
> +region list can be used to check if the given guest page is private or 
> shared.
> +
> +Before initiating the migration, we need to know the targets machine's public

s/targets/target/

> +Diffie-Hellman key (PDH) and certificate chain. It can be retrieved
> +with the 'query-sev-capabilities' QMP command or using the sev-tool. The
> +migrate-set-parameter can be used to pass the target machine's PDH and
> +certificate chain.

It's better to clarify that you use query-sev-capabilities QMP command
on the *target* VM (to get its PDF and cert) when it's ready, and then
use migrate-set-parameter QMP command on the *source* so it can prepare
the migration for that specific target.


> +
> +During the migration flow, the SEND_START is called on the source hypervisor
> +to create an outgoing encryption context. The SEV guest policy dictates 
> whether
> +the certificate passed through the migrate-sev-set-info command will be

Here you say migrate-sev-set-info but above you said
migrate-set-parameter.  Which one is it?


-Dov

> +validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
> +After migration is completed, SEND_FINISH is called to destroy the encryption
> +context and make the VM non-runnable to protect it against cloning.
> +
> +On the target machine, RECEIVE_START is called first to create an
> +incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
> +the received encrypted page into guest memory. After migration has
> +completed, RECEIVE_FINISH is called to make the VM runnable.
> +
> +For more information about the migration see SEV API Appendix A
> +Usage flow (Live migration section).
> +
> +NOTE:
> +To protect against the memory clone SEV APIs are designed to make the VM
> +unrunnable in case of the migration failure.
> 
>  References
>  -
>

Re: [RFC PATCH] vhost-vdpa: Do not send empty IOTLB update batches

2021-08-04 Thread Jason Wang

On Wed, Aug 4, 2021 at 10:44 PM Eugenio Pérez  wrote:
>
> With the introduction of the batch hinting, meaningless batches can be
> created with no IOTLB updates if the memory region was skipped by
> vhost_vdpa_listener_skipped_section. This is the case of host notifiers
> memory regions, but others could fall on this category. This caused the
> vdpa device to receive dma mapping settings with no changes, a possibly
> expensive operation for nothing.
>
> To avoid that, VHOST_IOTLB_BATCH_BEGIN hint is delayed until we have a
> meaningful (not skipped section) mapping or unmapping operation, and
> VHOST_IOTLB_BATCH_END is not written unless at least one of _UPDATE /
> _INVALIDATE has been issued.

Hi Eugeni:

This should work but it looks to me it's better to optimize the kernel.

E.g to make sure we don't send set_map() if there is no real updating
between batch start and end.

This makes sure that it can work for all kinds of userspace (not only for Qemu).

Another optimization is to batch the memory region transaction by adding:

memory_region_transaction_begin() and memory_region_transaction_end() in

both vhost_vdpa_host_notifiers_init() and vhost_vdpa_host_notifiers_uninit().

This can make sure only at least one memory transactions when
adding/removing host notifier regions.

Thanks

>
> Signed-off-by: Eugenio Pérez 
> ---
>  include/hw/virtio/vhost-vdpa.h |  1 +
>  hw/virtio/vhost-vdpa.c | 38 +++---
>  2 files changed, 27 insertions(+), 12 deletions(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index e98e327f12..0a7edbe4eb 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -23,6 +23,7 @@ typedef struct vhost_vdpa {
>  int device_fd;
>  int index;
>  uint32_t msg_type;
> +size_t n_iotlb_sent_batch;
>  MemoryListener listener;
>  struct vhost_dev *dev;
>  VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 6ce94a1f4d..2517fc6103 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -89,19 +89,13 @@ static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, 
> hwaddr iova,
>  return ret;
>  }
>
> -static void vhost_vdpa_listener_begin(MemoryListener *listener)
> +static void vhost_vdpa_listener_begin_batch(struct vhost_vdpa *v)
>  {
> -struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, 
> listener);
> -struct vhost_dev *dev = v->dev;
> -struct vhost_msg_v2 msg = {};
>  int fd = v->device_fd;
> -
> -if (!(dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH))) {
> -return;
> -}
> -
> -msg.type = v->msg_type;
> -msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
> +struct vhost_msg_v2 msg = {
> +.type = v->msg_type,
> +.iotlb.type = VHOST_IOTLB_BATCH_BEGIN,
> +};
>
>  if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
>  error_report("failed to write, fd=%d, errno=%d (%s)",
> @@ -120,6 +114,11 @@ static void vhost_vdpa_listener_commit(MemoryListener 
> *listener)
>  return;
>  }
>
> +if (v->n_iotlb_sent_batch == 0) {
> +return;
> +}
> +
> +v->n_iotlb_sent_batch = 0;
>  msg.type = v->msg_type;
>  msg.iotlb.type = VHOST_IOTLB_BATCH_END;
>
> @@ -170,6 +169,14 @@ static void 
> vhost_vdpa_listener_region_add(MemoryListener *listener,
>
>  llsize = int128_sub(llend, int128_make64(iova));
>
> +if (v->dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH)) {
> +if (v->n_iotlb_sent_batch == 0) {
> +vhost_vdpa_listener_begin_batch(v);
> +}
> +
> +v->n_iotlb_sent_batch++;
> +}
> +
>  ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
>   vaddr, section->readonly);
>  if (ret) {
> @@ -221,6 +228,14 @@ static void 
> vhost_vdpa_listener_region_del(MemoryListener *listener,
>
>  llsize = int128_sub(llend, int128_make64(iova));
>
> +if (v->dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH)) {
> +if (v->n_iotlb_sent_batch == 0) {
> +vhost_vdpa_listener_begin_batch(v);
> +}
> +
> +v->n_iotlb_sent_batch++;
> +}
> +
>  ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
>  if (ret) {
>  error_report("vhost_vdpa dma unmap error!");
> @@ -234,7 +249,6 @@ static void vhost_vdpa_listener_region_del(MemoryListener 
> *listener,
>   * depends on the addnop().
>   */
>  static const MemoryListener vhost_vdpa_memory_listener = {
> -.begin = vhost_vdpa_listener_begin,
>  .commit = vhost_vdpa_listener_commit,
>  .region_add = vhost_vdpa_listener_region_add,
>  .region_del = vhost_vdpa_listener_region_del,
> --
> 2.27.0
>

Re: [PATCH v2 4/4] hw/riscv: virt: Add optional ACLINT support to virt machine

2021-08-04 Thread Alistair Francis

On Thu, Aug 5, 2021 at 4:09 PM Alistair Francis  wrote:
>
> On Sat, Jul 24, 2021 at 10:27 PM Anup Patel  wrote:
> >
> > We extend virt machine to emulate ACLINT devices only when "aclint=on"
> > parameter is passed along with machine name in QEMU command-line.
> >
> > Signed-off-by: Anup Patel 
> > ---
> >  hw/riscv/virt.c | 113 +++-
> >  include/hw/riscv/virt.h |   2 +
> >  2 files changed, 114 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> > index 48c8b4aeb2..7259057a74 100644
> > --- a/hw/riscv/virt.c
> > +++ b/hw/riscv/virt.c
> > @@ -49,6 +49,7 @@ static const MemMapEntry virt_memmap[] = {
> >  [VIRT_TEST] ={   0x10,0x1000 },
> >  [VIRT_RTC] = {   0x101000,0x1000 },
> >  [VIRT_CLINT] =   {  0x200,   0x1 },
> > +[VIRT_ACLINT_SSWI] = {  0x2F0,0x4000 },
>
> Couldn't we use the same address as the current CLINT?

Whoops, nevermind.

Reviewed-by: Alistair Francis 

Alistair

Re: [PATCH v2 4/4] hw/riscv: virt: Add optional ACLINT support to virt machine

2021-08-04 Thread Alistair Francis

On Sat, Jul 24, 2021 at 10:27 PM Anup Patel  wrote:
>
> We extend virt machine to emulate ACLINT devices only when "aclint=on"
> parameter is passed along with machine name in QEMU command-line.
>
> Signed-off-by: Anup Patel 
> ---
>  hw/riscv/virt.c | 113 +++-
>  include/hw/riscv/virt.h |   2 +
>  2 files changed, 114 insertions(+), 1 deletion(-)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 48c8b4aeb2..7259057a74 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -49,6 +49,7 @@ static const MemMapEntry virt_memmap[] = {
>  [VIRT_TEST] ={   0x10,0x1000 },
>  [VIRT_RTC] = {   0x101000,0x1000 },
>  [VIRT_CLINT] =   {  0x200,   0x1 },
> +[VIRT_ACLINT_SSWI] = {  0x2F0,0x4000 },

Couldn't we use the same address as the current CLINT?

Alistair

>  [VIRT_PCIE_PIO] ={  0x300,   0x1 },
>  [VIRT_PLIC] ={  0xc00, VIRT_PLIC_SIZE(VIRT_CPUS_MAX * 2) },
>  [VIRT_UART0] =   { 0x1000, 0x100 },
> @@ -282,6 +283,82 @@ static void create_fdt_socket_clint(RISCVVirtState *s,
>  g_free(clint_cells);
>  }
>
> +static void create_fdt_socket_aclint(RISCVVirtState *s,
> + const MemMapEntry *memmap, int socket,
> + uint32_t *intc_phandles)
> +{
> +int cpu;
> +char *name;
> +unsigned long addr;
> +uint32_t aclint_cells_size;
> +uint32_t *aclint_mswi_cells;
> +uint32_t *aclint_sswi_cells;
> +uint32_t *aclint_mtimer_cells;
> +MachineState *mc = MACHINE(s);
> +
> +aclint_mswi_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
> +aclint_mtimer_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
> +aclint_sswi_cells = g_new0(uint32_t, s->soc[socket].num_harts * 2);
> +
> +for (cpu = 0; cpu < s->soc[socket].num_harts; cpu++) {
> +aclint_mswi_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
> +aclint_mswi_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_SOFT);
> +aclint_mtimer_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
> +aclint_mtimer_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_TIMER);
> +aclint_sswi_cells[cpu * 2 + 0] = cpu_to_be32(intc_phandles[cpu]);
> +aclint_sswi_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_S_SOFT);
> +}
> +aclint_cells_size = s->soc[socket].num_harts * sizeof(uint32_t) * 2;
> +
> +addr = memmap[VIRT_CLINT].base + (memmap[VIRT_CLINT].size * socket);
> +name = g_strdup_printf("/soc/mswi@%lx", addr);
> +qemu_fdt_add_subnode(mc->fdt, name);
> +qemu_fdt_setprop_string(mc->fdt, name, "compatible", 
> "riscv,aclint-mswi");
> +qemu_fdt_setprop_cells(mc->fdt, name, "reg",
> +0x0, addr, 0x0, RISCV_ACLINT_SWI_SIZE);
> +qemu_fdt_setprop(mc->fdt, name, "interrupts-extended",
> +aclint_mswi_cells, aclint_cells_size);
> +qemu_fdt_setprop(mc->fdt, name, "interrupt-controller", NULL, 0);
> +qemu_fdt_setprop_cell(mc->fdt, name, "#interrupt-cells", 0);
> +riscv_socket_fdt_write_id(mc, mc->fdt, name, socket);
> +g_free(name);
> +
> +addr = memmap[VIRT_CLINT].base + RISCV_ACLINT_SWI_SIZE +
> +(memmap[VIRT_CLINT].size * socket);
> +name = g_strdup_printf("/soc/mtimer@%lx", addr);
> +qemu_fdt_add_subnode(mc->fdt, name);
> +qemu_fdt_setprop_string(mc->fdt, name, "compatible",
> +"riscv,aclint-mtimer");
> +qemu_fdt_setprop_cells(mc->fdt, name, "reg",
> +0x0, addr + RISCV_ACLINT_DEFAULT_MTIME,
> +0x0, memmap[VIRT_CLINT].size - RISCV_ACLINT_SWI_SIZE -
> + RISCV_ACLINT_DEFAULT_MTIME,
> +0x0, addr + RISCV_ACLINT_DEFAULT_MTIMECMP,
> +0x0, RISCV_ACLINT_DEFAULT_MTIME);
> +qemu_fdt_setprop(mc->fdt, name, "interrupts-extended",
> +aclint_mtimer_cells, aclint_cells_size);
> +riscv_socket_fdt_write_id(mc, mc->fdt, name, socket);
> +g_free(name);
> +
> +addr = memmap[VIRT_ACLINT_SSWI].base +
> +(memmap[VIRT_ACLINT_SSWI].size * socket);
> +name = g_strdup_printf("/soc/sswi@%lx", addr);
> +qemu_fdt_add_subnode(mc->fdt, name);
> +qemu_fdt_setprop_string(mc->fdt, name, "compatible", 
> "riscv,aclint-sswi");
> +qemu_fdt_setprop_cells(mc->fdt, name, "reg",
> +0x0, addr, 0x0, memmap[VIRT_ACLINT_SSWI].size);
> +qemu_fdt_setprop(mc->fdt, name, "interrupts-extended",
> +aclint_sswi_cells, aclint_cells_size);
> +qemu_fdt_setprop(mc->fdt, name, "interrupt-controller", NULL, 0);
> +qemu_fdt_setprop_cell(mc->fdt, name, "#interrupt-cells", 0);
> +riscv_socket_fdt_write_id(mc, mc->fdt, name, socket);
> +g_free(name);
> +
> +g_free(aclint_mswi_cells);
> +g_free(aclint_mtimer_cells);
> +g_free(aclint_sswi_cells);
> +}
> +
>  static void create_fdt_socket_plic(RISCVVirtState *s,
> const MemMapEntry *memmap, int socket,
>

Re: [PATCH v2 3/4] hw/riscv: virt: Re-factor FDT generation

2021-08-04 Thread Alistair Francis

On Sat, Jul 24, 2021 at 10:25 PM Anup Patel  wrote:
>
> We re-factor and break the FDT generation into smaller functions
> so that it is easier to modify FDT generation for different
> configurations of virt machine.
>
> Signed-off-by: Anup Patel 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/virt.c | 521 ++--
>  1 file changed, 324 insertions(+), 197 deletions(-)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 3cbb4cd47f..48c8b4aeb2 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -177,214 +177,262 @@ static void create_pcie_irq_map(void *fdt, char 
> *nodename,
> 0x1800, 0, 0, 0x7);
>  }
>
> -static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap,
> -   uint64_t mem_size, const char *cmdline, bool 
> is_32_bit)
> +static void create_fdt_socket_cpus(RISCVVirtState *s, int socket,
> +   char *clust_name, uint32_t *phandle,
> +   bool is_32_bit, uint32_t *intc_phandles)
>  {
> -void *fdt;
> -int i, cpu, socket;
> +int cpu;
> +uint32_t cpu_phandle;
>  MachineState *mc = MACHINE(s);
> +char *name, *cpu_name, *core_name, *intc_name;
> +
> +for (cpu = s->soc[socket].num_harts - 1; cpu >= 0; cpu--) {
> +cpu_phandle = (*phandle)++;
> +
> +cpu_name = g_strdup_printf("/cpus/cpu@%d",
> +s->soc[socket].hartid_base + cpu);
> +qemu_fdt_add_subnode(mc->fdt, cpu_name);
> +qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type",
> +(is_32_bit) ? "riscv,sv32" : "riscv,sv48");
> +name = riscv_isa_string(&s->soc[socket].harts[cpu]);
> +qemu_fdt_setprop_string(mc->fdt, cpu_name, "riscv,isa", name);
> +g_free(name);
> +qemu_fdt_setprop_string(mc->fdt, cpu_name, "compatible", "riscv");
> +qemu_fdt_setprop_string(mc->fdt, cpu_name, "status", "okay");
> +qemu_fdt_setprop_cell(mc->fdt, cpu_name, "reg",
> +s->soc[socket].hartid_base + cpu);
> +qemu_fdt_setprop_string(mc->fdt, cpu_name, "device_type", "cpu");
> +riscv_socket_fdt_write_id(mc, mc->fdt, cpu_name, socket);
> +qemu_fdt_setprop_cell(mc->fdt, cpu_name, "phandle", cpu_phandle);
> +
> +intc_phandles[cpu] = (*phandle)++;
> +
> +intc_name = g_strdup_printf("%s/interrupt-controller", cpu_name);
> +qemu_fdt_add_subnode(mc->fdt, intc_name);
> +qemu_fdt_setprop_cell(mc->fdt, intc_name, "phandle",
> +intc_phandles[cpu]);
> +qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
> +"riscv,cpu-intc");
> +qemu_fdt_setprop(mc->fdt, intc_name, "interrupt-controller", NULL, 
> 0);
> +qemu_fdt_setprop_cell(mc->fdt, intc_name, "#interrupt-cells", 1);
> +
> +core_name = g_strdup_printf("%s/core%d", clust_name, cpu);
> +qemu_fdt_add_subnode(mc->fdt, core_name);
> +qemu_fdt_setprop_cell(mc->fdt, core_name, "cpu", cpu_phandle);
> +
> +g_free(core_name);
> +g_free(intc_name);
> +g_free(cpu_name);
> +}
> +}
> +
> +static void create_fdt_socket_memory(RISCVVirtState *s,
> + const MemMapEntry *memmap, int socket)
> +{
> +char *mem_name;
>  uint64_t addr, size;
> -uint32_t *clint_cells, *plic_cells;
> -unsigned long clint_addr, plic_addr;
> -uint32_t plic_phandle[MAX_NODES];
> -uint32_t cpu_phandle, intc_phandle, test_phandle;
> -uint32_t phandle = 1, plic_mmio_phandle = 1;
> -uint32_t plic_pcie_phandle = 1, plic_virtio_phandle = 1;
> -char *mem_name, *cpu_name, *core_name, *intc_name;
> -char *name, *clint_name, *plic_name, *clust_name;
> -hwaddr flashsize = virt_memmap[VIRT_FLASH].size / 2;
> -hwaddr flashbase = virt_memmap[VIRT_FLASH].base;
> +MachineState *mc = MACHINE(s);
> +
> +addr = memmap[VIRT_DRAM].base + riscv_socket_mem_offset(mc, socket);
> +size = riscv_socket_mem_size(mc, socket);
> +mem_name = g_strdup_printf("/memory@%lx", (long)addr);
> +qemu_fdt_add_subnode(mc->fdt, mem_name);
> +qemu_fdt_setprop_cells(mc->fdt, mem_name, "reg",
> +addr >> 32, addr, size >> 32, size);
> +qemu_fdt_setprop_string(mc->fdt, mem_name, "device_type", "memory");
> +riscv_socket_fdt_write_id(mc, mc->fdt, mem_name, socket);
> +g_free(mem_name);
> +}
> +
> +static void create_fdt_socket_clint(RISCVVirtState *s,
> +const MemMapEntry *memmap, int socket,
> +uint32_t *intc_phandles)
> +{
> +int cpu;
> +char *clint_name;
> +uint32_t *clint_cells;
> +unsigned long clint_addr;
> +MachineState *mc = MACHINE(s);
>  static const char * const clint_compat[2] = {
>  "sifive,clint0", "riscv,clint0"
>  };
> +
> +clint_cells = g_new0(uint32_t, s->soc[socket].num_harts * 4);
> +
> +for

Re: [RFC PATCH 00/13] Support UXL field in mstatus

2021-08-04 Thread Alistair Francis

On Thu, Aug 5, 2021 at 12:55 PM LIU Zhiwei  wrote:
>
> This patch set implements UXL field in mstatus register. Programmer can change
> UXLEN by writting to this field. So that you can run a 32 bit program
> on a 64 bit CPU.

Awesome! Do you have any steps for building a rootFS to test this?

Alistair

>
> This patch set depends on one patch set by Richard Henderson
> https://lists.gnu.org/archive/html/qemu-riscv/2021-07/msg00059.html.
>
> LIU Zhiwei (13):
>   target/riscv: Add UXL to tb flags
>   target/riscv: Support UXL32 for branch instructions
>   target/riscv: Support UXL32 on 64-bit cpu for load/store
>   target/riscv: Support UXL32 for slit/sltiu
>   target/riscv: Support UXL32 for shift instruction
>   target/riscv: Fix div instructions
>   target/riscv: Support UXL32 for RVM
>   target/riscv: Support UXL32 for vector instructions
>   target/riscv: Support UXL32 for atomic instructions
>   target/riscv: Support UXL32 for float instructions
>   target/riscv: Fix srow
>   target/riscv: Support UXL32 for RVB
>   target/riscv: Changing the width of U-mode CSR
>
>  target/riscv/cpu.h  |  18 +++
>  target/riscv/csr.c  |  42 +-
>  target/riscv/insn_trans/trans_rva.c.inc |  36 -
>  target/riscv/insn_trans/trans_rvb.c.inc |  51 +--
>  target/riscv/insn_trans/trans_rvd.c.inc |   4 +-
>  target/riscv/insn_trans/trans_rvf.c.inc |   4 +-
>  target/riscv/insn_trans/trans_rvi.c.inc |  62 ++--
>  target/riscv/insn_trans/trans_rvm.c.inc |  24 ++-
>  target/riscv/insn_trans/trans_rvv.c.inc |  44 +++---
>  target/riscv/translate.c| 186 
>  target/riscv/vector_helper.c|  54 +--
>  11 files changed, 414 insertions(+), 111 deletions(-)
>
> --
> 2.17.1
>
>

Re: [RFC PATCH 01/13] target/riscv: Add UXL to tb flags

2021-08-04 Thread Alistair Francis

On Thu, Aug 5, 2021 at 12:55 PM LIU Zhiwei  wrote:
>
> For 32-bit applications run on 64-bit cpu, it may share some code
> with other 64-bit applictions. Thus we should distinguish the translated
> cache of the share code with a tb flag.
>
> Signed-off-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.h   | 15 +++
>  target/riscv/translate.c |  3 +++
>  2 files changed, 18 insertions(+)
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index bf1c899c00..2b3ba21a78 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -394,9 +394,20 @@ FIELD(TB_FLAGS, SEW, 5, 3)
>  FIELD(TB_FLAGS, VILL, 8, 1)
>  /* Is a Hypervisor instruction load/store allowed? */
>  FIELD(TB_FLAGS, HLSX, 9, 1)
> +FIELD(TB_FLAGS, UXL, 10, 2)
>
>  bool riscv_cpu_is_32bit(CPURISCVState *env);
>
> +static inline bool riscv_cpu_is_uxl32(CPURISCVState *env)
> +{
> +#ifndef CONFIG_USER_ONLY
> +return (get_field(env->mstatus, MSTATUS64_UXL) == 1) &&
> +   !riscv_cpu_is_32bit(env) &&
> +   (env->priv == PRV_U);
> +#endif
> +return false;
> +}
> +
>  /*
>   * A simplification for VLMAX
>   * = (1 << LMUL) * VLEN / (8 * (1 << SEW))
> @@ -451,6 +462,10 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
> *env, target_ulong *pc,
>  flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1);
>  }
>  }
> +if (riscv_cpu_is_uxl32(env)) {
> +flags = FIELD_DP32(flags, TB_FLAGS, UXL,
> +   get_field(env->mstatus, MSTATUS64_UXL));
> +}
>  #endif
>
>  *pflags = flags;
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 076f28b9c1..ac4a545da8 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -67,6 +67,8 @@ typedef struct DisasContext {
>  CPUState *cs;
>  TCGv zero;
>  TCGv sink;
> +/* UXLEN is 32 bit for 64-bit CPU */
> +bool uxl32;
>  } DisasContext;
>
>  static inline bool has_ext(DisasContext *ctx, uint32_t ext)
> @@ -912,6 +914,7 @@ static void riscv_tr_init_disas_context(DisasContextBase 
> *dcbase, CPUState *cs)
>  ctx->mlen = 1 << (ctx->sew  + 3 - ctx->lmul);
>  ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
>  ctx->cs = cs;
> +ctx->uxl32 = FIELD_EX32(tb_flags, TB_FLAGS, UXL) == 1;
>  }
>
>  static void riscv_tr_tb_start(DisasContextBase *dcbase, CPUState *cpu)
> --
> 2.17.1
>
>

Re: [PATCH 0/3] Disable vhost device IOTLB is IOMMU is not enabled

2021-08-04 Thread Peter Xu

On Thu, Aug 05, 2021 at 09:46:10AM +0800, Jason Wang wrote:
> > For the long term we may need to think about making device creation to be 
> > not
> > ordered by user cmdline input but still has a priority specified in the code
> > itself.
> 
> I fully agree.

I'll see whether I can work on that in the near future.  Before that, no
intention to block this series. :)

Acked-by: Peter Xu 

Will keep you in the loop if there will be something.  Thanks,

-- 
Peter Xu

[RFC PATCH 12/13] target/riscv: Support UXL32 for RVB

2021-08-04 Thread LIU Zhiwei

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvb.c.inc | 47 +++--
 target/riscv/translate.c|  8 +
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc 
b/target/riscv/insn_trans/trans_rvb.c.inc
index 0bae0a2bbf..5de24c3a24 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -20,19 +20,19 @@
 static bool trans_clz(DisasContext *ctx, arg_clz *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_unary(ctx, a, gen_clz);
+return gen_unary(ctx, a, ctx->uxl32 ? gen_clzw : gen_clz);
 }
 
 static bool trans_ctz(DisasContext *ctx, arg_ctz *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_unary(ctx, a, gen_ctz);
+return gen_unary(ctx, a, ctx->uxl32 ? gen_ctzw : gen_ctz);
 }
 
 static bool trans_cpop(DisasContext *ctx, arg_cpop *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_unary(ctx, a, tcg_gen_ctpop_tl);
+return gen_unary(ctx, a, ctx->uxl32 ? gen_cpopw : tcg_gen_ctpop_tl);
 }
 
 static bool trans_andn(DisasContext *ctx, arg_andn *a)
@@ -56,43 +56,43 @@ static bool trans_xnor(DisasContext *ctx, arg_xnor *a)
 static bool trans_pack(DisasContext *ctx, arg_pack *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_arith(ctx, a, gen_pack);
+return gen_arith(ctx, a, ctx->uxl32 ? gen_packw : gen_pack);
 }
 
 static bool trans_packu(DisasContext *ctx, arg_packu *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_arith(ctx, a, gen_packu);
+return gen_arith(ctx, a, ctx->uxl32 ? gen_packuw : gen_packu);
 }
 
 static bool trans_packh(DisasContext *ctx, arg_packh *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_arith(ctx, a, gen_packh);
+return gen_arith(ctx, a, ctx->uxl32 ? gen_packhw : gen_packh);
 }
 
 static bool trans_min(DisasContext *ctx, arg_min *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_arith(ctx, a, tcg_gen_smin_tl);
+return gen_arith_s(ctx, a, tcg_gen_smin_tl);
 }
 
 static bool trans_max(DisasContext *ctx, arg_max *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_arith(ctx, a, tcg_gen_smax_tl);
+return gen_arith_s(ctx, a, tcg_gen_smax_tl);
 }
 
 static bool trans_minu(DisasContext *ctx, arg_minu *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_arith(ctx, a, tcg_gen_umin_tl);
+return gen_arith_u(ctx, a, tcg_gen_umin_tl);
 }
 
 static bool trans_maxu(DisasContext *ctx, arg_maxu *a)
 {
 REQUIRE_EXT(ctx, RVB);
-return gen_arith(ctx, a, tcg_gen_umax_tl);
+return gen_arith_u(ctx, a, tcg_gen_umax_tl);
 }
 
 static bool trans_sext_b(DisasContext *ctx, arg_sext_b *a)
@@ -170,36 +170,54 @@ static bool trans_sloi(DisasContext *ctx, arg_sloi *a)
 static bool trans_sro(DisasContext *ctx, arg_sro *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_srow(ctx, a);
+}
 return gen_shift(ctx, a, gen_sro);
 }
 
 static bool trans_sroi(DisasContext *ctx, arg_sroi *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_sroiw(ctx, a);
+}
 return gen_shifti(ctx, a, gen_sro);
 }
 
 static bool trans_ror(DisasContext *ctx, arg_ror *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_rorw(ctx, a);
+}
 return gen_shift(ctx, a, tcg_gen_rotr_tl);
 }
 
 static bool trans_rori(DisasContext *ctx, arg_rori *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_roriw(ctx, a);
+}
 return gen_shifti(ctx, a, tcg_gen_rotr_tl);
 }
 
 static bool trans_rol(DisasContext *ctx, arg_rol *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_rolw(ctx, a);
+}
 return gen_shift(ctx, a, tcg_gen_rotl_tl);
 }
 
 static bool trans_grev(DisasContext *ctx, arg_grev *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_grevw(ctx, a);
+}
 return gen_shift(ctx, a, gen_helper_grev);
 }
 
@@ -207,6 +225,9 @@ static bool trans_grevi(DisasContext *ctx, arg_grevi *a)
 {
 REQUIRE_EXT(ctx, RVB);
 
+if (ctx->uxl32) {
+return trans_grevi(ctx, a);
+}
 if (a->shamt >= TARGET_LONG_BITS) {
 return false;
 }
@@ -217,12 +238,18 @@ static bool trans_grevi(DisasContext *ctx, arg_grevi *a)
 static bool trans_gorc(DisasContext *ctx, arg_gorc *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_gorcw(ctx, a);
+}
 return gen_shift(ctx, a, gen_helper_gorc);
 }
 
 static bool trans_gorci(DisasContext *ctx, arg_gorci *a)
 {
 REQUIRE_EXT(ctx, RVB);
+if (ctx->uxl32) {
+return trans_gorciw(ctx, a);
+}
 return gen_shifti(ctx, a, gen_helper_gorc);
 }
 
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 5ee0feac4b..f4b2f75812 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -742,6 +742,14 @@ static void gen_packuw(TCGv ret, TCGv arg1, TCGv arg2)
 tcg_temp_free(t);
 }
 
+static void gen_packhw(TCGv ret, TCGv arg1, TCGv arg2)
+{
+TCGv t = tcg_temp_

[RFC PATCH 11/13] target/riscv: Fix srow

2021-08-04 Thread LIU Zhiwei

Always fill MSB 32 bits with 1s in source register before calling
gen_sro. Otherwise it may not only shift in 1s.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvb.c.inc | 4 ++--
 target/riscv/translate.c| 7 +++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc 
b/target/riscv/insn_trans/trans_rvb.c.inc
index 58921f3224..0bae0a2bbf 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -339,14 +339,14 @@ static bool trans_srow(DisasContext *ctx, arg_srow *a)
 {
 REQUIRE_64BIT(ctx);
 REQUIRE_EXT(ctx, RVB);
-return gen_shiftw(ctx, a, gen_sro);
+return gen_shiftw(ctx, a, gen_srow);
 }
 
 static bool trans_sroiw(DisasContext *ctx, arg_sroiw *a)
 {
 REQUIRE_64BIT(ctx);
 REQUIRE_EXT(ctx, RVB);
-return gen_shiftiw(ctx, a, gen_sro);
+return gen_shiftiw(ctx, a, gen_srow);
 }
 
 static bool trans_rorw(DisasContext *ctx, arg_rorw *a)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 160a2df629..5ee0feac4b 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -669,6 +669,13 @@ static void gen_sro(TCGv ret, TCGv arg1, TCGv arg2)
 tcg_gen_not_tl(ret, ret);
 }
 
+static void gen_srow(TCGv ret, TCGv arg1, TCGv arg2)
+{
+TCGv ones = tcg_constant_tl(UINT32_MAX);
+tcg_gen_deposit_tl(arg1, arg1, ones, 32, 32);
+gen_sro(ret, arg1, arg2);
+}
+
 static bool gen_grevi(DisasContext *ctx, arg_grevi *a)
 {
 TCGv dest = gpr_dst(ctx, a->rd);
-- 
2.17.1

[RFC PATCH 10/13] target/riscv: Support UXL32 for float instructions

2021-08-04 Thread LIU Zhiwei

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvd.c.inc | 4 ++--
 target/riscv/insn_trans/trans_rvf.c.inc | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvd.c.inc 
b/target/riscv/insn_trans/trans_rvd.c.inc
index 9bb15fdc12..fb033ccd97 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -23,7 +23,7 @@ static bool trans_fld(DisasContext *ctx, arg_fld *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVD);
 
-TCGv addr = gpr_src(ctx, a->rs1);
+TCGv addr = gpr_src_u(ctx, a->rs1);
 TCGv temp = NULL;
 
 if (a->imm) {
@@ -46,7 +46,7 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVD);
 
-TCGv addr = gpr_src(ctx, a->rs1);
+TCGv addr = gpr_src_u(ctx, a->rs1);
 TCGv temp = NULL;
 
 if (a->imm) {
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc 
b/target/riscv/insn_trans/trans_rvf.c.inc
index ff8e942199..4576072124 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -28,7 +28,7 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
 
-TCGv addr = gpr_src(ctx, a->rs1);
+TCGv addr = gpr_src_u(ctx, a->rs1);
 TCGv temp = NULL;
 
 if (a->imm) {
@@ -53,7 +53,7 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
 REQUIRE_FPU;
 REQUIRE_EXT(ctx, RVF);
 
-TCGv addr = gpr_src(ctx, a->rs1);
+TCGv addr = gpr_src_u(ctx, a->rs1);
 TCGv temp = NULL;
 
 if (a->imm) {
-- 
2.17.1

[RFC PATCH 13/13] target/riscv: Changing the width of U-mode CSR

2021-08-04 Thread LIU Zhiwei

Signed-off-by: LIU Zhiwei 
---
 target/riscv/csr.c | 42 +-
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 9a4ed18ac5..dc9807c0ff 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -297,7 +297,7 @@ static RISCVException read_vxrm(CPURISCVState *env, int 
csrno,
 static RISCVException write_vxrm(CPURISCVState *env, int csrno,
  target_ulong val)
 {
-env->vxrm = val;
+env->vxrm = riscv_cpu_is_uxl32(env) ? val & UINT32_MAX : val;
 return RISCV_EXCP_NONE;
 }
 
@@ -311,7 +311,7 @@ static RISCVException read_vxsat(CPURISCVState *env, int 
csrno,
 static RISCVException write_vxsat(CPURISCVState *env, int csrno,
   target_ulong val)
 {
-env->vxsat = val;
+env->vxsat = riscv_cpu_is_uxl32(env) ? val & UINT32_MAX : val;
 return RISCV_EXCP_NONE;
 }
 
@@ -325,7 +325,7 @@ static RISCVException read_vstart(CPURISCVState *env, int 
csrno,
 static RISCVException write_vstart(CPURISCVState *env, int csrno,
target_ulong val)
 {
-env->vstart = val;
+env->vstart = riscv_cpu_is_uxl32(env) ? val & UINT32_MAX : val;
 return RISCV_EXCP_NONE;
 }
 
@@ -493,6 +493,36 @@ static int validate_vm(CPURISCVState *env, target_ulong vm)
 }
 }
 
+static void uxl32_switch(CPURISCVState *env, target_ulong val)
+{
+uint32_t old = get_field(env->mstatus, MSTATUS64_UXL);
+uint32_t new = get_field(val, MSTATUS64_UXL);
+
+if (old == new) {
+return;
+}
+
+/*
+ * For the read-only bits of the previous-width CSR, the bits at the
+ * same positions in the temporary register are set to zeros.
+ */
+if (env->misa & RVV) {
+env->vl = 0;
+env->vtype = 0;
+}
+
+/*
+ * If the new width W is narrower than the previous width, the
+ * least-significant W bits of the temporary register are retained and
+ * the more-significant bits are discarded.
+ */
+if ((old == 2) && (new == 1)) {
+if (env->misa & RVV) {
+env->vtype &= UINT32_MAX;
+}
+}
+}
+
 static RISCVException write_mstatus(CPURISCVState *env, int csrno,
 target_ulong val)
 {
@@ -502,13 +532,13 @@ static RISCVException write_mstatus(CPURISCVState *env, 
int csrno,
 
 /* flush tlb on mstatus fields that affect VM */
 if ((val ^ mstatus) & (MSTATUS_MXR | MSTATUS_MPP | MSTATUS_MPV |
-MSTATUS_MPRV | MSTATUS_SUM)) {
+MSTATUS_MPRV | MSTATUS_SUM | MSTATUS64_UXL)) {
 tlb_flush(env_cpu(env));
 }
 mask = MSTATUS_SIE | MSTATUS_SPIE | MSTATUS_MIE | MSTATUS_MPIE |
 MSTATUS_SPP | MSTATUS_FS | MSTATUS_MPRV | MSTATUS_SUM |
 MSTATUS_MPP | MSTATUS_MXR | MSTATUS_TVM | MSTATUS_TSR |
-MSTATUS_TW;
+MSTATUS_TW | MSTATUS64_UXL;
 
 if (!riscv_cpu_is_32bit(env)) {
 /*
@@ -518,6 +548,8 @@ static RISCVException write_mstatus(CPURISCVState *env, int 
csrno,
 mask |= MSTATUS_MPV | MSTATUS_GVA;
 }
 
+uxl32_switch(env, val);
+
 mstatus = (mstatus & ~mask) | (val & mask);
 
 dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
-- 
2.17.1

[RFC PATCH 09/13] target/riscv: Support UXL32 for atomic instructions

2021-08-04 Thread LIU Zhiwei

Only load or store 32 bits data for atomic instructions when UXL32.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rva.c.inc | 36 -
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rva.c.inc 
b/target/riscv/insn_trans/trans_rva.c.inc
index 5bb5bbd09c..07c94416e5 100644
--- a/target/riscv/insn_trans/trans_rva.c.inc
+++ b/target/riscv/insn_trans/trans_rva.c.inc
@@ -20,12 +20,19 @@
 
 static bool gen_lr(DisasContext *ctx, arg_atomic *a, MemOp mop)
 {
-TCGv src1 = gpr_src(ctx, a->rs1);
+TCGv src1 = gpr_src_u(ctx, a->rs1);
 
 /* Put addr in load_res, data in load_val.  */
 if (a->rl) {
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
 }
+if (ctx->uxl32) {
+TCGv_i32 val = tcg_temp_new_i32();
+tcg_gen_qemu_ld_i32(val, src1, ctx->mem_idx, mop);
+tcg_gen_extu_i32_tl(load_val, val);
+} else {
+tcg_gen_qemu_ld_tl(load_val, src1, ctx->mem_idx, mop);
+}
 tcg_gen_qemu_ld_tl(load_val, src1, ctx->mem_idx, mop);
 if (a->aq) {
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
@@ -39,8 +46,8 @@ static bool gen_lr(DisasContext *ctx, arg_atomic *a, MemOp 
mop)
 static bool gen_sc(DisasContext *ctx, arg_atomic *a, MemOp mop)
 {
 TCGv dest = gpr_dst(ctx, a->rd);
-TCGv src1 = gpr_src(ctx, a->rs1);
-TCGv src2 = gpr_src(ctx, a->rs2);
+TCGv src1 = gpr_src_u(ctx, a->rs1);
+TCGv src2 = gpr_src_u(ctx, a->rs2);
 TCGLabel *l1 = gen_new_label();
 TCGLabel *l2 = gen_new_label();
 
@@ -50,8 +57,25 @@ static bool gen_sc(DisasContext *ctx, arg_atomic *a, MemOp 
mop)
  * Note that the TCG atomic primitives are SC,
  * so we can ignore AQ/RL along this path.
  */
-tcg_gen_atomic_cmpxchg_tl(dest, load_res, load_val, src2,
-  ctx->mem_idx, mop);
+if (ctx->uxl32) {
+TCGv_i32 retv, cmpv, newv;
+retv = tcg_temp_new_i32();
+cmpv = tcg_temp_new_i32();
+newv = tcg_temp_new_i32();
+tcg_gen_trunc_tl_i32(cmpv, load_val);
+tcg_gen_trunc_tl_i32(newv, src2);
+
+tcg_gen_atomic_cmpxchg_i32(retv, load_res, cmpv, newv,
+   ctx->mem_idx, mop);
+
+tcg_gen_extu_i32_tl(dest, retv);
+tcg_temp_free_i32(retv);
+tcg_temp_free_i32(cmpv);
+tcg_temp_free_i32(newv);
+} else {
+tcg_gen_atomic_cmpxchg_tl(dest, load_res, load_val, src2,
+  ctx->mem_idx, mop);
+}
 tcg_gen_setcond_tl(TCG_COND_NE, dest, dest, load_val);
 tcg_gen_br(l2);
 
@@ -78,7 +102,7 @@ static bool gen_amo(DisasContext *ctx, arg_atomic *a,
 MemOp mop)
 {
 TCGv dest = gpr_dst(ctx, a->rd);
-TCGv src1 = gpr_src(ctx, a->rs1);
+TCGv src1 = gpr_src_u(ctx, a->rs1);
 TCGv src2 = gpr_src(ctx, a->rs2);
 
 (*func)(dest, src1, src2, ctx->mem_idx, mop);
-- 
2.17.1

[RFC PATCH 08/13] target/riscv: Support UXL32 for vector instructions

2021-08-04 Thread LIU Zhiwei

For integer operations, the scalar can be taken from the scalar x register
specified by rs1. If XLEN
---
 target/riscv/cpu.h  |  3 ++
 target/riscv/insn_trans/trans_rvv.c.inc | 44 
 target/riscv/vector_helper.c| 54 +
 3 files changed, 68 insertions(+), 33 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 2b3ba21a78..9c96a1e818 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -111,6 +111,9 @@ FIELD(VTYPE, VEDIV, 5, 2)
 FIELD(VTYPE, RESERVED, 7, sizeof(target_ulong) * 8 - 9)
 FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
 
+FIELD(VTYPE, RESERVED_UXL32, 7, 23)
+FIELD(VTYPE, VILL_UXL32, 31, 1)
+
 struct CPURISCVState {
 target_ulong gpr[32];
 uint64_t fpr[32]; /* assume both F and D extensions */
diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 84a45fac38..732b8ab460 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -35,7 +35,7 @@ static bool trans_vsetvl(DisasContext *ctx, arg_vsetvl *a)
 /* As the mask is at least one bit, RV_VLEN_MAX is >= VLMAX */
 s1 = tcg_constant_tl(RV_VLEN_MAX);
 } else {
-s1 = gpr_src(ctx, a->rs1);
+s1 = gpr_src_u(ctx, a->rs1);
 }
 gen_helper_vsetvl(dst, cpu_env, s1, s2);
 
@@ -61,7 +61,7 @@ static bool trans_vsetvli(DisasContext *ctx, arg_vsetvli *a)
 /* As the mask is at least one bit, RV_VLEN_MAX is >= VLMAX */
 s1 = tcg_constant_tl(RV_VLEN_MAX);
 } else {
-s1 = gpr_src(ctx, a->rs1);
+s1 = gpr_src_u(ctx, a->rs1);
 }
 gen_helper_vsetvl(dst, cpu_env, s1, s2);
 
@@ -163,7 +163,7 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
 
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
-base = gpr_src(s, rs1);
+base = gpr_src_u(s, rs1);
 
 /*
  * As simd_desc supports at most 256 bytes, and in this implementation,
@@ -318,8 +318,8 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, 
uint32_t rs2,
 
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
-base = gpr_src(s, rs1);
-stride = gpr_src(s, rs2);
+base = gpr_src_u(s, rs1);
+stride = gpr_src_s(s, rs2);
 desc = tcg_constant_i32(simd_desc(s->vlen / 8, s->vlen / 8, data));
 
 tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, vd));
@@ -442,7 +442,7 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, 
uint32_t vs2,
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
 index = tcg_temp_new_ptr();
-base = gpr_src(s, rs1);
+base = gpr_src_u(s, rs1);
 desc = tcg_constant_i32(simd_desc(s->vlen / 8, s->vlen / 8, data));
 
 tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, vd));
@@ -571,7 +571,7 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t 
data,
 
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
-base = gpr_src(s, rs1);
+base = gpr_src_u(s, rs1);
 desc = tcg_constant_i32(simd_desc(s->vlen / 8, s->vlen / 8, data));
 
 tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, vd));
@@ -645,7 +645,7 @@ static bool amo_trans(uint32_t vd, uint32_t rs1, uint32_t 
vs2,
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
 index = tcg_temp_new_ptr();
-base = gpr_src(s, rs1);
+base = gpr_src_u(s, rs1);
 desc = tcg_constant_i32(simd_desc(s->vlen / 8, s->vlen / 8, data));
 
 tcg_gen_addi_ptr(dest, cpu_env, vreg_ofs(s, vd));
@@ -731,12 +731,13 @@ static bool amo_op(DisasContext *s, arg_rwdvm *a, uint8_t 
seq)
  */
 static bool amo_check(DisasContext *s, arg_rwdvm* a)
 {
-return (!s->vill && has_ext(s, RVA) &&
-(!a->wd || vext_check_overlap_mask(s, a->rd, a->vm, false)) &&
-vext_check_reg(s, a->rd, false) &&
-vext_check_reg(s, a->rs2, false) &&
-((1 << s->sew) <= sizeof(target_ulong)) &&
-((1 << s->sew) >= 4));
+return !s->vill && has_ext(s, RVA) &&
+   (!a->wd || vext_check_overlap_mask(s, a->rd, a->vm, false)) &&
+   vext_check_reg(s, a->rd, false) &&
+   vext_check_reg(s, a->rs2, false) &&
+   (s->uxl32 ? ((1 << s->sew) == 4) :
+   (((1 << s->sew) <= sizeof(target_ulong)) &&
+((1 << s->sew) >= 4)));
 }
 
 static bool amo_check64(DisasContext *s, arg_rwdvm* a)
@@ -840,7 +841,7 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t 
vs2, uint32_t vm,
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
 src2 = tcg_temp_new_ptr();
-src1 = gpr_src(s, rs1);
+src1 = gpr_src_s(s, rs1);
 
 data = FIELD_DP32(data, VDATA, MLEN, s->mlen);
 data = FIELD_DP32(data, VDATA, VM, vm);
@@ -882,7 +883,7 @@ do_opivx_gvec(DisasContext *s, arg_rmrr *a, GVecGen2sFn 
*gvec_fn,
 if (a->vm && s->vl_eq_vlmax) {
 TCGv_i64 src1 = tcg_temp_new_i64();
 
-tcg_gen_ext_tl_i64(src1, gpr_src(s, a->rs1

[RFC PATCH 07/13] target/riscv: Support UXL32 for RVM

2021-08-04 Thread LIU Zhiwei

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvm.c.inc | 24 ---
 target/riscv/translate.c| 56 +++--
 2 files changed, 62 insertions(+), 18 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvm.c.inc 
b/target/riscv/insn_trans/trans_rvm.c.inc
index 34220b824d..121d592351 100644
--- a/target/riscv/insn_trans/trans_rvm.c.inc
+++ b/target/riscv/insn_trans/trans_rvm.c.inc
@@ -28,43 +28,55 @@ static bool trans_mul(DisasContext *ctx, arg_mul *a)
 static bool trans_mulh(DisasContext *ctx, arg_mulh *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, gen_mulh);
+return gen_arith_s(ctx, a, gen_mulh);
 }
 
 static bool trans_mulhsu(DisasContext *ctx, arg_mulhsu *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, &gen_mulhsu);
+return gen_arith_su(ctx, a, &gen_mulhsu);
 }
 
 static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, gen_mulhu);
+return gen_arith_u(ctx, a, gen_mulhu);
 }
 
 static bool trans_div(DisasContext *ctx, arg_div *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, &gen_div);
+if (ctx->uxl32) {
+return trans_divw(ctx, a);
+}
+return gen_arith_div(ctx, a, &gen_div);
 }
 
 static bool trans_divu(DisasContext *ctx, arg_divu *a)
 {
 REQUIRE_EXT(ctx, RVM);
+if (ctx->uxl32) {
+return trans_divuw(ctx, a);
+}
 return gen_arith(ctx, a, &gen_divu);
 }
 
 static bool trans_rem(DisasContext *ctx, arg_rem *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, &gen_rem);
+if (ctx->uxl32) {
+return trans_remw(ctx, a);
+}
+return gen_arith_div(ctx, a, &gen_rem);
 }
 
 static bool trans_remu(DisasContext *ctx, arg_remu *a)
 {
 REQUIRE_EXT(ctx, RVM);
-return gen_arith(ctx, a, &gen_remu);
+if (ctx->uxl32) {
+return trans_remuw(ctx, a);
+}
+return gen_arith_u(ctx, a, &gen_remu);
 }
 
 static bool trans_mulw(DisasContext *ctx, arg_mulw *a)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 2892eaa9a7..160a2df629 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -254,16 +254,14 @@ static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
 TCGv rh = tcg_temp_new();
 
 tcg_gen_mulu2_tl(rl, rh, arg1, arg2);
-/* fix up for one negative */
-tcg_gen_sari_tl(rl, arg1, TARGET_LONG_BITS - 1);
-tcg_gen_and_tl(rl, rl, arg2);
-tcg_gen_sub_tl(ret, rh, rl);
+tcg_gen_sub_tl(rl, rh, arg2);
+tcg_gen_movcond_tl(TCG_COND_LT, ret, arg1, tcg_constant_tl(0), rl, rh);
 
 tcg_temp_free(rl);
 tcg_temp_free(rh);
 }
 
-static void gen_div(TCGv ret, TCGv source1, TCGv source2)
+static void gen_div(DisasContext *ctx, TCGv ret, TCGv source1, TCGv source2)
 {
 TCGv cond1, cond2, zeroreg, resultopt1, t1, t2;
 /*
@@ -280,8 +278,14 @@ static void gen_div(TCGv ret, TCGv source1, TCGv source2)
 
 tcg_gen_movi_tl(resultopt1, (target_ulong)-1);
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond2, source2, (target_ulong)(~0L));
-tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source1,
-((target_ulong)1) << (TARGET_LONG_BITS - 1));
+
+if (ctx->uxl32) {
+tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source1, INT32_MIN);
+} else {
+tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source1,
+((target_ulong)1) << (TARGET_LONG_BITS - 1));
+}
+
 tcg_gen_and_tl(cond1, cond1, cond2); /* cond1 = overflow */
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond2, source2, 0); /* cond2 = div 0 */
 /* if div by zero, set source1 to -1, otherwise don't change */
@@ -322,7 +326,7 @@ static void gen_divu(TCGv ret, TCGv source1, TCGv source2)
 tcg_temp_free(t2);
 }
 
-static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
+static void gen_rem(DisasContext *ctx, TCGv ret, TCGv source1, TCGv source2)
 {
 TCGv cond1, cond2, zeroreg, resultopt1, t2;
 
@@ -334,8 +338,14 @@ static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
 
 tcg_gen_movi_tl(resultopt1, 1L);
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond2, source2, (target_ulong)-1);
-tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source1,
-(target_ulong)1 << (TARGET_LONG_BITS - 1));
+
+if (ctx->uxl32) {
+tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source1, INT32_MIN);
+} else {
+tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source1,
+((target_long)1) << (TARGET_LONG_BITS - 1));
+}
+
 tcg_gen_and_tl(cond2, cond1, cond2); /* cond1 = overflow */
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source2, 0); /* cond2 = div 0 */
 /* if overflow or div by zero, set source2 to 1, else don't change */
@@ -541,7 +551,7 @@ static void gen_mulw(TCGv ret, TCGv arg1, TCGv arg2)
 }
 
 static bool gen_arith_div_w(DisasContext *ctx, arg_r *a,
-void(*func)(TCGv, TCGv, TCGv))
+void(*func)(D

[RFC PATCH 05/13] target/riscv: Support UXL32 for shift instruction

2021-08-04 Thread LIU Zhiwei

Reuse 32-bit right shift instructions.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvi.c.inc | 12 
 1 file changed, 12 insertions(+)

diff --git a/target/riscv/insn_trans/trans_rvi.c.inc 
b/target/riscv/insn_trans/trans_rvi.c.inc
index 6201c07795..698a28731e 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -303,11 +303,17 @@ static bool trans_slli(DisasContext *ctx, arg_slli *a)
 
 static bool trans_srli(DisasContext *ctx, arg_srli *a)
 {
+if (ctx->uxl32) {
+return trans_srliw(ctx, a);
+}
 return gen_shifti(ctx, a, tcg_gen_shr_tl);
 }
 
 static bool trans_srai(DisasContext *ctx, arg_srai *a)
 {
+if (ctx->uxl32) {
+return trans_sraiw(ctx, a);
+}
 return gen_shifti(ctx, a, tcg_gen_sar_tl);
 }
 
@@ -343,11 +349,17 @@ static bool trans_xor(DisasContext *ctx, arg_xor *a)
 
 static bool trans_srl(DisasContext *ctx, arg_srl *a)
 {
+if (ctx->uxl32) {
+return trans_srlw(ctx, a);
+}
 return gen_shift(ctx, a, &tcg_gen_shr_tl);
 }
 
 static bool trans_sra(DisasContext *ctx, arg_sra *a)
 {
+if (ctx->uxl32) {
+return trans_sraw(ctx, a);
+}
 return gen_shift(ctx, a, &tcg_gen_sar_tl);
 }
 
-- 
2.17.1

[RFC PATCH 06/13] target/riscv: Fix div instructions

2021-08-04 Thread LIU Zhiwei

Don't overwrite global source register after
https://lists.gnu.org/archive/html/qemu-riscv/2021-07/msg00058.html.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/translate.c | 46 +++-
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 912e5f1061..2892eaa9a7 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -265,7 +265,7 @@ static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
 
 static void gen_div(TCGv ret, TCGv source1, TCGv source2)
 {
-TCGv cond1, cond2, zeroreg, resultopt1;
+TCGv cond1, cond2, zeroreg, resultopt1, t1, t2;
 /*
  * Handle by altering args to tcg_gen_div to produce req'd results:
  * For overflow: want source1 in source1 and 1 in source2
@@ -275,6 +275,8 @@ static void gen_div(TCGv ret, TCGv source1, TCGv source2)
 cond2 = tcg_temp_new();
 zeroreg = tcg_constant_tl(0);
 resultopt1 = tcg_temp_new();
+t1 = tcg_temp_new();
+t2 = tcg_temp_new();
 
 tcg_gen_movi_tl(resultopt1, (target_ulong)-1);
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond2, source2, (target_ulong)(~0L));
@@ -283,49 +285,52 @@ static void gen_div(TCGv ret, TCGv source1, TCGv source2)
 tcg_gen_and_tl(cond1, cond1, cond2); /* cond1 = overflow */
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond2, source2, 0); /* cond2 = div 0 */
 /* if div by zero, set source1 to -1, otherwise don't change */
-tcg_gen_movcond_tl(TCG_COND_EQ, source1, cond2, zeroreg, source1,
-resultopt1);
+tcg_gen_movcond_tl(TCG_COND_EQ, t1, cond2, zeroreg, source1, resultopt1);
 /* if overflow or div by zero, set source2 to 1, else don't change */
 tcg_gen_or_tl(cond1, cond1, cond2);
 tcg_gen_movi_tl(resultopt1, (target_ulong)1);
-tcg_gen_movcond_tl(TCG_COND_EQ, source2, cond1, zeroreg, source2,
-resultopt1);
-tcg_gen_div_tl(ret, source1, source2);
+tcg_gen_movcond_tl(TCG_COND_EQ, t2, cond1, zeroreg, source2, resultopt1);
+tcg_gen_div_tl(ret, t1, t2);
 
 tcg_temp_free(cond1);
 tcg_temp_free(cond2);
 tcg_temp_free(resultopt1);
+tcg_temp_free(t1);
+tcg_temp_free(t2);
 }
 
 static void gen_divu(TCGv ret, TCGv source1, TCGv source2)
 {
-TCGv cond1, zeroreg, resultopt1;
+TCGv cond1, zeroreg, resultopt1, t1, t2;
 cond1 = tcg_temp_new();
 
 zeroreg = tcg_constant_tl(0);
 resultopt1 = tcg_temp_new();
+t1 = tcg_temp_new();
+t2 = tcg_temp_new();
 
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source2, 0);
 tcg_gen_movi_tl(resultopt1, (target_ulong)-1);
-tcg_gen_movcond_tl(TCG_COND_EQ, source1, cond1, zeroreg, source1,
-resultopt1);
+tcg_gen_movcond_tl(TCG_COND_EQ, t1, cond1, zeroreg, source1, resultopt1);
 tcg_gen_movi_tl(resultopt1, (target_ulong)1);
-tcg_gen_movcond_tl(TCG_COND_EQ, source2, cond1, zeroreg, source2,
-resultopt1);
-tcg_gen_divu_tl(ret, source1, source2);
+tcg_gen_movcond_tl(TCG_COND_EQ, t2, cond1, zeroreg, source2, resultopt1);
+tcg_gen_divu_tl(ret, t1, t2);
 
 tcg_temp_free(cond1);
 tcg_temp_free(resultopt1);
+tcg_temp_free(t1);
+tcg_temp_free(t2);
 }
 
 static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
 {
-TCGv cond1, cond2, zeroreg, resultopt1;
+TCGv cond1, cond2, zeroreg, resultopt1, t2;
 
 cond1 = tcg_temp_new();
 cond2 = tcg_temp_new();
 zeroreg = tcg_constant_tl(0);
 resultopt1 = tcg_temp_new();
+t2 = tcg_temp_new();
 
 tcg_gen_movi_tl(resultopt1, 1L);
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond2, source2, (target_ulong)-1);
@@ -335,9 +340,8 @@ static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source2, 0); /* cond2 = div 0 */
 /* if overflow or div by zero, set source2 to 1, else don't change */
 tcg_gen_or_tl(cond2, cond1, cond2);
-tcg_gen_movcond_tl(TCG_COND_EQ, source2, cond2, zeroreg, source2,
-resultopt1);
-tcg_gen_rem_tl(resultopt1, source1, source2);
+tcg_gen_movcond_tl(TCG_COND_EQ, t2, cond2, zeroreg, source2, resultopt1);
+tcg_gen_rem_tl(resultopt1, source1, t2);
 /* if div by zero, just return the original dividend */
 tcg_gen_movcond_tl(TCG_COND_EQ, ret, cond1, zeroreg, resultopt1,
 source1);
@@ -345,26 +349,28 @@ static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
 tcg_temp_free(cond1);
 tcg_temp_free(cond2);
 tcg_temp_free(resultopt1);
+tcg_temp_free(t2);
 }
 
 static void gen_remu(TCGv ret, TCGv source1, TCGv source2)
 {
-TCGv cond1, zeroreg, resultopt1;
+TCGv cond1, zeroreg, resultopt1, t2;
 cond1 = tcg_temp_new();
 zeroreg = tcg_constant_tl(0);
 resultopt1 = tcg_temp_new();
+t2 = tcg_temp_new();
 
 tcg_gen_movi_tl(resultopt1, (target_ulong)1);
 tcg_gen_setcondi_tl(TCG_COND_EQ, cond1, source2, 0);
-tcg_gen_movcond_tl(TCG_COND_EQ, source2, cond1, zeroreg, source2,
-resultopt

[RFC PATCH 03/13] target/riscv: Support UXL32 on 64-bit cpu for load/store

2021-08-04 Thread LIU Zhiwei

Get the LSB 32 bits and zero-extend as the base address.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvi.c.inc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.c.inc 
b/target/riscv/insn_trans/trans_rvi.c.inc
index ea41d1de2d..6823a6b3e0 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -163,7 +163,7 @@ static bool trans_bgeu(DisasContext *ctx, arg_bgeu *a)
 static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop)
 {
 TCGv dest = gpr_dst(ctx, a->rd);
-TCGv addr = gpr_src(ctx, a->rs1);
+TCGv addr = gpr_src_u(ctx, a->rs1);
 TCGv temp = NULL;
 
 if (a->imm) {
@@ -207,7 +207,7 @@ static bool trans_lhu(DisasContext *ctx, arg_lhu *a)
 
 static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop)
 {
-TCGv addr = gpr_src(ctx, a->rs1);
+TCGv addr = gpr_src_u(ctx, a->rs1);
 TCGv data = gpr_src(ctx, a->rs2);
 TCGv temp = NULL;
 
-- 
2.17.1

[RFC PATCH 04/13] target/riscv: Support UXL32 for slit/sltiu

2021-08-04 Thread LIU Zhiwei

For slitu, the imm is sign-extend before unsigned compare. Thus we
should only use the LSB 32 bits of the imm for UXL32.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvi.c.inc |  8 ++---
 target/riscv/translate.c| 44 +
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.c.inc 
b/target/riscv/insn_trans/trans_rvi.c.inc
index 6823a6b3e0..6201c07795 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -276,12 +276,12 @@ static void gen_sltu(TCGv ret, TCGv s1, TCGv s2)
 
 static bool trans_slti(DisasContext *ctx, arg_slti *a)
 {
-return gen_arith_imm_tl(ctx, a, &gen_slt);
+return gen_arith_simm_tl(ctx, a, &gen_slt);
 }
 
 static bool trans_sltiu(DisasContext *ctx, arg_sltiu *a)
 {
-return gen_arith_imm_tl(ctx, a, &gen_sltu);
+return gen_arith_uimm_tl(ctx, a, &gen_sltu);
 }
 
 static bool trans_xori(DisasContext *ctx, arg_xori *a)
@@ -328,12 +328,12 @@ static bool trans_sll(DisasContext *ctx, arg_sll *a)
 
 static bool trans_slt(DisasContext *ctx, arg_slt *a)
 {
-return gen_arith(ctx, a, &gen_slt);
+return gen_arith_s(ctx, a, &gen_slt);
 }
 
 static bool trans_sltu(DisasContext *ctx, arg_sltu *a)
 {
-return gen_arith(ctx, a, &gen_sltu);
+return gen_arith_u(ctx, a, &gen_sltu);
 }
 
 static bool trans_xor(DisasContext *ctx, arg_xor *a)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index d334a9db86..912e5f1061 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -494,6 +494,28 @@ static bool gen_arith_imm_tl(DisasContext *ctx, arg_i *a,
 return true;
 }
 
+static bool gen_arith_simm_tl(DisasContext *ctx, arg_i *a,
+  void (*func)(TCGv, TCGv, TCGv))
+{
+TCGv dest = gpr_dst(ctx, a->rd);
+TCGv src1 = gpr_src_s(ctx, a->rs1);
+TCGv src2 = tcg_constant_tl(a->imm);
+
+(*func)(dest, src1, src2);
+return true;
+}
+
+static bool gen_arith_uimm_tl(DisasContext *ctx, arg_i *a,
+  void (*func)(TCGv, TCGv, TCGv))
+{
+TCGv dest = gpr_dst(ctx, a->rd);
+TCGv src1 = gpr_src_u(ctx, a->rs1);
+TCGv src2 = tcg_constant_tl(ctx->uxl32 ? a->imm & UINT32_MAX : a->imm);
+
+(*func)(dest, src1, src2);
+return true;
+}
+
 static void gen_addw(TCGv ret, TCGv arg1, TCGv arg2)
 {
 tcg_gen_add_tl(ret, arg1, arg2);
@@ -779,6 +801,28 @@ static bool gen_arith(DisasContext *ctx, arg_r *a,
 return true;
 }
 
+static bool gen_arith_u(DisasContext *ctx, arg_r *a,
+void(*func)(TCGv, TCGv, TCGv))
+{
+TCGv dest = gpr_dst(ctx, a->rd);
+TCGv src1 = gpr_src_u(ctx, a->rs1);
+TCGv src2 = gpr_src_u(ctx, a->rs2);
+
+(*func)(dest, src1, src2);
+return true;
+}
+
+static bool gen_arith_s(DisasContext *ctx, arg_r *a,
+void(*func)(TCGv, TCGv, TCGv))
+{
+TCGv dest = gpr_dst(ctx, a->rd);
+TCGv src1 = gpr_src_s(ctx, a->rs1);
+TCGv src2 = gpr_src_s(ctx, a->rs2);
+
+(*func)(dest, src1, src2);
+return true;
+}
+
 static bool gen_shift(DisasContext *ctx, arg_r *a,
 void(*func)(TCGv, TCGv, TCGv))
 {
-- 
2.17.1

[RFC PATCH 02/13] target/riscv: Support UXL32 for branch instructions

2021-08-04 Thread LIU Zhiwei

When UXLEN is 32 on 64-bit CPU, only use the LSB 32 bits of source
registers and sign-extend or zero-extend it according to different
operations.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/insn_trans/trans_rvi.c.inc | 38 -
 target/riscv/translate.c| 22 ++
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.c.inc 
b/target/riscv/insn_trans/trans_rvi.c.inc
index 3705aad380..ea41d1de2d 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -84,11 +84,11 @@ static bool trans_jalr(DisasContext *ctx, arg_jalr *a)
 return true;
 }
 
-static bool gen_branch(DisasContext *ctx, arg_b *a, TCGCond cond)
+static bool gen_branch_internal(DisasContext *ctx, arg_b *a,
+TCGCond cond,
+TCGv src1, TCGv src2)
 {
 TCGLabel *l = gen_new_label();
-TCGv src1 = gpr_src(ctx, a->rs1);
-TCGv src2 = gpr_src(ctx, a->rs2);
 
 tcg_gen_brcond_tl(cond, src1, src2, l);
 gen_goto_tb(ctx, 1, ctx->pc_succ_insn);
@@ -106,6 +106,30 @@ static bool gen_branch(DisasContext *ctx, arg_b *a, 
TCGCond cond)
 return true;
 }
 
+static bool gen_branch_s(DisasContext *ctx, arg_b *a, TCGCond cond)
+{
+TCGv src1 = gpr_src_s(ctx, a->rs1);
+TCGv src2 = gpr_src_s(ctx, a->rs2);
+
+return gen_branch_internal(ctx, a, cond, src1, src2);
+}
+
+static bool gen_branch_u(DisasContext *ctx, arg_b *a, TCGCond cond)
+{
+TCGv src1 = gpr_src_u(ctx, a->rs1);
+TCGv src2 = gpr_src_u(ctx, a->rs2);
+
+return gen_branch_internal(ctx, a, cond, src1, src2);
+}
+
+static bool gen_branch(DisasContext *ctx, arg_b *a, TCGCond cond)
+{
+TCGv src1 = gpr_src(ctx, a->rs1);
+TCGv src2 = gpr_src(ctx, a->rs2);
+
+return gen_branch_internal(ctx, a, cond, src1, src2);
+}
+
 static bool trans_beq(DisasContext *ctx, arg_beq *a)
 {
 return gen_branch(ctx, a, TCG_COND_EQ);
@@ -118,22 +142,22 @@ static bool trans_bne(DisasContext *ctx, arg_bne *a)
 
 static bool trans_blt(DisasContext *ctx, arg_blt *a)
 {
-return gen_branch(ctx, a, TCG_COND_LT);
+return gen_branch_s(ctx, a, TCG_COND_LT);
 }
 
 static bool trans_bge(DisasContext *ctx, arg_bge *a)
 {
-return gen_branch(ctx, a, TCG_COND_GE);
+return gen_branch_s(ctx, a, TCG_COND_GE);
 }
 
 static bool trans_bltu(DisasContext *ctx, arg_bltu *a)
 {
-return gen_branch(ctx, a, TCG_COND_LTU);
+return gen_branch_u(ctx, a, TCG_COND_LTU);
 }
 
 static bool trans_bgeu(DisasContext *ctx, arg_bgeu *a)
 {
-return gen_branch(ctx, a, TCG_COND_GEU);
+return gen_branch_u(ctx, a, TCG_COND_GEU);
 }
 
 static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index ac4a545da8..d334a9db86 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -176,6 +176,28 @@ static void gen_goto_tb(DisasContext *ctx, int n, 
target_ulong dest)
 }
 }
 
+static TCGv gpr_src_u(DisasContext *ctx, int reg_num)
+{
+if (reg_num == 0) {
+return ctx->zero;
+}
+if (ctx->uxl32) {
+tcg_gen_ext32u_tl(cpu_gpr[reg_num], cpu_gpr[reg_num]);
+}
+return cpu_gpr[reg_num];
+}
+
+static TCGv gpr_src_s(DisasContext *ctx, int reg_num)
+{
+if (reg_num == 0) {
+return ctx->zero;
+}
+if (ctx->uxl32) {
+tcg_gen_ext32s_tl(cpu_gpr[reg_num], cpu_gpr[reg_num]);
+}
+return cpu_gpr[reg_num];
+}
+
 /* Wrapper for getting reg values - need to check of reg is zero since
  * cpu_gpr[0] is not actually allocated
  */
-- 
2.17.1

[RFC PATCH 01/13] target/riscv: Add UXL to tb flags

2021-08-04 Thread LIU Zhiwei

For 32-bit applications run on 64-bit cpu, it may share some code
with other 64-bit applictions. Thus we should distinguish the translated
cache of the share code with a tb flag.

Signed-off-by: LIU Zhiwei 
---
 target/riscv/cpu.h   | 15 +++
 target/riscv/translate.c |  3 +++
 2 files changed, 18 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index bf1c899c00..2b3ba21a78 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -394,9 +394,20 @@ FIELD(TB_FLAGS, SEW, 5, 3)
 FIELD(TB_FLAGS, VILL, 8, 1)
 /* Is a Hypervisor instruction load/store allowed? */
 FIELD(TB_FLAGS, HLSX, 9, 1)
+FIELD(TB_FLAGS, UXL, 10, 2)
 
 bool riscv_cpu_is_32bit(CPURISCVState *env);
 
+static inline bool riscv_cpu_is_uxl32(CPURISCVState *env)
+{
+#ifndef CONFIG_USER_ONLY
+return (get_field(env->mstatus, MSTATUS64_UXL) == 1) &&
+   !riscv_cpu_is_32bit(env) &&
+   (env->priv == PRV_U);
+#endif
+return false;
+}
+
 /*
  * A simplification for VLMAX
  * = (1 << LMUL) * VLEN / (8 * (1 << SEW))
@@ -451,6 +462,10 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
*env, target_ulong *pc,
 flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1);
 }
 }
+if (riscv_cpu_is_uxl32(env)) {
+flags = FIELD_DP32(flags, TB_FLAGS, UXL,
+   get_field(env->mstatus, MSTATUS64_UXL));
+}
 #endif
 
 *pflags = flags;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 076f28b9c1..ac4a545da8 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -67,6 +67,8 @@ typedef struct DisasContext {
 CPUState *cs;
 TCGv zero;
 TCGv sink;
+/* UXLEN is 32 bit for 64-bit CPU */
+bool uxl32;
 } DisasContext;
 
 static inline bool has_ext(DisasContext *ctx, uint32_t ext)
@@ -912,6 +914,7 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->mlen = 1 << (ctx->sew  + 3 - ctx->lmul);
 ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
 ctx->cs = cs;
+ctx->uxl32 = FIELD_EX32(tb_flags, TB_FLAGS, UXL) == 1;
 }
 
 static void riscv_tr_tb_start(DisasContextBase *dcbase, CPUState *cpu)
-- 
2.17.1

[RFC PATCH 00/13] Support UXL field in mstatus

2021-08-04 Thread LIU Zhiwei

This patch set implements UXL field in mstatus register. Programmer can change
UXLEN by writting to this field. So that you can run a 32 bit program
on a 64 bit CPU.

This patch set depends on one patch set by Richard Henderson
https://lists.gnu.org/archive/html/qemu-riscv/2021-07/msg00059.html.

LIU Zhiwei (13):
  target/riscv: Add UXL to tb flags
  target/riscv: Support UXL32 for branch instructions
  target/riscv: Support UXL32 on 64-bit cpu for load/store
  target/riscv: Support UXL32 for slit/sltiu
  target/riscv: Support UXL32 for shift instruction
  target/riscv: Fix div instructions
  target/riscv: Support UXL32 for RVM
  target/riscv: Support UXL32 for vector instructions
  target/riscv: Support UXL32 for atomic instructions
  target/riscv: Support UXL32 for float instructions
  target/riscv: Fix srow
  target/riscv: Support UXL32 for RVB
  target/riscv: Changing the width of U-mode CSR

 target/riscv/cpu.h  |  18 +++
 target/riscv/csr.c  |  42 +-
 target/riscv/insn_trans/trans_rva.c.inc |  36 -
 target/riscv/insn_trans/trans_rvb.c.inc |  51 +--
 target/riscv/insn_trans/trans_rvd.c.inc |   4 +-
 target/riscv/insn_trans/trans_rvf.c.inc |   4 +-
 target/riscv/insn_trans/trans_rvi.c.inc |  62 ++--
 target/riscv/insn_trans/trans_rvm.c.inc |  24 ++-
 target/riscv/insn_trans/trans_rvv.c.inc |  44 +++---
 target/riscv/translate.c| 186 
 target/riscv/vector_helper.c|  54 +--
 11 files changed, 414 insertions(+), 111 deletions(-)

-- 
2.17.1

Re: [PATCH 0/3] Disable vhost device IOTLB is IOMMU is not enabled

2021-08-04 Thread Jason Wang




在 2021/8/5 上午12:08, Peter Xu 写道:

On Wed, Aug 04, 2021 at 11:48:00AM +0800, Jason Wang wrote:

Hi:

We currently try to enable device IOTLB when iommu_platform is
set. This may lead unnecessary trasnsactions between qemu and vhost
when vIOMMU is not used (which is the typical case for the encrypted
VM).

So patch tries to use transport specific method to detect the enalbing
of vIOMMU and enable the device IOTLB only if vIOMMU is enalbed.

Just to mention that there's also the ordering requirement for e.g. vfio-pci
and the iommu device so far because vfio_realize() depends on vIOMMU being
realized too, so specifying "-device vfio-pci" before "-device intel-iommu"
will stop working, I think (see the similar pci_device_iommu_address_space()
call in vfio_realize()).



Right, I actually try to google and end up with one patch that you 
posted to make sure vtd is initialized first.





Do you think vhost can do the same to assume vIOMMU must be specified before
vhost?



See below and I think it's not user friendly. I think maybe there should 
a general way to handle the order/dependency of device initialization in 
Qemu. But until that is implemented, I tend to use the "workaround" like 
this.




Then vhost can call pci_device_iommu_address_space() freely.  It'll be
more gentle for vhost even when it's used wrong: instead of not working at all
(vfio-pci), vhost runs slower.



It's not about the slower, if pci_device_iommu_address_space() is used 
we will end up a wrong dma_as which breaks virtio DMA.





Currently libvirt should be able to guarantee that ordering so libvirt users
shouldn't need to bother.  I think libvirt should also guarantee the vIOMMU
device to be created before all the rest devices, including virtio/vhost.  But
need to check.  If that's the case libvirt will naturally work for vhost too.

For the long term we may need to think about making device creation to be not
ordered by user cmdline input but still has a priority specified in the code
itself.



I fully agree.



  Migration has something like that (MigrationPriority).  I think we
just need something similar for general device realizations.  Since vhost
raised the same need, I think that priority should bump up too.



Yes.



The other concern is right now vhost has vhost_dev.dma_as but now we're not
using it for vhost_dev_has_iommu().



If I understand correctly, both of us means using 
pci_device_iommu_address_space() in get_dma_as. If this is true, it's 
not he vhost_dev.dma_as but virtio_dev.dma_as.


So it breaks virtio actually (see above).




   It's just a bit confusing as when to use
which.



Yes, but I don't think a better approach.

Thanks




What do you think?

Thanks,

Re: [PATCH v2 0/2] s390x: improve subchannel error handling (vfio)

2021-08-04 Thread Matthew Rosato


On 8/4/21 8:30 PM, Jared Rossi wrote:

I've exercised the error paths and it appears to all work correctly.

On 7/19/21 11:09 AM, Jared Rossi wrote:

I will take a look and see if I can exercise the error paths.

Regards,

Jared Rossi


Thanks Jared!  So, with that I'd suggest a

Tested-by: Jared Rossi 

and as I said earlier the code LGTM -- so for the series:

Reviewed-by: Matthew Rosato 



On 7/19/21 10:16 AM, Matthew Rosato wrote:

On 7/5/21 12:39 PM, Cornelia Huck wrote:

This is a followup on the first version (which I had sent out in May,
and which kind of fell through the cracks.) While the first patch
is mostly unchanged, I added a second patch to address some possible
problems with the generated unit exceptions; non-vfio subchannels
are not affected by this.

As before, this works on the good path, and I have not managed to
actually get my system to exercise the error path :(


Sorry for the silence, was out of office for a bit and Eric is 
unavailable -- Anyway the code LGTM and matches what I see in the 
POPs, I'd be willing to ACK but I'd feel better if we could exercise 
the error paths before merging.


@Jared/@Mike, you've both had eyes on this area of code recently, 
would one of you be willing to take a crack at a tested-by (non-zero 
CCs on HSCH/CSCH + also drive the sch_gen_unit_exception path)?




v1->v2:
- add comments regarding -ENODEV/-EACCES handling
- add second patch

Cornelia Huck (2):
   vfio-ccw: forward halt/clear errors
   css: fix actl handling for unit exceptions

  hw/s390x/css.c | 38 ++
  hw/vfio/ccw.c  |  4 ++--
  include/hw/s390x/css.h |  3 ++-
  3 files changed, 38 insertions(+), 7 deletions(-)

[PATCH 2/2] docs/sphinx: change default role to "any"

2021-08-04 Thread John Snow

This interprets single-backtick syntax in all of our Sphinx docs as a
cross-reference to *something*, including Python symbols. If it doesn't
resolve, or resolves to too more than one thing, Sphinx will emit a
warning and the build will fail.

Signed-off-by: John Snow 
---
 docs/conf.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/conf.py b/docs/conf.py
index ff6e92c6e2e..acaf883704a 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -85,6 +85,9 @@
 # The master toctree document.
 master_doc = 'index'
 
+# Interpret `this` to be a cross-reference to "anything".
+default_role = 'any'
+
 # General information about the project.
 project = u'QEMU'
 copyright = u'2021, The QEMU Project Developers'
-- 
2.31.1

[PATCH 1/2] docs: remove non-reference uses of single backticks

2021-08-04 Thread John Snow

The single backtick markup in ReST is the "default role". Currently,
Sphinx's default role is called "content". Sphinx suggests you can use
the "Any" role instead to turn any single-backtick enclosed item into a
cross-reference.

This is useful for things like autodoc for Python docstrings, where it's
often nicer to reference other types with `foo` instead of the more
laborious :py:meth:`foo`.

Before we do that, though, we'll need to turn all existing usages of the
"content" role to inline verbatim markup wherever it does not correctly
resolve into a cross-refernece by using double backticks instead.

Signed-off-by: John Snow 
---
 docs/devel/fuzzing.rst | 10 ++
 docs/interop/live-block-operations.rst |  2 +-
 docs/system/guest-loader.rst   |  2 +-
 qapi/block-core.json   |  4 ++--
 include/qemu/module.h  |  6 +++---
 5 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/docs/devel/fuzzing.rst b/docs/devel/fuzzing.rst
index 2749bb9bed3..5f735bb1e91 100644
--- a/docs/devel/fuzzing.rst
+++ b/docs/devel/fuzzing.rst
@@ -182,10 +182,12 @@ The output should contain a complete list of matched 
MemoryRegions.
 
 OSS-Fuzz
 
-QEMU is continuously fuzzed on `OSS-Fuzz` 
__(https://github.com/google/oss-fuzz).
-By default, the OSS-Fuzz build will try to fuzz every fuzz-target. Since the
-generic-fuzz target requires additional information provided in environment
-variables, we pre-define some generic-fuzz configs in
+
+QEMU is continuously fuzzed on `OSS-Fuzz
+`_. By default, the OSS-Fuzz build
+will try to fuzz every fuzz-target. Since the generic-fuzz target
+requires additional information provided in environment variables, we
+pre-define some generic-fuzz configs in
 ``tests/qtest/fuzz/generic_fuzz_configs.h``. Each config must specify:
 
 - ``.name``: To identify the fuzzer config
diff --git a/docs/interop/live-block-operations.rst 
b/docs/interop/live-block-operations.rst
index 9e3635b2338..814c29bbe1d 100644
--- a/docs/interop/live-block-operations.rst
+++ b/docs/interop/live-block-operations.rst
@@ -640,7 +640,7 @@ at this point:
 (QEMU) block-job-complete device=job0
 
 In either of the above cases, if you once again run the
-`query-block-jobs` command, there should not be any active block
+``query-block-jobs`` command, there should not be any active block
 operation.
 
 Comparing 'commit' and 'mirror': In both then cases, the overlay images
diff --git a/docs/system/guest-loader.rst b/docs/system/guest-loader.rst
index 4320d1183f7..9ef9776bf07 100644
--- a/docs/system/guest-loader.rst
+++ b/docs/system/guest-loader.rst
@@ -51,4 +51,4 @@ The full syntax of the guest-loader is::
 
 ``bootargs=``
   This is an optional field for kernel blobs which will pass command
-  like via the `/chosen/module@/bootargs` node.
+  like via the ``/chosen/module@/bootargs`` node.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 675d8265ebf..4246a44da71 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -446,11 +446,11 @@
 # @granularity: granularity of the dirty bitmap in bytes (since 1.4)
 #
 # @recording: true if the bitmap is recording new writes from the guest.
-# Replaces `active` and `disabled` statuses. (since 4.0)
+# Replaces ``active`` and ``disabled`` statuses. (since 4.0)
 #
 # @busy: true if the bitmap is in-use by some operation (NBD or jobs)
 #and cannot be modified via QMP or used by another operation.
-#Replaces `locked` and `frozen` statuses. (since 4.0)
+#Replaces ``locked`` and ``frozen`` statuses. (since 4.0)
 #
 # @persistent: true if the bitmap was stored on disk, is scheduled to be stored
 #  on disk, or both. (since 4.0)
diff --git a/include/qemu/module.h b/include/qemu/module.h
index 3deac0078b9..5fcc323b2a7 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -77,14 +77,14 @@ void module_allow_arch(const char *arch);
 /**
  * DOC: module info annotation macros
  *
- * `scripts/modinfo-collect.py` will collect module info,
+ * ``scripts/modinfo-collect.py`` will collect module info,
  * using the preprocessor and -DQEMU_MODINFO.
  *
- * `scripts/modinfo-generate.py` will create a module meta-data database
+ * ``scripts/modinfo-generate.py`` will create a module meta-data database
  * from the collected information so qemu knows about module
  * dependencies and QOM objects implemented by modules.
  *
- * See `*.modinfo` and `modinfo.c` in the build directory to check the
+ * See ``*.modinfo`` and ``modinfo.c`` in the build directory to check the
  * script results.
  */
 #ifdef QEMU_MODINFO
-- 
2.31.1

[PATCH 0/2] docs/sphinx: change default `role` to "any"

2021-08-04 Thread John Snow

The first patch that I've been carrying for quite a while got real small
recently ... Apologies to Peter Maydell who re-did all that work.

John Snow (2):
  docs: remove non-reference uses of single backticks
  docs/sphinx: change default role to "any"

 docs/conf.py   |  3 +++
 docs/devel/fuzzing.rst | 10 ++
 docs/interop/live-block-operations.rst |  2 +-
 docs/system/guest-loader.rst   |  2 +-
 qapi/block-core.json   |  4 ++--
 include/qemu/module.h  |  6 +++---
 6 files changed, 16 insertions(+), 11 deletions(-)

-- 
2.31.1

Re: [PATCH RFC 1/1] accel/tcg: Clear PAGE_WRITE before translation

2021-08-04 Thread Richard Henderson


On 8/4/21 12:46 PM, Ilya Leoshkevich wrote:

translate_insn() implementations fetch instruction bytes piecemeal,
which can cause qemu-user to generate inconsistent translations if
another thread modifies them concurrently [1].

Fix by marking translation block pages non-writable earlier.

[1] https://lists.nongnu.org/archive/html/qemu-devel/2021-08/msg00644.html

Signed-off-by: Ilya Leoshkevich 
---
  accel/tcg/translate-all.c| 59 +---
  accel/tcg/translator.c   | 26 ++--
  include/exec/translate-all.h |  1 +
  3 files changed, 59 insertions(+), 27 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index bbfcfb698c..fb9ebfad9e 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1297,31 +1297,8 @@ static inline void tb_page_add(PageDesc *p, 
TranslationBlock *tb,
  invalidate_page_bitmap(p);
  
  #if defined(CONFIG_USER_ONLY)

-if (p->flags & PAGE_WRITE) {
-target_ulong addr;
-PageDesc *p2;
-int prot;
-
-/* force the host page as non writable (writes will have a
-   page fault + mprotect overhead) */
-page_addr &= qemu_host_page_mask;
-prot = 0;
-for (addr = page_addr; addr < page_addr + qemu_host_page_size;
-addr += TARGET_PAGE_SIZE) {
-
-p2 = page_find(addr >> TARGET_PAGE_BITS);
-if (!p2) {
-continue;
-}
-prot |= p2->flags;
-p2->flags &= ~PAGE_WRITE;
-  }
-mprotect(g2h_untagged(page_addr), qemu_host_page_size,
- (prot & PAGE_BITS) & ~PAGE_WRITE);
-if (DEBUG_TB_INVALIDATE_GATE) {
-printf("protecting code page: 0x" TB_PAGE_ADDR_FMT "\n", 
page_addr);
-}
-}
+/* translator_loop() must have made all TB pages non-writable */
+assert(!(p->flags & PAGE_WRITE));
  #else
  /* if some code is already present, then the pages are already
 protected. So we handle the case where only the first TB is
@@ -2394,6 +2371,38 @@ int page_check_range(target_ulong start, target_ulong 
len, int flags)
  return 0;
  }
  
+void page_protect(tb_page_addr_t page_addr)

+{
+target_ulong addr;
+PageDesc *p;
+int prot;
+
+p = page_find(page_addr >> TARGET_PAGE_BITS);
+if (p && (p->flags & PAGE_WRITE)) {
+/*
+ * Force the host page as non writable (writes will have a page fault +
+ * mprotect overhead).
+ */
+page_addr &= qemu_host_page_mask;
+prot = 0;
+for (addr = page_addr; addr < page_addr + qemu_host_page_size;
+ addr += TARGET_PAGE_SIZE) {
+
+p = page_find(addr >> TARGET_PAGE_BITS);
+if (!p) {
+continue;
+}
+prot |= p->flags;
+p->flags &= ~PAGE_WRITE;
+}
+mprotect(g2h_untagged(page_addr), qemu_host_page_size,
+ (prot & PAGE_BITS) & ~PAGE_WRITE);
+if (DEBUG_TB_INVALIDATE_GATE) {
+printf("protecting code page: 0x" TB_PAGE_ADDR_FMT "\n", 
page_addr);
+}
+}
+}
+
  /* called from signal handler: invalidate the code and unprotect the
   * page. Return 0 if the fault was not handled, 1 if it was handled,
   * and 2 if it was handled but the caller must cause the TB to be
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index c53a7f8e44..bfbe7d7ccf 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -14,6 +14,7 @@
  #include "exec/exec-all.h"
  #include "exec/gen-icount.h"
  #include "exec/log.h"
+#include "exec/translate-all.h"
  #include "exec/translator.h"
  #include "exec/plugin-gen.h"
  #include "sysemu/replay.h"
@@ -47,6 +48,10 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
  {
  uint32_t cflags = tb_cflags(tb);
  bool plugin_enabled;
+bool stop = false;
+#ifdef CONFIG_USER_ONLY
+target_ulong page_addr = -1;
+#endif
  
  /* Initialize DisasContext */

  db->tb = tb;
@@ -71,6 +76,21 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
  plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
  
  while (true) {

+#ifdef CONFIG_USER_ONLY
+/*
+ * Make the page containing the next instruction non-writable in order
+ * to get a consistent translation if another thread is modifying the
+ * code while translate_insn() fetches the instruction bytes piecemeal.
+ * Writer threads will wait for mmap_lock() in page_unprotect().
+ */
+if ((db->pc_next & TARGET_PAGE_MASK) != page_addr) {
+page_addr = db->pc_next & TARGET_PAGE_MASK;
+page_protect(page_addr);
+}
+#endif
+if (stop) {
+break;
+}


So... I think this isn't quite right.

(1) If we stop exactly at the page break, this protects the *next* page 
unnecessarily.

(2) This only protects the

Re: [PATCH v2 0/2] s390x: improve subchannel error handling (vfio)

2021-08-04 Thread Jared Rossi


I've exercised the error paths and it appears to all work correctly.

On 7/19/21 11:09 AM, Jared Rossi wrote:

I will take a look and see if I can exercise the error paths.

Regards,

Jared Rossi

On 7/19/21 10:16 AM, Matthew Rosato wrote:

On 7/5/21 12:39 PM, Cornelia Huck wrote:

This is a followup on the first version (which I had sent out in May,
and which kind of fell through the cracks.) While the first patch
is mostly unchanged, I added a second patch to address some possible
problems with the generated unit exceptions; non-vfio subchannels
are not affected by this.

As before, this works on the good path, and I have not managed to
actually get my system to exercise the error path :(


Sorry for the silence, was out of office for a bit and Eric is 
unavailable -- Anyway the code LGTM and matches what I see in the 
POPs, I'd be willing to ACK but I'd feel better if we could exercise 
the error paths before merging.


@Jared/@Mike, you've both had eyes on this area of code recently, 
would one of you be willing to take a crack at a tested-by (non-zero 
CCs on HSCH/CSCH + also drive the sch_gen_unit_exception path)?




v1->v2:
- add comments regarding -ENODEV/-EACCES handling
- add second patch

Cornelia Huck (2):
   vfio-ccw: forward halt/clear errors
   css: fix actl handling for unit exceptions

  hw/s390x/css.c | 38 ++
  hw/vfio/ccw.c  |  4 ++--
  include/hw/s390x/css.h |  3 ++-
  3 files changed, 38 insertions(+), 7 deletions(-)

Re: [PATCH for-6.2 01/10] docs: qom: Replace old GTK-Doc #symbol syntax with `symbol`

2021-08-04 Thread John Snow

On Wed, Aug 4, 2021 at 5:00 PM Eduardo Habkost  wrote:

> On Wed, Aug 04, 2021 at 09:42:24PM +0100, Peter Maydell wrote:
> > On Wed, 4 Aug 2021 at 21:31, Eduardo Habkost 
> wrote:
> > >
> > > On Mon, Aug 02, 2021 at 01:14:57PM +0100, Peter Maydell wrote:
> > > > On Thu, 29 Jul 2021 at 19:00, Eduardo Habkost 
> wrote:
> > > > > diff --git a/docs/devel/qom.rst b/docs/devel/qom.rst
> > > > > index e5fe3597cd8..9c1be5d7fc2 100644
> > > > > --- a/docs/devel/qom.rst
> > > > > +++ b/docs/devel/qom.rst
> > > > > @@ -3,6 +3,7 @@ The QEMU Object Model (QOM)
> > > > >  ===
> > > > >
> > > > >  .. highlight:: c
> > > > > +.. default-role:: any
> > > > >
> > > > >  The QEMU Object Model provides a framework for registering user
> creatable
> > > > >  types and instantiating objects from those types.  QOM provides
> the following
> > > > > @@ -42,8 +43,8 @@ features:
> > > > >
> > > > > type_init(my_device_register_types)
> > > > >
> > > > > -In the above example, we create a simple type that is described
> by #TypeInfo.
> > > > > -#TypeInfo describes information about the type including what it
> inherits
> > > > > +In the above example, we create a simple type that is described
> by `TypeInfo`.
> > > > > +`TypeInfo` describes information about the type including what it
> inherits
> > > >
> > > > I've just gone through all of docs/ finding the places where we had
> `foo` and
> > > > probably meant ``foo``, so please don't add any new ones. I would
> suggest
> > > > that you either use the ``double-backtick`` syntax to render as
> fixed-width
> > > > font, or use an explicit role tag so readers of the rST source can
> tell that
> > > > that's what you meant to use, ie avoid "default-role".
> > >
> > > I don't understand why that would be a reason to not use
> > > default-role.  With default-role, we get an error when misusing
> > > `foo`.  Without default-role, misuse won't be detected at all
> > > (except by manual inspection).
> >
> > Ah, I didn't realize that we got an error if we set the default-role
> > to 'any'. That certainly makes it nicer than the default default
> > of 'cite'.
> >
> > Is there a sensible default-role we can use as the default for the whole
> manual,
> > rather than only declaring it in individual .rst files ?  One of the
> > things I don't
> > like about the change here is that it means that `thing` in this
> individual .rst
> > file is different from `thing` in every other .rst file in our docs.
>
> I believe "any" would be a very sensible default role for all
> documents, but I don't know how to set default-role globally.
> I'll try to find out.
>
> --
> Eduardo
>

Oh, I actually fixed that issue I referenced there back in May -- I keep a
patchset up to date whenever I work on modernizing the QAPI python code
that actually DOES switch our default role to "any" ... I updated it just
today, in fact. I will send it to the list if there's an appetite for it
now.

--js

Re: [PATCH v2 0/6] qapi: static typing conversion, pt5b

2021-08-04 Thread John Snow


On 5/20/21 6:57 PM, John Snow wrote:

This is part five (b), and focuses on QAPIDoc in parser.py.

gitlab: https://gitlab.com/jsnow/qemu/-/commits/python-qapi-cleanup-pt5b

Requirements:
- Python 3.6+
- mypy >= 0.770
- pylint >= 2.6.0 (2.7.0+ when using Python 3.9+)

Every commit should pass with:
  - `isort -c qapi/`
  - `flake8 qapi/`
  - `pylint --rcfile=qapi/pylintrc qapi/`
  - `mypy --config-file=qapi/mypy.ini qapi/`

V2:
  - Changed patch 01 to fix error message.
  - Add a TODO for fixing the cycle in patch 03.
  - Changed some commit messages, patch names

John Snow (6):
   qapi/parser: fix unused check_args_section arguments
   qapi/parser: Allow empty QAPIDoc Sections
   qapi/parser: add type hint annotations (QAPIDoc)
   qapi/parser: enable mypy checks
   qapi/parser: Silence too-few-public-methods warning
   qapi/parser: enable pylint checks

  scripts/qapi/mypy.ini |  5 --
  scripts/qapi/parser.py| 98 +--
  scripts/qapi/pylintrc |  3 +-
  tests/qapi-schema/doc-bad-feature.err |  2 +-
  4 files changed, 64 insertions(+), 44 deletions(-)



From memory, where we left off was:

- What is our plan for eliminating the cycle in QAPIDoc?
- What's the actual situation with these "empty" sections?

And my answer to both problems as of today is:

... I'm not really sure yet, but I have a lot of preliminary work 
building up on implementing a cross-referenceable QAPI domain for 
sphinx, which might necessitate some heavier changes to how QAPIDoc 
information is parsed and stored.


I'd like to cover fixing both design problems of QAPIDoc at that time if 
you'll let me sweep the dirt under the mat until then. I can add FIXME 
comments to the code -- at the moment, the configuration of ./python/ 
does not tolerate TODO nor FIXME comments, and I do intend to move 
./scripts/qapi to ./python/qemu/qapi once we finish typing it, so you 
can be assured that I'll have to address those to do the thing I want.


In the meantime it means a blemish in the form of TYPE_CHECKING, but it 
lets us get on with everything else in the meantime.


Worst case scenario: A meteorite pierces my skull and the work goes 
unfinished. parser.py type checks but has some FIXME comments jangling 
around that Someone:tm: has to fix, but the Heat Death Of The Universe 
occurs first. Nobody has any energy left to be dissatisfied with the way 
things wound up.


--js

Re: [PATCH v3 4/7] migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source

2021-08-04 Thread Peter Xu

On Fri, Jul 30, 2021 at 10:52:46AM +0200, David Hildenbrand wrote:
> We don't want to migrate memory that corresponds to discarded ranges as
> managed by a RamDiscardManager responsible for the mapped memory region of
> the RAMBlock. The content of these pages is essentially stale and
> without any guarantees for the VM ("logically unplugged").
> 
> Depending on the underlying memory type, even reading memory might populate
> memory on the source, resulting in an undesired memory consumption. Of
> course, on the destination, even writing a zeropage consumes memory,
> which we also want to avoid (similar to free page hinting).
> 
> Currently, virtio-mem tries achieving that goal (not migrating "unplugged"
> memory that was discarded) by going via qemu_guest_free_page_hint() - but
> it's hackish and incomplete.
> 
> For example, background snapshots still end up reading all memory, as
> they don't do bitmap syncs. Postcopy recovery code will re-add
> previously cleared bits to the dirty bitmap and migrate them.
> 
> Let's consult the RamDiscardManager after setting up our dirty bitmap
> initially and when postcopy recovery code reinitializes it: clear
> corresponding bits in the dirty bitmaps (e.g., of the RAMBlock and inside
> KVM). It's important to fixup the dirty bitmap *after* our initial bitmap
> sync, such that the corresponding dirty bits in KVM are actually cleared.
> 
> As colo is incompatible with discarding of RAM and inhibits it, we don't
> have to bother.
> 
> Note: if a misbehaving guest would use discarded ranges after migration
> started we would still migrate that memory: however, then we already
> populated that memory on the migration source.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH v3 3/7] migration/ram: Don't passs RAMState to migration_clear_memory_region_dirty_bitmap_*()

2021-08-04 Thread Peter Xu

On Fri, Jul 30, 2021 at 10:52:45AM +0200, David Hildenbrand wrote:
> The parameter is unused, let's drop it.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH v3 6/7] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination

2021-08-04 Thread Peter Xu

On Fri, Jul 30, 2021 at 10:52:48AM +0200, David Hildenbrand wrote:
> Currently, when someone (i.e., the VM) accesses discarded parts inside a
> RAMBlock with a RamDiscardManager managing the corresponding mapped memory
> region, postcopy will request migration of the corresponding page from the
> source. The source, however, will never answer, because it refuses to
> migrate such pages with undefined content ("logically unplugged"): the
> pages are never dirty, and get_queued_page() will consequently skip
> processing these postcopy requests.
> 
> Especially reading discarded ("logically unplugged") ranges is supposed to
> work in some setups (for example with current virtio-mem), although it
> barely ever happens: still, not placing a page would currently stall the
> VM, as it cannot make forward progress.
> 
> Let's check the state via the RamDiscardManager (the state e.g.,
> of virtio-mem is migrated during precopy) and avoid sending a request
> that will never get answered. Place a fresh zero page instead to keep
> the VM working. This is the same behavior that would happen
> automatically without userfaultfd being active, when accessing virtual
> memory regions without populated pages -- "populate on demand".
> 
> For now, there are valid cases (as documented in the virtio-mem spec) where
> a VM might read discarded memory; in the future, we will disallow that.
> Then, we might want to handle that case differently, e.g., warning the
> user that the VM seems to be mis-behaving.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  migration/postcopy-ram.c | 31 +++
>  migration/ram.c  | 21 +
>  migration/ram.h  |  1 +
>  3 files changed, 49 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 2e9697bdd2..38cdfc09c3 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -671,6 +671,29 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd,
>  return ret;
>  }
>  
> +static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb,
> + ram_addr_t start, uint64_t haddr)
> +{
> +void *aligned = (void *)(uintptr_t)(haddr & -qemu_ram_pagesize(rb));
> +
> +/*
> + * Discarded pages (via RamDiscardManager) are never migrated. On 
> unlikely
> + * access, place a zeropage, which will also set the relevant bits in the
> + * recv_bitmap accordingly, so we won't try placing a zeropage twice.
> + *
> + * Checking a single bit is sufficient to handle pagesize > TPS as either
> + * all relevant bits are set or not.
> + */
> +assert(QEMU_IS_ALIGNED(start, qemu_ram_pagesize(rb)));

Is this check for ramblock_page_is_discarded()?  If so, shall we move this into
it, e.g. after memory_region_has_ram_discard_manager() returned true?

Other than that it looks good to me, thanks.

> +if (ramblock_page_is_discarded(rb, start)) {
> +bool received = ramblock_recv_bitmap_test_byte_offset(rb, start);
> +
> +return received ? 0 : postcopy_place_page_zero(mis, aligned, rb);
> +}
> +
> +return migrate_send_rp_req_pages(mis, rb, start, haddr);
> +}
> +
>  /*
>   * Callback from shared fault handlers to ask for a page,
>   * the page must be specified by a RAMBlock and an offset in that rb
> @@ -690,7 +713,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, 
> RAMBlock *rb,
>  qemu_ram_get_idstr(rb), rb_offset);
>  return postcopy_wake_shared(pcfd, client_addr, rb);
>  }
> -migrate_send_rp_req_pages(mis, rb, aligned_rbo, client_addr);
> +postcopy_request_page(mis, rb, aligned_rbo, client_addr);
>  return 0;
>  }
>  
> @@ -984,8 +1007,8 @@ retry:
>   * Send the request to the source - we want to request one
>   * of our host page sizes (which is >= TPS)
>   */
> -ret = migrate_send_rp_req_pages(mis, rb, rb_offset,
> -msg.arg.pagefault.address);
> +ret = postcopy_request_page(mis, rb, rb_offset,
> +msg.arg.pagefault.address);
>  if (ret) {
>  /* May be network failure, try to wait for recovery */
>  if (ret == -EIO && postcopy_pause_fault_thread(mis)) {
> @@ -993,7 +1016,7 @@ retry:
>  goto retry;
>  } else {
>  /* This is a unavoidable fault */
> -error_report("%s: migrate_send_rp_req_pages() get %d",
> +error_report("%s: postcopy_request_page() get %d",
>   __func__, ret);
>  break;
>  }
> diff --git a/migration/ram.c b/migration/ram.c
> index 9776919faa..01cea01774 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -912,6 +912,27 @@ static uint64_t 
> ramblock_dirty

Re: [PATCH 2/2] target/hppa: Clean up DisasCond

2021-08-04 Thread Richard Henderson


On 8/4/21 11:01 AM, Philippe Mathieu-Daudé wrote:

On 7/8/21 10:58 PM, Richard Henderson wrote:

The a0_is_n flag is redundant with comparing a0 to cpu_psw_n.


Preferably split as 2 patches:
Reviewed-by: Philippe Mathieu-Daudé 


Already in mainline for 3 weeks.


r~

[PATCH v8] tests/tcg/s390x: Test SIGILL and SIGSEGV handling

2021-08-04 Thread Ilya Leoshkevich

Verify that s390x-specific uc_mcontext.psw.addr is reported correctly
and that signal handling interacts properly with debugging.

Signed-off-by: Ilya Leoshkevich 
---

v7: https://lists.nongnu.org/archive/html/qemu-devel/2021-08/msg00463.html
v7 -> v8: Another rebase needed due to the conflict with Jonathan's
  50e36dd61652.

 tests/tcg/s390x/Makefile.target   |  17 +-
 tests/tcg/s390x/gdbstub/test-signals-s390x.py |  76 
 tests/tcg/s390x/signals-s390x.c   | 165 ++
 3 files changed, 257 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/s390x/gdbstub/test-signals-s390x.py
 create mode 100644 tests/tcg/s390x/signals-s390x.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index bd084c7840..cc64dd32d2 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -1,4 +1,5 @@
-VPATH+=$(SRC_PATH)/tests/tcg/s390x
+S390X_SRC=$(SRC_PATH)/tests/tcg/s390x
+VPATH+=$(S390X_SRC)
 CFLAGS+=-march=zEC12 -m64
 TESTS+=hello-s390x
 TESTS+=csst
@@ -9,3 +10,17 @@ TESTS+=pack
 TESTS+=mvo
 TESTS+=mvc
 TESTS+=trap
+TESTS+=signals-s390x
+
+ifneq ($(HAVE_GDB_BIN),)
+GDB_SCRIPT=$(SRC_PATH)/tests/guest-debug/run-test.py
+
+run-gdbstub-signals-s390x: signals-s390x
+   $(call run-test, $@, $(GDB_SCRIPT) \
+   --gdb $(HAVE_GDB_BIN) \
+   --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
+   --bin $< --test $(S390X_SRC)/gdbstub/test-signals-s390x.py, \
+   "mixing signals and debugging on s390x")
+
+EXTRA_RUNS += run-gdbstub-signals-s390x
+endif
diff --git a/tests/tcg/s390x/gdbstub/test-signals-s390x.py 
b/tests/tcg/s390x/gdbstub/test-signals-s390x.py
new file mode 100644
index 00..80a284b475
--- /dev/null
+++ b/tests/tcg/s390x/gdbstub/test-signals-s390x.py
@@ -0,0 +1,76 @@
+from __future__ import print_function
+
+#
+# Test that signals and debugging mix well together on s390x.
+#
+# This is launched via tests/guest-debug/run-test.py
+#
+
+import gdb
+import sys
+
+failcount = 0
+
+
+def report(cond, msg):
+"""Report success/fail of test"""
+if cond:
+print("PASS: %s" % (msg))
+else:
+print("FAIL: %s" % (msg))
+global failcount
+failcount += 1
+
+
+def run_test():
+"""Run through the tests one by one"""
+illegal_op = gdb.Breakpoint("illegal_op")
+stg = gdb.Breakpoint("stg")
+mvc_8 = gdb.Breakpoint("mvc_8")
+
+# Expect the following events:
+# 1x illegal_op breakpoint
+# 2x stg breakpoint, segv, breakpoint
+# 2x mvc_8 breakpoint, segv, breakpoint
+for _ in range(14):
+gdb.execute("c")
+report(illegal_op.hit_count == 1, "illegal_op.hit_count == 1")
+report(stg.hit_count == 4, "stg.hit_count == 4")
+report(mvc_8.hit_count == 4, "mvc_8.hit_count == 4")
+
+# The test must succeed.
+gdb.Breakpoint("_exit")
+gdb.execute("c")
+status = int(gdb.parse_and_eval("$r2"))
+report(status == 0, "status == 0");
+
+
+#
+# This runs as the script it sourced (via -x, via run-test.py)
+#
+try:
+inferior = gdb.selected_inferior()
+arch = inferior.architecture()
+print("ATTACHED: %s" % arch.name())
+except (gdb.error, AttributeError):
+print("SKIPPING (not connected)", file=sys.stderr)
+exit(0)
+
+if gdb.parse_and_eval("$pc") == 0:
+print("SKIP: PC not set")
+exit(0)
+
+try:
+# These are not very useful in scripts
+gdb.execute("set pagination off")
+gdb.execute("set confirm off")
+
+# Run the actual tests
+run_test()
+except (gdb.error):
+print("GDB Exception: %s" % (sys.exc_info()[0]))
+failcount += 1
+pass
+
+print("All tests complete: %d failures" % failcount)
+exit(failcount)
diff --git a/tests/tcg/s390x/signals-s390x.c b/tests/tcg/s390x/signals-s390x.c
new file mode 100644
index 00..dc2f8ee59a
--- /dev/null
+++ b/tests/tcg/s390x/signals-s390x.c
@@ -0,0 +1,165 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Various instructions that generate SIGILL and SIGSEGV. They could have been
+ * defined in a separate .s file, but this would complicate the build, so the
+ * inline asm is used instead.
+ */
+
+void illegal_op(void);
+void after_illegal_op(void);
+asm(".globl\tillegal_op\n"
+"illegal_op:\t.byte\t0x00,0x00\n"
+"\t.globl\tafter_illegal_op\n"
+"after_illegal_op:\tbr\t%r14");
+
+void stg(void *dst, unsigned long src);
+asm(".globl\tstg\n"
+"stg:\tstg\t%r3,0(%r2)\n"
+"\tbr\t%r14");
+
+void mvc_8(void *dst, void *src);
+asm(".globl\tmvc_8\n"
+"mvc_8:\tmvc\t0(8,%r2),0(%r3)\n"
+"\tbr\t%r14");
+
+static void safe_puts(const char *s)
+{
+write(0, s, strlen(s));
+write(0, "\n", 1);
+}
+
+enum exception {
+exception_operation,
+exception_translation,
+exception_protection,
+};
+
+static struct {
+int sig;
+void *addr;
+unsigned long psw_addr;
+enum exception exception;
+} expected;
+
+static void handle_signal(int sig, siginf

[PATCH RFC 1/1] accel/tcg: Clear PAGE_WRITE before translation

2021-08-04 Thread Ilya Leoshkevich

translate_insn() implementations fetch instruction bytes piecemeal,
which can cause qemu-user to generate inconsistent translations if
another thread modifies them concurrently [1].

Fix by marking translation block pages non-writable earlier.

[1] https://lists.nongnu.org/archive/html/qemu-devel/2021-08/msg00644.html

Signed-off-by: Ilya Leoshkevich 
---
 accel/tcg/translate-all.c| 59 +---
 accel/tcg/translator.c   | 26 ++--
 include/exec/translate-all.h |  1 +
 3 files changed, 59 insertions(+), 27 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index bbfcfb698c..fb9ebfad9e 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1297,31 +1297,8 @@ static inline void tb_page_add(PageDesc *p, 
TranslationBlock *tb,
 invalidate_page_bitmap(p);
 
 #if defined(CONFIG_USER_ONLY)
-if (p->flags & PAGE_WRITE) {
-target_ulong addr;
-PageDesc *p2;
-int prot;
-
-/* force the host page as non writable (writes will have a
-   page fault + mprotect overhead) */
-page_addr &= qemu_host_page_mask;
-prot = 0;
-for (addr = page_addr; addr < page_addr + qemu_host_page_size;
-addr += TARGET_PAGE_SIZE) {
-
-p2 = page_find(addr >> TARGET_PAGE_BITS);
-if (!p2) {
-continue;
-}
-prot |= p2->flags;
-p2->flags &= ~PAGE_WRITE;
-  }
-mprotect(g2h_untagged(page_addr), qemu_host_page_size,
- (prot & PAGE_BITS) & ~PAGE_WRITE);
-if (DEBUG_TB_INVALIDATE_GATE) {
-printf("protecting code page: 0x" TB_PAGE_ADDR_FMT "\n", 
page_addr);
-}
-}
+/* translator_loop() must have made all TB pages non-writable */
+assert(!(p->flags & PAGE_WRITE));
 #else
 /* if some code is already present, then the pages are already
protected. So we handle the case where only the first TB is
@@ -2394,6 +2371,38 @@ int page_check_range(target_ulong start, target_ulong 
len, int flags)
 return 0;
 }
 
+void page_protect(tb_page_addr_t page_addr)
+{
+target_ulong addr;
+PageDesc *p;
+int prot;
+
+p = page_find(page_addr >> TARGET_PAGE_BITS);
+if (p && (p->flags & PAGE_WRITE)) {
+/*
+ * Force the host page as non writable (writes will have a page fault +
+ * mprotect overhead).
+ */
+page_addr &= qemu_host_page_mask;
+prot = 0;
+for (addr = page_addr; addr < page_addr + qemu_host_page_size;
+ addr += TARGET_PAGE_SIZE) {
+
+p = page_find(addr >> TARGET_PAGE_BITS);
+if (!p) {
+continue;
+}
+prot |= p->flags;
+p->flags &= ~PAGE_WRITE;
+}
+mprotect(g2h_untagged(page_addr), qemu_host_page_size,
+ (prot & PAGE_BITS) & ~PAGE_WRITE);
+if (DEBUG_TB_INVALIDATE_GATE) {
+printf("protecting code page: 0x" TB_PAGE_ADDR_FMT "\n", 
page_addr);
+}
+}
+}
+
 /* called from signal handler: invalidate the code and unprotect the
  * page. Return 0 if the fault was not handled, 1 if it was handled,
  * and 2 if it was handled but the caller must cause the TB to be
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index c53a7f8e44..bfbe7d7ccf 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -14,6 +14,7 @@
 #include "exec/exec-all.h"
 #include "exec/gen-icount.h"
 #include "exec/log.h"
+#include "exec/translate-all.h"
 #include "exec/translator.h"
 #include "exec/plugin-gen.h"
 #include "sysemu/replay.h"
@@ -47,6 +48,10 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 {
 uint32_t cflags = tb_cflags(tb);
 bool plugin_enabled;
+bool stop = false;
+#ifdef CONFIG_USER_ONLY
+target_ulong page_addr = -1;
+#endif
 
 /* Initialize DisasContext */
 db->tb = tb;
@@ -71,6 +76,21 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
 
 while (true) {
+#ifdef CONFIG_USER_ONLY
+/*
+ * Make the page containing the next instruction non-writable in order
+ * to get a consistent translation if another thread is modifying the
+ * code while translate_insn() fetches the instruction bytes piecemeal.
+ * Writer threads will wait for mmap_lock() in page_unprotect().
+ */
+if ((db->pc_next & TARGET_PAGE_MASK) != page_addr) {
+page_addr = db->pc_next & TARGET_PAGE_MASK;
+page_protect(page_addr);
+}
+#endif
+if (stop) {
+break;
+}
 db->num_insns++;
 ops->insn_start(db, cpu);
 tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
@@ -95,7 +115,8 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 
 /*

[PATCH RFC 0/1] accel/tcg: Clear PAGE_WRITE before translation

2021-08-04 Thread Ilya Leoshkevich

Hello,

As discussed on IRC, here is the tentative fix for concurrent code
patching. It helps with the x86_64 .NET app on s390x and survives
check-tcg.

Bug report: 
https://lists.nongnu.org/archive/html/qemu-devel/2021-08/msg00644.html

IRC log:

 iii: my initial thoughts are there is a race in tb_page_add because 
while we will have flushed all the old pages this new corrupted page gets added 
the new corrupted one gets in
 stsquad: I think you are right that it can be considered a tb_page_add 
race. Would it be reasonable to solve it by marking the page read-only before 
translation and then making sure that it doesn't get its PAGE_WRITE back until 
translation is complete?
 iii, stsquad: 
https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg07995.html
 iii: yes, making the page read-only early is the fix, i think.  i believe 
we already hold the mmap_lock around translation, so that should make a writer 
fault and then wait on the mmap_lock.
 rth: Thanks, let me give it a try. I'll post whatever I come up with as 
an RFC patch to qemu-devel.
 thanks
 rth: doesn't that serialise all translation again?
 rth: we could page lock instead?
 stsquad: i thought we were talking about user-only, where translation is 
always serial.
 stsquad: the link from january is a system-mode bug of the same kind, 
where, yes, we need to hold the page lock or something.
 rth: ahh yes because we don't have zoned translation caches...

Ilya Leoshkevich (1):
  accel/tcg: Clear PAGE_WRITE before translation

 accel/tcg/translate-all.c| 59 +---
 accel/tcg/translator.c   | 26 ++--
 include/exec/translate-all.h |  1 +
 3 files changed, 59 insertions(+), 27 deletions(-)

-- 
2.31.1

Re: [PATCH v5 05/10] ACPI ERST: support for ACPI ERST feature

2021-08-04 Thread Eric DeVolder





On 7/27/21 7:52 AM, Igor Mammedov wrote:

On Mon, 26 Jul 2021 15:01:05 -0500
Eric DeVolder  wrote:


On 7/26/21 5:42 AM, Igor Mammedov wrote:

On Wed, 21 Jul 2021 11:07:40 -0500
Eric DeVolder  wrote:
   

On 7/20/21 7:17 AM, Igor Mammedov wrote:

On Wed, 30 Jun 2021 15:07:16 -0400
Eric DeVolder  wrote:
  

[..]

+
+static uint64_t erst_rd_reg64(hwaddr addr,
+uint64_t reg, unsigned size)
+{
+uint64_t rdval;
+uint64_t mask;
+unsigned shift;
+
+if (size == sizeof(uint64_t)) {
+/* 64b access */
+mask = 0xUL;
+shift = 0;
+} else {
+/* 32b access */
+mask = 0xUL;
+shift = ((addr & 0x4) == 0x4) ? 32 : 0;
+}
+
+rdval = reg;
+rdval >>= shift;
+rdval &= mask;
+
+return rdval;
+}
+
+static uint64_t erst_wr_reg64(hwaddr addr,
+uint64_t reg, uint64_t val, unsigned size)
+{
+uint64_t wrval;
+uint64_t mask;
+unsigned shift;
+
+if (size == sizeof(uint64_t)) {
+/* 64b access */
+mask = 0xUL;
+shift = 0;
+} else {
+/* 32b access */
+mask = 0xUL;
+shift = ((addr & 0x4) == 0x4) ? 32 : 0;
+}
+
+val &= mask;
+val <<= shift;
+mask <<= shift;
+wrval = reg;
+wrval &= ~mask;
+wrval |= val;
+
+return wrval;
+}

(I see in next patch it's us defining access width in the ACPI tables)
so question is: do we have to have mixed register width access?
can't all register accesses be 64-bit?


Initially I attempted to just use 64-bit exclusively. The problem is that,
for reasons I don't understand, the OSPM on Linux, even x86_64, breaks a 64b
register access into two. Here's an example of reading the exchange buffer
address, which is coded as a 64b access:

acpi_erst_reg_write addr: 0x <== 0x000d (size: 4)
acpi_erst_reg_read  addr: 0x0008 ==> 0xc101 (size: 4)
acpi_erst_reg_read  addr: 0x000c ==> 0x (size: 4)

So I went ahead and made ACTION register accesses 32b, else there would
be two reads of 32-bts, of which the second is useless.


could you post here decompiled ERST table?

As it is long, I posted it to the end of this message.


RHEL8 or Fedora 34 says that erst is invalid table,
so I can't help tracing what's going on there.

You'll have to figure out why access is not 64 bit on your own.


Today I downloaded Fedora 34 Server and created a guest. Using my
qemu-6 branch with ERST, it appears to work just fine. I was able to
create entries into it.


[0.010215] ACPI: ERST 0x7F9F3000 000390 (v01 BOCHS  BXPC 
0001 BXPC 0001)
[0.010250] ACPI: Reserving ERST table memory at [mem 0x7f9f3000-0x7f9f338f]
[1.056244] ERST: Error Record Serialization Table (ERST) support is 
initialized.
[1.056279] pstore: Registered erst as persistent store backend

total 0
drwxr-x---.  2 root root 0 Aug  4 18:05 .
drwxr-xr-x. 10 root root 0 Aug  4 18:05 ..
-r--r--r--.  1 root root 17700 Aug  4 17:54 dmesg-erst-6992696633267847169
-r--r--r--.  1 root root 17714 Aug  4 17:54 dmesg-erst-6992696633267847170


It appears to Linux OSPM is taking the 64-bit register access and breaking it
into two 32-bit accesses. If this is the case, then the fix would be in
Linux and not this code.

Pending your response to this finding, I have v6 ready to go.
Thanks
eric




[...]


Obtained via a running guest with:
iasl -d -vt /sys/firmware/acpi/tables/ERST

/*
   * Intel ACPI Component Architecture
   * AML/ASL+ Disassembler version 20180105 (64-bit version)
   * Copyright (c) 2000 - 2018 Intel Corporation
   *
   * Disassembly of ERST.blob, Mon Jul 26 14:31:21 2021
   *
   * ACPI Data Table [ERST]
   *
   * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
   */

[000h    4]Signature : "ERST"[Error Record 
Serialization Table]
[004h 0004   4] Table Length : 0390
[008h 0008   1] Revision : 01
[009h 0009   1] Checksum : C8
[00Ah 0010   6]   Oem ID : "BOCHS "
[010h 0016   8] Oem Table ID : "BXPC"
[018h 0024   4] Oem Revision : 0001
[01Ch 0028   4]  Asl Compiler ID : "BXPC"
[020h 0032   4]Asl Compiler Revision : 0001

[024h 0036   4]  Serialization Header Length : 0030
[028h 0040   4] Reserved : 
[02Ch 0044   4]  Instruction Entry Count : 001B

[030h 0048   1]   Action : 00 [Begin Write Operation]
[031h 0049   1]  Instruction : 03 [Write Register Value]
[032h 0050   1]Flags (decoded below) : 00
Preserve Register Bits : 0
[033h 0051   1] Reserved : 00

[034h 0052  12]  Register Region : [Generic Address Structure]
[034h 0052   1] Space ID : 00 [SystemMemory]
[035h 0053   1]

Re: [PATCH for 6.1] tests: filter out TLS distinguished name in certificate checks

2021-08-04 Thread Eric Blake

On Wed, Aug 04, 2021 at 07:03:30PM +0100, Daniel P. Berrangé wrote:
> The version of GNUTLS in Fedora 34 has changed the order in which encodes
> fields when generating new TLS certificates. This in turn changes the
> order seen when querying the distinguished name. This ultimately breaks
> the expected output in the NBD TLS iotests. We don't need to be
> comparing the exact distinguished name text for the purpose of the test
> though, so it is fine to filter it out.
> 
> Reported-by: Eric Blake 
> Signed-off-by: Daniel P. Berrangé 
> ---

Reviewed-by: Eric Blake 
Tested-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH-for-6.2 v4] memory: Directly dispatch alias accesses on origin memory region

2021-08-04 Thread Philippe Mathieu-Daudé

Let's be honest, this won't make 6.1, so update the subject to 6.2,
as some scan the list for "6.2" in the subject when 6.1 release
is out.

On 7/7/21 3:58 PM, Philippe Mathieu-Daudé wrote:
> Peter Xu already reviewed, but Cc'ing Peter Maydell too due to
> his last comment on v3:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg800482.html
> 
> On 4/18/21 7:57 AM, Philippe Mathieu-Daudé wrote:
>> Since commit 2cdfcf272d ("memory: assign MemoryRegionOps to all
>> regions"), all newly created regions are assigned with
>> unassigned_mem_ops (which might be then overwritten).
>>
>> When using aliased container regions, and there is no region mapped
>> at address 0 in the container, the memory_region_dispatch_read()
>> and memory_region_dispatch_write() calls incorrectly return the
>> container unassigned_mem_ops, because the alias offset is not used.
>>
>> Consider the following setup:
>>
>> ++ < - - - - - - - - - - - +
>> | Container  |  mr
>> |  (unassigned_mem)  | |
>> ||
>> || |
>> ||  alias_offset
>> ++ <- - - - - - +--+-+
>> | ++ |  ||
>> | |  MemoryRegion0 | |  ||
>> | ++ |  |   Alias|  addr1
>> | |  MemoryRegion1 | | <~ ~  ~  ~ ~ || <~~
>> | ++ |  ||
>> ||  ++
>> ||
>> ||
>> ||
>> ||
>> | ++ |
>> | |  MemoryRegionX | |
>> | ++ |
>> | |  MemoryRegionY | |
>> | ++ |
>> | |  MemoryRegionZ | |
>> | ++ |
>> ++
>>
>> The memory_region_init_alias() flow is:
>>
>>   memory_region_init_alias()
>>   -> memory_region_init()
>>  -> object_initialize(TYPE_MEMORY_REGION)
>> -> memory_region_initfn()
>>-> mr->ops = &unassigned_mem_ops;
>>
>> Later when accessing offset=addr1 via the alias, we expect to hit
>> MemoryRegion1. The memory_region_dispatch_read() flow is:
>>
>>   memory_region_dispatch_read(addr1)
>>   -> memory_region_access_valid(mr)   <- addr1 offset is ignored
>>  -> mr->ops->valid.accepts()
>> -> unassigned_mem_accepts()
>> <- false
>>  <- false
>><- MEMTX_DECODE_ERROR
>>
>> The caller gets a MEMTX_DECODE_ERROR while the access is OK.
>>
>> Fix by dispatching aliases recursively, accessing its origin region
>> after adding the alias offset.
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>> v4:
>> - added ASCII schema
>> v3:
>> - reworded, mentioning the "alias to container" case
>> - use recursive call instead of while(), because easier when debugging
>>   therefore reset Richard R-b tag.
>> v2:
>> - use while()
>> ---
>>  softmmu/memory.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>> index d4493ef9e43..b899ca6a6b7 100644
>> --- a/softmmu/memory.c
>> +++ b/softmmu/memory.c
>> @@ -1442,6 +1442,11 @@ MemTxResult memory_region_dispatch_read(MemoryRegion 
>> *mr,
>>  unsigned size = memop_size(op);
>>  MemTxResult r;
>>  
>> +if (mr->alias) {
>> +return memory_region_dispatch_read(mr->alias,
>> +   mr->alias_offset + addr,
>> +   pval, op, attrs);
>> +}
>>  if (!memory_region_access_valid(mr, addr, size, false, attrs)) {
>>  *pval = unassigned_mem_read(mr, addr, size);
>>  return MEMTX_DECODE_ERROR;
>> @@ -1486,6 +1491,11 @@ MemTxResult memory_region_dispatch_write(MemoryRegion 
>> *mr,
>>  {
>>  unsigned size = memop_size(op);
>>  
>> +if (mr->alias) {
>> +return memory_region_dispatch_write(mr->alias,
>> +mr->alias_offset + addr,
>> +data, op, attrs);
>> +}
>>  if (!memory_region_access_valid(mr, addr, size, true, attrs)) {
>>  unassigned_mem_write(mr, addr, data, size);
>>  return MEMTX_DECODE_ERROR;
>>
>

Re: [PATCH] include/qemu: Use builtins for bswap

2021-08-04 Thread Philippe Mathieu-Daudé

On 7/8/21 8:17 PM, Richard Henderson wrote:
> All supported compilers have builtins for this.
> Drop all of the complicated system detection stuff.
> 
> Signed-off-by: Richard Henderson 
> ---
>  meson.build  |  6 -
>  include/qemu/bswap.h | 53 +++-
>  2 files changed, 3 insertions(+), 56 deletions(-)
> 
> diff --git a/meson.build b/meson.build
> index 7e12de01be..6024f2d6fb 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1290,8 +1290,6 @@ config_host_data.set('HAVE_STRCHRNUL', 
> cc.has_function('strchrnul'))
>  config_host_data.set('HAVE_SYSTEM_FUNCTION', cc.has_function('system', 
> prefix: '#include '))
>  
>  # has_header_symbol
> -config_host_data.set('CONFIG_BYTESWAP_H',
> - cc.has_header_symbol('byteswap.h', 'bswap_32'))
>  config_host_data.set('CONFIG_EPOLL_CREATE1',
>   cc.has_header_symbol('sys/epoll.h', 'epoll_create1'))
>  config_host_data.set('CONFIG_HAS_ENVIRON',
> @@ -1311,10 +1309,6 @@ config_host_data.set('CONFIG_INOTIFY',
>   cc.has_header_symbol('sys/inotify.h', 'inotify_init'))
>  config_host_data.set('CONFIG_INOTIFY1',
>   cc.has_header_symbol('sys/inotify.h', 'inotify_init1'))
> -config_host_data.set('CONFIG_MACHINE_BSWAP_H',
> - cc.has_header_symbol('machine/bswap.h', 'bswap32',
> -  prefix: '''#include 
> - #include 
> '''))
>  config_host_data.set('CONFIG_PRCTL_PR_SET_TIMERSLACK',
>   cc.has_header_symbol('sys/prctl.h', 
> 'PR_SET_TIMERSLACK'))
>  config_host_data.set('CONFIG_RTNETLINK',
> diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
> index 2d3bb8bbed..9e12bd8073 100644
> --- a/include/qemu/bswap.h
> +++ b/include/qemu/bswap.h
> @@ -1,73 +1,26 @@
>  #ifndef BSWAP_H
>  #define BSWAP_H
>  
> -#ifdef CONFIG_MACHINE_BSWAP_H
> -# include 
> -# include 
> -#elif defined(__FreeBSD__)
> -# include 
> -#elif defined(__HAIKU__)
> -# include 
> -#elif defined(CONFIG_BYTESWAP_H)
> -# include 
> -#define BSWAP_FROM_BYTESWAP
> -# else
> -#define BSWAP_FROM_FALLBACKS
> -#endif /* ! CONFIG_MACHINE_BSWAP_H */
> -
>  #ifdef __cplusplus
>  extern "C" {
>  #endif
>  
>  #include "fpu/softfloat-types.h"
>  
> -#ifdef BSWAP_FROM_BYTESWAP
>  static inline uint16_t bswap16(uint16_t x)
>  {
> -return bswap_16(x);
> +return __builtin_bswap16(x);
>  }
>  
>  static inline uint32_t bswap32(uint32_t x)
>  {
> -return bswap_32(x);
> +return __builtin_bswap32(x);
>  }
>  
>  static inline uint64_t bswap64(uint64_t x)
>  {
> -return bswap_64(x);
> +return __builtin_bswap64(x);
>  }
> -#endif
> -
> -#ifdef BSWAP_FROM_FALLBACKS
> -static inline uint16_t bswap16(uint16_t x)
> -{
> -return (((x & 0x00ff) << 8) |
> -((x & 0xff00) >> 8));
> -}
> -
> -static inline uint32_t bswap32(uint32_t x)
> -{
> -return (((x & 0x00ffU) << 24) |
> -((x & 0xff00U) <<  8) |
> -((x & 0x00ffU) >>  8) |
> -((x & 0xff00U) >> 24));
> -}
> -
> -static inline uint64_t bswap64(uint64_t x)
> -{
> -return (((x & 0x00ffULL) << 56) |
> -((x & 0xff00ULL) << 40) |
> -((x & 0x00ffULL) << 24) |
> -((x & 0xff00ULL) <<  8) |
> -((x & 0x00ffULL) >>  8) |
> -((x & 0xff00ULL) >> 24) |
> -((x & 0x00ffULL) >> 40) |
> -((x & 0xff00ULL) >> 56));
> -}
> -#endif
> -
> -#undef BSWAP_FROM_BYTESWAP
> -#undef BSWAP_FROM_FALLBACKS
>  
>  static inline void bswap16s(uint16_t *s)
>  {
> 

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 2/4] target/alpha: Use dest_sink for HW_RET temporary

2021-08-04 Thread Philippe Mathieu-Daudé

On 7/8/21 8:25 PM, Richard Henderson wrote:
> This temp is automatically freed, just like ctx->lit.
> But we're about to remove ctx->lit, so use sink instead.
> 
> Signed-off-by: Richard Henderson 
> ---
>  target/alpha/translate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 2/2] target/hppa: Clean up DisasCond

2021-08-04 Thread Philippe Mathieu-Daudé

On 7/8/21 10:58 PM, Richard Henderson wrote:
> The a0_is_n flag is redundant with comparing a0 to cpu_psw_n.

Preferably split as 2 patches:
Reviewed-by: Philippe Mathieu-Daudé 

> The a1_is_0 flag can be removed by initializing a1 to $0,
> which also means that cond_prep can be removed entirely.
> 
> Signed-off-by: Richard Henderson 
> ---
>  target/hppa/translate.c | 43 +
>  1 file changed, 9 insertions(+), 34 deletions(-)

Re: [PATCH for-6.2 01/10] docs: qom: Replace old GTK-Doc #symbol syntax with `symbol`

2021-08-04 Thread Eduardo Habkost

On Wed, Aug 04, 2021 at 09:42:24PM +0100, Peter Maydell wrote:
> On Wed, 4 Aug 2021 at 21:31, Eduardo Habkost  wrote:
> >
> > On Mon, Aug 02, 2021 at 01:14:57PM +0100, Peter Maydell wrote:
> > > On Thu, 29 Jul 2021 at 19:00, Eduardo Habkost  wrote:
> > > > diff --git a/docs/devel/qom.rst b/docs/devel/qom.rst
> > > > index e5fe3597cd8..9c1be5d7fc2 100644
> > > > --- a/docs/devel/qom.rst
> > > > +++ b/docs/devel/qom.rst
> > > > @@ -3,6 +3,7 @@ The QEMU Object Model (QOM)
> > > >  ===
> > > >
> > > >  .. highlight:: c
> > > > +.. default-role:: any
> > > >
> > > >  The QEMU Object Model provides a framework for registering user 
> > > > creatable
> > > >  types and instantiating objects from those types.  QOM provides the 
> > > > following
> > > > @@ -42,8 +43,8 @@ features:
> > > >
> > > > type_init(my_device_register_types)
> > > >
> > > > -In the above example, we create a simple type that is described by 
> > > > #TypeInfo.
> > > > -#TypeInfo describes information about the type including what it 
> > > > inherits
> > > > +In the above example, we create a simple type that is described by 
> > > > `TypeInfo`.
> > > > +`TypeInfo` describes information about the type including what it 
> > > > inherits
> > >
> > > I've just gone through all of docs/ finding the places where we had `foo` 
> > > and
> > > probably meant ``foo``, so please don't add any new ones. I would suggest
> > > that you either use the ``double-backtick`` syntax to render as 
> > > fixed-width
> > > font, or use an explicit role tag so readers of the rST source can tell 
> > > that
> > > that's what you meant to use, ie avoid "default-role".
> >
> > I don't understand why that would be a reason to not use
> > default-role.  With default-role, we get an error when misusing
> > `foo`.  Without default-role, misuse won't be detected at all
> > (except by manual inspection).
> 
> Ah, I didn't realize that we got an error if we set the default-role
> to 'any'. That certainly makes it nicer than the default default
> of 'cite'.
> 
> Is there a sensible default-role we can use as the default for the whole 
> manual,
> rather than only declaring it in individual .rst files ?  One of the
> things I don't
> like about the change here is that it means that `thing` in this individual 
> .rst
> file is different from `thing` in every other .rst file in our docs.

I believe "any" would be a very sensible default role for all
documents, but I don't know how to set default-role globally.
I'll try to find out.

-- 
Eduardo

Re: [PATCH 5/9] tests/acceptance: Use image_expand() in test_arm_orangepi_uboot_netbsd9

2021-08-04 Thread Philippe Mathieu-Daudé

Hi Niek,

On 6/23/21 8:00 PM, Philippe Mathieu-Daudé wrote:
> The NetBSD OrangePi image must be at least 2GiB, not less.
> Expand the SD card image to this size before using it.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/acceptance/boot_linux_console.py | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/acceptance/boot_linux_console.py 
> b/tests/acceptance/boot_linux_console.py
> index 61069f0064f..b10f7257503 100644
> --- a/tests/acceptance/boot_linux_console.py
> +++ b/tests/acceptance/boot_linux_console.py
> @@ -868,7 +868,12 @@ def test_arm_orangepi_uboot_netbsd9(self):
>  :avocado: tags=device:sd
>  :avocado: tags=os:netbsd
>  """
> -# This test download a 304MB compressed image and expand it to 2GB
> +# This test download a 304MB compressed image and expand it to 2GB,
> +# which is the minimum card size required by the NetBSD installer:
> +# https://wiki.netbsd.org/ports/evbarm/raspberry_pi/#index7h2
> +# "A 2 GB card is the smallest workable size that the installation
> +# image will fit on."

Do you agree with this comment and the one in the next patch?

> +NETBSD_SDCARD_MINSIZE = 2 * 1024 * 1024 * 1024
>  deb_url = ('http://snapshot.debian.org/archive/debian/'
> '20200108T145233Z/pool/main/u/u-boot/'
> 'u-boot-sunxi_2020.01%2Bdfsg-1_armhf.deb')
> @@ -886,7 +891,7 @@ def test_arm_orangepi_uboot_netbsd9(self):
>  image_path_gz = self.fetch_asset(image_url, asset_hash=image_hash)
>  image_path = os.path.join(self.workdir, 'armv7.img')
>  archive.gzip_uncompress(image_path_gz, image_path)
> -image_pow2ceil_expand(image_path)
> +image_expand(image_path, NETBSD_SDCARD_MINSIZE)
>  image_drive_args = 'if=sd,format=raw,snapshot=on,file=' + image_path
>  
>  # dd if=u-boot-sunxi-with-spl.bin of=armv7.img bs=1K seek=8 
> conv=notrunc
>

Re: [PATCH V5 00/25] Live Update [restart] : fork mode?

2021-08-04 Thread Steven Sistare

On 7/30/2021 9:10 AM, Zheng Chuan wrote:
> Hi, Steve
> I have saw the discussion about the fork+exec mode below:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg815956.html
> 
> And I am still very curious and I want to discuss about the possibility to 
> support both fork+exec and exec in cpr framework.
> 
> 1.Why
> fork+exec could have some advantages and also drawbacks versus execvp() 
> directly.
> Advantages
> i) fork+exec give the chance to fallback to original process even after we do 
> exec which is important for workload seamless if any error happens.
> ii) smaller downtime since we could remove the vm_start() downtime out of the 
> frozen window.
> Drawbacks
> i）need more codes to handle including fork，address/ports conflict between 
> parent and child.
> ii）more complex life cycle management for the two processes.
> 
> 2.How
> The cpr framework is flexible and scalable, and maybe we can make use of most 
> codes to support both execvp and fork+exec mode.
> However, we may need to do more work compared to execvp method.
> i) do fork mode in a thread like migration thread
> ii) make parent and child talk to each other through socket or anonymous pipe
> iii）make use of sharing mechanism of fds in cpr framework including memfd, 
> vfio and devices fds
> iv）deal with the conflict about the socket address and port like vnc （do by 
> reuse port and pass the different args by cprexec）
> v) do life cycle managements for two qemu processes and need parent exit and 
> reconnection for the child at last by the management service
> 
> Please tell me if I am missing something important:)

Hi Zheng, that list sounds about right.  However, additional kernel changes 
would be needed to 
change ownership of the vfio device descriptors. The vfio module records the mm 
of the creating
process, and does not allow other processes to unmap ranges.  Also, the 
mm->locked_vm would 
need to be transferred to the new process.

Fork+exec could be a new mode to the cprsave command.

- Steve

Re: [PATCH for-6.2 06/10] docs: qom: Remove unnecessary class typedefs from example

2021-08-04 Thread Peter Maydell

On Wed, 4 Aug 2021 at 21:40, Eduardo Habkost  wrote:
>
> On Mon, Aug 02, 2021 at 01:19:14PM +0100, Peter Maydell wrote:
> > On Thu, 29 Jul 2021 at 19:01, Eduardo Habkost  wrote:
> > >
> > > When there's no specific class struct used for a QOM type, we
> > > normally don't define a typedef for it.  Remove the typedef from
> > > the minimal example, as it is unnecessary.

> > Reviewed-by: Peter Maydell 
> >
> > though I agree with Daniel that in the long-term we should reverse
> > the structure of the documents so the recommended macros go first
> > and the behind-the-scenes boilerplate last.
>
> I agree 100%, and maybe I will give it a try later.
>
> My immediate goal was to just remove the examples in the docs
> where type checking macros were defined manually.  Since we
> introduced the new QOM helper macros, the number of OBJECT_CHECK
> macros in the QEMU tree almost doubled (from 40 in commit
> 8110fa1d94f2 to 75 in commit bccabb3a5d60).

Yeah, to be clear, I'm all in favour of changes like this patchset
and don't think we should hold them up because we have a theoretical
larger restructuring of the documentation we might one day (or never)
get round to.

-- PMM

Re: [PATCH for-6.2 01/10] docs: qom: Replace old GTK-Doc #symbol syntax with `symbol`

2021-08-04 Thread Peter Maydell

On Wed, 4 Aug 2021 at 21:31, Eduardo Habkost  wrote:
>
> On Mon, Aug 02, 2021 at 01:14:57PM +0100, Peter Maydell wrote:
> > On Thu, 29 Jul 2021 at 19:00, Eduardo Habkost  wrote:
> > > diff --git a/docs/devel/qom.rst b/docs/devel/qom.rst
> > > index e5fe3597cd8..9c1be5d7fc2 100644
> > > --- a/docs/devel/qom.rst
> > > +++ b/docs/devel/qom.rst
> > > @@ -3,6 +3,7 @@ The QEMU Object Model (QOM)
> > >  ===
> > >
> > >  .. highlight:: c
> > > +.. default-role:: any
> > >
> > >  The QEMU Object Model provides a framework for registering user creatable
> > >  types and instantiating objects from those types.  QOM provides the 
> > > following
> > > @@ -42,8 +43,8 @@ features:
> > >
> > > type_init(my_device_register_types)
> > >
> > > -In the above example, we create a simple type that is described by 
> > > #TypeInfo.
> > > -#TypeInfo describes information about the type including what it inherits
> > > +In the above example, we create a simple type that is described by 
> > > `TypeInfo`.
> > > +`TypeInfo` describes information about the type including what it 
> > > inherits
> >
> > I've just gone through all of docs/ finding the places where we had `foo` 
> > and
> > probably meant ``foo``, so please don't add any new ones. I would suggest
> > that you either use the ``double-backtick`` syntax to render as fixed-width
> > font, or use an explicit role tag so readers of the rST source can tell that
> > that's what you meant to use, ie avoid "default-role".
>
> I don't understand why that would be a reason to not use
> default-role.  With default-role, we get an error when misusing
> `foo`.  Without default-role, misuse won't be detected at all
> (except by manual inspection).

Ah, I didn't realize that we got an error if we set the default-role
to 'any'. That certainly makes it nicer than the default default
of 'cite'.

Is there a sensible default-role we can use as the default for the whole manual,
rather than only declaring it in individual .rst files ?  One of the
things I don't
like about the change here is that it means that `thing` in this individual .rst
file is different from `thing` in every other .rst file in our docs.

Ccing John, who I have just remembered wrote something about Sphinx roles
a few months back:
https://lore.kernel.org/qemu-devel/9fe98807-7d68-2c86-163a-af374c789...@redhat.com/

-- PMM

Re: [PATCH for-6.2 06/10] docs: qom: Remove unnecessary class typedefs from example

2021-08-04 Thread Eduardo Habkost

On Mon, Aug 02, 2021 at 01:19:14PM +0100, Peter Maydell wrote:
> On Thu, 29 Jul 2021 at 19:01, Eduardo Habkost  wrote:
> >
> > When there's no specific class struct used for a QOM type, we
> > normally don't define a typedef for it.  Remove the typedef from
> > the minimal example, as it is unnecessary.
> >
> > Signed-off-by: Eduardo Habkost 
> > ---
> >  docs/devel/qom.rst | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/docs/devel/qom.rst b/docs/devel/qom.rst
> > index 05d045bf570..dee60a64c0a 100644
> > --- a/docs/devel/qom.rst
> > +++ b/docs/devel/qom.rst
> > @@ -20,9 +20,6 @@ features:
> >
> > #define TYPE_MY_DEVICE "my-device"
> >
> > -   // No new virtual functions: we can reuse the typedef for the
> > -   // superclass.
> > -   typedef DeviceClass MyDeviceClass;
> > typedef struct MyDevice
> > {
> > DeviceState parent;
> 
> Reviewed-by: Peter Maydell 
> 
> though I agree with Daniel that in the long-term we should reverse
> the structure of the documents so the recommended macros go first
> and the behind-the-scenes boilerplate last.

I agree 100%, and maybe I will give it a try later.

My immediate goal was to just remove the examples in the docs
where type checking macros were defined manually.  Since we
introduced the new QOM helper macros, the number of OBJECT_CHECK
macros in the QEMU tree almost doubled (from 40 in commit
8110fa1d94f2 to 75 in commit bccabb3a5d60).

-- 
Eduardo

Re: [PATCH for-6.2 01/10] docs: qom: Replace old GTK-Doc #symbol syntax with `symbol`

2021-08-04 Thread Eduardo Habkost

On Mon, Aug 02, 2021 at 01:14:57PM +0100, Peter Maydell wrote:
> On Thu, 29 Jul 2021 at 19:00, Eduardo Habkost  wrote:
> >
> > Replace leftover of GTK-Doc #name syntax with `name`, and use
> > default-role:: any, so we can add references to other functions,
> > types, and macros.
> >
> > There are 3 cases that required extra care:
> > - #TypeInfo.class_init: kernel-doc doesn't generate c:member::
> >   directives, so references to C struct members are not possible
> >   yet.  This was replaced with `TypeInfo`.class_init.
> > - #CPUClass.reset and #DeviceClass.realize: cpu.h and qdev docs are not
> >   rendered using Sphinx yet, so use ``code`` syntax for those.
> >
> > Signed-off-by: Eduardo Habkost 
> > ---
> >  docs/devel/qom.rst | 25 +
> >  1 file changed, 13 insertions(+), 12 deletions(-)
> >
> > diff --git a/docs/devel/qom.rst b/docs/devel/qom.rst
> > index e5fe3597cd8..9c1be5d7fc2 100644
> > --- a/docs/devel/qom.rst
> > +++ b/docs/devel/qom.rst
> > @@ -3,6 +3,7 @@ The QEMU Object Model (QOM)
> >  ===
> >
> >  .. highlight:: c
> > +.. default-role:: any
> >
> >  The QEMU Object Model provides a framework for registering user creatable
> >  types and instantiating objects from those types.  QOM provides the 
> > following
> > @@ -42,8 +43,8 @@ features:
> >
> > type_init(my_device_register_types)
> >
> > -In the above example, we create a simple type that is described by 
> > #TypeInfo.
> > -#TypeInfo describes information about the type including what it inherits
> > +In the above example, we create a simple type that is described by 
> > `TypeInfo`.
> > +`TypeInfo` describes information about the type including what it inherits
> 
> I've just gone through all of docs/ finding the places where we had `foo` and
> probably meant ``foo``, so please don't add any new ones. I would suggest
> that you either use the ``double-backtick`` syntax to render as fixed-width
> font, or use an explicit role tag so readers of the rST source can tell that
> that's what you meant to use, ie avoid "default-role".

I don't understand why that would be a reason to not use
default-role.  With default-role, we get an error when misusing
`foo`.  Without default-role, misuse won't be detected at all
(except by manual inspection).

-- 
Eduardo

Re: [PATCH V5 03/25] cpr: QMP interfaces for reboot

2021-08-04 Thread Steven Sistare

On 8/4/2021 11:48 AM, Eric Blake wrote:
> On Wed, Jul 07, 2021 at 10:20:12AM -0700, Steve Sistare wrote:
>> cprsave calls cprsave().  Syntax:
>>   { 'enum': 'CprMode', 'data': [ 'reboot' ] }
>>   { 'command': 'cprsave', 'data': { 'file': 'str', 'mode': 'CprMode' } }
>>
>> cprload calls cprload().  Syntax:
>>   { 'command': 'cprload', 'data': { 'file': 'str' } }
> 
> Does this also allow the magic "/dev/fdset/NNN" syntax for opening an
> fd already passed in previously?
> /me goes back to patch 2 to check
> Yes, it looks like it should.
> 
>>
>> cprinfo returns a list of supported modes.  Syntax:
>>   { 'struct': 'CprInfo', 'data': { 'modes': [ 'CprMode' ] } }
>>   { 'command': 'cprinfo', 'returns': 'CprInfo' }
> 
> As pointed out elsewhere, relying on introspection seems nicer than
> adding this command.
> 
>>
>> Signed-off-by: Mark Kanda 
>> Signed-off-by: Steve Sistare 
>> ---
> 
>> +++ b/qapi/cpr.json
>> @@ -0,0 +1,74 @@
>> +# -*- Mode: Python -*-
>> +#
>> +# Copyright (c) 2021 Oracle and/or its affiliates.
>> +#
>> +# This work is licensed under the terms of the GNU GPL, version 2.
>> +# See the COPYING file in the top-level directory.
>> +
>> +##
>> +# = CPR
> 
> Might be worth expanding what this acronym stands for here.> 
>> +##
>> +
>> +{ 'include': 'common.json' }
>> +
>> +##
>> +# @CprMode:
>> +#
>> +# @reboot: checkpoint can be cprload'ed after a host kexec reboot.
>> +#
>> +# Since: 6.1
> 
> As this missed 6.1, you'll need to (eventually) rebase the series to
> mention 6.2 everywhere.

Will do for both. (Marc-Andre also asked).  And I'll fix the indentation 
problem in patch 2.

- Steve

Re: [PATCH V5 12/25] cpr: QMP interfaces for restart

2021-08-04 Thread Steven Sistare

On 8/4/2021 12:00 PM, Eric Blake wrote:
> On Wed, Jul 07, 2021 at 10:20:21AM -0700, Steve Sistare wrote:
>> cprexec calls cprexec().  Syntax:
>>   { 'command': 'cprexec', 'data': { 'argv': [ 'str' ] } }
>>
>> Add the restart mode:
>>   { 'enum': 'CprMode', 'data': [ 'reboot', 'restart' ] }
>>
>> Signed-off-by: Steve Sistare 
>> ---
>>  monitor/qmp-cmds.c |  5 +
>>  qapi/cpr.json  | 16 +++-
>>  2 files changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
>> index 1128604..7326f7d 100644
>> --- a/monitor/qmp-cmds.c
>> +++ b/monitor/qmp-cmds.c
>> @@ -179,6 +179,11 @@ void qmp_cprsave(const char *file, CprMode mode, Error 
>> **errp)
>>  cprsave(file, mode, errp);
>>  }
>>  
>> +void qmp_cprexec(strList *args, Error **errp)
>> +{
>> +cprexec(args, errp);
>> +}
> 
> Why do you need both qmp_cprexec() and cprexec()?  Can you just name
> it qmp_cprexec() in cpr.c from the get-go, rather than having to add a
> one-line wrapper in qmp-cmds.c?

Will do.

While I'm at it, I will add an underscore to the function names and a dash to 
the command
names to be consistent with other compound-word commands:

qmp_cpr_save
qmp_cpr_exec
qmp_cpr_load
cpr-save
cpr-exec
cpr-load

- Steve

Re: [PATCH for 6.1] multifd: Unconditionally unregister yank function

2021-08-04 Thread Lukas Straub

On Wed, 4 Aug 2021 15:39:55 -0400
Peter Xu  wrote:

> On Wed, Aug 04, 2021 at 09:26:32PM +0200, Lukas Straub wrote:
> > Unconditionally unregister yank function in multifd_load_cleanup().
> > If it is not unregistered here, it will leak and cause a crash
> > in yank_unregister_instance().  
> 
> Curious whether there was a crash somewhere that you hit?

No, I just noticed this while working on a different patch.

> > Now if the ioc is still in use
> > afterwards, it will only lead to qemu not being able to recover
> > from a hang related to that ioc.
> > 
> > After checking the code, i am pretty sure that ref is always 1
> > when arriving here. So all this currently does is remove the
> > unneeded check.  
> 
> Thanks for checking and removing the 2nd ref==1.  I wanted to do that but I
> didn't dare before I look more closely or test more.
> 
> The patch looks good to me, it's just that I am not sure whether it suites 6.1
> material as we've just released rc2 today.  Maybe 6.2 is more suitable?

Yeah, if my assessment of the code is correct it's more of a cleanup.

> > 
> > Signed-off-by: Lukas Straub 
> > ---
> > 
> > This is similar to Peter Xu's 
> > 39675b3394d44b880d083a214c5e44786170
> > "migration: Move the yank unregister of channel_close out"
> > in that it removes the "OBJECT(p->c)->ref == 1" hack. So it
> > makes sense for 6.1 so these patches are together.
> > 
> >  migration/multifd.c | 5 +
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> > 
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 377da78f5b..a37805e17e 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -987,10 +987,7 @@ int multifd_load_cleanup(Error **errp)
> >  for (i = 0; i < migrate_multifd_channels(); i++) {
> >  MultiFDRecvParams *p = &multifd_recv_state->params[i];
> >  
> > -if (OBJECT(p->c)->ref == 1) {
> > -migration_ioc_unregister_yank(p->c);
> > -}
> > -
> > +migration_ioc_unregister_yank(p->c);
> >  object_unref(OBJECT(p->c));
> >  p->c = NULL;
> >  qemu_mutex_destroy(&p->mutex);
> > -- 
> > 2.32.0  
> 
> 
> 



-- 



pgpdhmp4PD2Xh.pgp
Description: OpenPGP digital signature

Re: [PATCH for 6.1] multifd: Unconditionally unregister yank function

2021-08-04 Thread Peter Xu

On Wed, Aug 04, 2021 at 09:26:32PM +0200, Lukas Straub wrote:
> Unconditionally unregister yank function in multifd_load_cleanup().
> If it is not unregistered here, it will leak and cause a crash
> in yank_unregister_instance().

Curious whether there was a crash somewhere that you hit?

> Now if the ioc is still in use
> afterwards, it will only lead to qemu not being able to recover
> from a hang related to that ioc.
> 
> After checking the code, i am pretty sure that ref is always 1
> when arriving here. So all this currently does is remove the
> unneeded check.

Thanks for checking and removing the 2nd ref==1.  I wanted to do that but I
didn't dare before I look more closely or test more.

The patch looks good to me, it's just that I am not sure whether it suites 6.1
material as we've just released rc2 today.  Maybe 6.2 is more suitable?

> 
> Signed-off-by: Lukas Straub 
> ---
> 
> This is similar to Peter Xu's 
> 39675b3394d44b880d083a214c5e44786170
> "migration: Move the yank unregister of channel_close out"
> in that it removes the "OBJECT(p->c)->ref == 1" hack. So it
> makes sense for 6.1 so these patches are together.
> 
>  migration/multifd.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 377da78f5b..a37805e17e 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -987,10 +987,7 @@ int multifd_load_cleanup(Error **errp)
>  for (i = 0; i < migrate_multifd_channels(); i++) {
>  MultiFDRecvParams *p = &multifd_recv_state->params[i];
>  
> -if (OBJECT(p->c)->ref == 1) {
> -migration_ioc_unregister_yank(p->c);
> -}
> -
> +migration_ioc_unregister_yank(p->c);
>  object_unref(OBJECT(p->c));
>  p->c = NULL;
>  qemu_mutex_destroy(&p->mutex);
> -- 
> 2.32.0



-- 
Peter Xu

[PATCH] multifd: Implement yank for multifd send side

2021-08-04 Thread Lukas Straub

When introducing yank functionality in the migration code I forgot
to cover the multifd send side.

Signed-off-by: Lukas Straub 
---

@Leonardo Could you check if this fixes your issue?

 migration/multifd.c | 6 +-
 migration/multifd.h | 2 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 377da78f5b..5a4f158f3c 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -546,6 +546,9 @@ void multifd_save_cleanup(void)
 MultiFDSendParams *p = &multifd_send_state->params[i];
 Error *local_err = NULL;
 
+if (p->registered_yank) {
+migration_ioc_unregister_yank(p->c);
+}
 socket_send_channel_destroy(p->c);
 p->c = NULL;
 qemu_mutex_destroy(&p->mutex);
@@ -813,7 +816,8 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
 return false;
 }
 } else {
-/* update for tls qio channel */
+migration_ioc_register_yank(ioc);
+p->registered_yank = true;
 p->c = ioc;
 qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
QEMU_THREAD_JOINABLE);
diff --git a/migration/multifd.h b/migration/multifd.h
index 8d6751f5ed..16c4d112d1 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -85,6 +85,8 @@ typedef struct {
 bool running;
 /* should this thread finish */
 bool quit;
+/* is the yank function registered */
+bool registered_yank;
 /* thread has work to do */
 int pending_job;
 /* array of pages to sent */
-- 
2.32.0


pgpDmXO7TErHE.pgp
Description: OpenPGP digital signature

[PATCH for 6.1] multifd: Unconditionally unregister yank function

2021-08-04 Thread Lukas Straub

Unconditionally unregister yank function in multifd_load_cleanup().
If it is not unregistered here, it will leak and cause a crash
in yank_unregister_instance(). Now if the ioc is still in use
afterwards, it will only lead to qemu not being able to recover
from a hang related to that ioc.

After checking the code, i am pretty sure that ref is always 1
when arriving here. So all this currently does is remove the
unneeded check.

Signed-off-by: Lukas Straub 
---

This is similar to Peter Xu's 
39675b3394d44b880d083a214c5e44786170
"migration: Move the yank unregister of channel_close out"
in that it removes the "OBJECT(p->c)->ref == 1" hack. So it
makes sense for 6.1 so these patches are together.

 migration/multifd.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 377da78f5b..a37805e17e 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -987,10 +987,7 @@ int multifd_load_cleanup(Error **errp)
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDRecvParams *p = &multifd_recv_state->params[i];
 
-if (OBJECT(p->c)->ref == 1) {
-migration_ioc_unregister_yank(p->c);
-}
-
+migration_ioc_unregister_yank(p->c);
 object_unref(OBJECT(p->c));
 p->c = NULL;
 qemu_mutex_destroy(&p->mutex);
-- 
2.32.0


pgp1ecaqbRLj8.pgp
Description: OpenPGP digital signature

Re: [ANNOUNCE] QEMU 6.1.0-rc2 is now available

2021-08-04 Thread Michael Roth

Quoting Mark Cave-Ayland (2021-08-04 13:37:43)
> On 04/08/2021 19:05, Michael Roth wrote:
> 
> > Hello,
> > 
> > On behalf of the QEMU Team, I'd like to announce the availability of the
> > third release candidate for the QEMU 6.1 release.  This release is meant
> > for testing purposes and should not be used in a production environment.
> > 
> >http://download.qemu-project.org/qemu-6.1.0-rc2.tar.xz
> >http://download.qemu-project.org/qemu-6.1.0-rc2.tar.xz.sig
> > 
> > You can help improve the quality of the QEMU 6.1 release by testing this
> > release and reporting bugs on Launchpad:
> > 
> >https://bugs.launchpad.net/qemu/
> 
> Since the project is moving away from reporting bugs on Launchpad, are you 
> able to 
> update the text template to point to the issue tracker on gitlab instead?
> 
>  https://gitlab.com/qemu-project/qemu/-/issues

Certainly! Thanks for the suggestion.

> 
> > The release plan, as well a documented known issues for release
> > candidates, are available at:
> > 
> >http://wiki.qemu.org/Planning/6.1
> > 
> > Please add entries to the ChangeLog for the 6.1 release below:
> > 
> >http://wiki.qemu.org/ChangeLog/6.1
> > 
> > Thank you to everyone involved!
> 
> 
> ATB,
> 
> Mark.
>

Re: [PATCH v3 25/25] python/aqmp: add AsyncProtocol unit tests

2021-08-04 Thread John Snow

On Tue, Aug 3, 2021 at 2:31 PM John Snow  wrote:

> This tests most of protocol.py -- From a hacked up Coverage.py run, it's
> at about 86%. There's a few error cases that aren't very well tested
> yet, they're hard to induce artificially so far. I'm working on it.
>
> Signed-off-by: John Snow 
> ---
>  python/tests/null_proto.py |  70 +
>  python/tests/protocol.py   | 535 +
>  2 files changed, 605 insertions(+)
>  create mode 100644 python/tests/null_proto.py
>  create mode 100644 python/tests/protocol.py
>
> diff --git a/python/tests/null_proto.py b/python/tests/null_proto.py
> new file mode 100644
> index 000..c8cedea5942
> --- /dev/null
> +++ b/python/tests/null_proto.py
>

Whoops, forgot to delete this file after inlining it into the other test.

--js

Re: [ANNOUNCE] QEMU 6.1.0-rc2 is now available

2021-08-04 Thread Mark Cave-Ayland


On 04/08/2021 19:05, Michael Roth wrote:


Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
third release candidate for the QEMU 6.1 release.  This release is meant
for testing purposes and should not be used in a production environment.

   http://download.qemu-project.org/qemu-6.1.0-rc2.tar.xz
   http://download.qemu-project.org/qemu-6.1.0-rc2.tar.xz.sig

You can help improve the quality of the QEMU 6.1 release by testing this
release and reporting bugs on Launchpad:

   https://bugs.launchpad.net/qemu/


Since the project is moving away from reporting bugs on Launchpad, are you able to 
update the text template to point to the issue tracker on gitlab instead?


https://gitlab.com/qemu-project/qemu/-/issues


The release plan, as well a documented known issues for release
candidates, are available at:

   http://wiki.qemu.org/Planning/6.1

Please add entries to the ChangeLog for the 6.1 release below:

   http://wiki.qemu.org/ChangeLog/6.1

Thank you to everyone involved!



ATB,

Mark.

[ANNOUNCE] QEMU 6.1.0-rc2 is now available

2021-08-04 Thread Michael Roth

Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
third release candidate for the QEMU 6.1 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-6.1.0-rc2.tar.xz
  http://download.qemu-project.org/qemu-6.1.0-rc2.tar.xz.sig

You can help improve the quality of the QEMU 6.1 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/6.1

Please add entries to the ChangeLog for the 6.1 release below:

  http://wiki.qemu.org/ChangeLog/6.1

Thank you to everyone involved!

Changes since rc1:

bccabb3a5d: Update version for v6.1.0-rc2 release (Peter Maydell)
62a4db5522: Drop _DSM 5 from expected DSDTs on ARM (Michael S. Tsirkin)
40c3472a29: Revert "acpi/gpex: Inform os to keep firmware resource map" 
(Michael S. Tsirkin)
5cd4a8d4e5: arm/acpi: allow DSDT changes (Michael S. Tsirkin)
d7346e614f: acpi: x86: pcihp: add support hotplug on multifunction bridges 
(Igor Mammedov)
e2a6290aab: hw/pcie-root-port: Fix hotplug for PCI devices requiring IO (Marcel 
Apfelbaum)
4ac0b72bae: hw/sd/sdcard: Fix assertion accessing out-of-range addresses with 
CMD30 (Philippe Mathieu-Daudé)
2a0396285d: hw/sd/sdcard: Document out-of-range addresses for SEND_WRITE_PROT 
(Philippe Mathieu-Daudé)
87ab880252: block: Fix in_flight leak in request padding error path (Kevin Wolf)
50e36dd616: tests/tcg: Test that compare-and-trap raises SIGFPE (Jonathan 
Albrecht)
ccb5f2708f: linux-user/s390x: signal with SIGFPE on compare-and-trap (Jonathan 
Albrecht)
54ba2161d8: target/s390x: Fix SIGILL and SIGFPE psw.addr reporting (Ilya 
Leoshkevich)
43f547b73d: Update libslirp to v4.6.1 (Marc-André Lureau)
e300858ed4: qga-win/msi: fix missing libstdc++-6 DLL in MSI installer (Michael 
Roth)
5f2a8b1fc1: qemu-ga/msi: fix w32 libgcc name (Gerd Hoffmann)
24328b7a83: qga-win: Free GMatchInfo properly (Kostiantyn Kostiuk)
ce72f11274: qga-win: Fix handle leak in ga_get_win_product_name() (Basil Salman)
02ac3f4b95: qga-win: Fix build_guest_fsinfo() close of nonexistent (Basil 
Salman)
3d98f9b68d: qga-win: Increase VSS freeze timeout to 60 secs instead of 10 
(Basil Salman)
4a64939db7: docs: Move user-facing barrier docs into system manual (Peter 
Maydell)
399a04775e: ui/input-barrier: Move TODOs from barrier.txt to a comment (Peter 
Maydell)
6cb02f1522: docs: Move the protocol part of barrier.txt into interop (Peter 
Maydell)
bd77bc8b89: docs: Move bootindex.txt into system section and rstify (Peter 
Maydell)
dae257394a: hw/arm/boot: Report error if there is no fw_cfg device in the 
machine (Peter Maydell)
4d6646c7de: docs/tools/virtiofsd.rst: Delete stray backtick (Peter Maydell)
1662ea9f4b: docs/about/removed-features: Fix markup error (Peter Maydell)
6df743dc31: docs: Format literals correctly (Peter Maydell)
8a48a7c2e0: docs/system/arm/cpu-features.rst: Format literals correctly (Peter 
Maydell)
9c372ecfec: docs/system/s390x/protvirt.rst: Format literals correctly (Peter 
Maydell)
1e235edab8: docs/devel: Format literals correctly (Peter Maydell)
4df3a7bf8f: docs/devel/migration.rst: Format literals correctly (Peter Maydell)
f0d7b970ac: docs/devel/ebpf_rss.rst: Format literals correctly (Peter Maydell)
d463f3c79a: docs/devel/build-system.rst: Correct typo in example code (Peter 
Maydell)
35a4ca4031: docs/devel/build-system.rst: Format literals correctly (Peter 
Maydell)
4e0b15c252: docs: Move licence/copyright from HTML output to rST comments 
(Peter Maydell)
199a436305: docs: Remove stale TODO comments about license and version (Peter 
Maydell)
7c6ef61a5c: MAINTAINERS: Don't list Andrzej Zaborowski for various components 
(Peter Maydell)
b1b3e3e3bf: docs: Add documentation of Arm 'imx25-pdk' board (Peter Maydell)
fa6c93944a: docs: Add documentation of Arm 'kzm' board (Peter Maydell)
c9543db4cc: docs: Add documentation of Arm 'mainstone' board (Peter Maydell)
cfe6d6841f: hw/net: e1000e: Don't zero out the VLAN tag in the legacy RX 
descriptor (Christina Wang)
d897056960: hw/net: e1000e: Correct the initial value of VET register 
(Christina Wang)
a1d7e475be: hw/net: e1000: Correct the initial value of VET register (Christina 
Wang)
11744862f2: hw/net/can: sja1000 fix buff2frame_bas and buff2frame_pel when dlc 
is out of std CAN 8 bytes (Pavel Pisa)
9010b0c7a9: hw/net/vmxnet3: Do not abort QEMU if guest specified bad queue 
numbers (Thomas Huth)
236f6709ae: target/nios2: Mark raise_exception() as noreturn (Philippe 
Mathieu-Daudé)
7039e1f604: accel/tcg: Remove double bswap for helper_atomic_sto_*_mmu (Richard 
Henderson)
e17bdaab2b: coverity-model: write models fully for non-array allocation 
functions (Paolo Bonzini)
0da41187df: coverity-model: constrain g_malloc/g_malloc0/g_realloc as never 
returning NULL (Paolo Bonzini)
05ad6857a5: coverity-model: clean up the models fo

[PATCH for 6.1] tests: filter out TLS distinguished name in certificate checks

2021-08-04 Thread Daniel P . Berrangé

The version of GNUTLS in Fedora 34 has changed the order in which encodes
fields when generating new TLS certificates. This in turn changes the
order seen when querying the distinguished name. This ultimately breaks
the expected output in the NBD TLS iotests. We don't need to be
comparing the exact distinguished name text for the purpose of the test
though, so it is fine to filter it out.

Reported-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
---
 tests/qemu-iotests/233   | 2 +-
 tests/qemu-iotests/233.out   | 4 ++--
 tests/qemu-iotests/common.filter | 5 +
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/233 b/tests/qemu-iotests/233
index da150cd27b..9ca7b68f42 100755
--- a/tests/qemu-iotests/233
+++ b/tests/qemu-iotests/233
@@ -148,7 +148,7 @@ $QEMU_IMG info --image-opts \
 
 echo
 echo "== final server log =="
-cat "$TEST_DIR/server.log"
+cat "$TEST_DIR/server.log" | _filter_authz_check_tls
 rm -f "$TEST_DIR/server.log"
 
 # success, all done
diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index c3c344811b..4b1f6a0e15 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -65,6 +65,6 @@ qemu-img: Could not open 
'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': F
 == final server log ==
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
-qemu-nbd: option negotiation failed: TLS x509 authz check for 
CN=localhost,O=Cthulhu Dark Lord Enterprises client1,L=R'lyeh,C=South Pacific 
is denied
-qemu-nbd: option negotiation failed: TLS x509 authz check for 
CN=localhost,O=Cthulhu Dark Lord Enterprises client3,L=R'lyeh,C=South Pacific 
is denied
+qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
+qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
 *** done
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 268b749e2f..2b2b53946c 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -332,5 +332,10 @@ for fname in fnames:
 sys.stdout.write(result)'
 }
 
+_filter_authz_check_tls()
+{
+$SED -e 's/TLS x509 authz check for .* is denied/TLS x509 authz check for 
DISTINGUISHED-NAME is denied/'
+}
+
 # make sure this script returns success
 true
-- 
2.31.1

Re: [PATCH v7 25/33] iotests.py: VM: add own enter method

2021-08-04 Thread John Snow

On Wed, Aug 4, 2021 at 5:39 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> In superclass __enter__ method is defined with return value type hint
> 'QEMUMachine'. So, mypy thinks that return value of VM.__enter__ is
> QEMUMachine. Let's redefine __enter__ in VM class, to give it correct
> type hint.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 
> ---
>  tests/qemu-iotests/iotests.py | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 89663dac06..025e288ddd 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -568,6 +568,10 @@ def remote_filename(path):
>  class VM(qtest.QEMUQtestMachine):
>  '''A QEMU VM'''
>
> +# Redefine __enter__ with proper type hint
> +def __enter__(self) -> 'VM':
> +return self
> +
>  def __init__(self, path_suffix=''):
>  name = "qemu%s-%d" % (path_suffix, os.getpid())
>  super().__init__(qemu_prog, qemu_opts, name=name,
> --
> 2.29.2
>

First and foremost:

Reviewed-by: John Snow 
Acked-by: John Snow 

A more durable approach might be to annotate QEMUMachine differently such
that subclasses get the right types automatically. See if this following
snippet works instead of adding a new __enter__ method.

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 971ed7e8c6a..2e410103569 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -36,6 +36,7 @@
 Sequence,
 Tuple,
 Type,
+TypeVar,
 )

 from qemu.qmp import (  # pylint: disable=import-error
@@ -73,6 +74,9 @@ class AbnormalShutdown(QEMUMachineError):
 """


+_T = TypeVar('_T', bound='QEMUMachine')
+
+
 class QEMUMachine:
 """
 A QEMU VM.
@@ -166,7 +170,7 @@ def __init__(self,
 self._remove_files: List[str] = []
 self._user_killed = False

-def __enter__(self) -> 'QEMUMachine':
+def __enter__(self: _T) -> _T:
 return self

 def __exit__(self,
@@ -182,8 +186,8 @@ def add_monitor_null(self) -> None:
 self._args.append('-monitor')
 self._args.append('null')

-def add_fd(self, fd: int, fdset: int,
-   opaque: str, opts: str = '') -> 'QEMUMachine':
+def add_fd(self: _T, fd: int, fdset: int,
+   opaque: str, opts: str = '') -> _T:
 """
 Pass a file descriptor to the VM
 """

Re: [PATCH v2] block/io_uring: resubmit when result is -EAGAIN

2021-08-04 Thread Kevin Wolf

Am 04.08.2021 um 16:50 hat Stefano Garzarella geschrieben:
> On Mon, Aug 02, 2021 at 02:40:36PM +0200, Kevin Wolf wrote:
> > Am 29.07.2021 um 11:10 hat Fabian Ebner geschrieben:
> > > Linux SCSI can throw spurious -EAGAIN in some corner cases in its
> > > completion path, which will end up being the result in the completed
> > > io_uring request.
> > > 
> > > Resubmitting such requests should allow block jobs to complete, even
> > > if such spurious errors are encountered.
> > > 
> > > Co-authored-by: Stefan Hajnoczi 
> > > Reviewed-by: Stefano Garzarella 
> > > Signed-off-by: Fabian Ebner 
> > > ---
> > > 
> > > Changes from v1:
> > > * Focus on what's relevant for the patch itself in the commit
> > >   message.
> > > * Add Stefan's comment.
> > > * Add Stefano's R-b tag (I hope that's fine, since there was no
> > >   change code-wise).
> > > 
> > >  block/io_uring.c | 16 +++-
> > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/io_uring.c b/block/io_uring.c
> > > index 00a3ee9fb8..dfa475cc87 100644
> > > --- a/block/io_uring.c
> > > +++ b/block/io_uring.c
> > > @@ -165,7 +165,21 @@ static void luring_process_completions(LuringState 
> > > *s)
> > >  total_bytes = ret + luringcb->total_read;
> > > 
> > >  if (ret < 0) {
> > > -if (ret == -EINTR) {
> > > +/*
> > > + * Only writev/readv/fsync requests on regular files or host 
> > > block
> > > + * devices are submitted. Therefore -EAGAIN is not expected 
> > > but it's
> > > + * known to happen sometimes with Linux SCSI. Submit again 
> > > and hope
> > > + * the request completes successfully.
> > > + *
> > > + * For more information, see:
> > > + * 
> > > https://lore.kernel.org/io-uring/20210727165811.284510-3-ax...@kernel.dk/T/#u
> > > + *
> > > + * If the code is changed to submit other types of requests 
> > > in the
> > > + * future, then this workaround may need to be extended to 
> > > deal with
> > > + * genuine -EAGAIN results that should not be resubmitted
> > > + * immediately.
> > > + */
> > > +if (ret == -EINTR || ret == -EAGAIN) {
> > >  luring_resubmit(s, luringcb);
> > >  continue;
> > >  }
> > 
> > Reviewed-by: Kevin Wolf 
> > 
> > Question about the preexisting code, though: luring_resubmit() requires
> > that the caller calls ioq_submit() later so that the request doesn't
> > just sit in a queue without getting any attention, but actually gets
> > submitted to the kernel.
> > 
> > In the call chain ioq_submit() -> luring_process_completions() ->
> > luring_resubmit(), who takes care of that?
> 
> Mmm, good point.
> There should be the same problem with ioq_submit() ->
> luring_process_completions() -> luring_resubmit_short_read() ->
> luring_resubmit().
> 
> Should we schedule a BH for example in luring_resubmit() to make sure that
> ioq_submit() is invoked after a resubmission?

Or just loop in ioq_submit() after calling luring_process_completions()
if new requests were added to the queue?

Kevin

Re: [PATCH V5 12/25] cpr: QMP interfaces for restart

2021-08-04 Thread Eric Blake

On Wed, Jul 07, 2021 at 10:20:21AM -0700, Steve Sistare wrote:
> cprexec calls cprexec().  Syntax:
>   { 'command': 'cprexec', 'data': { 'argv': [ 'str' ] } }
> 
> Add the restart mode:
>   { 'enum': 'CprMode', 'data': [ 'reboot', 'restart' ] }
> 
> Signed-off-by: Steve Sistare 
> ---
>  monitor/qmp-cmds.c |  5 +
>  qapi/cpr.json  | 16 +++-
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
> index 1128604..7326f7d 100644
> --- a/monitor/qmp-cmds.c
> +++ b/monitor/qmp-cmds.c
> @@ -179,6 +179,11 @@ void qmp_cprsave(const char *file, CprMode mode, Error 
> **errp)
>  cprsave(file, mode, errp);
>  }
>  
> +void qmp_cprexec(strList *args, Error **errp)
> +{
> +cprexec(args, errp);
> +}

Why do you need both qmp_cprexec() and cprexec()?  Can you just name
it qmp_cprexec() in cpr.c from the get-go, rather than having to add a
one-line wrapper in qmp-cmds.c?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 1/1] hw/arm/smmu: Add access flag handling

2021-08-04 Thread Eric Auger

Hi Joe,

On 7/31/21 1:17 AM, Joe Komlodi wrote:
> The SMMU should access fault if AF == 0 in a TTD, and if AFFD == 0 in the CD.
>
> Per the spec, an access fault has a higher priority over a permission fault.
> For instance, a write to a writable clean (temporarily non-writable) page with
> AF == 0 && AFFD == 0 will cause an access fault.
> If AF == 1 in this situation, then a permission fault would occur.
>
> Access flag handling is more complicated if HTTU is supported and HA != 0 in
> the CD, however we currently do not support HTTU.
>
> Signed-off-by: Joe Komlodi 
> ---
>  hw/arm/smmu-common.c | 7 +++
>  hw/arm/smmu-internal.h   | 8 
>  hw/arm/smmuv3-internal.h | 1 +
>  hw/arm/smmuv3.c  | 1 +
>  include/hw/arm/smmu-common.h | 1 +
>  5 files changed, 18 insertions(+)
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index 0459850..0fcc65c 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -305,6 +305,7 @@ static int smmu_ptw_64(SMMUTransCfg *cfg,
>  uint64_t pte, gpa;
>  dma_addr_t pte_addr = baseaddr + offset * sizeof(pte);
>  uint8_t ap;
> +bool af;
>  
>  if (get_pte(baseaddr, offset, &pte, info)) {
>  goto error;
> @@ -341,6 +342,12 @@ static int smmu_ptw_64(SMMUTransCfg *cfg,
>   pte_addr, pte, iova, gpa,
>   block_size >> 20);
>  }
> +af = PTE_AF(pte);
> +if (is_access_fault(af, perm)) {
I don't get the perm argument here.
 
Below there is this macro definition

+#define is_access_fault(af, cfg) \
+(!cfg->affd && !af)

Thanks

Eric

> +info->type = SMMU_PTW_ERR_ACCESS;
> +goto error;
> +}
> +
>  ap = PTE_AP(pte);
>  if (is_permission_fault(ap, perm)) {
>  info->type = SMMU_PTW_ERR_PERMISSION;
> diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
> index 2d75b31..9d3b22c 100644
> --- a/hw/arm/smmu-internal.h
> +++ b/hw/arm/smmu-internal.h
> @@ -58,6 +58,11 @@
>  ((level == 3) &&\
>   ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_L3_PTE_TYPE_PAGE))
>  
> +/* access flag */
> +
> +#define PTE_AF(pte) \
> +(extract64(pte, 10, 1))
> +
>  /* access permissions */
>  
>  #define PTE_AP(pte) \
> @@ -66,6 +71,9 @@
>  #define PTE_APTABLE(pte) \
>  (extract64(pte, 61, 2))
>  
> +#define is_access_fault(af, cfg) \
> +(!cfg->affd && !af)
> +
>  /*
>   * TODO: At the moment all transactions are considered as privileged (EL1)
>   * as IOMMU translation callback does not pass user/priv attributes.
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index d1885ae..0ccad1d 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -587,6 +587,7 @@ static inline int pa_range(STE *ste)
>  #define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
>  #define CD_ENDI(x)   extract32((x)->word[0], 15, 1)
>  #define CD_IPS(x)extract32((x)->word[1], 0 , 3)
> +#define CD_AFFD(x)   extract32((x)->word[1], 3 , 1)
>  #define CD_TBI(x)extract32((x)->word[1], 6 , 2)
>  #define CD_HD(x) extract32((x)->word[1], 10 , 1)
>  #define CD_HA(x) extract32((x)->word[1], 11 , 1)
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 01b60be..df5a194 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -483,6 +483,7 @@ static int decode_cd(SMMUTransCfg *cfg, CD *cd, 
> SMMUEventInfo *event)
>  cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
>  cfg->tbi = CD_TBI(cd);
>  cfg->asid = CD_ASID(cd);
> +cfg->affd = CD_AFFD(cd);
>  
>  trace_smmuv3_decode_cd(cfg->oas);
>  
> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
> index 706be3c..b0e82ad 100644
> --- a/include/hw/arm/smmu-common.h
> +++ b/include/hw/arm/smmu-common.h
> @@ -71,6 +71,7 @@ typedef struct SMMUTransCfg {
>  bool disabled; /* smmu is disabled */
>  bool bypassed; /* translation is bypassed */
>  bool aborted;  /* translation is aborted */
> +bool affd; /* Access Flag Fault Disabled */
>  uint64_t ttb;  /* TT base address */
>  uint8_t oas;   /* output address width */
>  uint8_t tbi;   /* Top Byte Ignore */

Re: [PATCH 0/3] Disable vhost device IOTLB is IOMMU is not enabled

2021-08-04 Thread Peter Xu

On Wed, Aug 04, 2021 at 11:48:00AM +0800, Jason Wang wrote:
> Hi:
> 
> We currently try to enable device IOTLB when iommu_platform is
> set. This may lead unnecessary trasnsactions between qemu and vhost
> when vIOMMU is not used (which is the typical case for the encrypted
> VM).
> 
> So patch tries to use transport specific method to detect the enalbing
> of vIOMMU and enable the device IOTLB only if vIOMMU is enalbed.

Just to mention that there's also the ordering requirement for e.g. vfio-pci
and the iommu device so far because vfio_realize() depends on vIOMMU being
realized too, so specifying "-device vfio-pci" before "-device intel-iommu"
will stop working, I think (see the similar pci_device_iommu_address_space()
call in vfio_realize()).

Do you think vhost can do the same to assume vIOMMU must be specified before
vhost?  Then vhost can call pci_device_iommu_address_space() freely.  It'll be
more gentle for vhost even when it's used wrong: instead of not working at all
(vfio-pci), vhost runs slower.

Currently libvirt should be able to guarantee that ordering so libvirt users
shouldn't need to bother.  I think libvirt should also guarantee the vIOMMU
device to be created before all the rest devices, including virtio/vhost.  But
need to check.  If that's the case libvirt will naturally work for vhost too.

For the long term we may need to think about making device creation to be not
ordered by user cmdline input but still has a priority specified in the code
itself.  Migration has something like that (MigrationPriority).  I think we
just need something similar for general device realizations.  Since vhost
raised the same need, I think that priority should bump up too.

The other concern is right now vhost has vhost_dev.dma_as but now we're not
using it for vhost_dev_has_iommu().  It's just a bit confusing as when to use
which.

What do you think?

Thanks,

-- 
Peter Xu

Re: [PULL 0/5] pc,pci: bugfixes

2021-08-04 Thread Peter Maydell

On Tue, 3 Aug 2021 at 21:52, Michael S. Tsirkin  wrote:
>
> The following changes since commit f2da205cb4142259d9bc6b9d4596ebbe2426fe49:
>
>   Update version for v6.1.0-rc1 release (2021-07-27 18:07:52 +0100)
>
> are available in the Git repository at:
>
>   git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream
>
> for you to fetch changes up to 62a4db5522cbbd8b5309a2949c22f00e5b0138e3:
>
>   Drop _DSM 5 from expected DSDTs on ARM (2021-08-03 16:32:35 -0400)
>
> 
> pc,pci: bugfixes
>
> Small bugfixes all over the place.
>
> Signed-off-by: Michael S. Tsirkin 
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1
for any user-visible changes.

-- PMM

Re: [RESEND RFC] hw/arm/smmuv3: add device properties to disable cached iotlb

2021-08-04 Thread Eric Auger

Hi Chenxiang,

On 8/4/21 10:49 AM, chenxiang wrote:
> From: Xiang Chen 
>
> It splits invalidations into ^2 range invalidations in the patch
> 6d9cd115b(" hw/arm/smmuv3: Enforce invalidation on a power of two range").
> So for some scenarios such as the size of invalidation is not ^2 range
> invalidation, it costs more time to invalidate.
this ^² split is not only necessary for internal TLB management but also
for IOMMU MR notifier calls (which use a mask), ie. IOTLB unmap
notifications used for both vhost and vfio integrations.
So you can disable the internal IOTLB but we can't simply remove the pow
of 2 split. See below.

internal TLB could be disabled through a property but I would rather set
it as an "x-" experimental property for debug purpose. Until recently
this was indeed helpful to debug bugs related to internal IOTLB
management (RIL support) ;-) I hope this period is over though ;-)
> Currently smmuv3_translate is rarely used (i only see it is used when
> binding msi), so i think maybe we can disable cached iotlb to promote
> efficiency of invalidation. So add device property disable_cached_iotlb
> to disable cached iotlb, and then we can send non-^2 range invalidation
> directly.
> Use tool dma_map_benchmark to have a test on the latency of unmap,
> and we can see it promotes much on unmap when the size of invalidation
> is not ^2 range invalidation (such as g = 7/15/31/511):
>
> t = 1(thread = 1)
>   before opt(us)   after opt(us)
> g=1(4K size)  0.2/7.6 0.2/7.5
> g=4(8K size)  0.4/7.9 0.4/7.9
> g=7(28K size) 0.6/10.20.6/8.2
> g=8(32K size) 0.6/8.3 0.6/8.3
> g=15(60K size)1.1/12.11.1/9.1
> g=16(64K size)1.1/9.2 1.1/9.1
> g=31(124K size)   2.0/14.82.0/10.7
> g=32(128K size)   2.1/14.82.1/10.7
> g=511(2044K size) 30.9/65.1   31.1/55.9
> g=512(2048K size) 0.3/32.10.3/32.1
> t = 10(thread = 10)
>   before opt(us)   after opt(us)
> g=1(4K size)  0.2/39.90.2/39.1
> g=4(8K size)  0.5/42.60.5/42.4
> g=7(28K size) 0.6/66.40.6/45.3
> g=8(32K size) 0.7/45.80.7/46.1
> g=15(60K size)1.1/80.51.1/49.6
> g=16(64K size)1.1/49.81.1/50.2
> g=31(124K size)   2.0/98.32.1/58.0
> g=32(128K size)   2.1/57.72.1/58.2
> g=511(2044K size) 35.2/322.2  35.3/236.7
> g=512(2048K size) 0.8/238.2   0.9/240.3
>
> Note: i test it based on VSMMU enabled with the patchset
> ("vSMMUv3/pSMMUv3 2 stage VFIO integration").
>
> Signed-off-by: Xiang Chen 
> ---
>  hw/arm/smmuv3.c | 77 
> -
>  include/hw/arm/smmuv3.h |  1 +
>  2 files changed, 52 insertions(+), 26 deletions(-)
>
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 01b60be..7ae668f 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -19,6 +19,7 @@
>  #include "qemu/osdep.h"
>  #include "qemu/bitops.h"
>  #include "hw/irq.h"
> +#include "hw/qdev-properties.h"
>  #include "hw/sysbus.h"
>  #include "migration/vmstate.h"
>  #include "hw/qdev-core.h"
> @@ -682,19 +683,21 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion 
> *mr, hwaddr addr,
>  page_mask = (1ULL << (tt->granule_sz)) - 1;
>  aligned_addr = addr & ~page_mask;
>  
> -cached_entry = smmu_iotlb_lookup(bs, cfg, tt, aligned_addr);
> -if (cached_entry) {
> -if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) {
> -status = SMMU_TRANS_ERROR;
> -if (event.record_trans_faults) {
> -event.type = SMMU_EVT_F_PERMISSION;
> -event.u.f_permission.addr = addr;
> -event.u.f_permission.rnw = flag & 0x1;
> +if (s->disable_cached_iotlb) {
> +cached_entry = smmu_iotlb_lookup(bs, cfg, tt, aligned_addr);
> +if (cached_entry) {
> +if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) 
> {
> +status = SMMU_TRANS_ERROR;
> +if (event.record_trans_faults) {
> +event.type = SMMU_EVT_F_PERMISSION;
> +event.u.f_permission.addr = addr;
> +event.u.f_permission.rnw = flag & 0x1;
> +}
> +} else {
> +status = SMMU_TRANS_SUCCESS;
>  }
> -} else {
> -status = SMMU_TRANS_SUCCESS;
> +goto epilogue;
>  }
> -goto epilogue;
>  }
>  
>  cached_entry = g_new0(SMMUTLBEntry, 1);
> @@ -742,7 +745,9 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion 
> *mr, hwaddr addr,
>  }
>  status = SMMU_TRANS_ERROR;
>  } else {
> -smmu_iotlb_insert(bs, cfg, cached_entry);
> +if (s->disable_cached_iotlb) {
> +smmu_iotlb_insert(bs, cfg, cac

[PATCH v2 07/11] chardev: fix qemu_chr_open_fd() being called with fd=-1

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

The "file" chardev may call qemu_chr_open_fd() with fd_in=-1. This may
cause invalid system calls, as the QIOChannel is assumed to be properly
initialized later on.

Signed-off-by: Marc-André Lureau 
---
 chardev/char-fd.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/chardev/char-fd.c b/chardev/char-fd.c
index 743d3989b4..c11b1037f9 100644
--- a/chardev/char-fd.c
+++ b/chardev/char-fd.c
@@ -39,6 +39,10 @@ static int fd_chr_write(Chardev *chr, const uint8_t *buf, 
int len)
 {
 FDChardev *s = FD_CHARDEV(chr);
 
+if (!s->ioc_out) {
+return -1;
+}
+
 return io_channel_send(s->ioc_out, buf, len);
 }
 
@@ -209,15 +213,19 @@ void qemu_chr_open_fd(Chardev *chr,
 FDChardev *s = FD_CHARDEV(chr);
 char *name;
 
-s->ioc_in = QIO_CHANNEL(qio_channel_file_new_fd(fd_in));
-name = g_strdup_printf("chardev-file-in-%s", chr->label);
-qio_channel_set_name(QIO_CHANNEL(s->ioc_in), name);
-g_free(name);
-s->ioc_out = QIO_CHANNEL(qio_channel_file_new_fd(fd_out));
-name = g_strdup_printf("chardev-file-out-%s", chr->label);
-qio_channel_set_name(QIO_CHANNEL(s->ioc_out), name);
-g_free(name);
-qemu_set_nonblock(fd_out);
+if (fd_in >= 0) {
+s->ioc_in = QIO_CHANNEL(qio_channel_file_new_fd(fd_in));
+name = g_strdup_printf("chardev-file-in-%s", chr->label);
+qio_channel_set_name(QIO_CHANNEL(s->ioc_in), name);
+g_free(name);
+}
+if (fd_out >= 0) {
+s->ioc_out = QIO_CHANNEL(qio_channel_file_new_fd(fd_out));
+name = g_strdup_printf("chardev-file-out-%s", chr->label);
+qio_channel_set_name(QIO_CHANNEL(s->ioc_out), name);
+g_free(name);
+qemu_set_nonblock(fd_out);
+}
 }
 
 static void char_fd_class_init(ObjectClass *oc, void *data)
-- 
2.32.0.264.g75ae10bc75

[PATCH v2 11/11] chardev: reuse qmp_chardev_new()

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

qemu_chardev_new() and qmp_chardev_add() are similar, let's factorize
the code a bit.

Signed-off-by: Marc-André Lureau 
---
 chardev/char.c | 27 ++-
 1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/chardev/char.c b/chardev/char.c
index 4595a8d430..3def40c914 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -1005,7 +1005,7 @@ Chardev *qemu_chardev_new(const char *id, const char 
*typename,
   GMainContext *gcontext,
   Error **errp)
 {
-Chardev *chr;
+g_autoptr(Chardev) chr = NULL;
 g_autofree char *genid = NULL;
 
 if (!id) {
@@ -1013,6 +1013,11 @@ Chardev *qemu_chardev_new(const char *id, const char 
*typename,
 id = genid;
 }
 
+if (qemu_chr_find(id)) {
+error_setg(errp, "Chardev with id '%s' already exists", id);
+return NULL;
+}
+
 chr = chardev_new(id, typename, backend, gcontext, false, errp);
 if (!chr) {
 return NULL;
@@ -1020,12 +1025,10 @@ Chardev *qemu_chardev_new(const char *id, const char 
*typename,
 
 if (!object_property_try_add_child(get_chardevs_root(), id, OBJECT(chr),
errp)) {
-object_unref(OBJECT(chr));
 return NULL;
 }
-object_unref(OBJECT(chr));
 
-return chr;
+return chr; /* returns a shared/unowned reference */
 }
 
 ChardevReturn *qmp_chardev_add(const char *id, ChardevBackend *backend,
@@ -1034,29 +1037,19 @@ ChardevReturn *qmp_chardev_add(const char *id, 
ChardevBackend *backend,
 ERRP_GUARD();
 const ChardevClass *cc;
 ChardevReturn *ret;
-g_autoptr(Chardev) chr = NULL;
-
-if (qemu_chr_find(id)) {
-error_setg(errp, "Chardev with id '%s' already exists", id);
-return NULL;
-}
+Chardev *chr = NULL;
 
 cc = char_get_class(ChardevBackendKind_str(backend->type), errp);
 if (!cc) {
 goto err;
 }
 
-chr = chardev_new(id, object_class_get_name(OBJECT_CLASS(cc)),
-  backend, NULL, false, errp);
+chr = qemu_chardev_new(id, object_class_get_name(OBJECT_CLASS(cc)),
+   backend, NULL, errp);
 if (!chr) {
 goto err;
 }
 
-if (!object_property_try_add_child(get_chardevs_root(), id, OBJECT(chr),
-   errp)) {
-goto err;
-}
-
 ret = g_new0(ChardevReturn, 1);
 if (CHARDEV_IS_PTY(chr)) {
 ret->pty = g_strdup(chr->filename + 4);
-- 
2.32.0.264.g75ae10bc75

Re: [PATCH v7 29/33] iotests.py: hmp_qemu_io: support qdev

2021-08-04 Thread John Snow

On Wed, Aug 4, 2021 at 5:39 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 
> ---
>  tests/qemu-iotests/iotests.py | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 025e288ddd..9d0031a0e8 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -655,9 +655,10 @@ def resume_drive(self, drive: str) -> None:
>  self.hmp(f'qemu-io {drive} "remove_break bp_{drive}"')
>
>  def hmp_qemu_io(self, drive: str, cmd: str,
> -use_log: bool = False) -> QMPMessage:
> +use_log: bool = False, qdev: bool = False) ->
> QMPMessage:
>  """Write to a given drive using an HMP command"""
> -return self.hmp(f'qemu-io {drive} "{cmd}"', use_log=use_log)
> +d = '-d ' if qdev else ''
> +return self.hmp(f'qemu-io {d}{drive} "{cmd}"', use_log=use_log)
>
>  def flatten_qmp_object(self, obj, output=None, basestr=''):
>  if output is None:
> --
> 2.29.2
>
>
Guess that's really the only flag that this HMP command supports. I was
gonna suggest abstracting to {args} ... but, uh, that's the only one, so...
sure!

Reviewed-by: John Snow

Re: [PATCH V5 02/25] cpr: reboot mode

2021-08-04 Thread Eric Blake

On Wed, Jul 07, 2021 at 10:20:11AM -0700, Steve Sistare wrote:
> Provide the cprsave and cprload functions for live update.  These save and
> restore VM state, with minimal guest pause time, so that qemu may be updated
> to a new version in between.
> 

> +++ b/migration/cpr.c
> @@ -0,0 +1,149 @@

> +
> +QEMUFile *qf_file_open(const char *path, int flags, int mode,
> +  const char *name, Error **errp)

Indentation is off.

> +{
> +QIOChannelFile *fioc;
> +QIOChannel *ioc;
> +QEMUFile *f;
> +
> +if (flags & O_RDWR) {
> +error_setg(errp, "qf_file_open %s: O_RDWR not supported", path);
> +return 0;
> +}
> +
> +fioc = qio_channel_file_new_path(path, flags, mode, errp);

Good, you aren't using bare open(), but reusing existing wrappers,
which means you should be able to accept magic filenames like
"/dev/fdset/1" to open an fd passed in previously.  (I had to come
back to this patch to make sure after starting on patch 3)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[PATCH v2 06/11] chardev: fix fd_chr_add_watch() when in != out

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Create child sources for the different streams, and dispatch on the
parent source with the synthesized conditions.

Signed-off-by: Marc-André Lureau 
---
 chardev/char-fd.c | 78 ++-
 1 file changed, 77 insertions(+), 1 deletion(-)

diff --git a/chardev/char-fd.c b/chardev/char-fd.c
index 1cd62f2779..743d3989b4 100644
--- a/chardev/char-fd.c
+++ b/chardev/char-fd.c
@@ -28,6 +28,7 @@
 #include "qemu/sockets.h"
 #include "qapi/error.h"
 #include "chardev/char.h"
+#include "chardev/char-fe.h"
 #include "io/channel-file.h"
 
 #include "chardev/char-fd.h"
@@ -80,10 +81,85 @@ static int fd_chr_read_poll(void *opaque)
 return s->max_size;
 }
 
+typedef struct FDSource {
+GSource parent;
+
+GIOCondition cond;
+} FDSource;
+
+static gboolean
+fd_source_prepare(GSource *source,
+  gint *timeout_)
+{
+FDSource *src = (FDSource *)source;
+
+return src->cond != 0;
+}
+
+static gboolean
+fd_source_check(GSource *source)
+{
+FDSource *src = (FDSource *)source;
+
+return src->cond != 0;
+}
+
+static gboolean
+fd_source_dispatch(GSource *source, GSourceFunc callback,
+   gpointer user_data)
+{
+FDSource *src = (FDSource *)source;
+FEWatchFunc func = (FEWatchFunc)callback;
+gboolean ret = G_SOURCE_CONTINUE;
+
+if (src->cond) {
+ret = func(NULL, src->cond, user_data);
+src->cond = 0;
+}
+
+return ret;
+}
+
+static GSourceFuncs fd_source_funcs = {
+  fd_source_prepare,
+  fd_source_check,
+  fd_source_dispatch,
+  NULL, NULL, NULL
+};
+
+static GSource *fd_source_new(FDChardev *chr)
+{
+return g_source_new(&fd_source_funcs, sizeof(FDSource));
+}
+
+static gboolean child_func(GIOChannel *source,
+   GIOCondition condition,
+   gpointer data)
+{
+FDSource *parent = data;
+
+parent->cond |= condition;
+
+return G_SOURCE_CONTINUE;
+}
+
 static GSource *fd_chr_add_watch(Chardev *chr, GIOCondition cond)
 {
 FDChardev *s = FD_CHARDEV(chr);
-return qio_channel_create_watch(s->ioc_out, cond);
+g_autoptr(GSource) source = fd_source_new(s);
+
+if (s->ioc_out) {
+g_autoptr(GSource) child = qio_channel_create_watch(s->ioc_out, cond & 
~G_IO_IN);
+g_source_set_callback(child, (GSourceFunc)child_func, source, NULL);
+g_source_add_child_source(source, child);
+}
+if (s->ioc_in) {
+g_autoptr(GSource) child = qio_channel_create_watch(s->ioc_in, cond & 
~G_IO_OUT);
+g_source_set_callback(child, (GSourceFunc)child_func, source, NULL);
+g_source_add_child_source(source, child);
+}
+
+return g_steal_pointer(&source);
 }
 
 static void fd_chr_update_read_handler(Chardev *chr)
-- 
2.32.0.264.g75ae10bc75

[PATCH v2 05/11] chardev: mark explicitly first argument as poisoned

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Since commit 9894dc0cdcc397ee5b26370bc53da6d360a363c2 "char: convert
from GIOChannel to QIOChannel", the first argument to the watch callback
can actually be a QIOChannel, which is not a GIOChannel (but a QEMU
Object).

Even though we never used that pointer, change the callback type to warn
the users. Possibly a better fix later, we may want to store the
callback and call it from intermediary functions.

Signed-off-by: Marc-André Lureau 
---
 include/chardev/char-fe.h | 8 +++-
 chardev/char-fe.c | 2 +-
 hw/char/cadence_uart.c| 2 +-
 hw/char/cmsdk-apb-uart.c  | 2 +-
 hw/char/ibex_uart.c   | 2 +-
 hw/char/nrf51_uart.c  | 2 +-
 hw/char/serial.c  | 2 +-
 hw/char/virtio-console.c  | 2 +-
 hw/usb/redirect.c | 2 +-
 hw/virtio/vhost-user.c| 2 +-
 monitor/monitor.c | 2 +-
 net/vhost-user.c  | 4 ++--
 12 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/include/chardev/char-fe.h b/include/chardev/char-fe.h
index a553843364..867ef1b3b2 100644
--- a/include/chardev/char-fe.h
+++ b/include/chardev/char-fe.h
@@ -174,6 +174,9 @@ void qemu_chr_fe_set_open(CharBackend *be, int fe_open);
 void qemu_chr_fe_printf(CharBackend *be, const char *fmt, ...)
 GCC_FMT_ATTR(2, 3);
 
+
+typedef gboolean (*FEWatchFunc)(void *do_not_use, GIOCondition condition, void 
*data);
+
 /**
  * qemu_chr_fe_add_watch:
  * @cond: the condition to poll for
@@ -188,10 +191,13 @@ void qemu_chr_fe_printf(CharBackend *be, const char *fmt, 
...)
  * Note that you are responsible to update the front-end sources if
  * you are switching the main context with qemu_chr_fe_set_handlers().
  *
+ * Warning: DO NOT use the first callback argument (it may be either
+ * a GIOChannel or a QIOChannel, depending on the underlying chardev)
+ *
  * Returns: the source tag
  */
 guint qemu_chr_fe_add_watch(CharBackend *be, GIOCondition cond,
-GIOFunc func, void *user_data);
+FEWatchFunc func, void *user_data);
 
 /**
  * qemu_chr_fe_write:
diff --git a/chardev/char-fe.c b/chardev/char-fe.c
index 474715c5a9..7789f7be9c 100644
--- a/chardev/char-fe.c
+++ b/chardev/char-fe.c
@@ -354,7 +354,7 @@ void qemu_chr_fe_set_open(CharBackend *be, int fe_open)
 }
 
 guint qemu_chr_fe_add_watch(CharBackend *be, GIOCondition cond,
-GIOFunc func, void *user_data)
+FEWatchFunc func, void *user_data)
 {
 Chardev *s = be->chr;
 GSource *src;
diff --git a/hw/char/cadence_uart.c b/hw/char/cadence_uart.c
index ceb677bc5a..8ee6f74b8c 100644
--- a/hw/char/cadence_uart.c
+++ b/hw/char/cadence_uart.c
@@ -288,7 +288,7 @@ static void uart_write_rx_fifo(void *opaque, const uint8_t 
*buf, int size)
 uart_update_status(s);
 }
 
-static gboolean cadence_uart_xmit(GIOChannel *chan, GIOCondition cond,
+static gboolean cadence_uart_xmit(GIOChannel *do_not_use, GIOCondition cond,
   void *opaque)
 {
 CadenceUARTState *s = opaque;
diff --git a/hw/char/cmsdk-apb-uart.c b/hw/char/cmsdk-apb-uart.c
index ba2cbbee3d..b07a9dee4f 100644
--- a/hw/char/cmsdk-apb-uart.c
+++ b/hw/char/cmsdk-apb-uart.c
@@ -191,7 +191,7 @@ static uint64_t uart_read(void *opaque, hwaddr offset, 
unsigned size)
 /* Try to send tx data, and arrange to be called back later if
  * we can't (ie the char backend is busy/blocking).
  */
-static gboolean uart_transmit(GIOChannel *chan, GIOCondition cond, void 
*opaque)
+static gboolean uart_transmit(GIOChannel *do_not_use, GIOCondition cond, void 
*opaque)
 {
 CMSDKAPBUART *s = CMSDK_APB_UART(opaque);
 int ret;
diff --git a/hw/char/ibex_uart.c b/hw/char/ibex_uart.c
index 6b0c9330bf..e493ea08c0 100644
--- a/hw/char/ibex_uart.c
+++ b/hw/char/ibex_uart.c
@@ -135,7 +135,7 @@ static void ibex_uart_receive(void *opaque, const uint8_t 
*buf, int size)
 ibex_uart_update_irqs(s);
 }
 
-static gboolean ibex_uart_xmit(GIOChannel *chan, GIOCondition cond,
+static gboolean ibex_uart_xmit(GIOChannel *do_not_use, GIOCondition cond,
void *opaque)
 {
 IbexUartState *s = opaque;
diff --git a/hw/char/nrf51_uart.c b/hw/char/nrf51_uart.c
index 045ca5fa40..0b89b0eae4 100644
--- a/hw/char/nrf51_uart.c
+++ b/hw/char/nrf51_uart.c
@@ -75,7 +75,7 @@ static uint64_t uart_read(void *opaque, hwaddr addr, unsigned 
int size)
 return r;
 }
 
-static gboolean uart_transmit(GIOChannel *chan, GIOCondition cond, void 
*opaque)
+static gboolean uart_transmit(GIOChannel *do_not_use, GIOCondition cond, void 
*opaque)
 {
 NRF51UARTState *s = NRF51_UART(opaque);
 int r;
diff --git a/hw/char/serial.c b/hw/char/serial.c
index bc2e322970..7061aacbce 100644
--- a/hw/char/serial.c
+++ b/hw/char/serial.c
@@ -220,7 +220,7 @@ static void serial_update_msl(SerialState *s)
 }
 }
 
-static gboolean serial_watch_cb(GIOChannel *chan, GIOCondition cond,
+static gboolean serial_watch_cb(void *do_not_use, GIOCondition cond,

Re: [PATCH V5 03/25] cpr: QMP interfaces for reboot

2021-08-04 Thread Eric Blake

On Wed, Jul 07, 2021 at 10:20:12AM -0700, Steve Sistare wrote:
> cprsave calls cprsave().  Syntax:
>   { 'enum': 'CprMode', 'data': [ 'reboot' ] }
>   { 'command': 'cprsave', 'data': { 'file': 'str', 'mode': 'CprMode' } }
> 
> cprload calls cprload().  Syntax:
>   { 'command': 'cprload', 'data': { 'file': 'str' } }

Does this also allow the magic "/dev/fdset/NNN" syntax for opening an
fd already passed in previously?
/me goes back to patch 2 to check
Yes, it looks like it should.

> 
> cprinfo returns a list of supported modes.  Syntax:
>   { 'struct': 'CprInfo', 'data': { 'modes': [ 'CprMode' ] } }
>   { 'command': 'cprinfo', 'returns': 'CprInfo' }

As pointed out elsewhere, relying on introspection seems nicer than
adding this command.

> 
> Signed-off-by: Mark Kanda 
> Signed-off-by: Steve Sistare 
> ---

> +++ b/qapi/cpr.json
> @@ -0,0 +1,74 @@
> +# -*- Mode: Python -*-
> +#
> +# Copyright (c) 2021 Oracle and/or its affiliates.
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2.
> +# See the COPYING file in the top-level directory.
> +
> +##
> +# = CPR

Might be worth expanding what this acronym stands for here.

> +##
> +
> +{ 'include': 'common.json' }
> +
> +##
> +# @CprMode:
> +#
> +# @reboot: checkpoint can be cprload'ed after a host kexec reboot.
> +#
> +# Since: 6.1

As this missed 6.1, you'll need to (eventually) rebase the series to
mention 6.2 everywhere.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[PATCH v2 10/11] chardev: report a simpler error about duplicated id

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Report:
  "Chardev with id 'char2' already exists"
Rather than:
  "Failed to add chardev 'char2': duplicate yank instance"

Signed-off-by: Marc-André Lureau 
---
 chardev/char.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/chardev/char.c b/chardev/char.c
index f59a61774b..4595a8d430 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -1036,6 +1036,11 @@ ChardevReturn *qmp_chardev_add(const char *id, 
ChardevBackend *backend,
 ChardevReturn *ret;
 g_autoptr(Chardev) chr = NULL;
 
+if (qemu_chr_find(id)) {
+error_setg(errp, "Chardev with id '%s' already exists", id);
+return NULL;
+}
+
 cc = char_get_class(ChardevBackendKind_str(backend->type), errp);
 if (!cc) {
 goto err;
-- 
2.32.0.264.g75ae10bc75

[PATCH v2 08/11] chardev: fix qemu_chr_open_fd() with fd_in==fd_out

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

The "serial" chardev calls qemu_chr_open_fd() with the same fd. This
may lead to double-close as each QIOChannel owns the fd.

Instead, share the reference to the same QIOChannel.

Signed-off-by: Marc-André Lureau 
---
 chardev/char-fd.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/chardev/char-fd.c b/chardev/char-fd.c
index c11b1037f9..93c56913b4 100644
--- a/chardev/char-fd.c
+++ b/chardev/char-fd.c
@@ -211,20 +211,31 @@ void qemu_chr_open_fd(Chardev *chr,
   int fd_in, int fd_out)
 {
 FDChardev *s = FD_CHARDEV(chr);
-char *name;
+g_autofree char *name = NULL;
+
+if (fd_out >= 0) {
+qemu_set_nonblock(fd_out);
+}
+
+if (fd_out == fd_in && fd_in >= 0) {
+s->ioc_in = QIO_CHANNEL(qio_channel_file_new_fd(fd_in));
+name = g_strdup_printf("chardev-file-%s", chr->label);
+qio_channel_set_name(QIO_CHANNEL(s->ioc_in), name);
+s->ioc_out = QIO_CHANNEL(object_ref(s->ioc_in));
+return;
+}
 
 if (fd_in >= 0) {
 s->ioc_in = QIO_CHANNEL(qio_channel_file_new_fd(fd_in));
 name = g_strdup_printf("chardev-file-in-%s", chr->label);
 qio_channel_set_name(QIO_CHANNEL(s->ioc_in), name);
-g_free(name);
 }
+
 if (fd_out >= 0) {
 s->ioc_out = QIO_CHANNEL(qio_channel_file_new_fd(fd_out));
+g_free(name);
 name = g_strdup_printf("chardev-file-out-%s", chr->label);
 qio_channel_set_name(QIO_CHANNEL(s->ioc_out), name);
-g_free(name);
-qemu_set_nonblock(fd_out);
 }
 }
 
-- 
2.32.0.264.g75ae10bc75

[PATCH v2 03/11] chardev: remove needless class method

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

"chr_option_parsed" is only implemented by the "mux" chardev, we can
specialize the code there to avoid the needless generic class method.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/chardev/char.h | 1 -
 chardev/char-mux.c | 6 ++
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/include/chardev/char.h b/include/chardev/char.h
index 7c0444f90d..589e7fe46d 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -273,7 +273,6 @@ struct ChardevClass {
 void (*chr_set_echo)(Chardev *chr, bool echo);
 void (*chr_set_fe_open)(Chardev *chr, int fe_open);
 void (*chr_be_event)(Chardev *s, QEMUChrEvent event);
-void (*chr_options_parsed)(Chardev *chr);
 };
 
 Chardev *qemu_chardev_new(const char *id, const char *typename,
diff --git a/chardev/char-mux.c b/chardev/char-mux.c
index 5baf419010..ada0c6866f 100644
--- a/chardev/char-mux.c
+++ b/chardev/char-mux.c
@@ -386,10 +386,9 @@ void suspend_mux_open(void)
 static int chardev_options_parsed_cb(Object *child, void *opaque)
 {
 Chardev *chr = (Chardev *)child;
-ChardevClass *class = CHARDEV_GET_CLASS(chr);
 
-if (!chr->be_open && class->chr_options_parsed) {
-class->chr_options_parsed(chr);
+if (!chr->be_open && CHARDEV_IS_MUX(chr)) {
+open_muxes(chr);
 }
 
 return 0;
@@ -412,7 +411,6 @@ static void char_mux_class_init(ObjectClass *oc, void *data)
 cc->chr_accept_input = mux_chr_accept_input;
 cc->chr_add_watch = mux_chr_add_watch;
 cc->chr_be_event = mux_chr_be_event;
-cc->chr_options_parsed = open_muxes;
 cc->chr_update_read_handler = mux_chr_update_read_handlers;
 }
 
-- 
2.32.0.264.g75ae10bc75

[PATCH v2 02/11] chardev/socket: print a more correct command-line address

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Better reflect the command line version of the socket address arguments,
following the now recommended long-form opt=on syntax.

Complement/fixes commit 9d902d51 "chardev: do not use short form boolean
options in non-QemuOpts character device descriptions".

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char-socket.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index d0fb545963..c43668cc15 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -468,9 +468,9 @@ static char *qemu_chr_socket_address(SocketChardev *s, 
const char *prefix)
 
 #ifdef CONFIG_LINUX
 if (sa->has_abstract && sa->abstract) {
-abstract = ",abstract";
+abstract = ",abstract=on";
 if (sa->has_tight && sa->tight) {
-tight = ",tight";
+tight = ",tight=on";
 }
 }
 #endif
-- 
2.32.0.264.g75ae10bc75

[PATCH v2 04/11] chardev: add some comments about the class methods

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
---
 include/chardev/char.h | 33 +
 1 file changed, 33 insertions(+)

diff --git a/include/chardev/char.h b/include/chardev/char.h
index 589e7fe46d..a319b5fdff 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -254,24 +254,57 @@ struct ChardevClass {
 
 bool internal; /* TODO: eventually use TYPE_USER_CREATABLE */
 bool supports_yank;
+
+/* parse command line options and populate QAPI @backend */
 void (*parse)(QemuOpts *opts, ChardevBackend *backend, Error **errp);
 
+/* called after construction, open/starts the backend */
 void (*open)(Chardev *chr, ChardevBackend *backend,
  bool *be_opened, Error **errp);
 
+/* write buf to the backend */
 int (*chr_write)(Chardev *s, const uint8_t *buf, int len);
+
+/*
+ * Read from the backend (blocking). A typical front-end will instead rely
+ * on chr_can_read/chr_read being called when polling/looping.
+ */
 int (*chr_sync_read)(Chardev *s, const uint8_t *buf, int len);
+
+/* create a watch on the backend */
 GSource *(*chr_add_watch)(Chardev *s, GIOCondition cond);
+
+/* update the backend internal sources */
 void (*chr_update_read_handler)(Chardev *s);
+
+/* send an ioctl to the backend */
 int (*chr_ioctl)(Chardev *s, int cmd, void *arg);
+
+/* get ancillary-received fds during last read */
 int (*get_msgfds)(Chardev *s, int* fds, int num);
+
+/* set ancillary fds to be sent with next write */
 int (*set_msgfds)(Chardev *s, int *fds, int num);
+
+/* accept the given fd */
 int (*chr_add_client)(Chardev *chr, int fd);
+
+/* wait for a connection */
 int (*chr_wait_connected)(Chardev *chr, Error **errp);
+
+/* disconnect a connection */
 void (*chr_disconnect)(Chardev *chr);
+
+/* called by frontend when it can read */
 void (*chr_accept_input)(Chardev *chr);
+
+/* set terminal echo */
 void (*chr_set_echo)(Chardev *chr, bool echo);
+
+/* notify the backend of frontend open state */
 void (*chr_set_fe_open)(Chardev *chr, int fe_open);
+
+/* handle various events */
 void (*chr_be_event)(Chardev *s, QEMUChrEvent event);
 };
 
-- 
2.32.0.264.g75ae10bc75

[PATCH v2 09/11] chardev: give some context on chardev-add error

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Description from Daniel P. Berrangé:
> The original code reported:
>
>  "attempt to add duplicate property 'char2' to object (type 'container')"
>
> Since adding yank support, the current code reports
>
>  "duplicate yank instance"
>
> With this patch applied it now reports:
>
>  "Failed to add chardev 'char2': duplicate yank instance"
>
> This is marginally better, but still not great, not that the original
> error was great either.
>
> It would be nice if we could report
>
>   "chardev with id 'char2' already exists"

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1984721

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
---
 chardev/char.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/chardev/char.c b/chardev/char.c
index d959eec522..f59a61774b 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -1031,27 +1031,26 @@ Chardev *qemu_chardev_new(const char *id, const char 
*typename,
 ChardevReturn *qmp_chardev_add(const char *id, ChardevBackend *backend,
Error **errp)
 {
+ERRP_GUARD();
 const ChardevClass *cc;
 ChardevReturn *ret;
-Chardev *chr;
+g_autoptr(Chardev) chr = NULL;
 
 cc = char_get_class(ChardevBackendKind_str(backend->type), errp);
 if (!cc) {
-return NULL;
+goto err;
 }
 
 chr = chardev_new(id, object_class_get_name(OBJECT_CLASS(cc)),
   backend, NULL, false, errp);
 if (!chr) {
-return NULL;
+goto err;
 }
 
 if (!object_property_try_add_child(get_chardevs_root(), id, OBJECT(chr),
errp)) {
-object_unref(OBJECT(chr));
-return NULL;
+goto err;
 }
-object_unref(OBJECT(chr));
 
 ret = g_new0(ChardevReturn, 1);
 if (CHARDEV_IS_PTY(chr)) {
@@ -1060,6 +1059,10 @@ ChardevReturn *qmp_chardev_add(const char *id, 
ChardevBackend *backend,
 }
 
 return ret;
+
+err:
+error_prepend(errp, "Failed to add chardev '%s': ", id);
+return NULL;
 }
 
 ChardevReturn *qmp_chardev_change(const char *id, ChardevBackend *backend,
-- 
2.32.0.264.g75ae10bc75

Re: [PATCH] chardev: give some context on chardev-add error

2021-08-04 Thread Marc-André Lureau

Hi

On Tue, Aug 3, 2021 at 6:39 PM Daniel P. Berrangé 
wrote:

> On Tue, Aug 03, 2021 at 04:02:29PM +0400, marcandre.lur...@redhat.com
> wrote:
> > From: Marc-André Lureau 
> >
> > Related to:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1984721
>
> It is preferrable to describe the problem & approach in the comit
> message, rather than leaving people to read through countless bug
> comments to understand it.
>
>
I'll update the commit message with your comment.

The original code reported:
>
>  "attempt to add duplicate property 'char2' to object (type 'container')"
>
>
> Since adding yank support, the current code reports
>
>  "duplicate yank instance"
>
> With this patch applied it now reports:
>
>  "Failed to add chardev 'char2': duplicate yank instance"
>
> This is marginally better, but still not great, not that the original
> error was great either.
>
> It would be nice if we could report
>
>   "chardev with id 'char2' already exists"
>

I can add a patch for that.


> > Signed-off-by: Marc-André Lureau 
>

thanks

> ---
> >  chardev/char.c | 15 +--
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/chardev/char.c b/chardev/char.c
> > index d959eec522..f59a61774b 100644
> > --- a/chardev/char.c
> > +++ b/chardev/char.c
> > @@ -1031,27 +1031,26 @@ Chardev *qemu_chardev_new(const char *id, const
> char *typename,
> >  ChardevReturn *qmp_chardev_add(const char *id, ChardevBackend *backend,
> > Error **errp)
> >  {
> > +ERRP_GUARD();
> >  const ChardevClass *cc;
> >  ChardevReturn *ret;
> > -Chardev *chr;
> > +g_autoptr(Chardev) chr = NULL;
> >
> >  cc = char_get_class(ChardevBackendKind_str(backend->type), errp);
> >  if (!cc) {
> > -return NULL;
> > +goto err;
> >  }
> >
> >  chr = chardev_new(id, object_class_get_name(OBJECT_CLASS(cc)),
> >backend, NULL, false, errp);
> >  if (!chr) {
> > -return NULL;
> > +goto err;
> >  }
> >
> >  if (!object_property_try_add_child(get_chardevs_root(), id,
> OBJECT(chr),
> > errp)) {
> > -object_unref(OBJECT(chr));
> > -return NULL;
> > +goto err;
> >  }
> > -object_unref(OBJECT(chr));
> >
> >  ret = g_new0(ChardevReturn, 1);
> >  if (CHARDEV_IS_PTY(chr)) {
> > @@ -1060,6 +1059,10 @@ ChardevReturn *qmp_chardev_add(const char *id,
> ChardevBackend *backend,
> >  }
> >
> >  return ret;
> > +
> > +err:
> > +error_prepend(errp, "Failed to add chardev '%s': ", id);
>
> Lowercase 'f' perhaps ?
>

We generally start error messages with an uppercase in this unit, but it's
not very consistent.


> > +return NULL;
> >  }
>
> A weak
>
>  Reviewed-by: Daniel P. Berrangé 
>
> because it is not quite so bad as current code
>
>
> Regards,
> Daniel
> --
> |: https://berrange.com  -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-
> https://www.instagram.com/dberrange :|
>
>
>

-- 
Marc-André Lureau

[PATCH v2 00/11] chardev related fixes

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Hi,

Those patches have been sent earlier. I compiled them here, as I intend to send
a pull request for 6.1. We may want to drop the last 2/3 patches since we are
in freeze though.

Marc-André Lureau (11):
  util: fix abstract socket path copy
  chardev/socket: print a more correct command-line address
  chardev: remove needless class method
  chardev: add some comments about the class methods
  chardev: mark explicitly first argument as poisoned
  chardev: fix fd_chr_add_watch() when in != out
  chardev: fix qemu_chr_open_fd() being called with fd=-1
  chardev: fix qemu_chr_open_fd() with fd_in==fd_out
  chardev: give some context on chardev-add error
  chardev: report a simpler error about duplicated id
  chardev: reuse qmp_chardev_new()

 include/chardev/char-fe.h |   8 ++-
 include/chardev/char.h|  34 ++-
 chardev/char-fd.c | 119 ++
 chardev/char-fe.c |   2 +-
 chardev/char-mux.c|   6 +-
 chardev/char-socket.c |   4 +-
 chardev/char.c|  33 ++-
 hw/char/cadence_uart.c|   2 +-
 hw/char/cmsdk-apb-uart.c  |   2 +-
 hw/char/ibex_uart.c   |   2 +-
 hw/char/nrf51_uart.c  |   2 +-
 hw/char/serial.c  |   2 +-
 hw/char/virtio-console.c  |   2 +-
 hw/usb/redirect.c |   2 +-
 hw/virtio/vhost-user.c|   2 +-
 monitor/monitor.c |   2 +-
 net/vhost-user.c  |   4 +-
 util/qemu-sockets.c   |   5 +-
 18 files changed, 184 insertions(+), 49 deletions(-)

-- 
2.32.0.264.g75ae10bc75

Re: [PATCH v4 00/13] new plugin argument passing scheme

2021-08-04 Thread Alex Bennée



Mahmoud Mandour  writes:

> Hello,
>
> This series removes passing arguments to plugins through "arg=" since
> it's redundant and reduces readability especially when the argument
> itself is composed of a name and a value.


Queued to plugins/next, thanks.

-- 
Alex Bennée

[PATCH v2 01/11] util: fix abstract socket path copy

2021-08-04 Thread marcandre . lureau

From: Marc-André Lureau 

Commit 776b97d360 "qemu-sockets: add abstract UNIX domain socket
support" neglected to update socket_sockaddr_to_address_unix() and
copied the whole sun_path without taking "salen" into account.

Later, commit 3b14b4ec49 "sockets: Fix socket_sockaddr_to_address_unix()
for abstract sockets" handled the abstract UNIX path, by stripping the
leading \0 character and fixing address details, but didn't use salen
either.

Not taking "salen" into account may result in incorrect "path" being
returned in monitors commands, as we read past the address which is not
necessarily \0-terminated.

Fixes: 776b97d3605ed0fc94443048fdf988c7725e38a9
Fixes: 3b14b4ec49a801067da19d6b8469eb1c1911c020
Signed-off-by: Marc-André Lureau 
Reviewed-by: xiaoqiang zhao 
Reviewed-by: Daniel P. Berrangé 
---
 util/qemu-sockets.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 080a240b74..f2f3676d1f 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -1345,13 +1345,16 @@ socket_sockaddr_to_address_unix(struct sockaddr_storage 
*sa,
 SocketAddress *addr;
 struct sockaddr_un *su = (struct sockaddr_un *)sa;
 
+assert(salen >= sizeof(su->sun_family) + 1 &&
+   salen <= sizeof(struct sockaddr_un));
+
 addr = g_new0(SocketAddress, 1);
 addr->type = SOCKET_ADDRESS_TYPE_UNIX;
 #ifdef CONFIG_LINUX
 if (!su->sun_path[0]) {
 /* Linux abstract socket */
 addr->u.q_unix.path = g_strndup(su->sun_path + 1,
-sizeof(su->sun_path) - 1);
+salen - sizeof(su->sun_family) - 1);
 addr->u.q_unix.has_abstract = true;
 addr->u.q_unix.abstract = true;
 addr->u.q_unix.has_tight = true;
-- 
2.32.0.264.g75ae10bc75

Re: [PATCH] hw/char/virtio-serial-bus: fix: Unpop throttled VirtQueueElement to queue before discard vq data

2021-08-04 Thread Stefan Hajnoczi

On Mon, Aug 02, 2021 at 05:01:14PM +0200, Laurent Vivier wrote:
> On 30/07/2021 03:58, AIERPATIJIANG1 [艾尔帕提江·阿布都赛买提] wrote:
> > Ports enter a "throttled" state when writing to the chardev would block.
> > The current output VirtQueueElement is kept around until the chardev
> > becomes writable again.
> > 
> > Because closing the virtio serial device does not reset the queue, we cannot
> > directly discard this element,  otherwise the control variables of the front
> > and back ends of the queue are inconsistent such as used_index. We should 
> > unpop the
> > VirtQueueElement to queue, let discard_vq_data process it.
> > 
> > The test environment:
> > kernel: linux-5.12
> > Qemu command:
> > Qemu-system-x86 -machine pc,accel=kvm \
> > -cpu host,host-phys-bits \
> > -smp 4 \
> > -m 4G \
> > -kernel ./kernel \
> > -display none \
> > -nodefaults \
> > -serial mon:stdio \
> > -append "panic=1 no_timer_check noreplace-smp rootflags=data=ordered 
> > rootfstype=ext4 console=ttyS0 reboot=k root=/dev/vda1 rw" \
> > -drive id=os,file=./disk,if=none \
> > -device virtio-blk-pci,drive=os \
> > -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
> > -chardev socket,id=charchannel0,path=/tmp/char-dev-test,server,nowait \
> >   -device 
> > virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
> > 
> > full up virtio queue after VM started:
> > Cat /large-file > /dev/vport1p1
> > 
> > Host side:
> > Open and close character device sockets repeatedly
> > 
> > After awhile we can’t write any request to /dev/vport1p1 at VM side, VM 
> > kernel soft lockup at drivers/char/virtio_console.c: __send_to_port
> > 
> > 
> > Signed-off-by: Arafatms 
> > 
> > diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> > index dd6bc27b3b..36236defdf 100644
> > --- a/hw/char/virtio-serial-bus.c
> > +++ b/hw/char/virtio-serial-bus.c
> > @@ -150,8 +150,12 @@ static void discard_vq_data(VirtQueue *vq, 
> > VirtIODevice *vdev)
> > 
> > static void discard_throttle_data(VirtIOSerialPort *port)
> > {
> > +if (!virtio_queue_ready(port->ovq)) {
> > +return;
> > +}
> 
> Why do we need to check virtio_queue_ready() now?

I'm not sure why this is necessary either. We don't need the vring to be
functional, we're just trying to clean up our internal state.

> > +
> >  if (port->elem) {
> > -virtqueue_detach_element(port->ovq, port->elem, 0);
> > +virtqueue_unpop(port->ovq, port->elem, 0);
> >  g_free(port->elem);
> >  port->elem = NULL;
> >  }
> > 
> 
> It seems the problem you report can only happen when the port is closed from 
> the host side
> (because from the guest side I guess the cleanup is done by the guest driver).
> 
> Stefan, you have introduced the function discard_throttle_data() in:
> 
>   d4c19cdeeb2f ("virtio-serial: add missing virtio_detach_element() call")
> 
> do you remember why you prefered to use virtqueue_detach_element() rather than
> virtqueue_unpop() (or virtqueue_discard() at this time, see 27e57efe32f5 
> ("virtio: rename
> virtqueue_discard to virtqueue_unpop"))

I don't remember. virtqueue_unpop() cannot handle out-of-order element
processing, but it seems safe to use here since port->elem is the last
element we pop off ovq.

It's probably just a bug in my commit d4c19cdeeb2f. I may have thought
the virtqueue is never touched again after reset (true), close (false),
and remove port (false?).

Please add:

  Fixes: d4c19cdeeb2f ("virtio-serial: add missing virtio_detach_element() 
call")

Stefan


signature.asc
Description: PGP signature

Re: [PATCH 1/2] fuzz: use ITIMER_REAL for timeouts

2021-08-04 Thread Darren Kenny

On Wednesday, 2021-08-04 at 09:56:20 -04, Alexander Bulekov wrote:
> Using ITIMER_VIRTUAL is a bad idea, if the fuzzer hits a blocking
> syscall - e.g. ppoll with a NULL timespec. This causes timeout issues
> while fuzzing some block-device code. Fix that by using wall-clock time.
> This might cause inputs to timeout sometimes due to scheduling
> effects/ambient load, but it is better than bringing the entire fuzzing
> process to a halt.
>
> Based-on: <20210713150037.9297-1-alx...@bu.edu>
> Signed-off-by: Alexander Bulekov 

Reviewed-by: Darren Kenny 

> ---
>  tests/qtest/fuzz/generic_fuzz.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tests/qtest/fuzz/generic_fuzz.c b/tests/qtest/fuzz/generic_fuzz.c
> index 3e8ce29227..de427a3727 100644
> --- a/tests/qtest/fuzz/generic_fuzz.c
> +++ b/tests/qtest/fuzz/generic_fuzz.c
> @@ -695,7 +695,7 @@ static void generic_fuzz(QTestState *s, const unsigned 
> char *Data, size_t Size)
>  while (cmd && Size) {
>  /* Reset the timeout, each time we run a new command */
>  if (timeout) {
> -setitimer(ITIMER_VIRTUAL, &timer, NULL);
> +setitimer(ITIMER_REAL, &timer, NULL);
>  }
>  
>  /* Get the length until the next command or end of input */
> -- 
> 2.30.2

Re: [PATCH v5 0/2] plugins/cache: multicore cache modelling

2021-08-04 Thread Alex Bennée



Mahmoud Mandour  writes:

> On Tue, Aug 3, 2021 at 11:10 PM Alex Bennée  wrote:
>
>  Mahmoud Mandour  writes:
>
>  > Hello,
>  >
>  > This series introduce multicore cache modelling in contrib/plugins/cache.c
>  >
>  > Multi-core cache modelling is handled such that for full-system
>  > emulation, a private L1 cache is maintained to each core available to
>  > the system. For multi-threaded userspace emulation, a static number of
>  > cores is maintained for the overall system, and every memory access go
>  > through one of these, even if the number of fired threads is more than
>  > that number.
>
>  Queued to plugins/next, thanks.
>
> From what I can see from your fork, qemu/cache.c at plugins/next · 
> stsquad/qemu · GitHub, 
> here, I think you enqueued v4 of the patches

No I just haven't re-pushed the branch yet.

>  
>  -- 
>  Alex Bennée


-- 
Alex Bennée

Re: [PATCH 2/2] fuzz: unblock SIGALRM so the timeout works

2021-08-04 Thread Darren Kenny

On Wednesday, 2021-08-04 at 09:56:21 -04, Alexander Bulekov wrote:
> The timeout mechanism wont work if SIGALRM is blocked. This changes

NIT: s/wont/won't/
 s/changes/change/

> unmasks SIGALRM when the timer is installed. This doesn't completely
> solve the problem, as the fuzzer could trigger some device activity that
> re-masks SIGALRM. However, there are currently no inputs on OSS-Fuzz
> that re-mask SIGALRM and timeout. If that turns out to be a real issue,
> we could try to hook sigmask-type calls, or use a separate timer thread.
>
> Based-on: <20210713150037.9297-1-alx...@bu.edu>
> Signed-off-by: Alexander Bulekov 

Reviewed-by: Darren Kenny 

> ---
>  tests/qtest/fuzz/generic_fuzz.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/tests/qtest/fuzz/generic_fuzz.c b/tests/qtest/fuzz/generic_fuzz.c
> index de427a3727..dd7e25851c 100644
> --- a/tests/qtest/fuzz/generic_fuzz.c
> +++ b/tests/qtest/fuzz/generic_fuzz.c
> @@ -670,6 +670,7 @@ static void generic_fuzz(QTestState *s, const unsigned 
> char *Data, size_t Size)
>  if (fork() == 0) {
>  struct sigaction sact;
>  struct itimerval timer;
> +sigset_t set;
>  /*
>   * Sometimes the fuzzer will find inputs that take quite a long time 
> to
>   * process. Often times, these inputs do not result in new coverage.
> @@ -684,6 +685,10 @@ static void generic_fuzz(QTestState *s, const unsigned 
> char *Data, size_t Size)
>  sact.sa_handler = handle_timeout;
>  sigaction(SIGALRM, &sact, NULL);
>  
> +sigemptyset(&set);
> +sigaddset(&set, SIGALRM);
> +pthread_sigmask(SIG_UNBLOCK, &set, NULL);
> +
>  memset(&timer, 0, sizeof(timer));
>  timer.it_value.tv_sec = timeout / USEC_IN_SEC;
>  timer.it_value.tv_usec = timeout % USEC_IN_SEC;
> -- 
> 2.30.2

Re: [PATCH 0/7] floppy: build as modules.

2021-08-04 Thread Philippe Mathieu-Daudé

+Mark

On 8/4/21 4:27 PM, Gerd Hoffmann wrote:
> Some code shuffling needed beforehand due to floppy being part of
> several platforms.  While being at it also make floppy optional
> in pc machine type.

>   floppy: move fdctrl_init_sysbus
>   floppy: move sun4m_fdctrl_init

https://www.mail-archive.com/qemu-block@nongnu.org/msg84008.html

Mark suggested:

  You may be able to simplify this further by removing the
  global legacy init functions fdctrl_init_sysbus() and
  sun4m_fdctrl_init(): from what I can see fdctrl_init_sysbus()
  is only used in hw/mips/jazz.c and sun4m_fdctrl_init() is only
  used in hw/sparc/sun4m.c so you might as well inline them or
  move the functions to the relevant files.

I did it and plan to send during 6.2. Sounds simpler than module.
You could easily rebase your series on top (or I can include your
patches while sending).

Re: [PATCH] linux-user/elfload: byteswap i386 registers when dumping core

2021-08-04 Thread Philippe Mathieu-Daudé

On 8/4/21 4:12 PM, Laurent Vivier wrote:
> Le 03/08/2021 à 20:34, Peter Maydell a écrit :
>> On Tue, 3 Aug 2021 at 18:21, Ilya Leoshkevich  wrote:
>>>
>>> Core dumps from emulating x86_64 on big-endian hosts contain incorrect
>>> register values.
>>>
>>> Signed-off-by: Ilya Leoshkevich 
>>
>>
>> Looks like these two were the only two guest arch versions of this
>> function that were missing the tswapreg calls...
>>
>> Reviewed-by: Peter Maydell 
> 
> Do we want this in 6.1?

Safe enough, I'd rather see it in, since there are recent interest in
emulating x86 binaries on s390x... My 2 cents ;)

Re: [PATCH 0/3] docs: convert qapi-code-gen.txt to qapi-code-gen.rst

2021-08-04 Thread Peter Maydell

On Wed, 4 Aug 2021 at 15:49, John Snow  wrote:
>
>
>
> On Wed, Aug 4, 2021 at 4:48 AM Markus Armbruster  wrote:
>>
>> For 6.1, or "don't rock the boat now"?
>
>
> No preference. In response to the same question on the other doc conversion 
> patch, "No preference."

I have no strong preference either. Docs changes are about
as safe as it gets so I don't mind them in rc3.

-- PMM

improving logging of hanging tests in CI

2021-08-04 Thread Peter Maydell

Some of our tests in 'make check' hang intermittently. At the moment
we have a path to diagnosing these when they happen via my ad-hoc CI
scripts: I can log into the relevant machine and manually look at
what has hung, attach gdb, etc. However, for the gitlab CI this
doesn't work.

Is it possible to stick something into the 'make check' framework
that does something like this:
 * for each test run, if it hasn't exited after 5 minutes,
   assume it has hung
 * in that case, print "ERROR: test $WHATEVER hung" to stdout
 * run something like the below script to capture backtraces
   (which is just something I threw together this afternoon and
   could probably be improved)
 * kill the offending subtree of processes
 * make sure 'make' exits with an error

We'd need to make sure that the CI stuff had 'gdb' installed
(and that the CI machine config lets gdb attach to processes
by PID, which we can for our own runners even if the gitlab
stock setup forbids it.)

The idea is to at least get a backtrace of a hung test into the
logs, so we have some idea of what happened.

===backtrace-process-tree===
#!/bin/bash -e
# backtrace-process-tree: print a thread backtrace of specified
# process and all its descendants.
# Copyright 2021 Linaro
# License GPL-v2-or-later

if [ $# != 1 ]; then
echo "Usage: backtrace-process-id PID"
exit 1
fi

TOPPID="$1"

if [ ! -e "/proc/$TOPPID" ]; then
echo "$TOPPID not a PID of a running process?"
exit 1
fi

bt_me_and_children() {
  ME="$1"
  echo "==="
  echo "PROCESS: $ME"
  ps -ww -f -p "$ME" | tail -1
  gdb --nx --batch -ex 'thread apply all bt' /proc/"$ME"/exe "$ME"
  echo

  for child in $(pgrep -P "$ME"); do
  bt_me_and_children $child
  done
}

echo "Process tree:"
pstree -pT "$TOPPID"

bt_me_and_children "$TOPPID"
===endit===

-- PMM

Re: [PATCH v2] block/io_uring: resubmit when result is -EAGAIN

2021-08-04 Thread Stefano Garzarella


On Mon, Aug 02, 2021 at 02:40:36PM +0200, Kevin Wolf wrote:

Am 29.07.2021 um 11:10 hat Fabian Ebner geschrieben:

Linux SCSI can throw spurious -EAGAIN in some corner cases in its
completion path, which will end up being the result in the completed
io_uring request.

Resubmitting such requests should allow block jobs to complete, even
if such spurious errors are encountered.

Co-authored-by: Stefan Hajnoczi 
Reviewed-by: Stefano Garzarella 
Signed-off-by: Fabian Ebner 
---

Changes from v1:
* Focus on what's relevant for the patch itself in the commit
  message.
* Add Stefan's comment.
* Add Stefano's R-b tag (I hope that's fine, since there was no
  change code-wise).

 block/io_uring.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/io_uring.c b/block/io_uring.c
index 00a3ee9fb8..dfa475cc87 100644
--- a/block/io_uring.c
+++ b/block/io_uring.c
@@ -165,7 +165,21 @@ static void luring_process_completions(LuringState *s)
 total_bytes = ret + luringcb->total_read;

 if (ret < 0) {
-if (ret == -EINTR) {
+/*
+ * Only writev/readv/fsync requests on regular files or host block
+ * devices are submitted. Therefore -EAGAIN is not expected but 
it's
+ * known to happen sometimes with Linux SCSI. Submit again and hope
+ * the request completes successfully.
+ *
+ * For more information, see:
+ * 
https://lore.kernel.org/io-uring/20210727165811.284510-3-ax...@kernel.dk/T/#u
+ *
+ * If the code is changed to submit other types of requests in the
+ * future, then this workaround may need to be extended to deal 
with
+ * genuine -EAGAIN results that should not be resubmitted
+ * immediately.
+ */
+if (ret == -EINTR || ret == -EAGAIN) {
 luring_resubmit(s, luringcb);
 continue;
 }


Reviewed-by: Kevin Wolf 

Question about the preexisting code, though: luring_resubmit() requires
that the caller calls ioq_submit() later so that the request doesn't
just sit in a queue without getting any attention, but actually gets
submitted to the kernel.

In the call chain ioq_submit() -> luring_process_completions() ->
luring_resubmit(), who takes care of that?


Mmm, good point.
There should be the same problem with ioq_submit() -> 
luring_process_completions() -> luring_resubmit_short_read() -> 
luring_resubmit().


Should we schedule a BH for example in luring_resubmit() to make sure 
that ioq_submit() is invoked after a resubmission?


Thanks,
Stefano

Re: [PATCH 0/3] docs: convert qapi-code-gen.txt to qapi-code-gen.rst

2021-08-04 Thread John Snow

On Wed, Aug 4, 2021 at 4:48 AM Markus Armbruster  wrote:

> Peter Maydell  writes:
>
> > On Wed, 21 Jul 2021 at 00:58, John Snow  wrote:
> >>
> >> Patch 1 does (roughly) the bare minimum, patch 2 adds some formatting,
> >> and patch 3 adds cross-references.
> >>
> >> John Snow (3):
> >>   docs: convert qapi-code-gen.txt to ReST
> >>   docs/qapi-code-gen: Beautify formatting
> >>   docs/qapi-code-gen: add cross-references
> >
> > Reviewed-by: Peter Maydell 
>
> Let's queue this after my "[PATCH] docs/devel/qapi-code-gen: Update
> examples to match current code".
>
>
Sure. Do you need a rebase? (I don't see that patch in origin/master right
now.)


> For 6.1, or "don't rock the boat now"?
>

No preference. In response to the same question on the other doc conversion
patch, "No preference."

[RFC PATCH] vhost-vdpa: Do not send empty IOTLB update batches

2021-08-04 Thread Eugenio Pérez

With the introduction of the batch hinting, meaningless batches can be
created with no IOTLB updates if the memory region was skipped by
vhost_vdpa_listener_skipped_section. This is the case of host notifiers
memory regions, but others could fall on this category. This caused the
vdpa device to receive dma mapping settings with no changes, a possibly
expensive operation for nothing.

To avoid that, VHOST_IOTLB_BATCH_BEGIN hint is delayed until we have a
meaningful (not skipped section) mapping or unmapping operation, and
VHOST_IOTLB_BATCH_END is not written unless at least one of _UPDATE /
_INVALIDATE has been issued.

Signed-off-by: Eugenio Pérez 
---
 include/hw/virtio/vhost-vdpa.h |  1 +
 hw/virtio/vhost-vdpa.c | 38 +++---
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index e98e327f12..0a7edbe4eb 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -23,6 +23,7 @@ typedef struct vhost_vdpa {
 int device_fd;
 int index;
 uint32_t msg_type;
+size_t n_iotlb_sent_batch;
 MemoryListener listener;
 struct vhost_dev *dev;
 VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 6ce94a1f4d..2517fc6103 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -89,19 +89,13 @@ static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, 
hwaddr iova,
 return ret;
 }
 
-static void vhost_vdpa_listener_begin(MemoryListener *listener)
+static void vhost_vdpa_listener_begin_batch(struct vhost_vdpa *v)
 {
-struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
-struct vhost_dev *dev = v->dev;
-struct vhost_msg_v2 msg = {};
 int fd = v->device_fd;
-
-if (!(dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH))) {
-return;
-}
-
-msg.type = v->msg_type;
-msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
+struct vhost_msg_v2 msg = {
+.type = v->msg_type,
+.iotlb.type = VHOST_IOTLB_BATCH_BEGIN,
+};
 
 if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
 error_report("failed to write, fd=%d, errno=%d (%s)",
@@ -120,6 +114,11 @@ static void vhost_vdpa_listener_commit(MemoryListener 
*listener)
 return;
 }
 
+if (v->n_iotlb_sent_batch == 0) {
+return;
+}
+
+v->n_iotlb_sent_batch = 0;
 msg.type = v->msg_type;
 msg.iotlb.type = VHOST_IOTLB_BATCH_END;
 
@@ -170,6 +169,14 @@ static void vhost_vdpa_listener_region_add(MemoryListener 
*listener,
 
 llsize = int128_sub(llend, int128_make64(iova));
 
+if (v->dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH)) {
+if (v->n_iotlb_sent_batch == 0) {
+vhost_vdpa_listener_begin_batch(v);
+}
+
+v->n_iotlb_sent_batch++;
+}
+
 ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
  vaddr, section->readonly);
 if (ret) {
@@ -221,6 +228,14 @@ static void vhost_vdpa_listener_region_del(MemoryListener 
*listener,
 
 llsize = int128_sub(llend, int128_make64(iova));
 
+if (v->dev->backend_cap & (0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH)) {
+if (v->n_iotlb_sent_batch == 0) {
+vhost_vdpa_listener_begin_batch(v);
+}
+
+v->n_iotlb_sent_batch++;
+}
+
 ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
 if (ret) {
 error_report("vhost_vdpa dma unmap error!");
@@ -234,7 +249,6 @@ static void vhost_vdpa_listener_region_del(MemoryListener 
*listener,
  * depends on the addnop().
  */
 static const MemoryListener vhost_vdpa_memory_listener = {
-.begin = vhost_vdpa_listener_begin,
 .commit = vhost_vdpa_listener_commit,
 .region_add = vhost_vdpa_listener_region_add,
 .region_del = vhost_vdpa_listener_region_del,
-- 
2.27.0

[PATCH 6/7] tcg/module: Add tlb_flush_page to TCGModuleOps

2021-08-04 Thread Gerd Hoffmann

Move stub from exec-all.h to tcg-module.c.

Signed-off-by: Gerd Hoffmann 
---
 include/exec/exec-all.h  | 3 ---
 include/tcg/tcg-module.h | 3 +++
 accel/tcg/cputlb.c   | 1 +
 accel/tcg/tcg-module.c   | 5 +
 softmmu/physmem.c| 4 ++--
 target/arm/helper.c  | 4 ++--
 6 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index ddb1ab797978..43d89699e989 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -328,9 +328,6 @@ static inline void tlb_init(CPUState *cpu)
 static inline void tlb_destroy(CPUState *cpu)
 {
 }
-static inline void tlb_flush_page(CPUState *cpu, target_ulong addr)
-{
-}
 static inline void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr)
 {
 }
diff --git a/include/tcg/tcg-module.h b/include/tcg/tcg-module.h
index b94bfdd362ed..a903e3ee62c0 100644
--- a/include/tcg/tcg-module.h
+++ b/include/tcg/tcg-module.h
@@ -1,8 +1,11 @@
 #ifndef TCG_MODULE_H
 #define TCG_MODULE_H
 
+#include "exec/exec-all.h"
+
 struct TCGModuleOps {
 void (*tlb_flush)(CPUState *cpu);
+void (*tlb_flush_page)(CPUState *cpu, target_ulong addr);
 };
 extern struct TCGModuleOps tcg;
 
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 40c3d1b65ac5..1fcdb71a10a0 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2771,6 +2771,7 @@ uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr)
 static void tcg_module_ops_tlb(void)
 {
 tcg.tlb_flush = tlb_flush;
+tcg.tlb_flush_page = tlb_flush_page;
 }
 
 type_init(tcg_module_ops_tlb);
diff --git a/accel/tcg/tcg-module.c b/accel/tcg/tcg-module.c
index a1e5728c8c1b..4d62160628bd 100644
--- a/accel/tcg/tcg-module.c
+++ b/accel/tcg/tcg-module.c
@@ -5,6 +5,11 @@ static void update_cpu_stub(CPUState *cpu)
 {
 }
 
+static void tlb_flush_page_stub(CPUState *cpu, target_ulong addr)
+{
+}
+
 struct TCGModuleOps tcg = {
 .tlb_flush = update_cpu_stub,
+.tlb_flush_page = tlb_flush_page_stub,
 };
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index d99b4ce55d8f..845b9e99e819 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -794,7 +794,7 @@ int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr 
len,
 
 in_page = -(addr | TARGET_PAGE_MASK);
 if (len <= in_page) {
-tlb_flush_page(cpu, addr);
+tcg.tlb_flush_page(cpu, addr);
 } else {
 tcg.tlb_flush(cpu);
 }
@@ -825,7 +825,7 @@ void cpu_watchpoint_remove_by_ref(CPUState *cpu, 
CPUWatchpoint *watchpoint)
 {
 QTAILQ_REMOVE(&cpu->watchpoints, watchpoint, entry);
 
-tlb_flush_page(cpu, watchpoint->vaddr);
+tcg.tlb_flush_page(cpu, watchpoint->vaddr);
 
 g_free(watchpoint);
 }
diff --git a/target/arm/helper.c b/target/arm/helper.c
index e0848f9bcea8..047d4360b65f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -772,7 +772,7 @@ static void tlbimva_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 if (tlb_force_broadcast(env)) {
 tlb_flush_page_all_cpus_synced(cs, value);
 } else {
-tlb_flush_page(cs, value);
+tcg.tlb_flush_page(cs, value);
 }
 }
 
@@ -799,7 +799,7 @@ static void tlbimvaa_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 if (tlb_force_broadcast(env)) {
 tlb_flush_page_all_cpus_synced(cs, value);
 } else {
-tlb_flush_page(cs, value);
+tcg.tlb_flush_page(cs, value);
 }
 }
 
-- 
2.31.1

[PATCH 2/7] tcg/module: move hmp.c to tcg module

2021-08-04 Thread Gerd Hoffmann

One little step in moving more code to the tcg modules.

Signed-off-by: Gerd Hoffmann 
---
 accel/tcg/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/meson.build b/accel/tcg/meson.build
index 137a1a44cc0a..d4df7681a811 100644
--- a/accel/tcg/meson.build
+++ b/accel/tcg/meson.build
@@ -15,7 +15,6 @@ specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
 
 specific_ss.add(when: ['CONFIG_SOFTMMU', 'CONFIG_TCG'], if_true: files(
   'cputlb.c',
-  'hmp.c',
 ))
 
 tcg_module_ss.add(when: ['CONFIG_SOFTMMU', 'CONFIG_TCG'], if_true: files(
@@ -23,4 +22,5 @@ tcg_module_ss.add(when: ['CONFIG_SOFTMMU', 'CONFIG_TCG'], 
if_true: files(
   'tcg-accel-ops-mttcg.c',
   'tcg-accel-ops-icount.c',
   'tcg-accel-ops-rr.c',
+  'hmp.c',
 ))
-- 
2.31.1

[PATCH 4/7] tcg/module: add some infrastructure for modular tcg.

2021-08-04 Thread Gerd Hoffmann

Create tcg-module.[ch] files, with struct TCGModuleOps, empty for now.

Followup patches will add function pointers to the struct and stub
functions to tcg-module.c.  That will effectively will switch stubs from
compile-time to runtime, which is needed to build tcg as module.

Signed-off-by: Gerd Hoffmann 
---
 include/tcg/tcg-module.h | 8 
 accel/tcg/tcg-module.c   | 5 +
 accel/tcg/meson.build| 4 
 3 files changed, 17 insertions(+)
 create mode 100644 include/tcg/tcg-module.h
 create mode 100644 accel/tcg/tcg-module.c

diff --git a/include/tcg/tcg-module.h b/include/tcg/tcg-module.h
new file mode 100644
index ..7e87aecb2357
--- /dev/null
+++ b/include/tcg/tcg-module.h
@@ -0,0 +1,8 @@
+#ifndef TCG_MODULE_H
+#define TCG_MODULE_H
+
+struct TCGModuleOps {
+};
+extern struct TCGModuleOps tcg;
+
+#endif /* TCG_MODULE_H */
diff --git a/accel/tcg/tcg-module.c b/accel/tcg/tcg-module.c
new file mode 100644
index ..e864fb20c141
--- /dev/null
+++ b/accel/tcg/tcg-module.c
@@ -0,0 +1,5 @@
+#include "qemu/osdep.h"
+#include "tcg/tcg-module.h"
+
+struct TCGModuleOps tcg = {
+};
diff --git a/accel/tcg/meson.build b/accel/tcg/meson.build
index c1ee9dcaed1f..eaaf7168ffb6 100644
--- a/accel/tcg/meson.build
+++ b/accel/tcg/meson.build
@@ -1,3 +1,7 @@
+specific_ss.add(files(
+  'tcg-module.c',
+))
+
 specific_ss.add(when: 'CONFIG_TCG', if_true: files(
   'cpu-exec-common.c',
 ))
-- 
2.31.1

1 2 3 >

1 - 100 of 243 matches

Mail list logo