Re: [PATCH 3/3] target/ppc: Set 601v exception model id

2021-12-08 Thread Cédric Le Goater

On 12/8/21 13:30, Fabiano Rosas wrote:

The exception model id for 601v has been removed without mention
why. I assume it was inadvertent and restore it here.

Fixes: b632a148b6 ("target-ppc: Use QOM method dispatch for MMU fault handling")
Signed-off-by: Fabiano Rosas 


Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  target/ppc/cpu_init.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 8100b89033..0e1682ddd9 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -4607,6 +4607,7 @@ POWERPC_FAMILY(601v)(ObjectClass *oc, void *data)
  (1ull << MSR_IR) |
  (1ull << MSR_DR);
  pcc->mmu_model = POWERPC_MMU_601;
+pcc->excp_model = POWERPC_EXCP_601;
  pcc->bus_model = PPC_FLAGS_INPUT_6xx;
  pcc->bfd_mach = bfd_mach_ppc_601;
  pcc->flags = POWERPC_FLAG_SE | POWERPC_FLAG_RTC_CLK | 
POWERPC_FLAG_HID0_LE;






Re: [PATCH 1/3] target/ppc: Fix MPCxxx FPU interrupt address

2021-12-08 Thread Cédric Le Goater

On 12/8/21 13:30, Fabiano Rosas wrote:

The Floating-point Unavailable and Decrementer interrupts are being
registered at the same 0x900 address. The FPU should be at 0x800
instead.

Verified on MPC555, MPC860 and MPC885 user manuals.

Reported-by: BALATON Zoltan 
Signed-off-by: Fabiano Rosas 


Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
  target/ppc/cpu_init.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 6695985e9b..55af48769a 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -2180,7 +2180,7 @@ static void init_excp_MPC5xx(CPUPPCState *env)
  env->excp_vectors[POWERPC_EXCP_EXTERNAL] = 0x0500;
  env->excp_vectors[POWERPC_EXCP_ALIGN]= 0x0600;
  env->excp_vectors[POWERPC_EXCP_PROGRAM]  = 0x0700;
-env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0900;
+env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0800;
  env->excp_vectors[POWERPC_EXCP_DECR] = 0x0900;
  env->excp_vectors[POWERPC_EXCP_SYSCALL]  = 0x0C00;
  env->excp_vectors[POWERPC_EXCP_TRACE]= 0x0D00;
@@ -2207,7 +2207,7 @@ static void init_excp_MPC8xx(CPUPPCState *env)
  env->excp_vectors[POWERPC_EXCP_EXTERNAL] = 0x0500;
  env->excp_vectors[POWERPC_EXCP_ALIGN]= 0x0600;
  env->excp_vectors[POWERPC_EXCP_PROGRAM]  = 0x0700;
-env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0900;
+env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0800;
  env->excp_vectors[POWERPC_EXCP_DECR] = 0x0900;
  env->excp_vectors[POWERPC_EXCP_SYSCALL]  = 0x0C00;
  env->excp_vectors[POWERPC_EXCP_TRACE]= 0x0D00;






Re: [PATCH 1/3] docs: rSTify ppc-spapr-hcalls.txt

2021-12-08 Thread Cédric Le Goater

-When the guest runs in "real mode" (in powerpc lingua this means
-with MMU disabled, ie guest effective == guest physical), it only
-has access to a subset of memory and no IOs.
+H_LOGICAL_MEMOP (0xf001)
+
-PAPR provides a set of hypervisor calls to perform cacheable or
+When the guest runs in "real mode" (in powerpc lingua this means with MMU


What's up with 'lingua'? As you already know "lingua" is Brazilian portuguese for 
"tongue"
and it's so weird to be used in this context.

The one English word that I can think of that is closer to "lingua" and would 
make sense here
is 'lingo'. But that is a slang for 'jargon' and not appropriate for a 
technical document
either. "langua" as a short form of "language" seems weird as well. I really 
believe 'jargon'
is a more suitable word here.


As a native speaker: "lingo" would make sense here, though its tone is
a little informal.  "jargon" could also work, but "terminology" would
probably better match the tone of the document.


I changed it to terminology. No need to resend !

Thanks,

C.



Re: [PATCH 3/6] target/arm: Honor TCR_ELx.{I}PS

2021-12-08 Thread Philippe Mathieu-Daudé
Hi Richard,

On 12/9/21 00:11, Richard Henderson wrote:
> This field controls the output (intermediate) physical address size
> of the translation process.  V8 requires to raise an AddressSize
> fault if the page tables are programmed incorrectly, such that any
> intermediate descriptor address, or the final translated address,
> is out of range.
> 

I'd split this patch as:

> Add an outputsize field to ARMVAParameters,
^ 1

> and fill it in during
> aa64_va_parameters.
^2

> Pass the value to check_s2_mmu_setup to use
> instead of the raw PAMax value.
^1

> Test the descaddr as extracted
> from TTBR and from page table entries.
^2

> Restrict descaddrmask so that we won't raise the fault for v7.
^ could be in 1 (simpler) or 2 if you think it makes sense.

This way #1 is a preliminary refactor/cleanup,
and #2 is only the ps field and V8 addition.

> Signed-off-by: Richard Henderson 
> ---
>  target/arm/internals.h |  1 +
>  target/arm/helper.c| 92 +-
>  2 files changed, 65 insertions(+), 28 deletions(-)



Re: [PATCH 2/6] target/arm: Move arm_pamax out of line

2021-12-08 Thread Philippe Mathieu-Daudé
On 12/9/21 00:11, Richard Henderson wrote:
> Signed-off-by: Richard Henderson 
> ---
>  target/arm/internals.h | 19 +--
>  target/arm/helper.c| 22 ++
>  2 files changed, 23 insertions(+), 18 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH v4 11/22] target/riscv: Implement AIA hvictl and hviprioX CSRs

2021-12-08 Thread Anup Patel
On Thu, Nov 4, 2021 at 10:19 AM Alistair Francis  wrote:
>
> On Tue, Oct 26, 2021 at 5:39 PM Anup Patel  wrote:
> >
> > The AIA hvictl and hviprioX CSRs allow hypervisor to control
> > interrupts visible at VS-level. This patch implements AIA hvictl
> > and hviprioX CSRs.
> >
> > Signed-off-by: Anup Patel 
> > ---
> >  target/riscv/cpu.h |   1 +
> >  target/riscv/csr.c | 126 +
> >  target/riscv/machine.c |   2 +
> >  3 files changed, 129 insertions(+)
> >
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index 718a95e864..21d9c536ef 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -196,6 +196,7 @@ struct CPURISCVState {
> >  uint64_t htimedelta;
> >
> >  /* Hypervisor controlled virtual interrupt priorities */
> > +target_ulong hvictl;
> >  uint8_t hviprio[64];
> >
> >  /* Virtual CSRs */
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > index 3a7d89ac34..46d0cabbde 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -219,6 +219,15 @@ static RISCVException hmode32(CPURISCVState *env, int 
> > csrno)
> >
> >  }
> >
> > +static int aia_hmode(CPURISCVState *env, int csrno)
> > +{
> > +if (!riscv_feature(env, RISCV_FEATURE_AIA)) {
> > +return RISCV_EXCP_ILLEGAL_INST;
> > + }
> > +
> > + return hmode(env, csrno);
> > +}
> > +
> >  static int aia_hmode32(CPURISCVState *env, int csrno)
> >  {
> >  if (!riscv_feature(env, RISCV_FEATURE_AIA)) {
> > @@ -1031,6 +1040,9 @@ static RISCVException rmw_sie64(CPURISCVState *env, 
> > int csrno,
> >  uint64_t mask = env->mideleg & S_MODE_INTERRUPTS;
> >
> >  if (riscv_cpu_virt_enabled(env)) {
> > +if (env->hvictl & HVICTL_VTI) {
> > +return RISCV_EXCP_VIRT_INSTRUCTION_FAULT;
> > +}
> >  ret = rmw_vsie64(env, CSR_VSIE, ret_val, new_val, wr_mask);
> >  } else {
> >  ret = rmw_mie64(env, csrno, ret_val, new_val, wr_mask & mask);
> > @@ -1229,6 +1241,9 @@ static RISCVException rmw_sip64(CPURISCVState *env, 
> > int csrno,
> >  uint64_t mask = env->mideleg & sip_writable_mask;
> >
> >  if (riscv_cpu_virt_enabled(env)) {
> > +if (env->hvictl & HVICTL_VTI) {
> > +return RISCV_EXCP_VIRT_INSTRUCTION_FAULT;
> > +}
> >  ret = rmw_vsip64(env, CSR_VSIE, ret_val, new_val, wr_mask);
> >  } else {
> >  ret = rmw_mip64(env, csrno, ret_val, new_val, wr_mask & mask);
> > @@ -1615,6 +1630,110 @@ static RISCVException 
> > write_htimedeltah(CPURISCVState *env, int csrno,
> >  return RISCV_EXCP_NONE;
> >  }
> >
> > +static int read_hvictl(CPURISCVState *env, int csrno, target_ulong *val)
> > +{
> > +*val = env->hvictl;
> > +return RISCV_EXCP_NONE;
> > +}
> > +
> > +static int write_hvictl(CPURISCVState *env, int csrno, target_ulong val)
> > +{
> > +env->hvictl = val & HVICTL_VALID_MASK;
> > +return RISCV_EXCP_NONE;
> > +}
> > +
> > +static int read_hvipriox(CPURISCVState *env, int first_index,
> > + uint8_t *iprio, target_ulong *val)
> > +{
> > +int i, irq, rdzero, num_irqs = 4 * (TARGET_LONG_BITS / 32);
>
> You should be checking the CPUs xlen instead of using the hardcoded
> TARGET_LONG_BITS here.

Sure, I will update and use CPU xlen.

Regards,
Anup

>
> Alistair
>
> > +
> > +/* First index has to be multiple of numbe of irqs per register */
> > +if (first_index % num_irqs) {
> > +return (riscv_cpu_virt_enabled(env)) ?
> > +   RISCV_EXCP_VIRT_INSTRUCTION_FAULT : RISCV_EXCP_ILLEGAL_INST;
> > +}
> > +
> > +/* Fill-up return value */
> > +*val = 0;
> > +for (i = 0; i < num_irqs; i++) {
> > +if (riscv_cpu_hviprio_index2irq(first_index + i, , )) {
> > +continue;
> > +}
> > +if (rdzero) {
> > +continue;
> > +}
> > +*val |= ((target_ulong)iprio[irq]) << (i * 8);
> > +}
> > +
> > +return RISCV_EXCP_NONE;
> > +}
> > +
> > +static int write_hvipriox(CPURISCVState *env, int first_index,
> > +  uint8_t *iprio, target_ulong val)
> > +{
> > +int i, irq, rdzero, num_irqs = 4 * (TARGET_LONG_BITS / 32);
> > +
> > +/* First index has to be multiple of numbe of irqs per register */
> > +if (first_index % num_irqs) {
> > +return (riscv_cpu_virt_enabled(env)) ?
> > +   RISCV_EXCP_VIRT_INSTRUCTION_FAULT : RISCV_EXCP_ILLEGAL_INST;
> > +}
> > +
> > +/* Fill-up priority arrary */
> > +for (i = 0; i < num_irqs; i++) {
> > +if (riscv_cpu_hviprio_index2irq(first_index + i, , )) {
> > +continue;
> > +}
> > +if (rdzero) {
> > +iprio[irq] = 0;
> > +} else {
> > +iprio[irq] = (val >> (i * 8)) & 0xff;
> > +}
> > +}
> > +
> > +return RISCV_EXCP_NONE;
> > +}
> > +
> > +static int read_hviprio1(CPURISCVState *env, int csrno, target_ulong *val)
> > +{
> 

Re: [PATCH v9 05/10] ACPI ERST: support for ACPI ERST feature

2021-12-08 Thread Ani Sinha
On Wed, Dec 8, 2021 at 10:08 PM Eric DeVolder  wrote:
>
>
>
> On 12/6/21 02:14, Ani Sinha wrote:
> > On Fri, Dec 3, 2021 at 12:39 AM Eric DeVolder  
> > wrote:
> >>
> >> This implements a PCI device for ACPI ERST. This implements the
> >> non-NVRAM "mode" of operation for ERST as it is supported by
> >> Linux and Windows.
> >
> > OK sent some more comments. It will take another pass for me to fully
> > review this.
> >
>
> Hi Ani, thank you for reviewing. I have incorporated your feedback thus far.
> I have v10 ready to go but not sure if your review of v9 is completed yet?

I completed scanning this patch. Don't hold your breath. I review
things when I find gaps in other work and can't promise timely
reviews.
You can send a v10 once you have addressed my last set of comments.



Re: [PATCH v9 05/10] ACPI ERST: support for ACPI ERST feature

2021-12-08 Thread Ani Sinha
On Fri, Dec 3, 2021 at 12:39 AM Eric DeVolder  wrote:
>
> This implements a PCI device for ACPI ERST. This implements the
> non-NVRAM "mode" of operation for ERST as it is supported by
> Linux and Windows.

Few more comments on this patch ...

>
> Signed-off-by: Eric DeVolder 
> ---
>  hw/acpi/Kconfig  |   6 +
>  hw/acpi/erst.c   | 836 
> +++
>  hw/acpi/meson.build  |   1 +
>  hw/acpi/trace-events |  15 +
>  4 files changed, 858 insertions(+)
>  create mode 100644 hw/acpi/erst.c
>
> diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
> index 622b0b5..19caebd 100644
> --- a/hw/acpi/Kconfig
> +++ b/hw/acpi/Kconfig
> @@ -10,6 +10,7 @@ config ACPI_X86
>  select ACPI_HMAT
>  select ACPI_PIIX4
>  select ACPI_PCIHP
> +select ACPI_ERST
>
>  config ACPI_X86_ICH
>  bool
> @@ -60,3 +61,8 @@ config ACPI_HW_REDUCED
>  select ACPI
>  select ACPI_MEMORY_HOTPLUG
>  select ACPI_NVDIMM
> +
> +config ACPI_ERST
> +bool
> +default y
> +depends on ACPI && PCI
> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> new file mode 100644
> index 000..4304f55
> --- /dev/null
> +++ b/hw/acpi/erst.c
> @@ -0,0 +1,836 @@
> +/*
> + * ACPI Error Record Serialization Table, ERST, Implementation
> + *
> + * ACPI ERST introduced in ACPI 4.0, June 16, 2009.
> + * ACPI Platform Error Interfaces : Error Serialization
> + *
> + * Copyright (c) 2021 Oracle and/or its affiliates.
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "hw/qdev-core.h"
> +#include "exec/memory.h"
> +#include "qom/object.h"
> +#include "hw/pci/pci.h"
> +#include "qom/object_interfaces.h"
> +#include "qemu/error-report.h"
> +#include "migration/vmstate.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/acpi/acpi.h"
> +#include "hw/acpi/acpi-defs.h"
> +#include "hw/acpi/aml-build.h"
> +#include "hw/acpi/bios-linker-loader.h"
> +#include "exec/address-spaces.h"
> +#include "sysemu/hostmem.h"
> +#include "hw/acpi/erst.h"
> +#include "trace.h"
> +
> +/* ACPI 4.0: Table 17-16 Serialization Actions */
> +#define ACTION_BEGIN_WRITE_OPERATION 0x0
> +#define ACTION_BEGIN_READ_OPERATION  0x1
> +#define ACTION_BEGIN_CLEAR_OPERATION 0x2
> +#define ACTION_END_OPERATION 0x3
> +#define ACTION_SET_RECORD_OFFSET 0x4
> +#define ACTION_EXECUTE_OPERATION 0x5
> +#define ACTION_CHECK_BUSY_STATUS 0x6
> +#define ACTION_GET_COMMAND_STATUS0x7
> +#define ACTION_GET_RECORD_IDENTIFIER 0x8
> +#define ACTION_SET_RECORD_IDENTIFIER 0x9
> +#define ACTION_GET_RECORD_COUNT  0xA
> +#define ACTION_BEGIN_DUMMY_WRITE_OPERATION   0xB
> +#define ACTION_RESERVED  0xC
> +#define ACTION_GET_ERROR_LOG_ADDRESS_RANGE   0xD
> +#define ACTION_GET_ERROR_LOG_ADDRESS_LENGTH  0xE
> +#define ACTION_GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES 0xF
> +#define ACTION_GET_EXECUTE_OPERATION_TIMINGS 0x10
> +
> +/* ACPI 4.0: Table 17-17 Command Status Definitions */
> +#define STATUS_SUCCESS0x00
> +#define STATUS_NOT_ENOUGH_SPACE   0x01
> +#define STATUS_HARDWARE_NOT_AVAILABLE 0x02
> +#define STATUS_FAILED 0x03
> +#define STATUS_RECORD_STORE_EMPTY 0x04
> +#define STATUS_RECORD_NOT_FOUND   0x05
> +
> +
> +/* UEFI 2.1: Appendix N Common Platform Error Record */
> +#define UEFI_CPER_RECORD_MIN_SIZE 128U
> +#define UEFI_CPER_RECORD_LENGTH_OFFSET 20U
> +#define UEFI_CPER_RECORD_ID_OFFSET 96U
> +#define IS_UEFI_CPER_RECORD(ptr) \
> +(((ptr)[0] == 'C') && \
> + ((ptr)[1] == 'P') && \
> + ((ptr)[2] == 'E') && \
> + ((ptr)[3] == 'R'))
> +
> +/*
> + * NOTE that when accessing CPER fields within a record, memcpy()
> + * is utilized to avoid a possible misaligned access on the host.
> + */
> +
> +/*
> + * This implementation is an ACTION (cmd) and VALUE (data)
> + * interface consisting of just two 64-bit registers.
> + */
> +#define ERST_REG_SIZE (16UL)
> +#define ERST_ACTION_OFFSET (0UL) /* action (cmd) */
> +#define ERST_VALUE_OFFSET  (8UL) /* argument/value (data) */
> +
> +/*
> + * ERST_RECORD_SIZE is the buffer size for exchanging ERST
> + * record contents. Thus, it defines the maximum record size.
> + * As this is mapped through a PCI BAR, it must be a power of
> + * two and larger than UEFI_CPER_RECORD_MIN_SIZE.
> + * The backing storage is divided into fixed size "slots",
> + * each ERST_RECORD_SIZE in length, and each "slot"
> + * storing a single record. No attempt at optimizing storage
> + * through compression, compaction, etc is attempted.
> + * NOTE that slot 0 is reserved for the backing storage header.
> + * Depending upon the size of the backing storage, additional
> + * slots will be part of the slot 0 header in order to account
> + * for a record_id for each available remaining slot.
> + */
> +/* 8KiB records, not too 

Re: [PATCH v4 10/22] target/riscv: Implement AIA CSRs for 64 local interrupts on RV32

2021-12-08 Thread Anup Patel
On Thu, Nov 4, 2021 at 10:13 AM Alistair Francis  wrote:
>
> On Tue, Oct 26, 2021 at 5:10 PM Anup Patel  wrote:
> >
> > The AIA specification adds new CSRs for RV32 so that RISC-V hart can
> > support 64 local interrupts on both RV32 and RV64.
> >
> > Signed-off-by: Anup Patel 
> > ---
> >  target/riscv/cpu.h|  14 +-
> >  target/riscv/cpu_helper.c |  10 +-
> >  target/riscv/csr.c| 560 +++---
> >  target/riscv/machine.c|  10 +-
> >  4 files changed, 474 insertions(+), 120 deletions(-)
> >
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index c47e57efc8..718a95e864 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -159,12 +159,12 @@ struct CPURISCVState {
> >   */
> >  uint64_t mstatus;
> >
> > -target_ulong mip;
> > +uint64_t mip;
> >
> > -uint32_t miclaim;
> > +uint64_t miclaim;
> >
> > -target_ulong mie;
> > -target_ulong mideleg;
> > +uint64_t mie;
> > +uint64_t mideleg;
> >
> >  target_ulong satp;   /* since: priv-1.10.0 */
> >  target_ulong stval;
> > @@ -186,7 +186,7 @@ struct CPURISCVState {
> >  /* Hypervisor CSRs */
> >  target_ulong hstatus;
> >  target_ulong hedeleg;
> > -target_ulong hideleg;
> > +uint64_t hideleg;
> >  target_ulong hcounteren;
> >  target_ulong htval;
> >  target_ulong htinst;
> > @@ -399,8 +399,8 @@ void riscv_cpu_list(void);
> >  #ifndef CONFIG_USER_ONLY
> >  bool riscv_cpu_exec_interrupt(CPUState *cs, int interrupt_request);
> >  void riscv_cpu_swap_hypervisor_regs(CPURISCVState *env);
> > -int riscv_cpu_claim_interrupts(RISCVCPU *cpu, uint32_t interrupts);
> > -uint32_t riscv_cpu_update_mip(RISCVCPU *cpu, uint32_t mask, uint32_t 
> > value);
> > +int riscv_cpu_claim_interrupts(RISCVCPU *cpu, uint64_t interrupts);
> > +uint64_t riscv_cpu_update_mip(RISCVCPU *cpu, uint64_t mask, uint64_t 
> > value);
> >  #define BOOL_TO_MASK(x) (-!!(x)) /* helper for riscv_cpu_update_mip value 
> > */
> >  void riscv_cpu_set_rdtime_fn(CPURISCVState *env, uint64_t (*fn)(uint32_t),
> >   uint32_t arg);
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 73ebce1efd..416b11f85c 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -506,7 +506,7 @@ bool riscv_cpu_two_stage_lookup(int mmu_idx)
> >  return mmu_idx & TB_FLAGS_PRIV_HYP_ACCESS_MASK;
> >  }
> >
> > -int riscv_cpu_claim_interrupts(RISCVCPU *cpu, uint32_t interrupts)
> > +int riscv_cpu_claim_interrupts(RISCVCPU *cpu, uint64_t interrupts)
> >  {
> >  CPURISCVState *env = >env;
> >  if (env->miclaim & interrupts) {
> > @@ -517,11 +517,11 @@ int riscv_cpu_claim_interrupts(RISCVCPU *cpu, 
> > uint32_t interrupts)
> >  }
> >  }
> >
> > -uint32_t riscv_cpu_update_mip(RISCVCPU *cpu, uint32_t mask, uint32_t value)
> > +uint64_t riscv_cpu_update_mip(RISCVCPU *cpu, uint64_t mask, uint64_t value)
> >  {
> >  CPURISCVState *env = >env;
> >  CPUState *cs = CPU(cpu);
> > -uint32_t gein, vsgein = 0, old = env->mip;
> > +uint64_t gein, vsgein = 0, old = env->mip;
> >  bool locked = false;
> >
> >  if (riscv_cpu_virt_enabled(env)) {
> > @@ -1244,7 +1244,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> >   */
> >  bool async = !!(cs->exception_index & RISCV_EXCP_INT_FLAG);
> >  target_ulong cause = cs->exception_index & RISCV_EXCP_INT_MASK;
> > -target_ulong deleg = async ? env->mideleg : env->medeleg;
> > +uint64_t deleg = async ? env->mideleg : env->medeleg;
> >  bool write_tval = false;
> >  target_ulong tval = 0;
> >  target_ulong htval = 0;
> > @@ -1311,7 +1311,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> >  cause < TARGET_LONG_BITS && ((deleg >> cause) & 1)) {
> >  /* handle the trap in S-mode */
> >  if (riscv_has_ext(env, RVH)) {
> > -target_ulong hdeleg = async ? env->hideleg : env->hedeleg;
> > +uint64_t hdeleg = async ? env->hideleg : env->hedeleg;
> >
> >  if (env->two_stage_lookup && write_tval) {
> >  /*
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > index 7ff285282b..3a7d89ac34 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -153,6 +153,15 @@ static RISCVException any32(CPURISCVState *env, int 
> > csrno)
> >
> >  }
> >
> > +static int aia_any32(CPURISCVState *env, int csrno)
> > +{
> > +if (!riscv_feature(env, RISCV_FEATURE_AIA)) {
> > +return RISCV_EXCP_ILLEGAL_INST;
> > +}
> > +
> > +return any32(env, csrno);
> > +}
> > +
> >  static RISCVException smode(CPURISCVState *env, int csrno)
> >  {
> >  if (riscv_has_ext(env, RVS)) {
> > @@ -162,6 +171,24 @@ static RISCVException smode(CPURISCVState *env, int 
> > csrno)
> >  return RISCV_EXCP_ILLEGAL_INST;
> >  }
> >
> > +static int smode32(CPURISCVState *env, int csrno)
> > +{
> > +if (riscv_cpu_mxl(env) != MXL_RV32) {

Re: [PATCH 1/3] docs: rSTify ppc-spapr-hcalls.txt

2021-12-08 Thread Warner Losh
On Wed, Dec 8, 2021 at 6:51 PM David Gibson 
wrote:

> On Wed, Dec 08, 2021 at 03:54:40PM -0300, Daniel Henrique Barboza wrote:
> >
> >
> > On 12/8/21 13:59, lagar...@linux.ibm.com wrote:
> > > From: Leonardo Garcia 
> > >
> > > Signed-off-by: Leonardo Garcia 
> > > ---
> > >   docs/specs/ppc-spapr-hcalls.txt | 92
> -
> > >   1 file changed, 57 insertions(+), 35 deletions(-)
> > >
> > > diff --git a/docs/specs/ppc-spapr-hcalls.txt
> b/docs/specs/ppc-spapr-hcalls.txt
> > > index 93fe3da91b..c69dae535b 100644
> > > --- a/docs/specs/ppc-spapr-hcalls.txt
> > > +++ b/docs/specs/ppc-spapr-hcalls.txt
> > > @@ -1,9 +1,15 @@
> > > -When used with the "pseries" machine type, QEMU-system-ppc64
> implements
> > > -a set of hypervisor calls using a subset of the server "PAPR"
> specification
> > > -(IBM internal at this point), which is also what IBM's proprietary
> hypervisor
> > > -adheres too.
> > > +sPAPR hypervisor calls
> > > +--
> > > -The subset is selected based on the requirements of Linux as a guest.
> > > +When used with the ``pseries`` machine type, ``qemu-system-ppc64``
> implements
> > > +a set of hypervisor calls (a.k.a. hcalls) defined in the `Linux on
> Power
> > > +Architecture Reference document (LoPAR)
> > > +<
> https://cdn.openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200812.pdf
> >`_.
> > > +This document is a subset of the Power Architecture Platform
> Reference (PAPR+)
> > > +specification (IBM internal only), which is what PowerVM, the IBM
> proprietary
> > > +hypervisor, adheres to.
> > > +
> > > +The subset in LoPAR is selected based on the requirements of Linux as
> a guest.
> > >   In addition to those calls, we have added our own private hypervisor
> > >   calls which are mostly used as a private interface between the
> firmware
> > > @@ -12,13 +18,14 @@ running in the guest and QEMU.
> > >   All those hypercalls start at hcall number 0xf000 which correspond
> > >   to an implementation specific range in PAPR.
> > > -- H_RTAS (0xf000)
> > > +H_RTAS (0xf000)
> > > +^^^
> > > -RTAS is a set of runtime services generally provided by the firmware
> > > -inside the guest to the operating system. It predates the existence
> > > -of hypervisors (it was originally an extension to Open Firmware) and
> > > -is still used by PAPR to provide various services that aren't
> performance
> > > -sensitive.
> > > +RTAS stands for Run-Time Abstraction Sercies and is a set of runtime
> services
> > > +generally provided by the firmware inside the guest to the operating
> system. It
> > > +predates the existence of hypervisors (it was originally an extension
> to Open
> > > +Firmware) and is still used by PAPR and LoPAR to provide various
> services that
> > > +are not performance sensitive.
> > >   We currently implement the RTAS services in QEMU itself. The actual
> RTAS
> > >   "firmware" blob in the guest is a small stub of a few instructions
> which
> > > @@ -26,22 +33,25 @@ calls our private H_RTAS hypervisor call to pass
> the RTAS calls to QEMU.
> > >   Arguments:
> > > -  r3 : H_RTAS (0xf000)
> > > -  r4 : Guest physical address of RTAS parameter block
> > > +  ``r3``: ``H_RTAS (0xf000)``
> > > +
> > > +  ``r4``: Guest physical address of RTAS parameter block.
> > >   Returns:
> > > -  H_SUCCESS   : Successfully called the RTAS function (RTAS result
> > > -will have been stored in the parameter block)
> > > -  H_PARAMETER : Unknown token
> > > +  ``H_SUCCESS``: Successfully called the RTAS function (RTAS result
> will have
> > > +  been stored in the parameter block).
> > > -- H_LOGICAL_MEMOP (0xf001)
> > > +  ``H_PARAMETER``: Unknown token.
> > > -When the guest runs in "real mode" (in powerpc lingua this means
> > > -with MMU disabled, ie guest effective == guest physical), it only
> > > -has access to a subset of memory and no IOs.
> > > +H_LOGICAL_MEMOP (0xf001)
> > > +
> > > -PAPR provides a set of hypervisor calls to perform cacheable or
> > > +When the guest runs in "real mode" (in powerpc lingua this means with
> MMU
> >
> > What's up with 'lingua'? As you already know "lingua" is Brazilian
> portuguese for "tongue"
> > and it's so weird to be used in this context.
> >
> > The one English word that I can think of that is closer to "lingua" and
> would make sense here
> > is 'lingo'. But that is a slang for 'jargon' and not appropriate for a
> technical document
> > either. "langua" as a short form of "language" seems weird as well. I
> really believe 'jargon'
> > is a more suitable word here.
>
> As a native speaker: "lingo" would make sense here, though its tone is
> a little informal.  "jargon" could also work, but "terminology" would
> probably better match the tone of the document.
>
> I expect this hasn't been noticed before, because I think most English
> speakers would read "lingua" as a typo for "lingo", maybe only barely
> registering that it was not the standard 

Re: [RFC PATCH v2 07/44] i386/kvm: Squash getting/putting guest state for TDX VMs

2021-12-08 Thread Xiaoyao Li

On 8/26/2021 6:24 PM, Gerd Hoffmann wrote:

On Wed, Jul 07, 2021 at 05:54:37PM -0700, isaku.yamah...@gmail.com wrote:

From: Sean Christopherson 

Ignore get/put state of TDX VMs as accessing/mutating guest state of
producation TDs is not supported.


Why silently ignore instead of returning an error?


The error is returned to upper caller in QEMU, right? There deems to be 
somewhere in QEMU to not call the IOCTLs to get guest states of TD VM.


Let's reword it to "Don't". Is it OK?





Re: [PATCH v9 10/10] target/ppc/excp_helper.c: EBB handling adjustments

2021-12-08 Thread David Gibson
On Wed, Dec 01, 2021 at 12:17:34PM -0300, Daniel Henrique Barboza wrote:
> The current logic is only considering event-based exceptions triggered
> by the performance monitor. This is true now, but we might want to add
> support for external event-based exceptions in the future.
> 
> Let's make it a bit easier to do so by adding the bit logic that would
> happen in case we were dealing with an external event-based exception.
> 
> While we're at it, add a few comments explaining why we're setting and
> clearing BESCR bits.
> 
> Reviewed-by: David Gibson 

Still looks fine, but I'm not seeing a particularly strong reason to
keep this split from the previous patch.

> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/ppc/excp_helper.c | 45 ++--
>  1 file changed, 39 insertions(+), 6 deletions(-)
> 
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index a26d266fe6..42e2fee9c8 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -801,14 +801,47 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
> excp_model, int excp)
>  break;
>  case POWERPC_EXCP_EBB:   /* Event-based branch exception 
> */
>  if ((env->spr[SPR_FSCR] & (1ull << FSCR_EBB)) &&
> -(env->spr[SPR_BESCR] & BESCR_GE) &&
> -(env->spr[SPR_BESCR] & BESCR_PME)) {
> +(env->spr[SPR_BESCR] & BESCR_GE)) {
>  target_ulong nip;
>  
> -env->spr[SPR_BESCR] &= ~BESCR_GE;   /* Clear GE */
> -env->spr[SPR_BESCR] |= BESCR_PMEO;  /* Set PMEO */
> -env->spr[SPR_EBBRR] = env->nip; /* Save NIP for rfebb insn */
> -nip = env->spr[SPR_EBBHR];  /* EBB handler */
> +/*
> + * If we have Performance Monitor Event-Based exception
> + * enabled (BESCR_PME) and a Performance Monitor alert
> + * occurred (MMCR0_PMAO), clear BESCR_PME and set BESCR_PMEO
> + * (Performance Monitor Event-Based Exception Occurred).
> + *
> + * Software is responsible for clearing both BESCR_PMEO and
> + * MMCR0_PMAO after the event has been handled.
> + */
> +if ((env->spr[SPR_BESCR] & BESCR_PME) &&
> +(env->spr[SPR_POWER_MMCR0] & MMCR0_PMAO)) {
> +env->spr[SPR_BESCR] &= ~BESCR_PME;
> +env->spr[SPR_BESCR] |= BESCR_PMEO;
> +}
> +
> +/*
> + * In the case of External Event-Based exceptions, do a
> + * similar logic with BESCR_EE and BESCR_EEO. BESCR_EEO must
> + * also be cleared by software.
> + *
> + * PowerISA 3.1 considers that we'll not have BESCR_PMEO and
> + * BESCR_EEO set at the same time. We can check for BESCR_PMEO
> + * being not set in step above to see if this exception was
> + * trigged by an external event.
> + */
> +if (env->spr[SPR_BESCR] & BESCR_EE &&
> +!(env->spr[SPR_BESCR] & BESCR_PMEO)) {
> +env->spr[SPR_BESCR] &= ~BESCR_EE;
> +env->spr[SPR_BESCR] |= BESCR_EEO;
> +}
> +
> +/*
> + * Clear BESCR_GE, save NIP for 'rfebb' and point the
> + * execution to the event handler (SPR_EBBHR) address.
> + */
> +env->spr[SPR_BESCR] &= ~BESCR_GE;
> +env->spr[SPR_EBBRR] = env->nip;
> +nip = env->spr[SPR_EBBHR];
>  powerpc_set_excp_state(cpu, nip, env->msr);
>  }
>  /*

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v9 09/10] target/ppc: PMU Event-Based exception support

2021-12-08 Thread David Gibson
On Wed, Dec 01, 2021 at 12:17:33PM -0300, Daniel Henrique Barboza wrote:
> From: Gustavo Romero 
> 
> Following up the rfebb implementation, this patch adds the EBB exception
> support that are triggered by Performance Monitor alerts. This exception
> occurs when an enabled PMU condition or event happens and both MMCR0_EBE
> and BESCR_PME are set.
> 
> The supported PM alerts will consist of counter negative conditions of
> the PMU counters. This will be achieved by a timer mechanism that will
> predict when a counter becomes negative. The PMU timer callback will set
> the appropriate bits in MMCR0 and fire a PMC interrupt. The EBB
> exception code will then set the appropriate BESCR bits, set the next
> instruction pointer to the address pointed by the return register
> (SPR_EBBRR), and redirect execution to the handler (pointed by
> SPR_EBBHR).
> 
> CC: Gustavo Romero 
> Signed-off-by: Gustavo Romero 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/ppc/cpu.h |  5 -
>  target/ppc/excp_helper.c | 29 +
>  target/ppc/power8-pmu.c  | 40 ++--
>  3 files changed, 71 insertions(+), 3 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 741b8baf4c..8e0e6319ee 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -129,8 +129,10 @@ enum {
>  /* ISA 3.00 additions */
>  POWERPC_EXCP_HVIRT= 101,
>  POWERPC_EXCP_SYSCALL_VECTORED = 102, /* scv exception
>  */
> +POWERPC_EXCP_EBB = 103, /* Event-based branch exception  
> */
> +
>  /* EOL   
> */
> -POWERPC_EXCP_NB   = 103,
> +POWERPC_EXCP_NB   = 104,
>  /* QEMU exceptions: special cases we want to stop translation
> */
>  POWERPC_EXCP_SYSCALL_USER = 0x203, /* System call in user mode only  
> */
>  };
> @@ -2452,6 +2454,7 @@ enum {
>  PPC_INTERRUPT_HMI,/* Hypervisor Maintenance interrupt*/
>  PPC_INTERRUPT_HDOORBELL,  /* Hypervisor Doorbell interrupt*/
>  PPC_INTERRUPT_HVIRT,  /* Hypervisor virtualization interrupt  */
> +PPC_INTERRUPT_PMC,/* PMU interrupt  */
>  };

All this low-level exception stuff is very clunky, but addressing
that is not reasonably in scope for this series.  So,

Reviewed-by: David Gibson 

>  
>  /* Processor Compatibility mask (PCR) */
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index 7ead32279c..a26d266fe6 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -799,6 +799,23 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
> excp_model, int excp)
>  cpu_abort(cs, "Non maskable external exception "
>"is not implemented yet !\n");
>  break;
> +case POWERPC_EXCP_EBB:   /* Event-based branch exception 
> */
> +if ((env->spr[SPR_FSCR] & (1ull << FSCR_EBB)) &&
> +(env->spr[SPR_BESCR] & BESCR_GE) &&
> +(env->spr[SPR_BESCR] & BESCR_PME)) {
> +target_ulong nip;
> +
> +env->spr[SPR_BESCR] &= ~BESCR_GE;   /* Clear GE */
> +env->spr[SPR_BESCR] |= BESCR_PMEO;  /* Set PMEO */
> +env->spr[SPR_EBBRR] = env->nip; /* Save NIP for rfebb insn */
> +nip = env->spr[SPR_EBBHR];  /* EBB handler */
> +powerpc_set_excp_state(cpu, nip, env->msr);
> +}
> +/*
> + * This interrupt is handled by userspace. No need
> + * to proceed.
> + */
> +return;
>  default:
>  excp_invalid:
>  cpu_abort(cs, "Invalid PowerPC exception %d. Aborting\n", excp);
> @@ -1046,6 +1063,18 @@ static void ppc_hw_interrupt(CPUPPCState *env)
>  powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_THERM);
>  return;
>  }
> +/* PMC -> Event-based branch exception */
> +if (env->pending_interrupts & (1 << PPC_INTERRUPT_PMC)) {
> +/*
> + * Performance Monitor event-based exception can only
> + * occur in problem state.
> + */
> +if (msr_pr == 1) {
> +env->pending_interrupts &= ~(1 << PPC_INTERRUPT_PMC);
> +powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_EBB);
> +return;
> +}
> +}
>  }
>  
>  if (env->resume_as_sreset) {
> diff --git a/target/ppc/power8-pmu.c b/target/ppc/power8-pmu.c
> index 08d1902cd5..279b824c3f 100644
> --- a/target/ppc/power8-pmu.c
> +++ b/target/ppc/power8-pmu.c
> @@ -297,6 +297,20 @@ void helper_store_pmc(CPUPPCState *env, uint32_t sprn, 
> uint64_t value)
>  pmc_update_overflow_timer(env, sprn);
>  }
>  
> +static void pmu_delete_timers(CPUPPCState *env)
> +{
> +QEMUTimer *pmc_overflow_timer;
> +int sprn;
> +
> +for (sprn = SPR_POWER_PMC1; sprn <= SPR_POWER_PMC6; 

Re: [PATCH 7/7] migration: Finer grained tracepoints for POSTCOPY_LISTEN

2021-12-08 Thread Peter Xu
On Wed, Dec 08, 2021 at 07:46:20PM +, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > The enablement of postcopy listening has a few steps, add a few tracepoints 
> > to
> > be there ready for some basic measurements for them.
> > 
> > Signed-off-by: Peter Xu 
> > ---
> >  migration/savevm.c | 5 -
> >  migration/trace-events | 2 +-
> >  2 files changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 17b8e25e00..5b3f31eab2 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -1946,7 +1946,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >  static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> >  {
> >  PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_LISTENING);
> > -trace_loadvm_postcopy_handle_listen();
> > +trace_loadvm_postcopy_handle_listen(1);
> 
> I think we tend just to split this into separate traces in many places;
> or if we're using the same one then we should use a string
> 
> I'd make this:
>   trace_loadvm_postcopy_handle_listen_entry();
> 
> for example.
> 
> >  Error *local_err = NULL;
> >  
> >  if (ps != POSTCOPY_INCOMING_ADVISE && ps != POSTCOPY_INCOMING_DISCARD) 
> > {
> > @@ -1962,6 +1962,7 @@ static int 
> > loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> >  postcopy_ram_prepare_discard(mis);
> >  }
> >  }
> > +trace_loadvm_postcopy_handle_listen(2);
> >  
> >  /*
> >   * Sensitise RAM - can now generate requests for blocks that don't 
> > exist
> > @@ -1974,6 +1975,7 @@ static int 
> > loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> >  return -1;
> >  }
> >  }
> > +trace_loadvm_postcopy_handle_listen(3);
> >  
> >  if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN, _err)) {
> >  error_report_err(local_err);
> > @@ -1988,6 +1990,7 @@ static int 
> > loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> > QEMU_THREAD_DETACHED);
> >  qemu_sem_wait(>listen_thread_sem);
> >  qemu_sem_destroy(>listen_thread_sem);
> > +trace_loadvm_postcopy_handle_listen(4);
> 
>   trace_loadvm_postcopy_handle_listen_entry_end();

I see, I can use it.  It's just that I want to trap more than entry/exit.

For the "4 steps" here I split it into four procedures, the 2 steps inside are
majorly to trap either uffd registration time or external uffd handling of
other processes.

One example:

We may want to know how slow is postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN)
when there're some external process attached.  I wanted to be prepared for that
so when there's need to evaluate slowness of this procedure with vhost-user
enabled we have some tracepoints without replacing the binaries.

It's easy to use strings too if that's better looking than numbers.  How about:

  trace_loadvm_postcopy_handle_listen("entry")
  trace_loadvm_postcopy_handle_listen("uffd-reg")
  trace_loadvm_postcopy_handle_listen("external")
  trace_loadvm_postcopy_handle_listen("exit")

?

Thanks,

-- 
Peter Xu




Re: [PATCH 1/3] docs: rSTify ppc-spapr-hcalls.txt

2021-12-08 Thread David Gibson
On Wed, Dec 08, 2021 at 03:54:40PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 12/8/21 13:59, lagar...@linux.ibm.com wrote:
> > From: Leonardo Garcia 
> > 
> > Signed-off-by: Leonardo Garcia 
> > ---
> >   docs/specs/ppc-spapr-hcalls.txt | 92 -
> >   1 file changed, 57 insertions(+), 35 deletions(-)
> > 
> > diff --git a/docs/specs/ppc-spapr-hcalls.txt 
> > b/docs/specs/ppc-spapr-hcalls.txt
> > index 93fe3da91b..c69dae535b 100644
> > --- a/docs/specs/ppc-spapr-hcalls.txt
> > +++ b/docs/specs/ppc-spapr-hcalls.txt
> > @@ -1,9 +1,15 @@
> > -When used with the "pseries" machine type, QEMU-system-ppc64 implements
> > -a set of hypervisor calls using a subset of the server "PAPR" specification
> > -(IBM internal at this point), which is also what IBM's proprietary 
> > hypervisor
> > -adheres too.
> > +sPAPR hypervisor calls
> > +--
> > -The subset is selected based on the requirements of Linux as a guest.
> > +When used with the ``pseries`` machine type, ``qemu-system-ppc64`` 
> > implements
> > +a set of hypervisor calls (a.k.a. hcalls) defined in the `Linux on Power
> > +Architecture Reference document (LoPAR)
> > +`_.
> > +This document is a subset of the Power Architecture Platform Reference 
> > (PAPR+)
> > +specification (IBM internal only), which is what PowerVM, the IBM 
> > proprietary
> > +hypervisor, adheres to.
> > +
> > +The subset in LoPAR is selected based on the requirements of Linux as a 
> > guest.
> >   In addition to those calls, we have added our own private hypervisor
> >   calls which are mostly used as a private interface between the firmware
> > @@ -12,13 +18,14 @@ running in the guest and QEMU.
> >   All those hypercalls start at hcall number 0xf000 which correspond
> >   to an implementation specific range in PAPR.
> > -- H_RTAS (0xf000)
> > +H_RTAS (0xf000)
> > +^^^
> > -RTAS is a set of runtime services generally provided by the firmware
> > -inside the guest to the operating system. It predates the existence
> > -of hypervisors (it was originally an extension to Open Firmware) and
> > -is still used by PAPR to provide various services that aren't performance
> > -sensitive.
> > +RTAS stands for Run-Time Abstraction Sercies and is a set of runtime 
> > services
> > +generally provided by the firmware inside the guest to the operating 
> > system. It
> > +predates the existence of hypervisors (it was originally an extension to 
> > Open
> > +Firmware) and is still used by PAPR and LoPAR to provide various services 
> > that
> > +are not performance sensitive.
> >   We currently implement the RTAS services in QEMU itself. The actual RTAS
> >   "firmware" blob in the guest is a small stub of a few instructions which
> > @@ -26,22 +33,25 @@ calls our private H_RTAS hypervisor call to pass the 
> > RTAS calls to QEMU.
> >   Arguments:
> > -  r3 : H_RTAS (0xf000)
> > -  r4 : Guest physical address of RTAS parameter block
> > +  ``r3``: ``H_RTAS (0xf000)``
> > +
> > +  ``r4``: Guest physical address of RTAS parameter block.
> >   Returns:
> > -  H_SUCCESS   : Successfully called the RTAS function (RTAS result
> > -will have been stored in the parameter block)
> > -  H_PARAMETER : Unknown token
> > +  ``H_SUCCESS``: Successfully called the RTAS function (RTAS result will 
> > have
> > +  been stored in the parameter block).
> > -- H_LOGICAL_MEMOP (0xf001)
> > +  ``H_PARAMETER``: Unknown token.
> > -When the guest runs in "real mode" (in powerpc lingua this means
> > -with MMU disabled, ie guest effective == guest physical), it only
> > -has access to a subset of memory and no IOs.
> > +H_LOGICAL_MEMOP (0xf001)
> > +
> > -PAPR provides a set of hypervisor calls to perform cacheable or
> > +When the guest runs in "real mode" (in powerpc lingua this means with MMU
> 
> What's up with 'lingua'? As you already know "lingua" is Brazilian portuguese 
> for "tongue"
> and it's so weird to be used in this context.
> 
> The one English word that I can think of that is closer to "lingua" and would 
> make sense here
> is 'lingo'. But that is a slang for 'jargon' and not appropriate for a 
> technical document
> either. "langua" as a short form of "language" seems weird as well. I really 
> believe 'jargon'
> is a more suitable word here.

As a native speaker: "lingo" would make sense here, though its tone is
a little informal.  "jargon" could also work, but "terminology" would
probably better match the tone of the document.

I expect this hasn't been noticed before, because I think most English
speakers would read "lingua" as a typo for "lingo", maybe only barely
registering that it was not the standard spelling.  ("lingo" is, of
course, cognate with lingua and similar words from romance langauges).

> This was added by c73e3771ea79ab and I really believe this is an unintended 
> typo/mistake. If
> you're 

Re: [PATCH 6/7] migration: Dump sub-cmd name in loadvm_process_command tp

2021-12-08 Thread Peter Xu
On Wed, Dec 08, 2021 at 06:41:22PM +, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > It'll be easier to read the name rather than index of sub-cmd when 
> > debugging.
> > 
> > Signed-off-by: Peter Xu 
> > ---
> >  migration/savevm.c | 2 +-
> >  migration/trace-events | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index d59e976d50..17b8e25e00 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -2271,7 +2271,7 @@ static int loadvm_process_command(QEMUFile *f)
> >  return qemu_file_get_error(f);
> >  }
> >  
> > -trace_loadvm_process_command(cmd, len);
> > +trace_loadvm_process_command(mig_cmd_args[cmd].name, len);
> >  if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {
> 
> No! you can't do that name lookup before the limit check.

Heh, right. :)

I guess it shouldn't matter in reality as we don't worry too much on untrusted
or uncompatible src qemu, but it's very reasonable to fix it.

Thanks!

-- 
Peter Xu




Re: [PATCH v6 07/18] target/riscv: setup everything for rv64 to support rv128 execution

2021-12-08 Thread Alistair Francis
On Mon, Nov 29, 2021 at 12:08 AM Frédéric Pétrot
 wrote:
>
> This patch adds the support of the '-cpu rv128' option to
> qemu-system-riscv64 so that we can indicate that we want to run rv128
> executables.
> Still, there is no support for 128-bit insns at that stage so qemu fails
> miserably (as expected) if launched with this option.
>
> Signed-off-by: Frédéric Pétrot 
> Co-authored-by: Fabien Portas 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  include/disas/dis-asm.h |  1 +
>  target/riscv/cpu.h  |  1 +
>  disas/riscv.c   |  5 +
>  target/riscv/cpu.c  | 20 
>  target/riscv/gdbstub.c  |  5 +
>  5 files changed, 32 insertions(+)
>
> diff --git a/include/disas/dis-asm.h b/include/disas/dis-asm.h
> index 08e1beec85..102a1e7f50 100644
> --- a/include/disas/dis-asm.h
> +++ b/include/disas/dis-asm.h
> @@ -459,6 +459,7 @@ int print_insn_nios2(bfd_vma, disassemble_info*);
>  int print_insn_xtensa   (bfd_vma, disassemble_info*);
>  int print_insn_riscv32  (bfd_vma, disassemble_info*);
>  int print_insn_riscv64  (bfd_vma, disassemble_info*);
> +int print_insn_riscv128 (bfd_vma, disassemble_info*);
>  int print_insn_rx(bfd_vma, disassemble_info *);
>  int print_insn_hexagon(bfd_vma, disassemble_info *);
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 53a295efb7..cbd4daa6d9 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -38,6 +38,7 @@
>  #define TYPE_RISCV_CPU_ANY  RISCV_CPU_TYPE_NAME("any")
>  #define TYPE_RISCV_CPU_BASE32   RISCV_CPU_TYPE_NAME("rv32")
>  #define TYPE_RISCV_CPU_BASE64   RISCV_CPU_TYPE_NAME("rv64")
> +#define TYPE_RISCV_CPU_BASE128  RISCV_CPU_TYPE_NAME("rv128")
>  #define TYPE_RISCV_CPU_IBEX RISCV_CPU_TYPE_NAME("lowrisc-ibex")
>  #define TYPE_RISCV_CPU_SHAKTI_C RISCV_CPU_TYPE_NAME("shakti-c")
>  #define TYPE_RISCV_CPU_SIFIVE_E31   RISCV_CPU_TYPE_NAME("sifive-e31")
> diff --git a/disas/riscv.c b/disas/riscv.c
> index 793ad14c27..03c8dc9961 100644
> --- a/disas/riscv.c
> +++ b/disas/riscv.c
> @@ -3090,3 +3090,8 @@ int print_insn_riscv64(bfd_vma memaddr, struct 
> disassemble_info *info)
>  {
>  return print_insn_riscv(memaddr, info, rv64);
>  }
> +
> +int print_insn_riscv128(bfd_vma memaddr, struct disassemble_info *info)
> +{
> +return print_insn_riscv(memaddr, info, rv128);
> +}
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 364140f5ff..7f5370f2b2 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -178,6 +178,19 @@ static void rv64_sifive_e_cpu_init(Object *obj)
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  qdev_prop_set_bit(DEVICE(obj), "mmu", false);
>  }
> +
> +static void rv128_base_cpu_init(Object *obj)
> +{
> +if (qemu_tcg_mttcg_enabled()) {
> +/* Missing 128-bit aligned atomics */
> +error_report("128-bit RISC-V currently does not work with Multi "
> + "Threaded TCG. Please use: -accel tcg,thread=single");
> +exit(EXIT_FAILURE);
> +}
> +CPURISCVState *env = _CPU(obj)->env;
> +/* We set this in the realise function */
> +set_misa(env, MXL_RV128, 0);
> +}
>  #else
>  static void rv32_base_cpu_init(Object *obj)
>  {
> @@ -402,6 +415,9 @@ static void riscv_cpu_disas_set_info(CPUState *s, 
> disassemble_info *info)
>  case MXL_RV64:
>  info->print_insn = print_insn_riscv64;
>  break;
> +case MXL_RV128:
> +info->print_insn = print_insn_riscv128;
> +break;
>  default:
>  g_assert_not_reached();
>  }
> @@ -464,6 +480,8 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> **errp)
>  #ifdef TARGET_RISCV64
>  case MXL_RV64:
>  break;
> +case MXL_RV128:
> +break;
>  #endif
>  case MXL_RV32:
>  break;
> @@ -670,6 +688,7 @@ static gchar *riscv_gdb_arch_name(CPUState *cs)
>  case MXL_RV32:
>  return g_strdup("riscv:rv32");
>  case MXL_RV64:
> +case MXL_RV128:
>  return g_strdup("riscv:rv64");
>  default:
>  g_assert_not_reached();
> @@ -822,6 +841,7 @@ static const TypeInfo riscv_cpu_type_infos[] = {
>  DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E51,   rv64_sifive_e_cpu_init),
>  DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_U54,   rv64_sifive_u_cpu_init),
>  DEFINE_CPU(TYPE_RISCV_CPU_SHAKTI_C, rv64_sifive_u_cpu_init),
> +DEFINE_CPU(TYPE_RISCV_CPU_BASE128,  rv128_base_cpu_init),
>  #endif
>  };
>
> diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
> index 23429179e2..2fbdcc5879 100644
> --- a/target/riscv/gdbstub.c
> +++ b/target/riscv/gdbstub.c
> @@ -164,6 +164,11 @@ static int riscv_gen_dynamic_csr_xml(CPUState *cs, int 
> base_reg)
>  int bitsize = 16 << env->misa_mxl_max;
>  int i;
>
> +/* Until gdb knows about 128-bit registers */
> +if (bitsize > 64) {
> +bitsize = 64;
> +}
> +
>  g_string_printf(s, "");
>  

Re: [PATCH v6 01/18] exec/memop: Adding signedness to quad definitions

2021-12-08 Thread Alistair Francis
On Sun, Nov 28, 2021 at 11:59 PM Frédéric Pétrot
 wrote:
>
> Renaming defines for quad in their various forms so that their signedness is
> now explicit.
> Done using git grep as suggested by Philippe, with a bit of hand edition to
> keep assignments aligned.
>
> Signed-off-by: Frédéric Pétrot 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Richard Henderson 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  include/exec/memop.h   |  8 +--
>  include/tcg/tcg-op.h   |  4 +-
>  target/arm/translate-a32.h |  4 +-
>  accel/tcg/cputlb.c | 30 +--
>  accel/tcg/user-exec.c  |  8 +--
>  target/alpha/translate.c   | 32 ++--
>  target/arm/helper-a64.c|  8 +--
>  target/arm/translate-a64.c |  8 +--
>  target/arm/translate-neon.c|  6 +--
>  target/arm/translate-sve.c | 10 ++--
>  target/arm/translate-vfp.c |  8 +--
>  target/arm/translate.c |  2 +-
>  target/cris/translate.c|  2 +-
>  target/hppa/translate.c|  4 +-
>  target/i386/tcg/mem_helper.c   |  2 +-
>  target/i386/tcg/translate.c| 36 +++---
>  target/m68k/op_helper.c|  2 +-
>  target/mips/tcg/translate.c| 58 +++---
>  target/mips/tcg/tx79_translate.c   |  8 +--
>  target/ppc/translate.c | 32 ++--
>  target/s390x/tcg/mem_helper.c  |  8 +--
>  target/s390x/tcg/translate.c   |  8 +--
>  target/sh4/translate.c | 12 ++---
>  target/sparc/translate.c   | 36 +++---
>  target/tricore/translate.c |  4 +-
>  target/xtensa/translate.c  |  4 +-
>  tcg/tcg.c  |  4 +-
>  tcg/tci.c  | 16 +++---
>  accel/tcg/ldst_common.c.inc|  8 +--
>  target/mips/tcg/micromips_translate.c.inc  | 10 ++--
>  target/ppc/translate/fixedpoint-impl.c.inc | 22 
>  target/ppc/translate/fp-impl.c.inc |  4 +-
>  target/ppc/translate/vsx-impl.c.inc| 42 
>  target/riscv/insn_trans/trans_rva.c.inc| 22 
>  target/riscv/insn_trans/trans_rvd.c.inc|  4 +-
>  target/riscv/insn_trans/trans_rvh.c.inc|  4 +-
>  target/riscv/insn_trans/trans_rvi.c.inc|  4 +-
>  target/s390x/tcg/translate_vx.c.inc| 18 +++
>  tcg/aarch64/tcg-target.c.inc   |  2 +-
>  tcg/arm/tcg-target.c.inc   | 10 ++--
>  tcg/i386/tcg-target.c.inc  | 12 ++---
>  tcg/mips/tcg-target.c.inc  | 12 ++---
>  tcg/ppc/tcg-target.c.inc   | 16 +++---
>  tcg/riscv/tcg-target.c.inc |  6 +--
>  tcg/s390x/tcg-target.c.inc | 18 +++
>  tcg/sparc/tcg-target.c.inc | 16 +++---
>  target/s390x/tcg/insn-data.def | 28 +--
>  47 files changed, 311 insertions(+), 311 deletions(-)
>
> diff --git a/include/exec/memop.h b/include/exec/memop.h
> index 04264ffd6b..72c2f0ff3d 100644
> --- a/include/exec/memop.h
> +++ b/include/exec/memop.h
> @@ -85,29 +85,29 @@ typedef enum MemOp {
>  MO_UB= MO_8,
>  MO_UW= MO_16,
>  MO_UL= MO_32,
> +MO_UQ= MO_64,
>  MO_SB= MO_SIGN | MO_8,
>  MO_SW= MO_SIGN | MO_16,
>  MO_SL= MO_SIGN | MO_32,
> -MO_Q = MO_64,
>
>  MO_LEUW  = MO_LE | MO_UW,
>  MO_LEUL  = MO_LE | MO_UL,
> +MO_LEUQ  = MO_LE | MO_UQ,
>  MO_LESW  = MO_LE | MO_SW,
>  MO_LESL  = MO_LE | MO_SL,
> -MO_LEQ   = MO_LE | MO_Q,
>
>  MO_BEUW  = MO_BE | MO_UW,
>  MO_BEUL  = MO_BE | MO_UL,
> +MO_BEUQ  = MO_BE | MO_UQ,
>  MO_BESW  = MO_BE | MO_SW,
>  MO_BESL  = MO_BE | MO_SL,
> -MO_BEQ   = MO_BE | MO_Q,
>
>  #ifdef NEED_CPU_H
>  MO_TEUW  = MO_TE | MO_UW,
>  MO_TEUL  = MO_TE | MO_UL,
> +MO_TEUQ  = MO_TE | MO_UQ,
>  MO_TESW  = MO_TE | MO_SW,
>  MO_TESL  = MO_TE | MO_SL,
> -MO_TEQ   = MO_TE | MO_Q,
>  #endif
>
>  MO_SSIZE = MO_SIZE | MO_SIGN,
> diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
> index 0545a6224c..caa0a63612 100644
> --- a/include/tcg/tcg-op.h
> +++ b/include/tcg/tcg-op.h
> @@ -894,7 +894,7 @@ static inline void tcg_gen_qemu_ld32s(TCGv ret, TCGv 
> addr, int mem_index)
>
>  static inline void tcg_gen_qemu_ld64(TCGv_i64 ret, TCGv addr, int mem_index)
>  {
> -tcg_gen_qemu_ld_i64(ret, addr, mem_index, MO_TEQ);
> +tcg_gen_qemu_ld_i64(ret, addr, mem_index, MO_TEUQ);
>  }
>
>  static inline void tcg_gen_qemu_st8(TCGv arg, TCGv addr, int mem_index)
> @@ -914,7 +914,7 @@ static inline void tcg_gen_qemu_st32(TCGv arg, TCGv addr, 
> int mem_index)
>
>  static inline void tcg_gen_qemu_st64(TCGv_i64 arg, TCGv addr, int mem_index)
>  {
> -

[PATCH 6/6] target/arm: Implement FEAT_LPA2

2021-12-08 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   | 12 +++
 target/arm/internals.h |  2 ++
 target/arm/cpu64.c |  2 ++
 target/arm/helper.c| 80 +++---
 4 files changed, 83 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 314904..379585352b 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -4283,6 +4283,18 @@ static inline bool isar_feature_aa64_i8mm(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, I8MM) != 0;
 }
 
+static inline bool isar_feature_aa64_tgran4_lpa2(const ARMISARegisters *id)
+{
+return sextract64(id->id_aa64mmfr0,
+  R_ID_AA64MMFR0_TGRAN4_SHIFT,
+  R_ID_AA64MMFR0_TGRAN4_LENGTH) >= 1;
+}
+
+static inline bool isar_feature_aa64_tgran16_lpa2(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN16) >= 2;
+}
+
 static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 3e801833b4..868cae2a55 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1033,12 +1033,14 @@ static inline uint32_t aarch64_pstate_valid_mask(const 
ARMISARegisters *id)
 typedef struct ARMVAParameters {
 unsigned tsz: 8;
 unsigned ps : 3;
+unsigned sh : 2;
 unsigned select : 1;
 bool tbi: 1;
 bool epd: 1;
 bool hpd: 1;
 bool using16k   : 1;
 bool using64k   : 1;
+bool ds : 1;
 } ARMVAParameters;
 
 ARMVAParameters aa64_va_parameters(CPUARMState *env, uint64_t va,
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 3bb79ca744..5a1940aa94 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -740,6 +740,8 @@ static void aarch64_max_initfn(Object *obj)
 
 t = cpu->isar.id_aa64mmfr0;
 t = FIELD_DP64(t, ID_AA64MMFR0, PARANGE, 6); /* FEAT_LPA: 52 bits */
+t = FIELD_DP64(t, ID_AA64MMFR0, TGRAN16, 2); /* FEAT_LPA2: 52 bits */
+t = FIELD_DP64(t, ID_AA64MMFR0, TGRAN4, 1);  /* FEAT_LPA2: 52 bits */
 cpu->isar.id_aa64mmfr0 = t;
 
 t = cpu->isar.id_aa64mmfr1;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index e39c1f5b3a..f4a8b37f98 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11008,8 +11008,13 @@ static bool check_s2_mmu_setup(ARMCPU *cpu, bool 
is_aa64, int level,
 const int grainsize = stride + 3;
 int startsizecheck;
 
-/* Negative levels are never allowed.  */
-if (level < 0) {
+/*
+ * Negative levels are usually not allowed...
+ * Except for FEAT_LPA2, 4k page table, 52-bit address space, which
+ * begins with level -1.  Note that previous feature tests will have
+ * eliminated this combination if it is not enabled.
+ */
+if (level < (inputsize == 52 && stride == 9 ? -1 : 0)) {
 return false;
 }
 
@@ -11150,8 +11155,8 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
ARMMMUIdx mmu_idx, bool data)
 {
 uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
-bool epd, hpd, using16k, using64k;
-int select, tsz, tbi, ps;
+bool epd, hpd, using16k, using64k, ds;
+int select, tsz, tbi, ps, sh;
 
 if (!regime_has_2_ranges(mmu_idx)) {
 select = 0;
@@ -11165,7 +11170,9 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 hpd = extract32(tcr, 24, 1);
 }
 epd = false;
+sh = extract64(tcr, 12, 2);
 ps = extract64(tcr, 16, 3);
+ds = extract64(tcr, 32, 1);
 } else {
 /*
  * Bit 55 is always between the two regions, and is canonical for
@@ -11175,6 +11182,7 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 if (!select) {
 tsz = extract32(tcr, 0, 6);
 epd = extract32(tcr, 7, 1);
+sh = extract32(tcr, 12, 2);
 using64k = extract32(tcr, 14, 1);
 using16k = extract32(tcr, 15, 1);
 hpd = extract64(tcr, 41, 1);
@@ -11184,9 +11192,11 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 using64k = tg == 3;
 tsz = extract32(tcr, 16, 6);
 epd = extract32(tcr, 23, 1);
+sh = extract32(tcr, 28, 2);
 hpd = extract64(tcr, 42, 1);
 }
 ps = extract64(tcr, 32, 3);
+ds = extract64(tcr, 59, 1);
 }
 
 /* Present TBI as a composite with TBID.  */
@@ -11199,12 +11209,14 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 return (ARMVAParameters) {
 .tsz = tsz,
 .ps = ps,
+.sh = sh,
 .select = select,
 .tbi = tbi,
 .epd = epd,
 .hpd = hpd,
 .using16k = using16k,
 .using64k = using64k,
+.ds = 

[PATCH 3/6] target/arm: Honor TCR_ELx.{I}PS

2021-12-08 Thread Richard Henderson
This field controls the output (intermediate) physical address size
of the translation process.  V8 requires to raise an AddressSize
fault if the page tables are programmed incorrectly, such that any
intermediate descriptor address, or the final translated address,
is out of range.

Add an outputsize field to ARMVAParameters, and fill it in during
aa64_va_parameters.  Pass the value to check_s2_mmu_setup to use
instead of the raw PAMax value.  Test the descaddr as extracted
from TTBR and from page table entries.

Restrict descaddrmask so that we won't raise the fault for v7.

Signed-off-by: Richard Henderson 
---
 target/arm/internals.h |  1 +
 target/arm/helper.c| 92 +-
 2 files changed, 65 insertions(+), 28 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 27d2fcd26c..3e801833b4 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1032,6 +1032,7 @@ static inline uint32_t aarch64_pstate_valid_mask(const 
ARMISARegisters *id)
  */
 typedef struct ARMVAParameters {
 unsigned tsz: 8;
+unsigned ps : 3;
 unsigned select : 1;
 bool tbi: 1;
 bool epd: 1;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index fab9ee70d8..568914bd42 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11003,7 +11003,7 @@ do_fault:
  * false otherwise.
  */
 static bool check_s2_mmu_setup(ARMCPU *cpu, bool is_aa64, int level,
-   int inputsize, int stride)
+   int inputsize, int stride, int outputsize)
 {
 const int grainsize = stride + 3;
 int startsizecheck;
@@ -11019,22 +11019,19 @@ static bool check_s2_mmu_setup(ARMCPU *cpu, bool 
is_aa64, int level,
 }
 
 if (is_aa64) {
-CPUARMState *env = >env;
-unsigned int pamax = arm_pamax(cpu);
-
 switch (stride) {
 case 13: /* 64KB Pages.  */
-if (level == 0 || (level == 1 && pamax <= 42)) {
+if (level == 0 || (level == 1 && outputsize <= 42)) {
 return false;
 }
 break;
 case 11: /* 16KB Pages.  */
-if (level == 0 || (level == 1 && pamax <= 40)) {
+if (level == 0 || (level == 1 && outputsize <= 40)) {
 return false;
 }
 break;
 case 9: /* 4KB Pages.  */
-if (level == 0 && pamax <= 42) {
+if (level == 0 && outputsize <= 42) {
 return false;
 }
 break;
@@ -11043,8 +11040,8 @@ static bool check_s2_mmu_setup(ARMCPU *cpu, bool 
is_aa64, int level,
 }
 
 /* Inputsize checks.  */
-if (inputsize > pamax &&
-(arm_el_is_aa64(env, 1) || inputsize > 40)) {
+if (inputsize > outputsize &&
+(arm_el_is_aa64(>env, 1) || inputsize > 40)) {
 /* This is CONSTRAINED UNPREDICTABLE and we choose to fault.  */
 return false;
 }
@@ -11090,17 +11087,19 @@ static uint8_t convert_stage2_attrs(CPUARMState *env, 
uint8_t s2attrs)
 }
 #endif /* !CONFIG_USER_ONLY */
 
+/* This mapping is common between ID_AA64MMFR0.PARANGE and TCR_ELx.{I}PS. */
+static const uint8_t pamax_map[] = {
+[0] = 32,
+[1] = 36,
+[2] = 40,
+[3] = 42,
+[4] = 44,
+[5] = 48,
+};
+
 /* The cpu-specific constant value of PAMax; also used by hw/arm/virt. */
 unsigned int arm_pamax(ARMCPU *cpu)
 {
-static const unsigned int pamax_map[] = {
-[0] = 32,
-[1] = 36,
-[2] = 40,
-[3] = 42,
-[4] = 44,
-[5] = 48,
-};
 unsigned int parange =
 FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE);
 
@@ -11151,7 +11150,7 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 {
 uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
 bool epd, hpd, using16k, using64k;
-int select, tsz, tbi;
+int select, tsz, tbi, ps;
 
 if (!regime_has_2_ranges(mmu_idx)) {
 select = 0;
@@ -11165,6 +11164,7 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 hpd = extract32(tcr, 24, 1);
 }
 epd = false;
+ps = extract64(tcr, 16, 3);
 } else {
 /*
  * Bit 55 is always between the two regions, and is canonical for
@@ -11185,6 +11185,7 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 epd = extract32(tcr, 23, 1);
 hpd = extract64(tcr, 42, 1);
 }
+ps = extract64(tcr, 32, 3);
 }
 
 /* Present TBI as a composite with TBID.  */
@@ -11196,6 +11197,7 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 
 return (ARMVAParameters) {
 .tsz = tsz,
+.ps = ps,
 .select = select,
 .tbi = tbi,
 .epd = epd,
@@ -11312,7 +11314,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
uint64_t address,
 target_ulong 

[PATCH 1/6] target/arm: Fault on invalid TCR_ELx.TxSZ

2021-12-08 Thread Richard Henderson
Without FEAT_LVA, the behaviour of programming an invalid value
is IMPLEMENTATION DEFINED.  With FEAT_LVA, programming an invalid
minimum value requires a Translation fault.

It is most self-consistent to choose to generate the fault always.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 9b317899a6..575723d62c 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11129,7 +11129,7 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 {
 uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
 bool epd, hpd, using16k, using64k;
-int select, tsz, tbi, max_tsz;
+int select, tsz, tbi;
 
 if (!regime_has_2_ranges(mmu_idx)) {
 select = 0;
@@ -11165,15 +11165,6 @@ ARMVAParameters aa64_va_parameters(CPUARMState *env, 
uint64_t va,
 }
 }
 
-if (cpu_isar_feature(aa64_st, env_archcpu(env))) {
-max_tsz = 48 - using64k;
-} else {
-max_tsz = 39;
-}
-
-tsz = MIN(tsz, max_tsz);
-tsz = MAX(tsz, 16);  /* TODO: ARMv8.2-LVA  */
-
 /* Present TBI as a composite with TBID.  */
 tbi = aa64_va_parameter_tbi(tcr, mmu_idx);
 if (!data) {
@@ -11309,9 +11300,30 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
uint64_t address,
 
 /* TODO: This code does not support shareability levels. */
 if (aarch64) {
+int min_tsz = 16, max_tsz = 39;  /* TODO: ARMv8.2-LVA  */
+
 param = aa64_va_parameters(env, address, mmu_idx,
access_type != MMU_INST_FETCH);
 level = 0;
+
+if (cpu_isar_feature(aa64_st, env_archcpu(env))) {
+max_tsz = 48 - param.using64k;
+}
+
+/*
+ * If TxSZ is programmed to a value larger than the maximum,
+ * or smaller than the effective minimum, it is IMPLEMENTATION
+ * DEFINED whether we behave as if the field were programmed
+ * within bounds, or if a level 0 Translation fault is generated.
+ *
+ * With FEAT_LVA, fault on less than minimum becomes required,
+ * so our choice is to always raise the fault.
+ */
+if (param.tsz < min_tsz || param.tsz > max_tsz) {
+fault_type = ARMFault_Translation;
+goto do_fault;
+}
+
 addrsize = 64 - 8 * param.tbi;
 inputsize = 64 - param.tsz;
 } else {
-- 
2.25.1




[PATCH 4/6] target/arm: Implement FEAT_LVA

2021-12-08 Thread Richard Henderson
This feature is relatively small, as it applies only to
64k pages and thus requires no additional changes to the
table descriptor walking algorithm, only a change to the
minimum TSZ (which is the inverse of the maximum virtual
address space size).

Signed-off-by: Richard Henderson 
---
 target/arm/cpu-param.h | 2 +-
 target/arm/cpu.h   | 5 +
 target/arm/cpu64.c | 1 +
 target/arm/helper.c| 8 +++-
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index 7f38d33b8e..5f9c288b1a 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -11,7 +11,7 @@
 #ifdef TARGET_AARCH64
 # define TARGET_LONG_BITS 64
 # define TARGET_PHYS_ADDR_SPACE_BITS  48
-# define TARGET_VIRT_ADDR_SPACE_BITS  48
+# define TARGET_VIRT_ADDR_SPACE_BITS  52
 #else
 # define TARGET_LONG_BITS 32
 # define TARGET_PHYS_ADDR_SPACE_BITS  40
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e33f37b70a..314904 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -4288,6 +4288,11 @@ static inline bool isar_feature_aa64_ccidx(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
 }
 
+static inline bool isar_feature_aa64_lva(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, VARANGE) != 0;
+}
+
 static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 15245a60a8..f44ee643ef 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -755,6 +755,7 @@ static void aarch64_max_initfn(Object *obj)
 t = FIELD_DP64(t, ID_AA64MMFR2, UAO, 1);
 t = FIELD_DP64(t, ID_AA64MMFR2, CNP, 1); /* TTCNP */
 t = FIELD_DP64(t, ID_AA64MMFR2, ST, 1); /* TTST */
+t = FIELD_DP64(t, ID_AA64MMFR2, VARANGE, 1); /* FEAT_LPA */
 cpu->isar.id_aa64mmfr2 = t;
 
 t = cpu->isar.id_aa64zfr0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 568914bd42..6a59975028 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11324,7 +11324,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
uint64_t address,
 
 /* TODO: This code does not support shareability levels. */
 if (aarch64) {
-int min_tsz = 16, max_tsz = 39;  /* TODO: ARMv8.2-LVA  */
+int min_tsz = 16, max_tsz = 39;
 int parange;
 
 param = aa64_va_parameters(env, address, mmu_idx,
@@ -11334,6 +11334,12 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
uint64_t address,
 if (cpu_isar_feature(aa64_st, env_archcpu(env))) {
 max_tsz = 48 - param.using64k;
 }
+if (param.using64k) {
+if (cpu_isar_feature(aa64_lva, env_archcpu(env))) {
+min_tsz = 12;
+}
+}
+/* TODO: FEAT_LPA2 */
 
 /*
  * If TxSZ is programmed to a value larger than the maximum,
-- 
2.25.1




[PATCH 5/6] target/arm: Implement FEAT_LPA

2021-12-08 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/cpu-param.h |  2 +-
 target/arm/cpu64.c |  2 +-
 target/arm/helper.c| 19 ---
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index 5f9c288b1a..b59d505761 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -10,7 +10,7 @@
 
 #ifdef TARGET_AARCH64
 # define TARGET_LONG_BITS 64
-# define TARGET_PHYS_ADDR_SPACE_BITS  48
+# define TARGET_PHYS_ADDR_SPACE_BITS  52
 # define TARGET_VIRT_ADDR_SPACE_BITS  52
 #else
 # define TARGET_LONG_BITS 32
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index f44ee643ef..3bb79ca744 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -739,7 +739,7 @@ static void aarch64_max_initfn(Object *obj)
 cpu->isar.id_aa64pfr1 = t;
 
 t = cpu->isar.id_aa64mmfr0;
-t = FIELD_DP64(t, ID_AA64MMFR0, PARANGE, 5); /* PARange: 48 bits */
+t = FIELD_DP64(t, ID_AA64MMFR0, PARANGE, 6); /* FEAT_LPA: 52 bits */
 cpu->isar.id_aa64mmfr0 = t;
 
 t = cpu->isar.id_aa64mmfr1;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 6a59975028..e39c1f5b3a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11095,6 +11095,7 @@ static const uint8_t pamax_map[] = {
 [3] = 42,
 [4] = 44,
 [5] = 48,
+[6] = 52,
 };
 
 /* The cpu-specific constant value of PAMax; also used by hw/arm/virt. */
@@ -11472,11 +11473,15 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
uint64_t address,
 descaddr = extract64(ttbr, 0, 48);
 
 /*
- * If the base address is out of range, raise AddressSizeFault.
+ * For FEAT_LPA and PS=6, bits [51:48] of descaddr are in [5:2] of TTBR.
+ *
+ * Otherwise, if the base address is out of range, raise AddressSizeFault.
  * In the pseudocode, this is !IsZero(baseregister<47:outputsize>),
  * but we've just cleared the bits above 47, so simplify the test.
  */
-if (descaddr >> outputsize) {
+if (outputsize > 48) {
+descaddr |= extract64(ttbr, 2, 4) << 48;
+} else if (descaddr >> outputsize) {
 level = 0;
 fault_type = ARMFault_AddressSize;
 goto do_fault;
@@ -11526,7 +11531,15 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
uint64_t address,
 }
 
 descaddr = descriptor & descaddrmask;
-if (descaddr >> outputsize) {
+
+/*
+ * For FEAT_LPA and PS=6, bits [51:48] of descaddr are in [15:12]
+ * of descriptor.  Otherwise, if descaddr is out of range, raise
+ * AddressSizeFault.
+ */
+if (outputsize > 48) {
+descaddr |= extract64(descriptor, 12, 4) << 48;
+} else if (descaddr >> outputsize) {
 fault_type = ARMFault_AddressSize;
 goto do_fault;
 }
-- 
2.25.1




[PATCH for-7.0 0/6] target/arm: Implement LVA, LPA, LPA2 features

2021-12-08 Thread Richard Henderson
These features are all related and relatively small.

Testing so far has been limited to booting a kernel
with 64k pages and VA and PA set to 52 bits, which
excercises LVA and LPA.

There is not yet upstream support for LPA2, probably
because it's an ARMv8.7 addition.


r~


Richard Henderson (6):
  target/arm: Fault on invalid TCR_ELx.TxSZ
  target/arm: Move arm_pamax out of line
  target/arm: Honor TCR_ELx.{I}PS
  target/arm: Implement FEAT_LVA
  target/arm: Implement FEAT_LPA
  target/arm: Implement FEAT_LPA2

 target/arm/cpu-param.h |   4 +-
 target/arm/cpu.h   |  17 
 target/arm/internals.h |  22 +
 target/arm/cpu64.c |   5 +-
 target/arm/helper.c| 211 ++---
 5 files changed, 204 insertions(+), 55 deletions(-)

-- 
2.25.1




[PATCH 2/6] target/arm: Move arm_pamax out of line

2021-12-08 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/internals.h | 19 +--
 target/arm/helper.c| 22 ++
 2 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 89f7610ebc..27d2fcd26c 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -243,24 +243,7 @@ static inline void update_spsel(CPUARMState *env, uint32_t 
imm)
  * Returns the implementation defined bit-width of physical addresses.
  * The ARMv8 reference manuals refer to this as PAMax().
  */
-static inline unsigned int arm_pamax(ARMCPU *cpu)
-{
-static const unsigned int pamax_map[] = {
-[0] = 32,
-[1] = 36,
-[2] = 40,
-[3] = 42,
-[4] = 44,
-[5] = 48,
-};
-unsigned int parange =
-FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE);
-
-/* id_aa64mmfr0 is a read-only register so values outside of the
- * supported mappings can be considered an implementation error.  */
-assert(parange < ARRAY_SIZE(pamax_map));
-return pamax_map[parange];
-}
+unsigned int arm_pamax(ARMCPU *cpu);
 
 /* Return true if extended addresses are enabled.
  * This is always the case if our translation regime is 64 bit,
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 575723d62c..fab9ee70d8 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11090,6 +11090,28 @@ static uint8_t convert_stage2_attrs(CPUARMState *env, 
uint8_t s2attrs)
 }
 #endif /* !CONFIG_USER_ONLY */
 
+/* The cpu-specific constant value of PAMax; also used by hw/arm/virt. */
+unsigned int arm_pamax(ARMCPU *cpu)
+{
+static const unsigned int pamax_map[] = {
+[0] = 32,
+[1] = 36,
+[2] = 40,
+[3] = 42,
+[4] = 44,
+[5] = 48,
+};
+unsigned int parange =
+FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARANGE);
+
+/*
+ * id_aa64mmfr0 is a read-only register so values outside of the
+ * supported mappings can be considered an implementation error.
+ */
+assert(parange < ARRAY_SIZE(pamax_map));
+return pamax_map[parange];
+}
+
 static int aa64_va_parameter_tbi(uint64_t tcr, ARMMMUIdx mmu_idx)
 {
 if (regime_has_2_ranges(mmu_idx)) {
-- 
2.25.1




[PATCH] target/ppc: powerpc_excp: Guard ALIGNMENT interrupt with CONFIG_TCG

2021-12-08 Thread Fabiano Rosas
We cannot have TCG code in powerpc_excp because the function is called
from kvm-only code via ppc_cpu_do_interrupt:

 ../target/ppc/excp_helper.c:463:29: error: implicit declaration of
 function ‘cpu_ldl_code’ [-Werror=implicit-function-declaration]

Fortunately, the Alignment interrupt is not among the ones dispatched
from kvm-only code, so we can keep it out of the disable-tcg build for
now.

Fixes: 336e91f853 ("target/ppc: Move SPR_DSISR setting to powerpc_excp")
Signed-off-by: Fabiano Rosas 

---

Perhaps we could make powerpc_excp TCG only and have a separate
function that only knows the two interrupts that we use with KVM
(Program, Machine check). But for now this fix will do, I think.
---
 target/ppc/excp_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 17607adbe4..dcf22440cc 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -453,6 +453,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 }
 break;
 }
+#ifdef CONFIG_TCG
 case POWERPC_EXCP_ALIGN: /* Alignment exception  */
 /*
  * Get rS/rD and rA from faulting opcode.
@@ -464,6 +465,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 env->spr[SPR_DSISR] |= (insn & 0x03FF) >> 16;
 }
 break;
+#endif
 case POWERPC_EXCP_PROGRAM:   /* Program exception*/
 switch (env->error_code & ~0xF) {
 case POWERPC_EXCP_FP:
-- 
2.33.1




Re: [PATCH 7/7] migration: Finer grained tracepoints for POSTCOPY_LISTEN

2021-12-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> The enablement of postcopy listening has a few steps, add a few tracepoints to
> be there ready for some basic measurements for them.
> 
> Signed-off-by: Peter Xu 
> ---
>  migration/savevm.c | 5 -
>  migration/trace-events | 2 +-
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 17b8e25e00..5b3f31eab2 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1946,7 +1946,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
>  static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>  {
>  PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_LISTENING);
> -trace_loadvm_postcopy_handle_listen();
> +trace_loadvm_postcopy_handle_listen(1);

I think we tend just to split this into separate traces in many places;
or if we're using the same one then we should use a string

I'd make this:
  trace_loadvm_postcopy_handle_listen_entry();

for example.

>  Error *local_err = NULL;
>  
>  if (ps != POSTCOPY_INCOMING_ADVISE && ps != POSTCOPY_INCOMING_DISCARD) {
> @@ -1962,6 +1962,7 @@ static int 
> loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>  postcopy_ram_prepare_discard(mis);
>  }
>  }
> +trace_loadvm_postcopy_handle_listen(2);
>  
>  /*
>   * Sensitise RAM - can now generate requests for blocks that don't exist
> @@ -1974,6 +1975,7 @@ static int 
> loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>  return -1;
>  }
>  }
> +trace_loadvm_postcopy_handle_listen(3);
>  
>  if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN, _err)) {
>  error_report_err(local_err);
> @@ -1988,6 +1990,7 @@ static int 
> loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> QEMU_THREAD_DETACHED);
>  qemu_sem_wait(>listen_thread_sem);
>  qemu_sem_destroy(>listen_thread_sem);
> +trace_loadvm_postcopy_handle_listen(4);

  trace_loadvm_postcopy_handle_listen_entry_end();
>  
>  return 0;
>  }
> diff --git a/migration/trace-events b/migration/trace-events
> index d63a5915f5..1aa6937dc1 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -14,7 +14,7 @@ loadvm_handle_cmd_packaged_main(int ret) "%d"
>  loadvm_handle_cmd_packaged_received(int ret) "%d"
>  loadvm_handle_recv_bitmap(char *s) "%s"
>  loadvm_postcopy_handle_advise(void) ""
> -loadvm_postcopy_handle_listen(void) ""
> +loadvm_postcopy_handle_listen(int i) "%d"
>  loadvm_postcopy_handle_run(void) ""
>  loadvm_postcopy_handle_run_cpu_sync(void) ""
>  loadvm_postcopy_handle_run_vmstart(void) ""
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 09/12] s390x/pci: enable adapter event notification for interpreted devices

2021-12-08 Thread Matthew Rosato

On 12/8/21 6:29 AM, Thomas Huth wrote:

On 07/12/2021 22.04, Matthew Rosato wrote:
Use the associated vfio feature ioctl to enable adapter event 
notification

and forwarding for devices when requested.  This feature will be set up
with or without firmware assist based upon the 'intassist' setting.

Signed-off-by: Matthew Rosato 
---
  hw/s390x/s390-pci-bus.c  | 24 +++--
  hw/s390x/s390-pci-inst.c | 54 +++-
  hw/s390x/s390-pci-vfio.c | 88 
  include/hw/s390x/s390-pci-bus.h  |  1 +
  include/hw/s390x/s390-pci-vfio.h | 20 
  5 files changed, 182 insertions(+), 5 deletions(-)

[...]

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 78093aaac7..6f9271df87 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -152,6 +152,94 @@ int 
s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)

  return 0;
  }
+int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = container_of(pbdev->pdev, VFIOPCIDevice, 
pdev);


Should this use VFIO_PCI() instead of container_of ?



Yes, VFIO_PCI(pbdev->pdev) should work.


+    struct vfio_device_feature feat = {
+    .argsz = sizeof(struct vfio_device_feature),
+    .flags = VFIO_DEVICE_FEATURE_PROBE + 
VFIO_DEVICE_FEATURE_ZPCI_AIF

+    };
+
+    assert(vdev);


... then you could likely also drop the assert(), I think?


If I've understood qom.h correctly then yes you're right, if we use the 
instance checker VFIO_PCI() we should trigger an assert through 
object_dynamic_cast_assert() already if the pdev isn't a vfio-pci device 
-- I just verified that by trying to call VFIO_PCI() with something else 
and we indeed get an assert e.g. 'VFIO_PCI: Object 0x... is not an 
instance of type vfio-pci'


So I'll change these and get rid of the extra asserts, thanks.




Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-08 Thread Michael S. Tsirkin
On Wed, Dec 08, 2021 at 01:20:10PM +0800, Longpeng(Mike) wrote:
> From: Longpeng 
> 
> Hi guys,
> 
> This patch introduces vhost-vdpa-net device, which is inspired
> by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> 
> I've tested this patch on Huawei's offload card:
> ./x86_64-softmmu/qemu-system-x86_64 \
> -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> 
> For virtio hardware offloading, the most important requirement for us
> is to support live migration between offloading cards from different
> vendors, the combination of netdev and virtio-net seems too heavy, we
> prefer a lightweight way.

Did not look at the patch in depth yet.
Is this already supported with this patch? Or is that just the plan?

> Maybe we could support both in the future ? Such as:
> 
> * Lightweight
>  Net: vhost-vdpa-net
>  Storage: vhost-vdpa-blk
> 
> * Heavy but more powerful
>  Net: netdev + virtio-net + vhost-vdpa
>  Storage: bdrv + virtio-blk + vhost-vdpa

I'd like to better understand what is in and out of scope for
this device. Which features would be "more powerful" and belong
in virtio-net, and which in vhost-vdpa-net?

> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> 
> Signed-off-by: Longpeng(Mike) 
> ---
>  hw/net/meson.build |   1 +
>  hw/net/vhost-vdpa-net.c| 338 
> +
>  hw/virtio/Kconfig  |   5 +
>  hw/virtio/meson.build  |   1 +
>  hw/virtio/vhost-vdpa-net-pci.c | 118 +
>  include/hw/virtio/vhost-vdpa-net.h |  31 
>  include/net/vhost-vdpa.h   |   2 +
>  net/vhost-vdpa.c   |   2 +-
>  8 files changed, 497 insertions(+), 1 deletion(-)
>  create mode 100644 hw/net/vhost-vdpa-net.c
>  create mode 100644 hw/virtio/vhost-vdpa-net-pci.c
>  create mode 100644 include/hw/virtio/vhost-vdpa-net.h
> 
> diff --git a/hw/net/meson.build b/hw/net/meson.build
> index bdf71f1..139ebc4 100644
> --- a/hw/net/meson.build
> +++ b/hw/net/meson.build
> @@ -44,6 +44,7 @@ specific_ss.add(when: 'CONFIG_XILINX_ETHLITE', if_true: 
> files('xilinx_ethlite.c'
>  
>  softmmu_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('net_rx_pkt.c'))
>  specific_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('virtio-net.c'))
> +specific_ss.add(when: 'CONFIG_VHOST_VDPA_NET', if_true: 
> files('vhost-vdpa-net.c'))
>  
>  softmmu_ss.add(when: ['CONFIG_VIRTIO_NET', 'CONFIG_VHOST_NET'], if_true: 
> files('vhost_net.c'), if_false: files('vhost_net-stub.c'))
>  softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost_net-stub.c'))
> diff --git a/hw/net/vhost-vdpa-net.c b/hw/net/vhost-vdpa-net.c
> new file mode 100644
> index 000..48b99f9
> --- /dev/null
> +++ b/hw/net/vhost-vdpa-net.c
> @@ -0,0 +1,338 @@
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/error-report.h"
> +#include "qemu/cutils.h"
> +#include "hw/qdev-core.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/qdev-properties-system.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-vdpa-net.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-access.h"
> +#include "sysemu/sysemu.h"
> +#include "sysemu/runstate.h"
> +#include "net/vhost-vdpa.h"
> +
> +static void vhost_vdpa_net_get_config(VirtIODevice *vdev, uint8_t *config)
> +{
> +VHostVdpaNet *s = VHOST_VDPA_NET(vdev);
> +
> +memcpy(config, >netcfg, sizeof(struct virtio_net_config));
> +}
> +
> +static void vhost_vdpa_net_set_config(VirtIODevice *vdev, const uint8_t 
> *config)
> +{
> +VHostVdpaNet *s = VHOST_VDPA_NET(vdev);
> +struct virtio_net_config *netcfg = (struct virtio_net_config *)config;
> +int ret;
> +
> +ret = vhost_dev_set_config(>dev, (uint8_t *)netcfg, 0, 
> sizeof(*netcfg),
> +   VHOST_SET_CONFIG_TYPE_MASTER);
> +if (ret) {
> +error_report("set device config space failed");
> +return;
> +}
> +}
> +
> +static uint64_t vhost_vdpa_net_get_features(VirtIODevice *vdev,
> +uint64_t features,
> +Error **errp)
> +{
> +VHostVdpaNet *s = VHOST_VDPA_NET(vdev);
> +
> +virtio_add_feature(, VIRTIO_NET_F_CSUM);
> +virtio_add_feature(, VIRTIO_NET_F_GUEST_CSUM);
> +virtio_add_feature(, VIRTIO_NET_F_MAC);
> +virtio_add_feature(, VIRTIO_NET_F_GSO);
> +virtio_add_feature(, VIRTIO_NET_F_GUEST_TSO4);
> +virtio_add_feature(, VIRTIO_NET_F_GUEST_TSO6);
> +virtio_add_feature(, VIRTIO_NET_F_GUEST_ECN);
> +virtio_add_feature(, VIRTIO_NET_F_GUEST_UFO);
> +virtio_add_feature(, VIRTIO_NET_F_GUEST_ANNOUNCE);
> +virtio_add_feature(, VIRTIO_NET_F_HOST_TSO4);
> +virtio_add_feature(, VIRTIO_NET_F_HOST_TSO6);
> +virtio_add_feature(, VIRTIO_NET_F_HOST_ECN);
> +virtio_add_feature(, VIRTIO_NET_F_HOST_UFO);
> +virtio_add_feature(, VIRTIO_NET_F_MRG_RXBUF);
> +

Re: [PATCH 3/3] Link new ppc-spapr-hcalls.rst file to pseries.rst.

2021-12-08 Thread Daniel Henrique Barboza




On 12/8/21 13:59, lagar...@linux.ibm.com wrote:

From: Leonardo Garcia 

Signed-off-by: Leonardo Garcia 
---


Reviewed-by: Daniel Henrique Barboza 


  docs/system/ppc/pseries.rst | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/system/ppc/pseries.rst b/docs/system/ppc/pseries.rst
index e46f09d4c8..56f5942e13 100644
--- a/docs/system/ppc/pseries.rst
+++ b/docs/system/ppc/pseries.rst
@@ -113,12 +113,12 @@ can  also be found in QEMU documentation:
  .. toctree::
 :maxdepth: 1
  
+   ../../specs/ppc-spapr-hcalls.rst

 ../../specs/ppc-spapr-numa.rst
 ../../specs/ppc-spapr-xive.rst
  
  Other documentation available in QEMU docs directory:
  
-* Hypervisor calls (a.k.a. hcalls) (``docs/specs/ppc-spapr-hcalls.txt``).

  * Hot plug (``/docs/specs/ppc-spapr-hotplug.txt``).
  * Hypervisor calls needed by the Ultravisor
(``/docs/specs/ppc-spapr-uv-hcalls.txt``).





Re: [PATCH 2/3] docs: Rename ppc-spapr-hcalls.txt to ppc-spapr-hcalls.rst.

2021-12-08 Thread Daniel Henrique Barboza




On 12/8/21 13:59, lagar...@linux.ibm.com wrote:

From: Leonardo Garcia 

Signed-off-by: Leonardo Garcia 
---


Reviewed-by: Daniel Henrique Barboza 



  docs/specs/{ppc-spapr-hcalls.txt => ppc-spapr-hcalls.rst} | 0
  1 file changed, 0 insertions(+), 0 deletions(-)
  rename docs/specs/{ppc-spapr-hcalls.txt => ppc-spapr-hcalls.rst} (100%)

diff --git a/docs/specs/ppc-spapr-hcalls.txt b/docs/specs/ppc-spapr-hcalls.rst
similarity index 100%
rename from docs/specs/ppc-spapr-hcalls.txt
rename to docs/specs/ppc-spapr-hcalls.rst





Re: [RFC PATCH v2 0/5] virtio: early detect 'modern' virtio

2021-12-08 Thread Michael S. Tsirkin
On Fri, Nov 12, 2021 at 03:57:44PM +0100, Halil Pasic wrote:
> This is an early RFC for a transport specific early detecton of
> modern virtio, which is most relevant for transitional devices on big
> endian platforms, when drivers access the config space before
> FEATURES_OK is set.
> 
> The most important part that is missing here is fixing all the problems
> that arise in the situation described in the previous paragraph, when
> the config is managed by a vhost device (and thus outside QEMU. This
> series tackles this problem only for virtio_net+vhost as an example. If
> this approach is deemed good, we need to do something very similar for
> every single affected device.
> 
> This series was only lightly tested. The vhost stuff is entirely
> untested, unfortunately I don't have a working setup where this
> handling would be needed (because the config space is handled in the
> device). DPDK is not supported on s390x so at the moment I can't test
> DPDK based setups. 

So this looks sane to me. Cornelia requested some name tweaks and we
need to add vhost-user things and more devices, but otherwise we are
good.

> v1 -> v2:
> 
> * add callback
> * tweak feature manipulation
> * add generic handling for vhost that needs to be called by devices
> * add handling for virtio
> 
> Halil Pasic (5):
>   virtio: introduce virtio_force_modern()
>   virtio-ccw: use virtio_force_modern()
>   virtio-pci: use virtio_force_modern()
>   vhost: push features to backend on force_modern
>   virtio-net: handle force_modern for vhost
> 
>  hw/net/virtio-net.c| 20 
>  hw/s390x/virtio-ccw.c  |  3 +++
>  hw/virtio/vhost.c  | 17 +
>  hw/virtio/virtio-pci.c |  1 +
>  hw/virtio/virtio.c | 13 +
>  include/hw/virtio/vhost.h  |  2 ++
>  include/hw/virtio/virtio.h |  2 ++
>  7 files changed, 58 insertions(+)
> 
> 
> base-commit: 2c3e83f92d93fbab071b8a96b8ab769b01902475
> -- 
> 2.25.1




Re: [PATCH 1/3] docs: rSTify ppc-spapr-hcalls.txt

2021-12-08 Thread Daniel Henrique Barboza




On 12/8/21 13:59, lagar...@linux.ibm.com wrote:

From: Leonardo Garcia 

Signed-off-by: Leonardo Garcia 
---
  docs/specs/ppc-spapr-hcalls.txt | 92 -
  1 file changed, 57 insertions(+), 35 deletions(-)

diff --git a/docs/specs/ppc-spapr-hcalls.txt b/docs/specs/ppc-spapr-hcalls.txt
index 93fe3da91b..c69dae535b 100644
--- a/docs/specs/ppc-spapr-hcalls.txt
+++ b/docs/specs/ppc-spapr-hcalls.txt
@@ -1,9 +1,15 @@
-When used with the "pseries" machine type, QEMU-system-ppc64 implements
-a set of hypervisor calls using a subset of the server "PAPR" specification
-(IBM internal at this point), which is also what IBM's proprietary hypervisor
-adheres too.
+sPAPR hypervisor calls
+--
  
-The subset is selected based on the requirements of Linux as a guest.

+When used with the ``pseries`` machine type, ``qemu-system-ppc64`` implements
+a set of hypervisor calls (a.k.a. hcalls) defined in the `Linux on Power
+Architecture Reference document (LoPAR)
+`_.
+This document is a subset of the Power Architecture Platform Reference (PAPR+)
+specification (IBM internal only), which is what PowerVM, the IBM proprietary
+hypervisor, adheres to.
+
+The subset in LoPAR is selected based on the requirements of Linux as a guest.
  
  In addition to those calls, we have added our own private hypervisor

  calls which are mostly used as a private interface between the firmware
@@ -12,13 +18,14 @@ running in the guest and QEMU.
  All those hypercalls start at hcall number 0xf000 which correspond
  to an implementation specific range in PAPR.
  
-- H_RTAS (0xf000)

+H_RTAS (0xf000)
+^^^
  
-RTAS is a set of runtime services generally provided by the firmware

-inside the guest to the operating system. It predates the existence
-of hypervisors (it was originally an extension to Open Firmware) and
-is still used by PAPR to provide various services that aren't performance
-sensitive.
+RTAS stands for Run-Time Abstraction Sercies and is a set of runtime services
+generally provided by the firmware inside the guest to the operating system. It
+predates the existence of hypervisors (it was originally an extension to Open
+Firmware) and is still used by PAPR and LoPAR to provide various services that
+are not performance sensitive.
  
  We currently implement the RTAS services in QEMU itself. The actual RTAS

  "firmware" blob in the guest is a small stub of a few instructions which
@@ -26,22 +33,25 @@ calls our private H_RTAS hypervisor call to pass the RTAS 
calls to QEMU.
  
  Arguments:
  
-  r3 : H_RTAS (0xf000)

-  r4 : Guest physical address of RTAS parameter block
+  ``r3``: ``H_RTAS (0xf000)``
+
+  ``r4``: Guest physical address of RTAS parameter block.
  
  Returns:
  
-  H_SUCCESS   : Successfully called the RTAS function (RTAS result

-will have been stored in the parameter block)
-  H_PARAMETER : Unknown token
+  ``H_SUCCESS``: Successfully called the RTAS function (RTAS result will have
+  been stored in the parameter block).
  
-- H_LOGICAL_MEMOP (0xf001)

+  ``H_PARAMETER``: Unknown token.
  
-When the guest runs in "real mode" (in powerpc lingua this means

-with MMU disabled, ie guest effective == guest physical), it only
-has access to a subset of memory and no IOs.
+H_LOGICAL_MEMOP (0xf001)
+
  
-PAPR provides a set of hypervisor calls to perform cacheable or

+When the guest runs in "real mode" (in powerpc lingua this means with MMU


What's up with 'lingua'? As you already know "lingua" is Brazilian portuguese for 
"tongue"
and it's so weird to be used in this context.

The one English word that I can think of that is closer to "lingua" and would 
make sense here
is 'lingo'. But that is a slang for 'jargon' and not appropriate for a 
technical document
either. "langua" as a short form of "language" seems weird as well. I really 
believe 'jargon'
is a more suitable word here.

This was added by c73e3771ea79ab and I really believe this is an unintended 
typo/mistake. If
you're feeling generous feel free to send an extra patch (you can send an 
independent patch,
or another patch on top of this series, your call) changing this 'lingua' 
instance to something
more appropriate.


Since this is not this patch fault:

Reviewed-by: Daniel Henrique Barboza 





+disabled, i.e. guest effective address equals to guest physical address), it
+only has access to a subset of memory and no I/Os.
+
+PAPR and LoPAR provides a set of hypervisor calls to perform cacheable or
  non-cacheable accesses to any guest physical addresses that the
  guest can use in order to access IO devices while in real mode.
  
@@ -58,21 +68,33 @@ is used by our SLOF firmware to invert the screen.
  
  Arguments:
  
-  r3: H_LOGICAL_MEMOP (0xf001)

-  r4: Guest physical address of destination
-  r5: Guest physical address of source
-  r6: Individual element size
-0 

Re: [PATCH 6/7] migration: Dump sub-cmd name in loadvm_process_command tp

2021-12-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> It'll be easier to read the name rather than index of sub-cmd when debugging.
> 
> Signed-off-by: Peter Xu 
> ---
>  migration/savevm.c | 2 +-
>  migration/trace-events | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index d59e976d50..17b8e25e00 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2271,7 +2271,7 @@ static int loadvm_process_command(QEMUFile *f)
>  return qemu_file_get_error(f);
>  }
>  
> -trace_loadvm_process_command(cmd, len);
> +trace_loadvm_process_command(mig_cmd_args[cmd].name, len);
>  if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {

No! you can't do that name lookup before the limit check.

Dave

>  error_report("MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
>  return -EINVAL;
> diff --git a/migration/trace-events b/migration/trace-events
> index b48d873b8a..d63a5915f5 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -22,7 +22,7 @@ loadvm_postcopy_handle_resume(void) ""
>  loadvm_postcopy_ram_handle_discard(void) ""
>  loadvm_postcopy_ram_handle_discard_end(void) ""
>  loadvm_postcopy_ram_handle_discard_header(const char *ramid, uint16_t len) 
> "%s: %ud"
> -loadvm_process_command(uint16_t com, uint16_t len) "com=0x%x len=%d"
> +loadvm_process_command(const char *s, uint16_t len) "com=%s len=%d"
>  loadvm_process_command_ping(uint32_t val) "0x%x"
>  postcopy_ram_listen_thread_exit(void) ""
>  postcopy_ram_listen_thread_start(void) ""
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 5/7] migration: Drop return code for disgard ram process

2021-12-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> It will just never fail.  Drop those return values where they're constantly
> zeros.
> 
> A tiny touch-up on the tracepoint so trace_ram_postcopy_send_discard_bitmap()
> is called after the logic itself (which sounds more reasonable).
> 
> Signed-off-by: Peter Xu 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/migration.c |  5 +
>  migration/ram.c   | 20 +---
>  migration/ram.h   |  2 +-
>  3 files changed, 7 insertions(+), 20 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index abaf6f9e3d..c2e5539721 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2983,10 +2983,7 @@ static int postcopy_start(MigrationState *ms)
>   * that are dirty
>   */
>  if (migrate_postcopy_ram()) {
> -if (ram_postcopy_send_discard_bitmap(ms)) {
> -error_report("postcopy send discard bitmap failed");
> -goto fail;
> -}
> +ram_postcopy_send_discard_bitmap(ms);
>  }
>  
>  /*
> diff --git a/migration/ram.c b/migration/ram.c
> index ecc744d54d..28f1ace0f7 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2478,8 +2478,6 @@ static void 
> postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block);
>  /**
>   * postcopy_each_ram_send_discard: discard all RAMBlocks
>   *
> - * Returns 0 for success or negative for error
> - *
>   * Utility for the outgoing postcopy code.
>   *   Calls postcopy_send_discard_bm_ram for each RAMBlock
>   *   passing it bitmap indexes and name.
> @@ -2488,10 +2486,9 @@ static void 
> postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block);
>   *
>   * @ms: current migration state
>   */
> -static int postcopy_each_ram_send_discard(MigrationState *ms)
> +static void postcopy_each_ram_send_discard(MigrationState *ms)
>  {
>  struct RAMBlock *block;
> -int ret;
>  
>  RAMBLOCK_FOREACH_NOT_IGNORED(block) {
>  postcopy_discard_send_init(ms, block->idstr);
> @@ -2509,14 +2506,9 @@ static int 
> postcopy_each_ram_send_discard(MigrationState *ms)
>   * just needs indexes at this point, avoids it having
>   * target page specific code.
>   */
> -ret = postcopy_send_discard_bm_ram(ms, block);
> +postcopy_send_discard_bm_ram(ms, block);
>  postcopy_discard_send_finish(ms);
> -if (ret) {
> -return ret;
> -}
>  }
> -
> -return 0;
>  }
>  
>  /**
> @@ -2589,8 +2581,6 @@ static void 
> postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block)
>  /**
>   * ram_postcopy_send_discard_bitmap: transmit the discard bitmap
>   *
> - * Returns zero on success
> - *
>   * Transmit the set of pages to be discarded after precopy to the target
>   * these are pages that:
>   * a) Have been previously transmitted but are now dirty again
> @@ -2601,7 +2591,7 @@ static void 
> postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block)
>   *
>   * @ms: current migration state
>   */
> -int ram_postcopy_send_discard_bitmap(MigrationState *ms)
> +void ram_postcopy_send_discard_bitmap(MigrationState *ms)
>  {
>  RAMState *rs = ram_state;
>  
> @@ -2615,9 +2605,9 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>  rs->last_sent_block = NULL;
>  rs->last_page = 0;
>  
> -trace_ram_postcopy_send_discard_bitmap();
> +postcopy_each_ram_send_discard(ms);
>  
> -return postcopy_each_ram_send_discard(ms);
> +trace_ram_postcopy_send_discard_bitmap();
>  }
>  
>  /**
> diff --git a/migration/ram.h b/migration/ram.h
> index f543e25765..2c6dc3675d 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -57,7 +57,7 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t 
> start, ram_addr_t len);
>  void acct_update_position(QEMUFile *f, size_t size, bool zero);
>  void ram_postcopy_migrated_memory_release(MigrationState *ms);
>  /* For outgoing discard bitmap */
> -int ram_postcopy_send_discard_bitmap(MigrationState *ms);
> +void ram_postcopy_send_discard_bitmap(MigrationState *ms);
>  /* For incoming postcopy discard */
>  int ram_discard_range(const char *block_name, uint64_t start, size_t length);
>  int ram_postcopy_incoming_init(MigrationIncomingState *mis);
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 06/12] target/s390x: add zpci-interp to cpu models

2021-12-08 Thread Matthew Rosato

On 12/8/21 5:16 AM, Christian Borntraeger wrote:

Am 07.12.21 um 22:04 schrieb Matthew Rosato:

The zpci-interp feature is used to specify whether zPCI interpretation is
to be used for this guest.

Signed-off-by: Matthew Rosato 
---
  target/s390x/cpu_features_def.h.inc | 1 +
  target/s390x/gen-features.c | 2 ++
  target/s390x/kvm/kvm.c  | 1 +
  3 files changed, 4 insertions(+)

diff --git a/target/s390x/cpu_features_def.h.inc 
b/target/s390x/cpu_features_def.h.inc

index e86662bb3b..4ade3182aa 100644
--- a/target/s390x/cpu_features_def.h.inc
+++ b/target/s390x/cpu_features_def.h.inc
@@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: 
Conditional-external-interception f

  DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
  DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management 
facility")

  DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
+DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
  /* Features exposed via the PLO instruction. */
  DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in 
general registers)")

diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index 7cb1a6ec10..7005d22415 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
  S390_FEAT_HPMA2,
  S390_FEAT_SIE_KSS,
  S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
  };
  #define full_GEN14_GA2 EmptyFeat
@@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
  S390_FEAT_GROUP_MSA_EXT_8,
  S390_FEAT_MULTIPLE_EPOCH,
  S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
  };


For the default model you need to be careful.
Is this in any way guest visible? then you definitely need to fence this
off for older QEMU versions so that when you migrate with older QEMUs
See the s390_cpudef_featoff_greater calls in  hw/s390x/s390-virtio-ccw.c

I know its more of a theoretical aspect, since PCI currently forbids 
migration

but we should try to have the cpu model consistent I guess.


Ah, good idea.  Thanks for the pointer.


  #define default_GEN14_GA2 EmptyFeat
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 5b1fdb55c4..b13d78f988 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2290,6 +2290,7 @@ static int kvm_to_feat[][2] = {
  { KVM_S390_VM_CPU_FEAT_PFMFI, S390_FEAT_SIE_PFMFI},
  { KVM_S390_VM_CPU_FEAT_SIGPIF, S390_FEAT_SIE_SIGPIF},
  { KVM_S390_VM_CPU_FEAT_KSS, S390_FEAT_SIE_KSS},
+    { KVM_S390_VM_CPU_FEAT_ZPCI_INTERP, S390_FEAT_ZPCI_INTERP },
  };
  static int query_cpu_feat(S390FeatBitmap features)






Re: [PATCH 4/7] migration: Do chunk page in postcopy_each_ram_send_discard()

2021-12-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> Right now we loop ramblocks for twice, the 1st time chunk the dirty bits with
> huge page information; the 2nd time we send the discard ranges.  That's not
> necessary - we can do them in a single loop.
> 
> Signed-off-by: Peter Xu 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/ram.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index e3876181ab..ecc744d54d 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2473,6 +2473,8 @@ static int postcopy_send_discard_bm_ram(MigrationState 
> *ms, RAMBlock *block)
>  return 0;
>  }
>  
> +static void postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock 
> *block);
> +
>  /**
>   * postcopy_each_ram_send_discard: discard all RAMBlocks
>   *
> @@ -2494,6 +2496,14 @@ static int 
> postcopy_each_ram_send_discard(MigrationState *ms)
>  RAMBLOCK_FOREACH_NOT_IGNORED(block) {
>  postcopy_discard_send_init(ms, block->idstr);
>  
> +/*
> + * Deal with TPS != HPS and huge pages.  It discard any partially 
> sent
> + * host-page size chunks, mark any partially dirty host-page size
> + * chunks as all dirty.  In this case the host-page is the host-page
> + * for the particular RAMBlock, i.e. it might be a huge page.
> + */
> +postcopy_chunk_hostpages_pass(ms, block);
> +
>  /*
>   * Postcopy sends chunks of bitmap over the wire, but it
>   * just needs indexes at this point, avoids it having
> @@ -2594,7 +2604,6 @@ static void 
> postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block)
>  int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>  {
>  RAMState *rs = ram_state;
> -RAMBlock *block;
>  
>  RCU_READ_LOCK_GUARD();
>  
> @@ -2606,15 +2615,6 @@ int ram_postcopy_send_discard_bitmap(MigrationState 
> *ms)
>  rs->last_sent_block = NULL;
>  rs->last_page = 0;
>  
> -RAMBLOCK_FOREACH_NOT_IGNORED(block) {
> -/*
> - * Deal with TPS != HPS and huge pages.  It discard any partially 
> sent
> - * host-page size chunks, mark any partially dirty host-page size
> - * chunks as all dirty.  In this case the host-page is the host-page
> - * for the particular RAMBlock, i.e. it might be a huge page.
> - */
> -postcopy_chunk_hostpages_pass(ms, block);
> -}
>  trace_ram_postcopy_send_discard_bitmap();
>  
>  return postcopy_each_ram_send_discard(ms);
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 0/3] target/ppc: Minor fixes to exception code

2021-12-08 Thread Richard Henderson

On 12/8/21 4:30 AM, Fabiano Rosas wrote:

These are just some minor fixes to the exception code that I collected
over the past few months.

Fabiano Rosas (3):
   target/ppc: Fix MPCxxx FPU interrupt address
   target/ppc: Remove 603e exception model
   target/ppc: Set 601v exception model id

  target/ppc/cpu-qom.h |  2 --
  target/ppc/cpu_init.c| 37 +
  target/ppc/excp_helper.c |  1 -
  3 files changed, 5 insertions(+), 35 deletions(-)


Reviewed-by: Richard Henderson 

r~




Re: [PATCH for-7.0] hw: Add compat machines for 7.0

2021-12-08 Thread Cédric Le Goater

On 12/8/21 18:02, Cornelia Huck wrote:

Add 7.0 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck 


Acked-by: Cédric Le Goater 

Thanks,

C.


---
  hw/arm/virt.c  |  9 -
  hw/core/machine.c  |  3 +++
  hw/i386/pc.c   |  3 +++
  hw/i386/pc_piix.c  | 14 +-
  hw/i386/pc_q35.c   | 13 -
  hw/ppc/spapr.c | 15 +--
  hw/s390x/s390-virtio-ccw.c | 14 +-
  include/hw/boards.h|  3 +++
  include/hw/i386/pc.h   |  3 +++
  9 files changed, 71 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 30da05dfe040..ddcfab426436 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2857,10 +2857,17 @@ static void machvirt_machine_init(void)
  }
  type_init(machvirt_machine_init);
  
+static void virt_machine_7_0_options(MachineClass *mc)

+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(7, 0)
+
  static void virt_machine_6_2_options(MachineClass *mc)
  {
+virt_machine_7_0_options(mc);
+compat_props_add(mc->compat_props, hw_compat_6_2, hw_compat_6_2_len);
  }
-DEFINE_VIRT_MACHINE_AS_LATEST(6, 2)
+DEFINE_VIRT_MACHINE(6, 2)
  
  static void virt_machine_6_1_options(MachineClass *mc)

  {
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 53a99abc5605..a9c15479fe1d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -37,6 +37,9 @@
  #include "hw/virtio/virtio.h"
  #include "hw/virtio/virtio-pci.h"
  
+GlobalProperty hw_compat_6_2[] = {};

+const size_t hw_compat_6_2_len = G_N_ELEMENTS(hw_compat_6_2);
+
  GlobalProperty hw_compat_6_1[] = {
  { "vhost-user-vsock-device", "seqpacket", "off" },
  { "nvme-ns", "shared", "off" },
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index a2ef40ecbc24..fccde2ef39f6 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -94,6 +94,9 @@
  #include "trace.h"
  #include CONFIG_DEVICES
  
+GlobalProperty pc_compat_6_2[] = {};

+const size_t pc_compat_6_2_len = G_N_ELEMENTS(pc_compat_6_2);
+
  GlobalProperty pc_compat_6_1[] = {
  { TYPE_X86_CPU, "hv-version-id-build", "0x1bbc" },
  { TYPE_X86_CPU, "hv-version-id-major", "0x0006" },
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 223dd3e05d15..b03026bf0648 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -413,7 +413,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
  machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
  }
  
-static void pc_i440fx_6_2_machine_options(MachineClass *m)

+static void pc_i440fx_7_0_machine_options(MachineClass *m)
  {
  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
  pc_i440fx_machine_options(m);
@@ -422,6 +422,18 @@ static void pc_i440fx_6_2_machine_options(MachineClass *m)
  pcmc->default_cpu_version = 1;
  }
  
+DEFINE_I440FX_MACHINE(v7_0, "pc-i440fx-7.0", NULL,

+  pc_i440fx_7_0_machine_options);
+
+static void pc_i440fx_6_2_machine_options(MachineClass *m)
+{
+pc_i440fx_machine_options(m);
+m->alias = NULL;
+m->is_default = false;
+compat_props_add(m->compat_props, hw_compat_6_2, hw_compat_6_2_len);
+compat_props_add(m->compat_props, pc_compat_6_2, pc_compat_6_2_len);
+}
+
  DEFINE_I440FX_MACHINE(v6_2, "pc-i440fx-6.2", NULL,
pc_i440fx_6_2_machine_options);
  
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c

index e1e100316d93..6b66eb16bb64 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -360,7 +360,7 @@ static void pc_q35_machine_options(MachineClass *m)
  m->max_cpus = 288;
  }
  
-static void pc_q35_6_2_machine_options(MachineClass *m)

+static void pc_q35_7_0_machine_options(MachineClass *m)
  {
  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
  pc_q35_machine_options(m);
@@ -368,6 +368,17 @@ static void pc_q35_6_2_machine_options(MachineClass *m)
  pcmc->default_cpu_version = 1;
  }
  
+DEFINE_Q35_MACHINE(v7_0, "pc-q35-7.0", NULL,

+   pc_q35_7_0_machine_options);
+
+static void pc_q35_6_2_machine_options(MachineClass *m)
+{
+pc_q35_machine_options(m);
+m->alias = NULL;
+compat_props_add(m->compat_props, hw_compat_6_2, hw_compat_6_2_len);
+compat_props_add(m->compat_props, pc_compat_6_2, pc_compat_6_2_len);
+}
+
  DEFINE_Q35_MACHINE(v6_2, "pc-q35-6.2", NULL,
 pc_q35_6_2_machine_options);
  
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c

index 3b5fd749be89..837342932586 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4665,15 +4665,26 @@ static void 
spapr_machine_latest_class_options(MachineClass *mc)
  }\
  type_init(spapr_machine_register_##suffix)
  
+/*

+ * pseries-7.0
+ */
+static void spapr_machine_7_0_class_options(MachineClass *mc)
+{
+/* Defaults for the latest behaviour inherited from the base class */
+}
+
+DEFINE_SPAPR_MACHINE(7_0, "7.0", true);
+
  /*
   * pseries-6.2
   */
  static void spapr_machine_6_2_class_options(MachineClass *mc)
  {
-/* 

Re: [PATCH 2/7] hw/intc: sifive_plic: Cleanup the write function

2021-12-08 Thread Richard Henderson

On 12/7/21 10:42 PM, Alistair Francis wrote:

From: Alistair Francis 

Signed-off-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
  hw/intc/sifive_plic.c | 82 +--
  1 file changed, 33 insertions(+), 49 deletions(-)

diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c
index 35f097799a..c1fa689868 100644
--- a/hw/intc/sifive_plic.c
+++ b/hw/intc/sifive_plic.c
@@ -33,6 +33,17 @@
  
  #define RISCV_DEBUG_PLIC 0
  
+static bool addr_between(uint32_t addr, uint32_t base, uint32_t num)

+{
+uint32_t end = base + num;
+
+if (addr >= base && addr < end) {
+return true;
+}
+
+return false;
+}


It may well not matter for your use case, but this will fail for addresses at the end of 
the range.  Better as


return addr >= base && addr - base < num;


r~



Re: dozens of qemu/kvm VMs getting into stuck states since kernel ~5.13

2021-12-08 Thread Chris Murphy
On Tue, Dec 7, 2021 at 5:25 PM Sean Christopherson  wrote:
>
> On Tue, Dec 07, 2021, Chris Murphy wrote:
> > cc: qemu-devel
> >
> > Hi,
> >
> > I'm trying to help progress a very troublesome and so far elusive bug
> > we're seeing in Fedora infrastructure. When running dozens of qemu-kvm
> > VMs simultaneously, eventually they become unresponsive, as well as
> > new processes as we try to extract information from the host about
> > what's gone wrong.
>
> Have you tried bisecting?  IIUC, the issues showed up between v5.11 and 
> v5.12.12,
> bisecting should be relatively straightforward.

We haven't tried bisecting. Due to limited access since it's a
production machine, and limited resources for those who have that
access, I think the chance of bisecting is low, but I've asked. We
could do something of a faux-bisect by running already built kernels
in Fedora infrastructure. We could start by running x.y.0 kernels to
see when it first appeared, then once hitting the problem, start
testing rc1, rc2, ... in that series. We also have approximately daily
git builds in between those rc's. That might be enough to deduce a
culprit, but I'm not sure. At the least this would get us a ~1-3 day
window within two rc's for bisecting.

>
> > Systems (Fedora openQA worker hosts) on kernel 5.12.12+ wind up in a
> > state where forking does not work correctly, breaking most things
> > https://bugzilla.redhat.com/show_bug.cgi?id=2009585
> >
> > In subsequent testing, we used newer kernels with lockdep and other
> > debug stuff enabled, and managed to capture a hung task with a bunch
> > of locks listed, including kvm and qemu processes. But I can't parse
> > it.
> >
> > 5.15-rc7
> > https://bugzilla-attachments.redhat.com/attachment.cgi?id=1840941
> > 5.15+
> > https://bugzilla-attachments.redhat.com/attachment.cgi?id=1840939
> >
> > If anyone can take a glance at those kernel messages, and/or give
> > hints how we can extract more information for debugging, it'd be
> > appreciated. Maybe all of that is normal and the actual problem isn't
> > in any of these traces.
>
> All the instances of
>
>   (>mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x77/0x720 [kvm]
>
> are uninteresting and expected, that's just each vCPU task taking its 
> associated
> vcpu->mutex, likely for KVM_RUN.
>
> At a glance, the XFS stuff looks far more interesting/suspect.

Thanks for the reply.

-- 
Chris Murphy



[PATCH for-7.0] hw: Add compat machines for 7.0

2021-12-08 Thread Cornelia Huck
Add 7.0 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck 
---
 hw/arm/virt.c  |  9 -
 hw/core/machine.c  |  3 +++
 hw/i386/pc.c   |  3 +++
 hw/i386/pc_piix.c  | 14 +-
 hw/i386/pc_q35.c   | 13 -
 hw/ppc/spapr.c | 15 +--
 hw/s390x/s390-virtio-ccw.c | 14 +-
 include/hw/boards.h|  3 +++
 include/hw/i386/pc.h   |  3 +++
 9 files changed, 71 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 30da05dfe040..ddcfab426436 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2857,10 +2857,17 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
+static void virt_machine_7_0_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(7, 0)
+
 static void virt_machine_6_2_options(MachineClass *mc)
 {
+virt_machine_7_0_options(mc);
+compat_props_add(mc->compat_props, hw_compat_6_2, hw_compat_6_2_len);
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(6, 2)
+DEFINE_VIRT_MACHINE(6, 2)
 
 static void virt_machine_6_1_options(MachineClass *mc)
 {
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 53a99abc5605..a9c15479fe1d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -37,6 +37,9 @@
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-pci.h"
 
+GlobalProperty hw_compat_6_2[] = {};
+const size_t hw_compat_6_2_len = G_N_ELEMENTS(hw_compat_6_2);
+
 GlobalProperty hw_compat_6_1[] = {
 { "vhost-user-vsock-device", "seqpacket", "off" },
 { "nvme-ns", "shared", "off" },
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index a2ef40ecbc24..fccde2ef39f6 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -94,6 +94,9 @@
 #include "trace.h"
 #include CONFIG_DEVICES
 
+GlobalProperty pc_compat_6_2[] = {};
+const size_t pc_compat_6_2_len = G_N_ELEMENTS(pc_compat_6_2);
+
 GlobalProperty pc_compat_6_1[] = {
 { TYPE_X86_CPU, "hv-version-id-build", "0x1bbc" },
 { TYPE_X86_CPU, "hv-version-id-major", "0x0006" },
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 223dd3e05d15..b03026bf0648 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -413,7 +413,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
 machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
 }
 
-static void pc_i440fx_6_2_machine_options(MachineClass *m)
+static void pc_i440fx_7_0_machine_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_i440fx_machine_options(m);
@@ -422,6 +422,18 @@ static void pc_i440fx_6_2_machine_options(MachineClass *m)
 pcmc->default_cpu_version = 1;
 }
 
+DEFINE_I440FX_MACHINE(v7_0, "pc-i440fx-7.0", NULL,
+  pc_i440fx_7_0_machine_options);
+
+static void pc_i440fx_6_2_machine_options(MachineClass *m)
+{
+pc_i440fx_machine_options(m);
+m->alias = NULL;
+m->is_default = false;
+compat_props_add(m->compat_props, hw_compat_6_2, hw_compat_6_2_len);
+compat_props_add(m->compat_props, pc_compat_6_2, pc_compat_6_2_len);
+}
+
 DEFINE_I440FX_MACHINE(v6_2, "pc-i440fx-6.2", NULL,
   pc_i440fx_6_2_machine_options);
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index e1e100316d93..6b66eb16bb64 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -360,7 +360,7 @@ static void pc_q35_machine_options(MachineClass *m)
 m->max_cpus = 288;
 }
 
-static void pc_q35_6_2_machine_options(MachineClass *m)
+static void pc_q35_7_0_machine_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_q35_machine_options(m);
@@ -368,6 +368,17 @@ static void pc_q35_6_2_machine_options(MachineClass *m)
 pcmc->default_cpu_version = 1;
 }
 
+DEFINE_Q35_MACHINE(v7_0, "pc-q35-7.0", NULL,
+   pc_q35_7_0_machine_options);
+
+static void pc_q35_6_2_machine_options(MachineClass *m)
+{
+pc_q35_machine_options(m);
+m->alias = NULL;
+compat_props_add(m->compat_props, hw_compat_6_2, hw_compat_6_2_len);
+compat_props_add(m->compat_props, pc_compat_6_2, pc_compat_6_2_len);
+}
+
 DEFINE_Q35_MACHINE(v6_2, "pc-q35-6.2", NULL,
pc_q35_6_2_machine_options);
 
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3b5fd749be89..837342932586 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4665,15 +4665,26 @@ static void 
spapr_machine_latest_class_options(MachineClass *mc)
 }\
 type_init(spapr_machine_register_##suffix)
 
+/*
+ * pseries-7.0
+ */
+static void spapr_machine_7_0_class_options(MachineClass *mc)
+{
+/* Defaults for the latest behaviour inherited from the base class */
+}
+
+DEFINE_SPAPR_MACHINE(7_0, "7.0", true);
+
 /*
  * pseries-6.2
  */
 static void spapr_machine_6_2_class_options(MachineClass *mc)
 {
-/* Defaults for the latest behaviour inherited from the base class */
+spapr_machine_7_0_class_options(mc);
+compat_props_add(mc->compat_props, 

[PATCH 0/3] docs: rSTify ppc-spapr-hcalls.txt

2021-12-08 Thread lagarcia
From: Leonardo Garcia 

Along with this change, hook the new rst file into the pseries
documentation.

Leonardo Garcia (3):
  docs: rSTify ppc-spapr-hcalls.txt
  docs: Rename ppc-spapr-hcalls.txt to ppc-spapr-hcalls.rst.
  Link new ppc-spapr-hcalls.rst file to pseries.rst.

 docs/specs/ppc-spapr-hcalls.rst | 100 
 docs/specs/ppc-spapr-hcalls.txt |  78 -
 docs/system/ppc/pseries.rst |   2 +-
 3 files changed, 101 insertions(+), 79 deletions(-)
 create mode 100644 docs/specs/ppc-spapr-hcalls.rst
 delete mode 100644 docs/specs/ppc-spapr-hcalls.txt

-- 
2.33.1




[PATCH 1/3] docs: rSTify ppc-spapr-hcalls.txt

2021-12-08 Thread lagarcia
From: Leonardo Garcia 

Signed-off-by: Leonardo Garcia 
---
 docs/specs/ppc-spapr-hcalls.txt | 92 -
 1 file changed, 57 insertions(+), 35 deletions(-)

diff --git a/docs/specs/ppc-spapr-hcalls.txt b/docs/specs/ppc-spapr-hcalls.txt
index 93fe3da91b..c69dae535b 100644
--- a/docs/specs/ppc-spapr-hcalls.txt
+++ b/docs/specs/ppc-spapr-hcalls.txt
@@ -1,9 +1,15 @@
-When used with the "pseries" machine type, QEMU-system-ppc64 implements
-a set of hypervisor calls using a subset of the server "PAPR" specification
-(IBM internal at this point), which is also what IBM's proprietary hypervisor
-adheres too.
+sPAPR hypervisor calls
+--
 
-The subset is selected based on the requirements of Linux as a guest.
+When used with the ``pseries`` machine type, ``qemu-system-ppc64`` implements
+a set of hypervisor calls (a.k.a. hcalls) defined in the `Linux on Power
+Architecture Reference document (LoPAR)
+`_.
+This document is a subset of the Power Architecture Platform Reference (PAPR+)
+specification (IBM internal only), which is what PowerVM, the IBM proprietary
+hypervisor, adheres to.
+
+The subset in LoPAR is selected based on the requirements of Linux as a guest.
 
 In addition to those calls, we have added our own private hypervisor
 calls which are mostly used as a private interface between the firmware
@@ -12,13 +18,14 @@ running in the guest and QEMU.
 All those hypercalls start at hcall number 0xf000 which correspond
 to an implementation specific range in PAPR.
 
-- H_RTAS (0xf000)
+H_RTAS (0xf000)
+^^^
 
-RTAS is a set of runtime services generally provided by the firmware
-inside the guest to the operating system. It predates the existence
-of hypervisors (it was originally an extension to Open Firmware) and
-is still used by PAPR to provide various services that aren't performance
-sensitive.
+RTAS stands for Run-Time Abstraction Sercies and is a set of runtime services
+generally provided by the firmware inside the guest to the operating system. It
+predates the existence of hypervisors (it was originally an extension to Open
+Firmware) and is still used by PAPR and LoPAR to provide various services that
+are not performance sensitive.
 
 We currently implement the RTAS services in QEMU itself. The actual RTAS
 "firmware" blob in the guest is a small stub of a few instructions which
@@ -26,22 +33,25 @@ calls our private H_RTAS hypervisor call to pass the RTAS 
calls to QEMU.
 
 Arguments:
 
-  r3 : H_RTAS (0xf000)
-  r4 : Guest physical address of RTAS parameter block
+  ``r3``: ``H_RTAS (0xf000)``
+
+  ``r4``: Guest physical address of RTAS parameter block.
 
 Returns:
 
-  H_SUCCESS   : Successfully called the RTAS function (RTAS result
-will have been stored in the parameter block)
-  H_PARAMETER : Unknown token
+  ``H_SUCCESS``: Successfully called the RTAS function (RTAS result will have
+  been stored in the parameter block).
 
-- H_LOGICAL_MEMOP (0xf001)
+  ``H_PARAMETER``: Unknown token.
 
-When the guest runs in "real mode" (in powerpc lingua this means
-with MMU disabled, ie guest effective == guest physical), it only
-has access to a subset of memory and no IOs.
+H_LOGICAL_MEMOP (0xf001)
+
 
-PAPR provides a set of hypervisor calls to perform cacheable or
+When the guest runs in "real mode" (in powerpc lingua this means with MMU
+disabled, i.e. guest effective address equals to guest physical address), it
+only has access to a subset of memory and no I/Os.
+
+PAPR and LoPAR provides a set of hypervisor calls to perform cacheable or
 non-cacheable accesses to any guest physical addresses that the
 guest can use in order to access IO devices while in real mode.
 
@@ -58,21 +68,33 @@ is used by our SLOF firmware to invert the screen.
 
 Arguments:
 
-  r3: H_LOGICAL_MEMOP (0xf001)
-  r4: Guest physical address of destination
-  r5: Guest physical address of source
-  r6: Individual element size
-0 = 1 byte
-1 = 2 bytes
-2 = 4 bytes
-3 = 8 bytes
-  r7: Number of elements
-  r8: Operation
-0 = copy
-1 = xor
+  ``r3 ``: ``H_LOGICAL_MEMOP (0xf001)``
+
+  ``r4``: Guest physical address of destination.
+
+  ``r5``: Guest physical address of source.
+
+  ``r6``: Individual element size, defined by the binary logarithm of the
+  desired size. Supported values are:
+
+``0`` = 1 byte
+
+``1`` = 2 bytes
+
+``2`` = 4 bytes
+
+``3`` = 8 bytes
+
+  ``r7``: Number of elements.
+
+  ``r8``: Operation. Supported values are:
+
+``0``: copy
+
+``1``: xor
 
 Returns:
 
-  H_SUCCESS   : Success
-  H_PARAMETER : Invalid argument
+  ``H_SUCCESS``: Success.
 
+  ``H_PARAMETER``: Invalid argument.
\ No newline at end of file
-- 
2.33.1




[PATCH 3/3] Link new ppc-spapr-hcalls.rst file to pseries.rst.

2021-12-08 Thread lagarcia
From: Leonardo Garcia 

Signed-off-by: Leonardo Garcia 
---
 docs/system/ppc/pseries.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/system/ppc/pseries.rst b/docs/system/ppc/pseries.rst
index e46f09d4c8..56f5942e13 100644
--- a/docs/system/ppc/pseries.rst
+++ b/docs/system/ppc/pseries.rst
@@ -113,12 +113,12 @@ can  also be found in QEMU documentation:
 .. toctree::
:maxdepth: 1
 
+   ../../specs/ppc-spapr-hcalls.rst
../../specs/ppc-spapr-numa.rst
../../specs/ppc-spapr-xive.rst
 
 Other documentation available in QEMU docs directory:
 
-* Hypervisor calls (a.k.a. hcalls) (``docs/specs/ppc-spapr-hcalls.txt``).
 * Hot plug (``/docs/specs/ppc-spapr-hotplug.txt``).
 * Hypervisor calls needed by the Ultravisor
   (``/docs/specs/ppc-spapr-uv-hcalls.txt``).
-- 
2.33.1




[PATCH 2/3] docs: Rename ppc-spapr-hcalls.txt to ppc-spapr-hcalls.rst.

2021-12-08 Thread lagarcia
From: Leonardo Garcia 

Signed-off-by: Leonardo Garcia 
---
 docs/specs/{ppc-spapr-hcalls.txt => ppc-spapr-hcalls.rst} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename docs/specs/{ppc-spapr-hcalls.txt => ppc-spapr-hcalls.rst} (100%)

diff --git a/docs/specs/ppc-spapr-hcalls.txt b/docs/specs/ppc-spapr-hcalls.rst
similarity index 100%
rename from docs/specs/ppc-spapr-hcalls.txt
rename to docs/specs/ppc-spapr-hcalls.rst
-- 
2.33.1




Re: [PATCH 3/7] migration: Drop postcopy_chunk_hostpages()

2021-12-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> This function calls three functions:
> 
>   - postcopy_discard_send_init(ms, block->idstr);
>   - postcopy_chunk_hostpages_pass(ms, block);
>   - postcopy_discard_send_finish(ms);
> 
> However only the 2nd function call is meaningful.  It's major role is to make
> sure dirty bits are applied in host-page-size granule, so there will be no
> partial dirty bits set for a whole host page if huge pages are used.
> 
> The 1st/3rd call are for latter when we want to send the disgard ranges.
> They're mostly no-op here besides some tracepoints (which are misleading!).
> 
> Drop them, then we can directly drop postcopy_chunk_hostpages() as a whole
> because we can call postcopy_chunk_hostpages_pass() directly.
> 
> There're still some nice comments above postcopy_chunk_hostpages() that 
> explain
> what it does.  Copy it over to the caller's site.
> 
> Signed-off-by: Peter Xu 

Yeh, I think originally the idea was to send some of the messages during
the chunking

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/ram.c | 33 +++--
>  1 file changed, 7 insertions(+), 26 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index fb8c1a887e..e3876181ab 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2576,30 +2576,6 @@ static void 
> postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block)
>  }
>  }
>  
> -/**
> - * postcopy_chunk_hostpages: discard any partially sent host page
> - *
> - * Utility for the outgoing postcopy code.
> - *
> - * Discard any partially sent host-page size chunks, mark any partially
> - * dirty host-page size chunks as all dirty.  In this case the host-page
> - * is the host-page for the particular RAMBlock, i.e. it might be a huge page
> - *
> - * @ms: current migration state
> - * @block: block we want to work with
> - */
> -static void postcopy_chunk_hostpages(MigrationState *ms, RAMBlock *block)
> -{
> -postcopy_discard_send_init(ms, block->idstr);
> -
> -/*
> - * Ensure that all partially dirty host pages are made fully dirty.
> - */
> -postcopy_chunk_hostpages_pass(ms, block);
> -
> -postcopy_discard_send_finish(ms);
> -}
> -
>  /**
>   * ram_postcopy_send_discard_bitmap: transmit the discard bitmap
>   *
> @@ -2631,8 +2607,13 @@ int ram_postcopy_send_discard_bitmap(MigrationState 
> *ms)
>  rs->last_page = 0;
>  
>  RAMBLOCK_FOREACH_NOT_IGNORED(block) {
> -/* Deal with TPS != HPS and huge pages */
> -postcopy_chunk_hostpages(ms, block);
> +/*
> + * Deal with TPS != HPS and huge pages.  It discard any partially 
> sent
> + * host-page size chunks, mark any partially dirty host-page size
> + * chunks as all dirty.  In this case the host-page is the host-page
> + * for the particular RAMBlock, i.e. it might be a huge page.
> + */
> +postcopy_chunk_hostpages_pass(ms, block);
>  }
>  trace_ram_postcopy_send_discard_bitmap();
>  
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH] hw/ppc/ppc405_boards: Change kernel load address

2021-12-08 Thread Cédric Le Goater

On 12/8/21 14:19, Thomas Huth wrote:

On 08/12/2021 14.15, Cédric Le Goater wrote:

On 12/8/21 14:07, Thomas Huth wrote:

On 03/12/2021 13.25, Cédric Le Goater wrote:

On 12/3/21 11:40, Peter Maydell wrote:

On Fri, 3 Dec 2021 at 10:32, Thomas Huth  wrote:

I guess it's an accidential NULL pointer dereference somewhere in the u-boot
code ... which will be quite hard to track down when the first page of
memory is marked as writable... :-/


Attach a target-arch gdb to the QEMU gdbstub and put a watchpoint on
address zero ? (Or if you suspect something inside QEMU is doing it
then run QEMU under gdb and watchpoint the host memory location
corresponding to guest address 0, but that's more painful.) Nothing
in the pre-kernel part of the boot process will have set up paging,
so the watchpointing should be pretty reliable.


That's the guy:

https://gitlab.com/huth/u-boot/-/blob/taihu/arch/powerpc/cpu/ppc4xx/sdram.c#L199

There must be an error in how get_ram_size() restores the RAM values :

   https://gitlab.com/huth/u-boot/-/blob/taihu/common/memsize.c


There is definitely something wrong in that function. Seems like they tried to 
fix it once here:

  https://source.denx.de/u-boot/u-boot/-/commit/b8496cced856ff411f

but that patch got later reverted without a replacement later...



a fix restoring address 0, something like :

@@ -60,6 +60,9 @@ long get_ram_size(long *base, long maxsi
  return (0);
  }

+    addr = base;
+    *addr = save[i];
+
  for (cnt = 1; cnt < maxsize / sizeof(long); cnt <<= 1) {
  addr = base + cnt;    /* pointer arith! */
  val = *addr;

is not enough. trap_init() will also overwrite the kernel image.
And u-boot will complain about a wrong CRC.

The 405 series I sent improves support and latest kernel 5.16-rc4
can be loaded without uboot. It's a start to debug user space.


Yes, your series is certainly the better way forward. I'll stop messing 
with u-boot now ... since upstream u-boot already removed the 405 support 
a long time ago, my hack was a dead end anyway (but it helped to get at 
least a kernel running again, so it was not in vain)


This is not what I meant ! sorry ! What I wanted to express is that
now nothing is blocking us from fixing user space :)

The patchset I sent does not break booting from U-Boot. It still
very much works and we should keep it because it exercises more
our models. Even if the SDRAM still needs a U-Boot hack. I believe
that can be fixed in QEMU with some time. When done, I hope we
can merge the U-Boot binary in QEMU.

Regarding Linux loading in QEMU or U-Boot, the problem is the PPC
board information which is out of sync in all components ...

Today, the easiest is to boot directly from a Linux 405 hotfoot
kernel under QEMU because we are a bit lucky. The proc frequency
is set to some non zero value so the kernel boots. I pushed
experiments a bit further and network seems to be OK also.

My idea was to merge the first round of changes for direct linux
boot. See how that goes (user space) and then merge a custom
ppc405-hotfoot machine to work around the board information
problem.

Thanks,

C.



[PATCH v4] docs: Introducing pseries documentation.

2021-12-08 Thread lagarcia
From: Leonardo Garcia 

The purpose of this document is to substitute the content currently
available in the QEMU wiki at [0]. This initial version does contain
some additional content as well. Whenever this documentation gets
upstream and is reflected in [1], the QEMU wiki will be edited to point
to this documentation, so that we only need to keep it updated in one
place.

0. https://wiki.qemu.org/Documentation/Platforms/POWER
1. https://qemu.readthedocs.io/en/latest/system/ppc/pseries.html

Signed-off-by: Leonardo Garcia 
Reviewed-by: David Gibson 
---

Changelog:
v3->v4:
- Fix build.

v2->v3:
- Updating LoPAR link.
- Minor clarifications in the text.

v1->v2:
- Addressing David Gibson and Cédric's comments. Thanks!

 docs/system/ppc/pseries.rst | 226 
 1 file changed, 226 insertions(+)

diff --git a/docs/system/ppc/pseries.rst b/docs/system/ppc/pseries.rst
index 932d4dd17d..e46f09d4c8 100644
--- a/docs/system/ppc/pseries.rst
+++ b/docs/system/ppc/pseries.rst
@@ -1,12 +1,238 @@
 pSeries family boards (``pseries``)
 ===
 
+The Power machine para-virtualized environment described by the `Linux on Power
+Architecture Reference document (LoPAR)
+`_
+is called pSeries. This environment is also known as sPAPR, System p guests, or
+simply Power Linux guests (although it is capable of running other operating
+systems, such as AIX).
+
+Even though pSeries is designed to behave as a guest environment, it is also
+capable of acting as a hypervisor OS, providing, on that role, nested
+virtualization capabilities.
+
 Supported devices
 -
 
+ * Multi processor support for many Power processors generations: POWER7,
+   POWER7+, POWER8, POWER8NVL, POWER9, and Power10. Support for POWER5+ exists,
+   but its state is unknown.
+ * Interrupt Controller, XICS (POWER8) and XIVE (POWER9 and Power10)
+ * vPHB PCIe Host bridge.
+ * vscsi and vnet devices, compatible with the same devices available on a
+   PowerVM hypervisor with VIOS managing LPARs.
+ * Virtio based devices.
+ * PCIe device pass through.
+
 Missing devices
 ---
 
+ * SPICE support.
 
 Firmware
 
+
+`SLOF `_ (Slimline Open Firmware) is an
+implementation of the `IEEE 1275-1994, Standard for Boot (Initialization
+Configuration) Firmware: Core Requirements and Practices
+`_.
+
+QEMU includes a prebuilt image of SLOF which is updated when a more recent
+version is required.
+
+Build directions
+
+
+.. code-block:: bash
+
+  ./configure --target-list=ppc64-softmmu && make
+
+Running instructions
+
+
+Someone can select the pSeries machine type by running QEMU with the following
+options:
+
+.. code-block:: bash
+
+  qemu-system-ppc64 -M pseries 
+
+sPAPR devices
+-
+
+The sPAPR specification defines a set of para-virtualized devices, which are
+also supported by the pSeries machine in QEMU and can be instantiated with the
+``-device`` option:
+
+* ``spapr-vlan`` : a virtual network interface.
+* ``spapr-vscsi`` : a virtual SCSI disk interface.
+* ``spapr-rng`` : a pseudo-device for passing random number generator data to 
the
+  guest (see the `H_RANDOM hypercall feature
+  `_ for details).
+* ``spapr-vty``: a virtual teletype.
+* ``spapr-pci-host-bridge``: a PCI host bridge.
+* ``tpm-spapr``: a Trusted Platform Module (TPM).
+* ``spapr-tpm-proxy``: a TPM proxy.
+
+These are compatible with the devices historically available for use when
+running the IBM PowerVM hypervisor with LPARs.
+
+However, since these devices have originally been specified with another
+hypervisor and non-Linux guests in mind, you should use the virtio counterparts
+(virtio-net, virtio-blk/scsi and virtio-rng for instance) if possible instead,
+since they will most probably give you better performance with Linux guests in 
a
+QEMU environment.
+
+The pSeries machine in QEMU is always instantiated with the following devices:
+
+* A NVRAM device (``spapr-nvram``).
+* A virtual teletype (``spapr-vty``).
+* A PCI host bridge (``spapr-pci-host-bridge``).
+
+Hence, it is not needed to add them manually, unless you use the 
``-nodefaults``
+command line option in QEMU.
+
+In the case of the default ``spapr-nvram`` device, if someone wants to make the
+contents of the NVRAM device persistent, they will need to specify a PFLASH
+device when starting QEMU, i.e. either use
+``-drive if=pflash,file=,format=raw`` to set the default PFLASH
+device, or specify one with an ID
+(``-drive if=none,file=,format=raw,id=pfid``) and pass that ID to the
+NVRAM device with ``-global spapr-nvram.drive=pfid``.
+
+sPAPR specification
+^^^
+
+The main source of documentation on the sPAPR standard is the `Linux on Power
+Architecture Reference 

Re: [PATCH v9 05/10] ACPI ERST: support for ACPI ERST feature

2021-12-08 Thread Eric DeVolder




On 12/6/21 02:14, Ani Sinha wrote:

On Fri, Dec 3, 2021 at 12:39 AM Eric DeVolder  wrote:


This implements a PCI device for ACPI ERST. This implements the
non-NVRAM "mode" of operation for ERST as it is supported by
Linux and Windows.


OK sent some more comments. It will take another pass for me to fully
review this.



Hi Ani, thank you for reviewing. I have incorporated your feedback thus far.
I have v10 ready to go but not sure if your review of v9 is completed yet?
Thanks!
eric



Re: [PATCH 2/7] migration: Don't return for postcopy_chunk_hostpages()

2021-12-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> It always return zero, because it just can't go wrong so far.  Simplify the
> code with no functional change.
> 
> Signed-off-by: Peter Xu 

OK, I was wondering if the discard_send_finish could fail, but I chased
it another 3 or 4 levels and nothing returns an error flag either.


Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/ram.c | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 756ac800a7..fb8c1a887e 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2585,12 +2585,10 @@ static void 
> postcopy_chunk_hostpages_pass(MigrationState *ms, RAMBlock *block)
>   * dirty host-page size chunks as all dirty.  In this case the host-page
>   * is the host-page for the particular RAMBlock, i.e. it might be a huge page
>   *
> - * Returns zero on success
> - *
>   * @ms: current migration state
>   * @block: block we want to work with
>   */
> -static int postcopy_chunk_hostpages(MigrationState *ms, RAMBlock *block)
> +static void postcopy_chunk_hostpages(MigrationState *ms, RAMBlock *block)
>  {
>  postcopy_discard_send_init(ms, block->idstr);
>  
> @@ -2600,7 +2598,6 @@ static int postcopy_chunk_hostpages(MigrationState *ms, 
> RAMBlock *block)
>  postcopy_chunk_hostpages_pass(ms, block);
>  
>  postcopy_discard_send_finish(ms);
> -return 0;
>  }
>  
>  /**
> @@ -2622,7 +2619,6 @@ int ram_postcopy_send_discard_bitmap(MigrationState *ms)
>  {
>  RAMState *rs = ram_state;
>  RAMBlock *block;
> -int ret;
>  
>  RCU_READ_LOCK_GUARD();
>  
> @@ -2636,10 +2632,7 @@ int ram_postcopy_send_discard_bitmap(MigrationState 
> *ms)
>  
>  RAMBLOCK_FOREACH_NOT_IGNORED(block) {
>  /* Deal with TPS != HPS and huge pages */
> -ret = postcopy_chunk_hostpages(ms, block);
> -if (ret) {
> -return ret;
> -}
> +postcopy_chunk_hostpages(ms, block);
>  }
>  trace_ram_postcopy_send_discard_bitmap();
>  
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH] block/nvme: fix infinite loop in nvme_free_req_queue_cb()

2021-12-08 Thread Philippe Mathieu-Daudé
On 12/8/21 16:22, Stefan Hajnoczi wrote:
> When the request free list is exhausted the coroutine waits on
> q->free_req_queue for the next free request. Whenever a request is
> completed a BH is scheduled to invoke nvme_free_req_queue_cb() and wake
> up waiting coroutines.
> 
> 1. nvme_get_free_req() waits for a free request:
> 
> while (q->free_req_head == -1) {
> ...
> trace_nvme_free_req_queue_wait(q->s, q->index);
> qemu_co_queue_wait(>free_req_queue, >lock);
> ...
> }
> 
> 2. nvme_free_req_queue_cb() wakes up the coroutine:
> 
> while (qemu_co_enter_next(>free_req_queue, >lock)) {
>^--- infinite loop when free_req_head == -1

Ouch.

> }
> 
> nvme_free_req_queue_cb() and the coroutine form an infinite loop when
> q->free_req_head == -1. Fix this by checking q->free_req_head in
> nvme_free_req_queue_cb(). If the free request list is exhausted, don't
> wake waiting coroutines. Eventually an in-flight request will complete
> and the BH will be scheduled again, guaranteeing forward progress.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/nvme.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/block/nvme.c b/block/nvme.c
> index e4f336d79c..fa360b9b3c 100644
> --- a/block/nvme.c
> +++ b/block/nvme.c
> @@ -206,8 +206,9 @@ static void nvme_free_req_queue_cb(void *opaque)
>  NVMeQueuePair *q = opaque;
>  
>  qemu_mutex_lock(>lock);
> -while (qemu_co_enter_next(>free_req_queue, >lock)) {
> -/* Retry all pending requests */
> +while (q->free_req_head != -1 &&
> +   qemu_co_enter_next(>free_req_queue, >lock)) {
> +/* Retry waiting requests */

Reviewed-by: Philippe Mathieu-Daudé 

>  }
>  qemu_mutex_unlock(>lock);
>  }
> 




Re: [PATCH 1/7] migration: Drop dead code of ram_debug_dump_bitmap()

2021-12-08 Thread Dr. David Alan Gilbert
* Peter Xu (pet...@redhat.com) wrote:
> I planned to add "#ifdef DEBUG_POSTCOPY" around the function too because
> otherwise it'll be compiled into qemu binary even if it'll never be used.  
> Then
> I found that maybe it's easier to just drop it for good..
> 
> Signed-off-by: Peter Xu 

Yeh, it was useful debugging the bitmap in the early days.


Reviewed-by: Dr. David Alan Gilbert 

Dave
> ---
>  migration/ram.c | 39 ---
>  migration/ram.h |  2 --
>  2 files changed, 41 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 863035d235..756ac800a7 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2413,40 +2413,6 @@ static void ram_state_reset(RAMState *rs)
>  
>  #define MAX_WAIT 50 /* ms, half buffered_file limit */
>  
> -/*
> - * 'expected' is the value you expect the bitmap mostly to be full
> - * of; it won't bother printing lines that are all this value.
> - * If 'todump' is null the migration bitmap is dumped.
> - */
> -void ram_debug_dump_bitmap(unsigned long *todump, bool expected,
> -   unsigned long pages)
> -{
> -int64_t cur;
> -int64_t linelen = 128;
> -char linebuf[129];
> -
> -for (cur = 0; cur < pages; cur += linelen) {
> -int64_t curb;
> -bool found = false;
> -/*
> - * Last line; catch the case where the line length
> - * is longer than remaining ram
> - */
> -if (cur + linelen > pages) {
> -linelen = pages - cur;
> -}
> -for (curb = 0; curb < linelen; curb++) {
> -bool thisbit = test_bit(cur + curb, todump);
> -linebuf[curb] = thisbit ? '1' : '.';
> -found = found || (thisbit != expected);
> -}
> -if (found) {
> -linebuf[curb] = '\0';
> -fprintf(stderr,  "0x%08" PRIx64 " : %s\n", cur, linebuf);
> -}
> -}
> -}
> -
>  /*  functions for postcopy * */
>  
>  void ram_postcopy_migrated_memory_release(MigrationState *ms)
> @@ -2674,11 +2640,6 @@ int ram_postcopy_send_discard_bitmap(MigrationState 
> *ms)
>  if (ret) {
>  return ret;
>  }
> -
> -#ifdef DEBUG_POSTCOPY
> -ram_debug_dump_bitmap(block->bmap, true,
> -  block->used_length >> TARGET_PAGE_BITS);
> -#endif
>  }
>  trace_ram_postcopy_send_discard_bitmap();
>  
> diff --git a/migration/ram.h b/migration/ram.h
> index c515396a9a..f543e25765 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -55,8 +55,6 @@ void mig_throttle_counter_reset(void);
>  uint64_t ram_pagesize_summary(void);
>  int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t 
> len);
>  void acct_update_position(QEMUFile *f, size_t size, bool zero);
> -void ram_debug_dump_bitmap(unsigned long *todump, bool expected,
> -   unsigned long pages);
>  void ram_postcopy_migrated_memory_release(MigrationState *ms);
>  /* For outgoing discard bitmap */
>  int ram_postcopy_send_discard_bitmap(MigrationState *ms);
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v3] docs: Introducing pseries documentation.

2021-12-08 Thread Leonardo Augusto Guimarães Garcia
Argh! Disregard this patch. It fails to build. I'll send v4 fixing this.

On 12/8/21 12:47, lagar...@linux.ibm.com wrote:
> From: Leonardo Garcia 
>
> The purpose of this document is to substitute the content currently
> available in the QEMU wiki at [0]. This initial version does contain
> some additional content as well. Whenever this documentation gets
> upstream and is reflected in [1], the QEMU wiki will be edited to point
> to this documentation, so that we only need to keep it updated in one
> place.
>
> 0. https://wiki.qemu.org/Documentation/Platforms/POWER
> 1. https://qemu.readthedocs.io/en/latest/system/ppc/pseries.html
>
> Signed-off-by: Leonardo Garcia 
> Reviewed-by: David Gibson 
> ---
>
> Changelog:
> v2->v3:
>  - Updating LoPAR link.
>  - Minor clarifications in the text.
>
> v1->v2:
> - Addressing David Gibson and Cédric's comments. Thanks!
>
>  docs/system/ppc/pseries.rst | 226 
>  1 file changed, 226 insertions(+)
>
> diff --git a/docs/system/ppc/pseries.rst b/docs/system/ppc/pseries.rst
> index 932d4dd17d..f072f4a935 100644
> --- a/docs/system/ppc/pseries.rst
> +++ b/docs/system/ppc/pseries.rst
> @@ -1,12 +1,238 @@
>  pSeries family boards (``pseries``)
>  ===
>
> +The Power machine para-virtualized environment described by the `Linux on 
> Power
> +Architecture Reference document (LoPAR)
> +`_
> +is called pSeries. This environment is also known as sPAPR, System p guests, 
> or
> +simply Power Linux guests (although it is capable of running other operating
> +systems, such as AIX).
> +
> +Even though pSeries is designed to behave as a guest environment, it is also
> +capable of acting as a hypervisor OS, providing, on that role, nested
> +virtualization capabilities.
> +
>  Supported devices
>  -
>
> + * Multi processor support for many Power processors generations: POWER7,
> +   POWER7+, POWER8, POWER8NVL, POWER9, and Power10. Support for POWER5+ 
> exists,
> +   but its state is unknown.
> + * Interrupt Controller, XICS (POWER8) and XIVE (POWER9 and Power10)
> + * vPHB PCIe Host bridge.
> + * vscsi and vnet devices, compatible with the same devices available on a
> +   PowerVM hypervisor with VIOS managing LPARs.
> + * Virtio based devices.
> + * PCIe device pass through.
> +
>  Missing devices
>  ---
>
> + * SPICE support.
>
>  Firmware
>  
> +
> +`SLOF `_ (Slimline Open Firmware) is an
> +implementation of the `IEEE 1275-1994, Standard for Boot (Initialization
> +Configuration) Firmware: Core Requirements and Practices
> +`_.
> +
> +QEMU includes a prebuilt image of SLOF which is updated when a more recent
> +version is required.
> +
> +Build directions
> +
> +
> +.. code-block:: bash
> +
> +  ./configure --target-list=ppc64-softmmu && make
> +
> +Running instructions
> +
> +
> +Someone can select the pSeries machine type by running QEMU with the 
> following
> +options:
> +
> +.. code-block:: bash
> +
> +  qemu-system-ppc64 -M pseries 
> +
> +sPAPR devices
> +-
> +
> +The sPAPR specification defines a set of para-virtualized devices, which are
> +also supported by the pSeries machine in QEMU and can be instantiated with 
> the
> +``-device`` option:
> +
> +* ``spapr-vlan`` : a virtual network interface.
> +* ``spapr-vscsi`` : a virtual SCSI disk interface.
> +* ``spapr-rng`` : a pseudo-device for passing random number generator data 
> to the
> +  guest (see the `H_RANDOM hypercall feature
> +  `_ for details).
> +* ``spapr-vty``: a virtual teletype.
> +* ``spapr-pci-host-bridge``: a PCI host bridge.
> +* ``tpm-spapr``: a Trusted Platform Module (TPM).
> +* ``spapr-tpm-proxy``: a TPM proxy.
> +
> +These are compatible with the devices historically available for use when
> +running the IBM PowerVM hypervisor with LPARs.
> +
> +However, since these devices have originally been specified with another
> +hypervisor and non-Linux guests in mind, you should use the virtio 
> counterparts
> +(virtio-net, virtio-blk/scsi and virtio-rng for instance) if possible 
> instead,
> +since they will most probably give you better performance with Linux guests 
> in a
> +QEMU environment.
> +
> +The pSeries machine in QEMU is always instantiated with the following 
> devices:
> +
> +* A NVRAM device (``spapr-nvram``).
> +* A virtual teletype (``spapr-vty``).
> +* A PCI host bridge (``spapr-pci-host-bridge``).
> +
> +Hence, it is not needed to add them manually, unless you use the 
> ``-nodefaults``
> +command line option in QEMU.
> +
> +In the case of the default ``spapr-nvram`` device, if someone wants to make 
> the
> +contents of the NVRAM device persistent, they will need to specify a PFLASH
> +device when starting QEMU, i.e. either 

Re: [PATCH v9 2/3] cpu-throttle: implement vCPU throttle

2021-12-08 Thread Hyman




在 2021/12/8 23:36, Hyman 写道:



在 2021/12/6 18:10, Peter Xu 写道:

On Fri, Dec 03, 2021 at 09:39:46AM +0800, huang...@chinatelecom.cn wrote:

+static uint64_t dirtylimit_pct(unsigned int last_pct,
+   uint64_t quota,
+   uint64_t current)
+{
+    uint64_t limit_pct = 0;
+    RestrainPolicy policy;
+    bool mitigate = (quota > current) ? true : false;
+
+    if (mitigate && ((current == 0) ||
+    (last_pct <= DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE))) {
+    return 0;
+    }
+
+    policy = dirtylimit_policy(last_pct, quota, current);
+    switch (policy) {
+    case RESTRAIN_SLIGHT:
+    /* [90, 99] */
+    if (mitigate) {
+    limit_pct =
+    last_pct - DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+    } else {
+    limit_pct =
+    last_pct + DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+
+    limit_pct = MIN(limit_pct, CPU_THROTTLE_PCT_MAX);
+    }
+   break;
+    case RESTRAIN_HEAVY:
+    /* [75, 90) */
+    if (mitigate) {
+    limit_pct =
+    last_pct - DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+    } else {
+    limit_pct =
+    last_pct + DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+
+    limit_pct = MIN(limit_pct,
+    DIRTYLIMIT_THROTTLE_SLIGHT_WATERMARK);
+    }
+   break;
+    case RESTRAIN_RATIO:
+    /* [0, 75) */
+    if (mitigate) {
+    if (last_pct <= (((quota - current) * 100 / quota))) {
+    limit_pct = 0;
+    } else {
+    limit_pct = last_pct -
+    ((quota - current) * 100 / quota);
+    limit_pct = MAX(limit_pct, CPU_THROTTLE_PCT_MIN);
+    }
+    } else {
+    limit_pct = last_pct +
+    ((current - quota) * 100 / current);
+
+    limit_pct = MIN(limit_pct,
+    DIRTYLIMIT_THROTTLE_HEAVY_WATERMARK);
+    }
+   break;
+    case RESTRAIN_KEEP:
+    default:
+   limit_pct = last_pct;
+   break;
+    }
+
+    return limit_pct;
+}
+
+static void *dirtylimit_thread(void *opaque)
+{
+    int cpu_index = *(int *)opaque;
+    uint64_t quota_dirtyrate, current_dirtyrate;
+    unsigned int last_pct = 0;
+    unsigned int pct = 0;
+
+    rcu_register_thread();
+
+    quota_dirtyrate = dirtylimit_quota(cpu_index);
+    current_dirtyrate = dirtylimit_current(cpu_index);
+
+    pct = dirtylimit_init_pct(quota_dirtyrate, current_dirtyrate);
+
+    do {
+    trace_dirtylimit_impose(cpu_index,
+    quota_dirtyrate, current_dirtyrate, pct);
+
+    last_pct = pct;
+    if (pct == 0) {
+    sleep(DIRTYLIMIT_CALC_PERIOD_TIME_S);
+    } else {
+    dirtylimit_check(cpu_index, pct);
+    }
+
+    quota_dirtyrate = dirtylimit_quota(cpu_index);
+    current_dirtyrate = dirtylimit_current(cpu_index);
+
+    pct = dirtylimit_pct(last_pct, quota_dirtyrate, 
current_dirtyrate);


So what I had in mind is we can start with an extremely simple version of
negative feedback system.  Say, firstly each vcpu will have a simple 
number to
sleep for some interval (this is ugly code, but just show what I 
meant..):


===
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index eecd8031cf..c320fd190f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2932,6 +2932,8 @@ int kvm_cpu_exec(CPUState *cpu)
  trace_kvm_dirty_ring_full(cpu->cpu_index);
  qemu_mutex_lock_iothread();
  kvm_dirty_ring_reap(kvm_state);
+    if (dirtylimit_enabled(cpu->cpu_index) && 
cpu->throttle_us_per_full)

+    usleep(cpu->throttle_us_per_full);
  qemu_mutex_unlock_iothread();
  ret = 0;
  break;
===

I think this will have finer granularity when throttle (for 4096 ring 
size,
that's per-16MB operation) than current way where we inject per-vcpu 
async task

to sleep, like auto-converge.

Then we have the "black box" to tune this value with below input/output:

   - Input: dirty rate information, same as current algo

   - Output: increase/decrease of per-vcpu throttle_us_per_full above, 
and

 that's all

We can do the sampling per-second, then we keep doing it: we can have 
1 thread
doing per-second task collecting dirty rate information for all the 
vcpus, then

tune that throttle_us_per_full for each of them.

The simplest linear algorithm would be as simple as (for each vcpu):

   if (quota < current)
 throttle_us_per_full += SOMETHING;
 if (throttle_us_per_full > MAX)
   throttle_us_per_full = MAX;
   else
 throttle_us_per_full -= SOMETHING;
 if (throttle_us_per_full < 0)
   throttle_us_per_full = 0;

I think your algorithm is fine, but thoroughly review every single bit 
of it in
one shot will be challenging, and it's also hard to prove every bit of 
the
algorithm is helpful, as there're a lot of 

[PATCH v3] docs: Introducing pseries documentation.

2021-12-08 Thread lagarcia
From: Leonardo Garcia 

The purpose of this document is to substitute the content currently
available in the QEMU wiki at [0]. This initial version does contain
some additional content as well. Whenever this documentation gets
upstream and is reflected in [1], the QEMU wiki will be edited to point
to this documentation, so that we only need to keep it updated in one
place.

0. https://wiki.qemu.org/Documentation/Platforms/POWER
1. https://qemu.readthedocs.io/en/latest/system/ppc/pseries.html

Signed-off-by: Leonardo Garcia 
Reviewed-by: David Gibson 
---

Changelog:
v2->v3:
 - Updating LoPAR link.
 - Minor clarifications in the text.

v1->v2:
- Addressing David Gibson and Cédric's comments. Thanks!

 docs/system/ppc/pseries.rst | 226 
 1 file changed, 226 insertions(+)

diff --git a/docs/system/ppc/pseries.rst b/docs/system/ppc/pseries.rst
index 932d4dd17d..f072f4a935 100644
--- a/docs/system/ppc/pseries.rst
+++ b/docs/system/ppc/pseries.rst
@@ -1,12 +1,238 @@
 pSeries family boards (``pseries``)
 ===
 
+The Power machine para-virtualized environment described by the `Linux on Power
+Architecture Reference document (LoPAR)
+`_
+is called pSeries. This environment is also known as sPAPR, System p guests, or
+simply Power Linux guests (although it is capable of running other operating
+systems, such as AIX).
+
+Even though pSeries is designed to behave as a guest environment, it is also
+capable of acting as a hypervisor OS, providing, on that role, nested
+virtualization capabilities.
+
 Supported devices
 -
 
+ * Multi processor support for many Power processors generations: POWER7,
+   POWER7+, POWER8, POWER8NVL, POWER9, and Power10. Support for POWER5+ exists,
+   but its state is unknown.
+ * Interrupt Controller, XICS (POWER8) and XIVE (POWER9 and Power10)
+ * vPHB PCIe Host bridge.
+ * vscsi and vnet devices, compatible with the same devices available on a
+   PowerVM hypervisor with VIOS managing LPARs.
+ * Virtio based devices.
+ * PCIe device pass through.
+
 Missing devices
 ---
 
+ * SPICE support.
 
 Firmware
 
+
+`SLOF `_ (Slimline Open Firmware) is an
+implementation of the `IEEE 1275-1994, Standard for Boot (Initialization
+Configuration) Firmware: Core Requirements and Practices
+`_.
+
+QEMU includes a prebuilt image of SLOF which is updated when a more recent
+version is required.
+
+Build directions
+
+
+.. code-block:: bash
+
+  ./configure --target-list=ppc64-softmmu && make
+
+Running instructions
+
+
+Someone can select the pSeries machine type by running QEMU with the following
+options:
+
+.. code-block:: bash
+
+  qemu-system-ppc64 -M pseries 
+
+sPAPR devices
+-
+
+The sPAPR specification defines a set of para-virtualized devices, which are
+also supported by the pSeries machine in QEMU and can be instantiated with the
+``-device`` option:
+
+* ``spapr-vlan`` : a virtual network interface.
+* ``spapr-vscsi`` : a virtual SCSI disk interface.
+* ``spapr-rng`` : a pseudo-device for passing random number generator data to 
the
+  guest (see the `H_RANDOM hypercall feature
+  `_ for details).
+* ``spapr-vty``: a virtual teletype.
+* ``spapr-pci-host-bridge``: a PCI host bridge.
+* ``tpm-spapr``: a Trusted Platform Module (TPM).
+* ``spapr-tpm-proxy``: a TPM proxy.
+
+These are compatible with the devices historically available for use when
+running the IBM PowerVM hypervisor with LPARs.
+
+However, since these devices have originally been specified with another
+hypervisor and non-Linux guests in mind, you should use the virtio counterparts
+(virtio-net, virtio-blk/scsi and virtio-rng for instance) if possible instead,
+since they will most probably give you better performance with Linux guests in 
a
+QEMU environment.
+
+The pSeries machine in QEMU is always instantiated with the following devices:
+
+* A NVRAM device (``spapr-nvram``).
+* A virtual teletype (``spapr-vty``).
+* A PCI host bridge (``spapr-pci-host-bridge``).
+
+Hence, it is not needed to add them manually, unless you use the 
``-nodefaults``
+command line option in QEMU.
+
+In the case of the default ``spapr-nvram`` device, if someone wants to make the
+contents of the NVRAM device persistent, they will need to specify a PFLASH
+device when starting QEMU, i.e. either use
+``-drive if=pflash,file=,format=raw`` to set the default PFLASH
+device, or specify one with an ID
+(``-drive if=none,file=,format=raw,id=pfid``) and pass that ID to the
+NVRAM device with ``-global spapr-nvram.drive=pfid``.
+
+sPAPR specification
+^^^
+
+The main source of documentation on the sPAPR standard is the `Linux on Power
+Architecture Reference document (LoPAR)

Re: [PATCH v9 2/3] cpu-throttle: implement vCPU throttle

2021-12-08 Thread Hyman




在 2021/12/6 18:10, Peter Xu 写道:

On Fri, Dec 03, 2021 at 09:39:46AM +0800, huang...@chinatelecom.cn wrote:

+static uint64_t dirtylimit_pct(unsigned int last_pct,
+   uint64_t quota,
+   uint64_t current)
+{
+uint64_t limit_pct = 0;
+RestrainPolicy policy;
+bool mitigate = (quota > current) ? true : false;
+
+if (mitigate && ((current == 0) ||
+(last_pct <= DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE))) {
+return 0;
+}
+
+policy = dirtylimit_policy(last_pct, quota, current);
+switch (policy) {
+case RESTRAIN_SLIGHT:
+/* [90, 99] */
+if (mitigate) {
+limit_pct =
+last_pct - DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+} else {
+limit_pct =
+last_pct + DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+
+limit_pct = MIN(limit_pct, CPU_THROTTLE_PCT_MAX);
+}
+   break;
+case RESTRAIN_HEAVY:
+/* [75, 90) */
+if (mitigate) {
+limit_pct =
+last_pct - DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+} else {
+limit_pct =
+last_pct + DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+
+limit_pct = MIN(limit_pct,
+DIRTYLIMIT_THROTTLE_SLIGHT_WATERMARK);
+}
+   break;
+case RESTRAIN_RATIO:
+/* [0, 75) */
+if (mitigate) {
+if (last_pct <= (((quota - current) * 100 / quota))) {
+limit_pct = 0;
+} else {
+limit_pct = last_pct -
+((quota - current) * 100 / quota);
+limit_pct = MAX(limit_pct, CPU_THROTTLE_PCT_MIN);
+}
+} else {
+limit_pct = last_pct +
+((current - quota) * 100 / current);
+
+limit_pct = MIN(limit_pct,
+DIRTYLIMIT_THROTTLE_HEAVY_WATERMARK);
+}
+   break;
+case RESTRAIN_KEEP:
+default:
+   limit_pct = last_pct;
+   break;
+}
+
+return limit_pct;
+}
+
+static void *dirtylimit_thread(void *opaque)
+{
+int cpu_index = *(int *)opaque;
+uint64_t quota_dirtyrate, current_dirtyrate;
+unsigned int last_pct = 0;
+unsigned int pct = 0;
+
+rcu_register_thread();
+
+quota_dirtyrate = dirtylimit_quota(cpu_index);
+current_dirtyrate = dirtylimit_current(cpu_index);
+
+pct = dirtylimit_init_pct(quota_dirtyrate, current_dirtyrate);
+
+do {
+trace_dirtylimit_impose(cpu_index,
+quota_dirtyrate, current_dirtyrate, pct);
+
+last_pct = pct;
+if (pct == 0) {
+sleep(DIRTYLIMIT_CALC_PERIOD_TIME_S);
+} else {
+dirtylimit_check(cpu_index, pct);
+}
+
+quota_dirtyrate = dirtylimit_quota(cpu_index);
+current_dirtyrate = dirtylimit_current(cpu_index);
+
+pct = dirtylimit_pct(last_pct, quota_dirtyrate, current_dirtyrate);


So what I had in mind is we can start with an extremely simple version of
negative feedback system.  Say, firstly each vcpu will have a simple number to
sleep for some interval (this is ugly code, but just show what I meant..):

===
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index eecd8031cf..c320fd190f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2932,6 +2932,8 @@ int kvm_cpu_exec(CPUState *cpu)
  trace_kvm_dirty_ring_full(cpu->cpu_index);
  qemu_mutex_lock_iothread();
  kvm_dirty_ring_reap(kvm_state);
+if (dirtylimit_enabled(cpu->cpu_index) && 
cpu->throttle_us_per_full)
+usleep(cpu->throttle_us_per_full);
  qemu_mutex_unlock_iothread();
  ret = 0;
  break;
===

I think this will have finer granularity when throttle (for 4096 ring size,
that's per-16MB operation) than current way where we inject per-vcpu async task
to sleep, like auto-converge.

Then we have the "black box" to tune this value with below input/output:

   - Input: dirty rate information, same as current algo

   - Output: increase/decrease of per-vcpu throttle_us_per_full above, and
 that's all

We can do the sampling per-second, then we keep doing it: we can have 1 thread
doing per-second task collecting dirty rate information for all the vcpus, then
tune that throttle_us_per_full for each of them.

The simplest linear algorithm would be as simple as (for each vcpu):

   if (quota < current)
 throttle_us_per_full += SOMETHING;
 if (throttle_us_per_full > MAX)
   throttle_us_per_full = MAX;
   else
 throttle_us_per_full -= SOMETHING;
 if (throttle_us_per_full < 0)
   throttle_us_per_full = 0;

I think your algorithm is fine, but thoroughly review every single bit of it in
one shot will be challenging, and it's also hard to prove every bit of the
algorithm is helpful, as there're a lot of hand-made macros and state changes.

I actually 

Re: [PATCH v2 2/2] ui/clipboard: Don't use g_autoptr just to free a variable

2021-12-08 Thread John Snow
On Wed, Dec 8, 2021, 4:11 AM Daniel P. Berrangé  wrote:

> On Tue, Dec 07, 2021 at 03:40:38PM -0500, John Snow wrote:
> > Clang doesn't recognize that the variable is being "used" and will emit
> > a warning:
> >
> >   ../ui/clipboard.c:47:34: error: variable 'old' set but not used
> [-Werror,-Wunused-but-set-variable]
> >   g_autoptr(QemuClipboardInfo) old = NULL;
> >  ^
> >   1 error generated.
> >
> > OK, fine. Just do things the old way.
> >
> > Signed-off-by: John Snow 
> > ---
> >  ui/clipboard.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/ui/clipboard.c b/ui/clipboard.c
> > index d7b008d62a..9ab65efefb 100644
> > --- a/ui/clipboard.c
> > +++ b/ui/clipboard.c
> > @@ -44,12 +44,11 @@ void qemu_clipboard_peer_release(QemuClipboardPeer
> *peer,
> >
> >  void qemu_clipboard_update(QemuClipboardInfo *info)
> >  {
> > -g_autoptr(QemuClipboardInfo) old = NULL;
> >  assert(info->selection < QEMU_CLIPBOARD_SELECTION__COUNT);
> >
> >  notifier_list_notify(_notifiers, info);
> >
> > -old = cbinfo[info->selection];
> > +g_free(cbinfo[info->selection]);
>
> This is a ref counted data type - it can't use g_free:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg04890.html
>
> >  cbinfo[info->selection] = qemu_clipboard_info_ref(info);
> >  }
>
> Regards,
> Daniel
>

Whoops, sorry. I saw free being used elsewhere and assumed it was safe.
That's what I get for assuming things...


[PATCH] block/nvme: fix infinite loop in nvme_free_req_queue_cb()

2021-12-08 Thread Stefan Hajnoczi
When the request free list is exhausted the coroutine waits on
q->free_req_queue for the next free request. Whenever a request is
completed a BH is scheduled to invoke nvme_free_req_queue_cb() and wake
up waiting coroutines.

1. nvme_get_free_req() waits for a free request:

while (q->free_req_head == -1) {
...
trace_nvme_free_req_queue_wait(q->s, q->index);
qemu_co_queue_wait(>free_req_queue, >lock);
...
}

2. nvme_free_req_queue_cb() wakes up the coroutine:

while (qemu_co_enter_next(>free_req_queue, >lock)) {
   ^--- infinite loop when free_req_head == -1
}

nvme_free_req_queue_cb() and the coroutine form an infinite loop when
q->free_req_head == -1. Fix this by checking q->free_req_head in
nvme_free_req_queue_cb(). If the free request list is exhausted, don't
wake waiting coroutines. Eventually an in-flight request will complete
and the BH will be scheduled again, guaranteeing forward progress.

Signed-off-by: Stefan Hajnoczi 
---
 block/nvme.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index e4f336d79c..fa360b9b3c 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -206,8 +206,9 @@ static void nvme_free_req_queue_cb(void *opaque)
 NVMeQueuePair *q = opaque;
 
 qemu_mutex_lock(>lock);
-while (qemu_co_enter_next(>free_req_queue, >lock)) {
-/* Retry all pending requests */
+while (q->free_req_head != -1 &&
+   qemu_co_enter_next(>free_req_queue, >lock)) {
+/* Retry waiting requests */
 }
 qemu_mutex_unlock(>lock);
 }
-- 
2.33.1




[ANNOUNCE] QEMU 6.2.0-rc4 is now available

2021-12-08 Thread Michael Roth
Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
fifth release candidate for the QEMU 6.2 release. This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-6.2.0-rc4.tar.xz
  http://download.qemu-project.org/qemu-6.2.0-rc4.tar.xz.sig

You can help improve the quality of the QEMU 6.2 release by testing this
release and reporting bugs using our GitLab issue tracker:

  https://gitlab.com/qemu-project/qemu/-/issues

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/6.2

Please add entries to the ChangeLog for the 6.2 release below:

  http://wiki.qemu.org/ChangeLog/6.2

Thank you to everyone involved!

Changes since rc3:

a3607def89: Update version for v6.2.0-rc4 release (Richard Henderson)
2958e5150d: gicv3: fix ICH_MISR's LRENP computation (Damien Hedde)
b9537d5904: tcg/arm: Reduce vector alignment requirement for NEON (Richard 
Henderson)
5b807181c2: virtio-blk: Fix clean up of host notifiers for single MR 
transaction. (Mark Mielke)
ac5837e330: Revert "vga: don't abort when adding a duplicate isa-vga device" 
(Alex Bennée)
d77c462bf2: hw/mips/boston: Fix load_elf() error detection (Jiaxun Yang)
24ade8c5de: hw/mips/bootloader: Fix write_ulong() (Jiaxun Yang)
3bc90ac567: seabios: update binaries to 1.15.0 (Gerd Hoffmann)
e7fa3377cc: seabios: update submodule to 1.15.0 (Gerd Hoffmann)
cc20926e9b: tests/qtest/fdc-test: Add a regression test for CVE-2021-20196 
(Philippe Mathieu-Daudé)
1ab95af033: hw/block/fdc: Kludge missing floppy drive to fix CVE-2021-20196 
(Philippe Mathieu-Daudé)
b154791e7b: hw/block/fdc: Extract blk_create_empty_drive() (Philippe 
Mathieu-Daudé)



Re: [RFC PATCH v2] blog post: how to get your new feature up-streamed

2021-12-08 Thread Thomas Huth

On 06/12/2021 15.14, Alex Bennée wrote:

Experience has shown that getting new functionality up-streamed can be
a somewhat painful process. Lets see if we can collect some of our


s/Lets/Let's/


community knowledge into a blog post describing some best practices
for getting code accepted.

Signed-off-by: Alex Bennée 

---
v2
   - tweak the title
   - expand on requirements for series of patches
   - wrote a conclusion
---
  ...26-so-you-want-to-add-something-to-qemu.md | 150 ++
  1 file changed, 150 insertions(+)
  create mode 100644 _posts/2021-11-26-so-you-want-to-add-something-to-qemu.md

diff --git a/_posts/2021-11-26-so-you-want-to-add-something-to-qemu.md 
b/_posts/2021-11-26-so-you-want-to-add-something-to-qemu.md
new file mode 100644
index 000..6d855d9
--- /dev/null
+++ b/_posts/2021-11-26-so-you-want-to-add-something-to-qemu.md
@@ -0,0 +1,150 @@
+---
+layout: post
+title:  "So you want to add a new feature to QEMU?"
+date:   2021-11-26 19:43:45


Please refresh the date in the next version (also in the file name)


+author: Alex Bennée
+categories: [blog, process, development]
+---
+
+From time to time I hear of frustrations from potential new
+contributors who have tried to get new features up-streamed into the
+QEMU repository. After having read [our patch
+guidelines](https://qemu.readthedocs.io/en/latest/devel/submitting-a-patch.html)


Maybe better use https://www.qemu.org/contribute/submit-a-patch/ instead.


+they post them to [qemu-devel](https://lore.kernel.org/qemu-devel/).


Maybe better use https://lists.nongnu.org/mailman/listinfo/qemu-devel ?


+Often the patches sit there seemingly unread and unloved. The
+developer is left wandering if they missed out the secret hand shake


s/wandering/wondering/ ?


+required to move the process forward. My hope is that this blog post
+will help.
+
+New features != Fixing a bug


Want to use ≠ instead of != in case a non-developer reads this blog entry, too?


+Adding a new feature is not the same as fixing a bug. For an area of
+code that is supported for Odd Fixes or above there will be a


Please remove the "a" at the end of the line.


+someone listed in the
+[MAINTAINERS](https://gitlab.com/qemu-project/qemu/-/blob/master/MAINTAINERS)
+file. A properly configured `git-send-email` will even automatically
+add them to the patches as they are sent out.


add them to the CC: list when the patches are sent out ?


The maintainer will
+review the code and if no changes are requested they ensure the
+patch flows through the appropriate trees and eventually makes it into
+the master branch.
+
+This doesn't usually happen for new code unless your patches happen to
+touch a directory that is marked as maintained. Without a maintainer
+to look at and apply your patches how will it ever get merged?
+
+Adding new code to a project is not a free activity. Code that isn't
+actively maintained has a tendency to [bit
+rot](http://www.catb.org/jargon/html/B/bit-rot.html) and become a drag
+on the rest of the code base. The QEMU code base is quite large and
+none of the developers are knowledgeable about the all of it. If
+features aren't
+[documented](https://qemu.readthedocs.io/en/latest/devel/submitting-a-patch.html)


Use https://www.qemu.org/contribute/submit-a-patch/ again?


+they tend to remain unused as users struggle to enable them. If an
+unused feature becomes a drag on the rest of the code base by preventing
+re-factoring and other clean ups it is likely to be deprecated.
+Eventually deprecated code gets removed from the code base never to be
+seen again.


I think I'd either drop thatpart about deprecation, or add another sentence 
to conclude this paragraph ("Thus there is a natural hesitation to add new 
features where the usefulness is not fully clear, e.g. due to missing 
documentation" maybe?)



+Fortunately there is a way to avoid the ignominy of ignored new features
+and that is to become a maintainer of your own code!
+
+The maintainers path
+
+
+There is perhaps an unfortunate stereotype in the open source world of
+maintainers being grumpy old experts who spend their time dismissively
+rejecting the patches of new contributors. Having done their time in
+the metaphorical trenches of the project they must ingest the email
+archive to prove their encyclopedic mastery. Eventually they then
+ascend to the status of maintainer having completed the dark key
+signing ritual.
+
+In reality the process is much more prosaic - you simply need to send
+a patch to the MAINTAINERS file with your email address, the areas you
+are going to cover and the level of support you expect to give.


Well, there is a little bit more to this. We now have a Code of Conduct, 
too, for example. And you should be at least a little bit active on the 
mailing list first to show that you basically know what you're doing - I 
don't think we will accept a patch changing MAINTAINERS from somebody who 
never ever wrote a mail to 

Re: [PATCH v2 2/2] ui/clipboard: Don't use g_autoptr just to free a variable

2021-12-08 Thread Philippe Mathieu-Daudé
On 12/8/21 10:11, Daniel P. Berrangé wrote:
> On Tue, Dec 07, 2021 at 03:40:38PM -0500, John Snow wrote:
>> Clang doesn't recognize that the variable is being "used" and will emit
>> a warning:
>>
>>   ../ui/clipboard.c:47:34: error: variable 'old' set but not used 
>> [-Werror,-Wunused-but-set-variable]
>>   g_autoptr(QemuClipboardInfo) old = NULL;
>>  ^
>>   1 error generated.
>>
>> OK, fine. Just do things the old way.
>>
>> Signed-off-by: John Snow 
>> ---
>>  ui/clipboard.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/ui/clipboard.c b/ui/clipboard.c
>> index d7b008d62a..9ab65efefb 100644
>> --- a/ui/clipboard.c
>> +++ b/ui/clipboard.c
>> @@ -44,12 +44,11 @@ void qemu_clipboard_peer_release(QemuClipboardPeer *peer,
>>  
>>  void qemu_clipboard_update(QemuClipboardInfo *info)
>>  {
>> -g_autoptr(QemuClipboardInfo) old = NULL;
>>  assert(info->selection < QEMU_CLIPBOARD_SELECTION__COUNT);
>>  
>>  notifier_list_notify(_notifiers, info);
>>  
>> -old = cbinfo[info->selection];
>> +g_free(cbinfo[info->selection]);
> 
> This is a ref counted data type - it can't use g_free:
> 
>   https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg04890.html

Doh, I vaguely remembered having already reviewed this then only
found John's v1 and finally mis-reviewed it :/ Thanks for noticing.




Re: [PATCH v6 4/4] tests: qtest: Add virtio-iommu test

2021-12-08 Thread Thomas Huth

On 27/11/2021 08.29, Eric Auger wrote:

Add the framework to test the virtio-iommu-pci device
and tests exercising the attach/detach, map/unmap API.

Signed-off-by: Eric Auger 
Tested-by: Jean-Philippe Brucker 
Reviewed-by: Jean-Philippe Brucker 

---

v5 -> v6:
- changed the expected value for domain.end (32 -> MAX_UINT32)
---
  tests/qtest/libqos/meson.build|   1 +
  tests/qtest/libqos/virtio-iommu.c | 126 
  tests/qtest/libqos/virtio-iommu.h |  40 
  tests/qtest/meson.build   |   1 +
  tests/qtest/virtio-iommu-test.c   | 326 ++
  5 files changed, 494 insertions(+)
  create mode 100644 tests/qtest/libqos/virtio-iommu.c
  create mode 100644 tests/qtest/libqos/virtio-iommu.h
  create mode 100644 tests/qtest/virtio-iommu-test.c


Acked-by: Thomas Huth 




Re: [PATCH] hw/ppc/ppc405_boards: Change kernel load address

2021-12-08 Thread Thomas Huth

On 08/12/2021 14.15, Cédric Le Goater wrote:

On 12/8/21 14:07, Thomas Huth wrote:

On 03/12/2021 13.25, Cédric Le Goater wrote:

On 12/3/21 11:40, Peter Maydell wrote:

On Fri, 3 Dec 2021 at 10:32, Thomas Huth  wrote:
I guess it's an accidential NULL pointer dereference somewhere in the 
u-boot

code ... which will be quite hard to track down when the first page of
memory is marked as writable... :-/


Attach a target-arch gdb to the QEMU gdbstub and put a watchpoint on
address zero ? (Or if you suspect something inside QEMU is doing it
then run QEMU under gdb and watchpoint the host memory location
corresponding to guest address 0, but that's more painful.) Nothing
in the pre-kernel part of the boot process will have set up paging,
so the watchpointing should be pretty reliable.


That's the guy:

https://gitlab.com/huth/u-boot/-/blob/taihu/arch/powerpc/cpu/ppc4xx/sdram.c#L199 



There must be an error in how get_ram_size() restores the RAM values :

   https://gitlab.com/huth/u-boot/-/blob/taihu/common/memsize.c


There is definitely something wrong in that function. Seems like they 
tried to fix it once here:


  https://source.denx.de/u-boot/u-boot/-/commit/b8496cced856ff411f

but that patch got later reverted without a replacement later...



a fix restoring address 0, something like :

@@ -60,6 +60,9 @@ long get_ram_size(long *base, long maxsi
  return (0);
  }

+    addr = base;
+    *addr = save[i];
+
  for (cnt = 1; cnt < maxsize / sizeof(long); cnt <<= 1) {
  addr = base + cnt;    /* pointer arith! */
  val = *addr;

is not enough. trap_init() will also overwrite the kernel image.
And u-boot will complain about a wrong CRC.

The 405 series I sent improves support and latest kernel 5.16-rc4
can be loaded without uboot. It's a start to debug user space.


Yes, your series is certainly the better way forward. I'll stop messing with 
u-boot now ... since upstream u-boot already removed the 405 support a long 
time ago, my hack was a dead end anyway (but it helped to get at least a 
kernel running again, so it was not in vain)


 Thomas




Re: [PATCH] hw/ppc/ppc405_boards: Change kernel load address

2021-12-08 Thread Cédric Le Goater

On 12/8/21 14:07, Thomas Huth wrote:

On 03/12/2021 13.25, Cédric Le Goater wrote:

On 12/3/21 11:40, Peter Maydell wrote:

On Fri, 3 Dec 2021 at 10:32, Thomas Huth  wrote:

I guess it's an accidential NULL pointer dereference somewhere in the u-boot
code ... which will be quite hard to track down when the first page of
memory is marked as writable... :-/


Attach a target-arch gdb to the QEMU gdbstub and put a watchpoint on
address zero ? (Or if you suspect something inside QEMU is doing it
then run QEMU under gdb and watchpoint the host memory location
corresponding to guest address 0, but that's more painful.) Nothing
in the pre-kernel part of the boot process will have set up paging,
so the watchpointing should be pretty reliable.


That's the guy:

https://gitlab.com/huth/u-boot/-/blob/taihu/arch/powerpc/cpu/ppc4xx/sdram.c#L199

There must be an error in how get_ram_size() restores the RAM values :

   https://gitlab.com/huth/u-boot/-/blob/taihu/common/memsize.c


There is definitely something wrong in that function. Seems like they tried to 
fix it once here:

  https://source.denx.de/u-boot/u-boot/-/commit/b8496cced856ff411f

but that patch got later reverted without a replacement later...



a fix restoring address 0, something like :

@@ -60,6 +60,9 @@ long get_ram_size(long *base, long maxsi
return (0);
}
 
+	addr = base;

+   *addr = save[i];
+
for (cnt = 1; cnt < maxsize / sizeof(long); cnt <<= 1) {
addr = base + cnt;  /* pointer arith! */
val = *addr;

is not enough. trap_init() will also overwrite the kernel image.
And u-boot will complain about a wrong CRC.

The 405 series I sent improves support and latest kernel 5.16-rc4
can be loaded without uboot. It's a start to debug user space.

Thanks,

C.



Re: [PATCH] hw/ppc/ppc405_boards: Change kernel load address

2021-12-08 Thread Thomas Huth

On 03/12/2021 13.25, Cédric Le Goater wrote:

On 12/3/21 11:40, Peter Maydell wrote:

On Fri, 3 Dec 2021 at 10:32, Thomas Huth  wrote:

I guess it's an accidential NULL pointer dereference somewhere in the u-boot
code ... which will be quite hard to track down when the first page of
memory is marked as writable... :-/


Attach a target-arch gdb to the QEMU gdbstub and put a watchpoint on
address zero ? (Or if you suspect something inside QEMU is doing it
then run QEMU under gdb and watchpoint the host memory location
corresponding to guest address 0, but that's more painful.) Nothing
in the pre-kernel part of the boot process will have set up paging,
so the watchpointing should be pretty reliable.


That's the guy:

   
https://gitlab.com/huth/u-boot/-/blob/taihu/arch/powerpc/cpu/ppc4xx/sdram.c#L199 



There must be an error in how get_ram_size() restores the RAM values :

   https://gitlab.com/huth/u-boot/-/blob/taihu/common/memsize.c


There is definitely something wrong in that function. Seems like they tried 
to fix it once here:


 https://source.denx.de/u-boot/u-boot/-/commit/b8496cced856ff411f

but that patch got later reverted without a replacement later...

 Thomas





[PATCH v8 2/4] tests/qtest: add some tests for virtio-net failover

2021-12-08 Thread Laurent Vivier
Add test cases to test several error cases that must be
generated by invalid failover configuration.

Add a combination of coldplug and hotplug test cases to be
sure the primary is correctly managed according the
presence or not of the STANDBY feature.

Signed-off-by: Laurent Vivier 
---
 tests/qtest/meson.build   |   4 +
 tests/qtest/virtio-net-failover.c | 788 ++
 2 files changed, 792 insertions(+)
 create mode 100644 tests/qtest/virtio-net-failover.c

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index c9d8458062ff..975a0f2f5f25 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -68,6 +68,10 @@ qtests_i386 = \
   (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) + 
 \
   (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? 
['fuzz-e1000e-test'] : []) +   \
   (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) +
 \
+  (config_all_devices.has_key('CONFIG_VIRTIO_NET') and 
 \
+   config_all_devices.has_key('CONFIG_Q35') and
 \
+   config_all_devices.has_key('CONFIG_VIRTIO_PCI') and 
 \
+   slirp.found() ? ['virtio-net-failover'] : []) + 
 \
   (unpack_edk2_blobs ? ['bios-tables-test'] : []) +
 \
   qtests_pci + 
 \
   ['fdc-test',
diff --git a/tests/qtest/virtio-net-failover.c 
b/tests/qtest/virtio-net-failover.c
new file mode 100644
index ..fd7821deaf48
--- /dev/null
+++ b/tests/qtest/virtio-net-failover.c
@@ -0,0 +1,788 @@
+/*
+ * QTest testcase for virtio-net failover
+ *
+ * See docs/system/virtio-net-failover.rst
+ *
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include "qemu/osdep.h"
+#include "libqos/libqtest.h"
+#include "libqos/pci.h"
+#include "libqos/pci-pc.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qlist.h"
+#include "qapi/qmp/qjson.h"
+#include "libqos/malloc-pc.h"
+#include "libqos/virtio-pci.h"
+#include "hw/pci/pci.h"
+
+#define ACPI_PCIHP_ADDR_ICH90x0cc0
+#define PCI_EJ_BASE 0x0008
+
+#define BASE_MACHINE "-M q35 -nodefaults " \
+"-device pcie-root-port,id=root0,addr=0x1,bus=pcie.0,chassis=1 " \
+"-device pcie-root-port,id=root1,addr=0x2,bus=pcie.0,chassis=2 "
+
+#define MAC_PRIMARY0 "52:54:00:11:11:11"
+#define MAC_STANDBY0 "52:54:00:22:22:22"
+
+static QGuestAllocator guest_malloc;
+static QPCIBus *pcibus;
+
+static QTestState *machine_start(const char *args, int numbus)
+{
+QTestState *qts;
+QPCIDevice *dev;
+int bus;
+
+qts = qtest_init(args);
+
+pc_alloc_init(_malloc, qts, 0);
+pcibus = qpci_new_pc(qts, _malloc);
+g_assert(qpci_secondary_buses_init(pcibus) == numbus);
+
+for (bus = 1; bus <= numbus; bus++) {
+dev = qpci_device_find(pcibus, QPCI_DEVFN(bus, 0));
+g_assert_nonnull(dev);
+
+qpci_device_enable(dev);
+qpci_iomap(dev, 4, NULL);
+
+g_free(dev);
+}
+
+return qts;
+}
+
+static void machine_stop(QTestState *qts)
+{
+qpci_free_pc(pcibus);
+alloc_destroy(_malloc);
+qtest_quit(qts);
+}
+
+static void test_error_id(void)
+{
+QTestState *qts;
+QDict *resp;
+QDict *err;
+
+qts = machine_start(BASE_MACHINE
+"-device virtio-net,bus=root0,id=standby0,failover=on",
+2);
+
+resp = qtest_qmp(qts, "{'execute': 'device_add',"
+  "'arguments': {"
+  "'driver': 'virtio-net',"
+  "'bus': 'root1',"
+  "'failover_pair_id': 'standby0'"
+  "} }");
+g_assert(qdict_haskey(resp, "error"));
+
+err = qdict_get_qdict(resp, "error");
+g_assert(qdict_haskey(err, "desc"));
+
+g_assert_cmpstr(qdict_get_str(err, "desc"), ==,
+"Device with failover_pair_id needs to have id");
+
+qobject_unref(resp);
+
+machine_stop(qts);
+}
+
+static void test_error_pcie(void)
+{
+QTestState *qts;
+QDict *resp;
+QDict *err;
+
+qts = machine_start(BASE_MACHINE
+"-device virtio-net,bus=root0,id=standby0,failover=on",
+2);
+
+resp = qtest_qmp(qts, "{'execute': 'device_add',"
+  "'arguments': {"
+  "'driver': 'virtio-net',"
+  "'id': 'primary0',"
+  "'bus': 'pcie.0',"
+  "'failover_pair_id': 'standby0'"
+  "} }");
+g_assert(qdict_haskey(resp, "error"));
+
+err = qdict_get_qdict(resp, "error");
+g_assert(qdict_haskey(err, "desc"));
+
+g_assert_cmpstr(qdict_get_str(err, "desc"), ==,
+ 

[PATCH v8 0/4] tests/qtest: add some tests for virtio-net failover

2021-12-08 Thread Laurent Vivier
This series adds a qtest entry to test virtio-net failover feature.

We check following error cases:

- check missing id on device with failover_pair_id triggers an error
- check a primary device plugged on a bus that doesn't support hotplug
  triggers an error

We check the status of the machine before and after hotplugging cards and
feature negotiation:

- check we don't see the primary device at boot if failover is on
- check we see the primary device at boot if failover is off
- check we don't see the primary device if failover is on
  but failover_pair_id is not the one with on (I think this should be changed)
- check the primary device is plugged after the feature negotiation
- check the result if the primary device is plugged before standby device and
  vice-versa
- check the if the primary device is coldplugged and the standy device
  hotplugged and vice-versa
- check the migration triggers the unplug and the hotplug

There is one preliminary patch in the series:

- PATCH 1 introduces a function to enable PCI bridge.
  Failover needs to be plugged on a pcie-root-port and while
  the root port is not configured the cards behind it are not
  available

v8:
- fix checkpatch.pl error (space after "(")
- fix sanitizer errors:
  * migrate_status() qobject_unref() cleanup
  * release QVirtioPCIDevice with qos_object_destroy()
  * add a missing g_free() in qpci_secondary_buses_rec()
  * add qobject_unref() in get_bus() and find_device()
when an object is popped from a list.

v7:
- merge patch 3 and 4 as the fix for ACPI unplug has been merged
- address Thomas' comments
- add a dependency on slirp in meson.build
- check FAILOVER_NEGOCIATED device-id and MIGRATION status
  on destination, update UNPLUG_PRIMARY event checking
- fix an object_unref() in test_migrate_abort_active()
- fix typo s/whan/when/

v6:
- manage more than 2 root ports
- add a function to check if a card is available or not
- check migration state
- add cancelled migration test cases
- rename tests

v5:
- re-add the wait-unplug test that has been removed from v4 by mistake.

v4:
- rely on query-migrate status to know the migration state rather than
  to wait the STOP event.
- remove the patch to add time out to qtest_qmp_eventwait()

v3:
- fix a bug with ACPI unplug and add the related test

v2:
- remove PATCH 1 that introduced a function that can be replaced by
  qobject_to_json_pretty() (Markus)
- Add migration to a file and from the file to check the card is
  correctly unplugged on the source, and hotplugged on the dest
- Add an ACPI call to eject the card as the kernel would do

Laurent Vivier (4):
  qtest/libqos: add a function to initialize secondary PCI buses
  tests/qtest: add some tests for virtio-net failover
  test/libqtest: add some virtio-net failover migration cancelling tests
  tests/libqtest: add a migration test with two couples of failover
devices

 include/hw/pci/pci_bridge.h   |8 +
 tests/qtest/libqos/pci.c  |  119 +++
 tests/qtest/libqos/pci.h  |1 +
 tests/qtest/meson.build   |4 +
 tests/qtest/virtio-net-failover.c | 1352 +
 5 files changed, 1484 insertions(+)
 create mode 100644 tests/qtest/virtio-net-failover.c

-- 
2.33.1





[PATCH v8 3/4] test/libqtest: add some virtio-net failover migration cancelling tests

2021-12-08 Thread Laurent Vivier
Add some tests to check the state of the machine if the migration
is cancelled while we are using virtio-net failover.

Signed-off-by: Laurent Vivier 
Acked-by: Thomas Huth 
---
 tests/qtest/virtio-net-failover.c | 282 ++
 1 file changed, 282 insertions(+)

diff --git a/tests/qtest/virtio-net-failover.c 
b/tests/qtest/virtio-net-failover.c
index fd7821deaf48..e998a546b031 100644
--- a/tests/qtest/virtio-net-failover.c
+++ b/tests/qtest/virtio-net-failover.c
@@ -752,6 +752,280 @@ static void test_migrate_in(gconstpointer opaque)
 machine_stop(qts);
 }
 
+static void test_migrate_abort_wait_unplug(gconstpointer opaque)
+{
+QTestState *qts;
+QDict *resp, *args, *ret;
+g_autofree gchar *uri = g_strdup_printf("exec: cat > %s", (gchar *)opaque);
+const gchar *status;
+QVirtioPCIDevice *vdev;
+
+qts = machine_start(BASE_MACHINE
+ "-netdev user,id=hs0 "
+ "-netdev user,id=hs1 ",
+ 2);
+
+check_one_card(qts, false, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+
+qtest_qmp_device_add(qts, "virtio-net", "standby0",
+ "{'bus': 'root0',"
+ "'failover': 'on',"
+ "'netdev': 'hs0',"
+ "'mac': '"MAC_STANDBY0"'}");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+
+vdev = start_virtio_net(qts, 1, 0, "standby0");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+
+qtest_qmp_device_add(qts, "virtio-net", "primary0",
+ "{'bus': 'root1',"
+ "'failover_pair_id': 'standby0',"
+ "'netdev': 'hs1',"
+ "'rombar': 0,"
+ "'romfile': '',"
+ "'mac': '"MAC_PRIMARY0"'}");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, true, "primary0", MAC_PRIMARY0);
+
+args = qdict_from_jsonf_nofail("{}");
+g_assert_nonnull(args);
+qdict_put_str(args, "uri", uri);
+
+resp = qtest_qmp(qts, "{ 'execute': 'migrate', 'arguments': %p}", args);
+g_assert(qdict_haskey(resp, "return"));
+qobject_unref(resp);
+
+/* the event is sent when QEMU asks the OS to unplug the card */
+resp = get_unplug_primary_event(qts);
+g_assert_cmpstr(qdict_get_str(resp, "device-id"), ==, "primary0");
+qobject_unref(resp);
+
+resp = qtest_qmp(qts, "{ 'execute': 'migrate_cancel' }");
+g_assert(qdict_haskey(resp, "return"));
+qobject_unref(resp);
+
+/* migration has been cancelled while the unplug was in progress */
+
+/* while the card is not ejected, we must be in "cancelling" state */
+ret = migrate_status(qts);
+
+status = qdict_get_str(ret, "status");
+g_assert_cmpstr(status, ==, "cancelling");
+qobject_unref(ret);
+
+/* OS unplugs the cards, QEMU can move from wait-unplug state */
+qtest_outl(qts, ACPI_PCIHP_ADDR_ICH9 + PCI_EJ_BASE, 1);
+
+while (true) {
+ret = migrate_status(qts);
+
+status = qdict_get_str(ret, "status");
+if (strcmp(status, "cancelled") == 0) {
+qobject_unref(ret);
+break;
+}
+g_assert_cmpstr(status, !=, "failed");
+g_assert_cmpstr(status, !=, "active");
+qobject_unref(ret);
+}
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, true, "primary0", MAC_PRIMARY0);
+
+qos_object_destroy((QOSGraphObject *)vdev);
+machine_stop(qts);
+}
+
+static void test_migrate_abort_active(gconstpointer opaque)
+{
+QTestState *qts;
+QDict *resp, *args, *ret;
+g_autofree gchar *uri = g_strdup_printf("exec: cat > %s", (gchar *)opaque);
+const gchar *status;
+QVirtioPCIDevice *vdev;
+
+qts = machine_start(BASE_MACHINE
+ "-netdev user,id=hs0 "
+ "-netdev user,id=hs1 ",
+ 2);
+
+check_one_card(qts, false, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+
+qtest_qmp_device_add(qts, "virtio-net", "standby0",
+ "{'bus': 'root0',"
+ "'failover': 'on',"
+ "'netdev': 'hs0',"
+ "'mac': '"MAC_STANDBY0"'}");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+
+vdev = start_virtio_net(qts, 1, 0, "standby0");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+
+qtest_qmp_device_add(qts, "virtio-net", "primary0",
+ "{'bus': 'root1',"
+ "'failover_pair_id': 'standby0',"
+ 

[PATCH v8 4/4] tests/libqtest: add a migration test with two couples of failover devices

2021-12-08 Thread Laurent Vivier
Signed-off-by: Laurent Vivier 
Acked-by: Thomas Huth 
---
 tests/qtest/virtio-net-failover.c | 282 ++
 1 file changed, 282 insertions(+)

diff --git a/tests/qtest/virtio-net-failover.c 
b/tests/qtest/virtio-net-failover.c
index e998a546b031..4b2ba8a106b1 100644
--- a/tests/qtest/virtio-net-failover.c
+++ b/tests/qtest/virtio-net-failover.c
@@ -20,6 +20,7 @@
 
 #define ACPI_PCIHP_ADDR_ICH90x0cc0
 #define PCI_EJ_BASE 0x0008
+#define PCI_SEL_BASE0x0010
 
 #define BASE_MACHINE "-M q35 -nodefaults " \
 "-device pcie-root-port,id=root0,addr=0x1,bus=pcie.0,chassis=1 " \
@@ -27,6 +28,8 @@
 
 #define MAC_PRIMARY0 "52:54:00:11:11:11"
 #define MAC_STANDBY0 "52:54:00:22:22:22"
+#define MAC_PRIMARY1 "52:54:00:33:33:33"
+#define MAC_STANDBY1 "52:54:00:44:44:44"
 
 static QGuestAllocator guest_malloc;
 static QPCIBus *pcibus;
@@ -1026,6 +1029,281 @@ static void test_migrate_abort_timeout(gconstpointer 
opaque)
 machine_stop(qts);
 }
 
+static void test_multi_out(gconstpointer opaque)
+{
+QTestState *qts;
+QDict *resp, *args, *ret;
+g_autofree gchar *uri = g_strdup_printf("exec: cat > %s", (gchar *)opaque);
+const gchar *status, *expected;
+QVirtioPCIDevice *vdev0, *vdev1;
+
+qts = machine_start(BASE_MACHINE
+"-device pcie-root-port,id=root2,addr=0x3,bus=pcie.0,chassis=3 
"
+"-device pcie-root-port,id=root3,addr=0x4,bus=pcie.0,chassis=4 
"
+"-netdev user,id=hs0 "
+"-netdev user,id=hs1 "
+"-netdev user,id=hs2 "
+"-netdev user,id=hs3 ",
+4);
+
+check_one_card(qts, false, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+check_one_card(qts, false, "standby1", MAC_STANDBY1);
+check_one_card(qts, false, "primary1", MAC_PRIMARY1);
+
+qtest_qmp_device_add(qts, "virtio-net", "standby0",
+ "{'bus': 'root0',"
+ "'failover': 'on',"
+ "'netdev': 'hs0',"
+ "'mac': '"MAC_STANDBY0"'}");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+check_one_card(qts, false, "standby1", MAC_STANDBY1);
+check_one_card(qts, false, "primary1", MAC_PRIMARY1);
+
+qtest_qmp_device_add(qts, "virtio-net", "primary0",
+ "{'bus': 'root1',"
+ "'failover_pair_id': 'standby0',"
+ "'netdev': 'hs1',"
+ "'rombar': 0,"
+ "'romfile': '',"
+ "'mac': '"MAC_PRIMARY0"'}");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, false, "primary0", MAC_PRIMARY0);
+check_one_card(qts, false, "standby1", MAC_STANDBY1);
+check_one_card(qts, false, "primary1", MAC_PRIMARY1);
+
+vdev0 = start_virtio_net(qts, 1, 0, "standby0");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, true, "primary0", MAC_PRIMARY0);
+check_one_card(qts, false, "standby1", MAC_STANDBY1);
+check_one_card(qts, false, "primary1", MAC_PRIMARY1);
+
+qtest_qmp_device_add(qts, "virtio-net", "standby1",
+ "{'bus': 'root2',"
+ "'failover': 'on',"
+ "'netdev': 'hs2',"
+ "'mac': '"MAC_STANDBY1"'}");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, true, "primary0", MAC_PRIMARY0);
+check_one_card(qts, true, "standby1", MAC_STANDBY1);
+check_one_card(qts, false, "primary1", MAC_PRIMARY1);
+
+qtest_qmp_device_add(qts, "virtio-net", "primary1",
+ "{'bus': 'root3',"
+ "'failover_pair_id': 'standby1',"
+ "'netdev': 'hs3',"
+ "'rombar': 0,"
+ "'romfile': '',"
+ "'mac': '"MAC_PRIMARY1"'}");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, true, "primary0", MAC_PRIMARY0);
+check_one_card(qts, true, "standby1", MAC_STANDBY1);
+check_one_card(qts, false, "primary1", MAC_PRIMARY1);
+
+vdev1 = start_virtio_net(qts, 3, 0, "standby1");
+
+check_one_card(qts, true, "standby0", MAC_STANDBY0);
+check_one_card(qts, true, "primary0", MAC_PRIMARY0);
+check_one_card(qts, true, "standby1", MAC_STANDBY1);
+check_one_card(qts, true, "primary1", MAC_PRIMARY1);
+
+args = qdict_from_jsonf_nofail("{}");
+g_assert_nonnull(args);
+qdict_put_str(args, "uri", uri);
+
+resp = qtest_qmp(qts, "{ 'execute': 'migrate', 'arguments': %p}", args);
+g_assert(qdict_haskey(resp, "return"));
+qobject_unref(resp);
+
+/* the event is sent when QEMU asks the OS to unplug the card */
+resp = get_unplug_primary_event(qts);
+if 

[PATCH v8 1/4] qtest/libqos: add a function to initialize secondary PCI buses

2021-12-08 Thread Laurent Vivier
Scan the PCI devices to find bridge and set PCI_SECONDARY_BUS and
PCI_SUBORDINATE_BUS (algorithm from seabios)

Signed-off-by: Laurent Vivier 
Acked-by: Thomas Huth 
---
 include/hw/pci/pci_bridge.h |   8 +++
 tests/qtest/libqos/pci.c| 119 
 tests/qtest/libqos/pci.h|   1 +
 3 files changed, 128 insertions(+)

diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index a94d350034bf..30691a6e5728 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -138,6 +138,7 @@ typedef struct PCIBridgeQemuCap {
 uint64_t mem_pref_64; /* Prefetchable memory to reserve (64-bit MMIO) */
 } PCIBridgeQemuCap;
 
+#define REDHAT_PCI_CAP_TYPE_OFFSET  3
 #define REDHAT_PCI_CAP_RESOURCE_RESERVE 1
 
 /*
@@ -152,6 +153,13 @@ typedef struct PCIResReserve {
 uint64_t mem_pref_64;
 } PCIResReserve;
 
+#define REDHAT_PCI_CAP_RES_RESERVE_BUS_RES 4
+#define REDHAT_PCI_CAP_RES_RESERVE_IO  8
+#define REDHAT_PCI_CAP_RES_RESERVE_MEM 16
+#define REDHAT_PCI_CAP_RES_RESERVE_PREF_MEM_32 20
+#define REDHAT_PCI_CAP_RES_RESERVE_PREF_MEM_64 24
+#define REDHAT_PCI_CAP_RES_RESERVE_CAP_SIZE32
+
 int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset,
PCIResReserve res_reserve, Error **errp);
 
diff --git a/tests/qtest/libqos/pci.c b/tests/qtest/libqos/pci.c
index e1e96189c821..3a9076ae580a 100644
--- a/tests/qtest/libqos/pci.c
+++ b/tests/qtest/libqos/pci.c
@@ -13,6 +13,8 @@
 #include "qemu/osdep.h"
 #include "pci.h"
 
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_bridge.h"
 #include "hw/pci/pci_regs.h"
 #include "qemu/host-utils.h"
 #include "qgraph.h"
@@ -99,6 +101,123 @@ void qpci_device_init(QPCIDevice *dev, QPCIBus *bus, 
QPCIAddress *addr)
 g_assert(!addr->device_id || device_id == addr->device_id);
 }
 
+static uint8_t qpci_find_resource_reserve_capability(QPCIDevice *dev)
+{
+uint16_t device_id;
+uint8_t cap = 0;
+
+if (qpci_config_readw(dev, PCI_VENDOR_ID) != PCI_VENDOR_ID_REDHAT) {
+return 0;
+}
+
+device_id = qpci_config_readw(dev, PCI_DEVICE_ID);
+
+if (device_id != PCI_DEVICE_ID_REDHAT_PCIE_RP &&
+device_id != PCI_DEVICE_ID_REDHAT_BRIDGE) {
+return 0;
+}
+
+do {
+cap = qpci_find_capability(dev, PCI_CAP_ID_VNDR, cap);
+} while (cap &&
+ qpci_config_readb(dev, cap + REDHAT_PCI_CAP_TYPE_OFFSET) !=
+ REDHAT_PCI_CAP_RESOURCE_RESERVE);
+if (cap) {
+uint8_t cap_len = qpci_config_readb(dev, cap + PCI_CAP_FLAGS);
+if (cap_len < REDHAT_PCI_CAP_RES_RESERVE_CAP_SIZE) {
+return 0;
+}
+}
+return cap;
+}
+
+static void qpci_secondary_buses_rec(QPCIBus *qbus, int bus, int *pci_bus)
+{
+QPCIDevice *dev;
+uint16_t class;
+uint8_t pribus, secbus, subbus;
+int index;
+
+for (index = 0; index < 32; index++) {
+dev = qpci_device_find(qbus, QPCI_DEVFN(bus + index, 0));
+if (dev == NULL) {
+continue;
+}
+class = qpci_config_readw(dev, PCI_CLASS_DEVICE);
+if (class == PCI_CLASS_BRIDGE_PCI) {
+qpci_config_writeb(dev, PCI_SECONDARY_BUS, 255);
+qpci_config_writeb(dev, PCI_SUBORDINATE_BUS, 0);
+}
+g_free(dev);
+}
+
+for (index = 0; index < 32; index++) {
+dev = qpci_device_find(qbus, QPCI_DEVFN(bus + index, 0));
+if (dev == NULL) {
+continue;
+}
+class = qpci_config_readw(dev, PCI_CLASS_DEVICE);
+if (class != PCI_CLASS_BRIDGE_PCI) {
+g_free(dev);
+continue;
+}
+
+pribus = qpci_config_readb(dev, PCI_PRIMARY_BUS);
+if (pribus != bus) {
+qpci_config_writeb(dev, PCI_PRIMARY_BUS, bus);
+}
+
+secbus = qpci_config_readb(dev, PCI_SECONDARY_BUS);
+(*pci_bus)++;
+if (*pci_bus != secbus) {
+secbus = *pci_bus;
+qpci_config_writeb(dev, PCI_SECONDARY_BUS, secbus);
+}
+
+subbus = qpci_config_readb(dev, PCI_SUBORDINATE_BUS);
+qpci_config_writeb(dev, PCI_SUBORDINATE_BUS, 255);
+
+qpci_secondary_buses_rec(qbus, secbus << 5, pci_bus);
+
+if (subbus != *pci_bus) {
+uint8_t res_bus = *pci_bus;
+uint8_t cap = qpci_find_resource_reserve_capability(dev);
+
+if (cap) {
+uint32_t tmp_res_bus;
+
+tmp_res_bus = qpci_config_readl(dev, cap +
+
REDHAT_PCI_CAP_RES_RESERVE_BUS_RES);
+if (tmp_res_bus != (uint32_t)-1) {
+res_bus = tmp_res_bus & 0xFF;
+if ((uint8_t)(res_bus + secbus) < secbus ||
+(uint8_t)(res_bus + secbus) < res_bus) {
+res_bus = 0;
+}
+if (secbus + res_bus > *pci_bus) {
+res_bus = 

[PATCH 3/3] target/ppc: Set 601v exception model id

2021-12-08 Thread Fabiano Rosas
The exception model id for 601v has been removed without mention
why. I assume it was inadvertent and restore it here.

Fixes: b632a148b6 ("target-ppc: Use QOM method dispatch for MMU fault handling")
Signed-off-by: Fabiano Rosas 
---
 target/ppc/cpu_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 8100b89033..0e1682ddd9 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -4607,6 +4607,7 @@ POWERPC_FAMILY(601v)(ObjectClass *oc, void *data)
 (1ull << MSR_IR) |
 (1ull << MSR_DR);
 pcc->mmu_model = POWERPC_MMU_601;
+pcc->excp_model = POWERPC_EXCP_601;
 pcc->bus_model = PPC_FLAGS_INPUT_6xx;
 pcc->bfd_mach = bfd_mach_ppc_601;
 pcc->flags = POWERPC_FLAG_SE | POWERPC_FLAG_RTC_CLK | POWERPC_FLAG_HID0_LE;
-- 
2.33.1




[PATCH 1/3] target/ppc: Fix MPCxxx FPU interrupt address

2021-12-08 Thread Fabiano Rosas
The Floating-point Unavailable and Decrementer interrupts are being
registered at the same 0x900 address. The FPU should be at 0x800
instead.

Verified on MPC555, MPC860 and MPC885 user manuals.

Reported-by: BALATON Zoltan 
Signed-off-by: Fabiano Rosas 
---
 target/ppc/cpu_init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 6695985e9b..55af48769a 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -2180,7 +2180,7 @@ static void init_excp_MPC5xx(CPUPPCState *env)
 env->excp_vectors[POWERPC_EXCP_EXTERNAL] = 0x0500;
 env->excp_vectors[POWERPC_EXCP_ALIGN]= 0x0600;
 env->excp_vectors[POWERPC_EXCP_PROGRAM]  = 0x0700;
-env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0900;
+env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0800;
 env->excp_vectors[POWERPC_EXCP_DECR] = 0x0900;
 env->excp_vectors[POWERPC_EXCP_SYSCALL]  = 0x0C00;
 env->excp_vectors[POWERPC_EXCP_TRACE]= 0x0D00;
@@ -2207,7 +2207,7 @@ static void init_excp_MPC8xx(CPUPPCState *env)
 env->excp_vectors[POWERPC_EXCP_EXTERNAL] = 0x0500;
 env->excp_vectors[POWERPC_EXCP_ALIGN]= 0x0600;
 env->excp_vectors[POWERPC_EXCP_PROGRAM]  = 0x0700;
-env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0900;
+env->excp_vectors[POWERPC_EXCP_FPU]  = 0x0800;
 env->excp_vectors[POWERPC_EXCP_DECR] = 0x0900;
 env->excp_vectors[POWERPC_EXCP_SYSCALL]  = 0x0C00;
 env->excp_vectors[POWERPC_EXCP_TRACE]= 0x0D00;
-- 
2.33.1




[PATCH 2/3] target/ppc: Remove 603e exception model

2021-12-08 Thread Fabiano Rosas
The 603e uses the same exception code as 603 so we don't need a
dedicated entry for it.

This is only a removal of redundant code, no functional change.

Signed-off-by: Fabiano Rosas 
---
 target/ppc/cpu-qom.h |  2 --
 target/ppc/cpu_init.c| 32 ++--
 target/ppc/excp_helper.c |  1 -
 3 files changed, 2 insertions(+), 33 deletions(-)

diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index 5800fa324e..e585912571 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -94,8 +94,6 @@ enum powerpc_excp_t {
 POWERPC_EXCP_602,
 /* PowerPC 603 exception model  */
 POWERPC_EXCP_603,
-/* PowerPC 603e exception model */
-POWERPC_EXCP_603E,
 /* PowerPC G2 exception model   */
 POWERPC_EXCP_G2,
 /* PowerPC 604 exception model  */
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 55af48769a..8100b89033 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -4749,41 +4749,13 @@ POWERPC_FAMILY(603)(ObjectClass *oc, void *data)
  POWERPC_FLAG_BE | POWERPC_FLAG_BUS_CLK;
 }
 
-static void init_proc_603E(CPUPPCState *env)
-{
-register_ne_601_sprs(env);
-register_sdr1_sprs(env);
-register_603_sprs(env);
-/* Time base */
-register_tbl(env);
-/* hardware implementation registers */
-/* XXX : not implemented */
-spr_register(env, SPR_HID0, "HID0",
- SPR_NOACCESS, SPR_NOACCESS,
- _read_generic, _write_generic,
- 0x);
-/* XXX : not implemented */
-spr_register(env, SPR_HID1, "HID1",
- SPR_NOACCESS, SPR_NOACCESS,
- _read_generic, _write_generic,
- 0x);
-/* Memory management */
-register_low_BATs(env);
-register_6xx_7xx_soft_tlb(env, 64, 2);
-init_excp_603(env);
-env->dcache_line_size = 32;
-env->icache_line_size = 32;
-/* Allocate hardware IRQ controller */
-ppc6xx_irq_init(env_archcpu(env));
-}
-
 POWERPC_FAMILY(603E)(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
 
 dc->desc = "PowerPC 603e";
-pcc->init_proc = init_proc_603E;
+pcc->init_proc = init_proc_603;
 pcc->check_pow = check_pow_hid0;
 pcc->insns_flags = PPC_INSNS_BASE | PPC_STRING | PPC_MFTB |
PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
@@ -4809,7 +4781,7 @@ POWERPC_FAMILY(603E)(ObjectClass *oc, void *data)
 (1ull << MSR_RI) |
 (1ull << MSR_LE);
 pcc->mmu_model = POWERPC_MMU_SOFT_6xx;
-pcc->excp_model = POWERPC_EXCP_603E;
+pcc->excp_model = POWERPC_EXCP_603;
 pcc->bus_model = PPC_FLAGS_INPUT_6xx;
 pcc->bfd_mach = bfd_mach_ppc_ec603e;
 pcc->flags = POWERPC_FLAG_TGPR | POWERPC_FLAG_SE |
diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 17607adbe4..f15a859fe4 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -672,7 +672,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 switch (excp_model) {
 case POWERPC_EXCP_602:
 case POWERPC_EXCP_603:
-case POWERPC_EXCP_603E:
 case POWERPC_EXCP_G2:
 /* Swap temporary saved registers with GPRs */
 if (!(new_msr & ((target_ulong)1 << MSR_TGPR))) {
-- 
2.33.1




[PATCH 0/3] target/ppc: Minor fixes to exception code

2021-12-08 Thread Fabiano Rosas
These are just some minor fixes to the exception code that I collected
over the past few months.

Fabiano Rosas (3):
  target/ppc: Fix MPCxxx FPU interrupt address
  target/ppc: Remove 603e exception model
  target/ppc: Set 601v exception model id

 target/ppc/cpu-qom.h |  2 --
 target/ppc/cpu_init.c| 37 +
 target/ppc/excp_helper.c |  1 -
 3 files changed, 5 insertions(+), 35 deletions(-)

-- 
2.33.1




Re: [PATCH] tests/docker: add libfuse3 development headers

2021-12-08 Thread Beraldo Leal
On Tue, Dec 07, 2021 at 04:00:25PM +, Stefan Hajnoczi wrote:
> The FUSE exports feature is not built because most container images do
> not have libfuse3 development headers installed. Add the necessary
> packages to the Dockerfiles.
> 
> Cc: Hanna Reitz 
> Cc: Richard W.M. Jones 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  tests/docker/dockerfiles/alpine.docker| 1 +
>  tests/docker/dockerfiles/centos8.docker   | 1 +
>  tests/docker/dockerfiles/fedora.docker| 1 +
>  tests/docker/dockerfiles/opensuse-leap.docker | 1 +
>  tests/docker/dockerfiles/ubuntu.docker| 1 +
>  tests/docker/dockerfiles/ubuntu2004.docker| 1 +
>  6 files changed, 6 insertions(+)
> 
> diff --git a/tests/docker/dockerfiles/alpine.docker 
> b/tests/docker/dockerfiles/alpine.docker
> index 7e6997e301..9ddb3c2ebc 100644
> --- a/tests/docker/dockerfiles/alpine.docker
> +++ b/tests/docker/dockerfiles/alpine.docker
> @@ -12,6 +12,7 @@ ENV PACKAGES \
>   ccache \
>   coreutils \
>   curl-dev \
> + fuse3-dev \
>   g++ \
>   gcc \
>   git \
> diff --git a/tests/docker/dockerfiles/centos8.docker 
> b/tests/docker/dockerfiles/centos8.docker
> index 7f135f8e8c..a2dae4be29 100644
> --- a/tests/docker/dockerfiles/centos8.docker
> +++ b/tests/docker/dockerfiles/centos8.docker
> @@ -19,6 +19,7 @@ ENV PACKAGES \
>  device-mapper-multipath-devel \
>  diffutils \
>  findutils \
> +fuse3-devel \
>  gcc \
>  gcc-c++ \
>  genisoimage \
> diff --git a/tests/docker/dockerfiles/fedora.docker 
> b/tests/docker/dockerfiles/fedora.docker
> index c6fd7e1113..a3a712c87b 100644
> --- a/tests/docker/dockerfiles/fedora.docker
> +++ b/tests/docker/dockerfiles/fedora.docker
> @@ -20,6 +20,7 @@ ENV PACKAGES \
>  device-mapper-multipath-devel \
>  diffutils \
>  findutils \
> +fuse3-devel \
>  gcc \
>  gcc-c++ \
>  gcovr \
> diff --git a/tests/docker/dockerfiles/opensuse-leap.docker 
> b/tests/docker/dockerfiles/opensuse-leap.docker
> index 3bbdb67f4f..2beb61bd7e 100644
> --- a/tests/docker/dockerfiles/opensuse-leap.docker
> +++ b/tests/docker/dockerfiles/opensuse-leap.docker
> @@ -15,6 +15,7 @@ ENV PACKAGES \
>  dbus-1 \
>  diffutils \
>  findutils \
> +fuse3-devel \
>  gcc \
>  gcc-c++ \
>  gcovr \
> diff --git a/tests/docker/dockerfiles/ubuntu.docker 
> b/tests/docker/dockerfiles/ubuntu.docker
> index f0e0180d21..0c694a2bf0 100644
> --- a/tests/docker/dockerfiles/ubuntu.docker
> +++ b/tests/docker/dockerfiles/ubuntu.docker
> @@ -29,6 +29,7 @@ ENV PACKAGES \
>  libepoxy-dev \
>  libfdt-dev \
>  libffi-dev \
> +libfuse3-dev \
>  libgbm-dev \
>  libgnutls28-dev \
>  libgtk-3-dev \
> diff --git a/tests/docker/dockerfiles/ubuntu2004.docker 
> b/tests/docker/dockerfiles/ubuntu2004.docker
> index 15a026be09..a46feaecdd 100644
> --- a/tests/docker/dockerfiles/ubuntu2004.docker
> +++ b/tests/docker/dockerfiles/ubuntu2004.docker
> @@ -34,6 +34,7 @@ ENV PACKAGES \
>  libepoxy-dev \
>  libfdt-dev \
>  libffi-dev \
> +libfuse3-dev \
>  libgbm-dev \
>  libgcrypt20-dev \
>  libglib2.0-dev \

Reviewed-by: Beraldo Leal 
Tested-by: Beraldo Leal 

Tested against package names on those repositories.

--
Beraldo




Re: [PATCH 1/7] hw/intc: sifive_plic: Add a reset function

2021-12-08 Thread Philippe Mathieu-Daudé
Hi Alistair,

On 12/8/21 07:42, Alistair Francis wrote:
> From: Alistair Francis 
> 
> Signed-off-by: Alistair Francis 
> ---
>  hw/intc/sifive_plic.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c
> index 877e76877c..35f097799a 100644
> --- a/hw/intc/sifive_plic.c
> +++ b/hw/intc/sifive_plic.c
> @@ -355,6 +355,17 @@ static const MemoryRegionOps sifive_plic_ops = {
>  }
>  };
>  
> +static void sifive_plic_reset(DeviceState *dev)
> +{
> +SiFivePLICState *s = SIFIVE_PLIC(dev);
> +
> +memset(s->source_priority, 0, sizeof(uint32_t) * s->num_sources);
> +memset(s->target_priority, 0, sizeof(uint32_t) * s->num_addrs);
> +memset(s->pending, 0, sizeof(uint32_t) * s->bitfield_words);
> +memset(s->claimed, 0, sizeof(uint32_t) * s->bitfield_words);
> +memset(s->enable, 0, sizeof(uint32_t) * s->num_enables);

Looking at sifive_plic_realize():

- Should we reset the external IRQs in a default state?
- Shouldn't riscv_cpu_claim_interrupts() be called at reset?

Note: parse_hart_config() name is slightly confusing since
beside parsing, it also allocates addr_config. Maybe consider
renaming?



Re: [PATCH 7/7] hw/riscv: Use error_fatal for SoC realisation

2021-12-08 Thread Philippe Mathieu-Daudé
On 12/8/21 07:42, Alistair Francis wrote:
> From: Alistair Francis 
> 
> When realising the SoC use error_fatal instead of error_abort as the
> process can fail and report useful information to the user.
> 
> Currently a user can see this:
> 
>$ ../qemu/bld/qemu-system-riscv64 -M sifive_u -S -monitor stdio -display 
> none -drive if=pflash
> QEMU 6.1.93 monitor - type 'help' for more information
> (qemu) Unexpected error in sifive_u_otp_realize() at 
> ../hw/misc/sifive_u_otp.c:229:
> qemu-system-riscv64: OTP drive size < 16K
> Aborted (core dumped)
> 
> Which this patch addresses
> 
> Signed-off-by: Alistair Francis 
> Reported-by: Markus Armbruster 
> ---
>  hw/riscv/microchip_pfsoc.c | 2 +-
>  hw/riscv/opentitan.c   | 2 +-
>  hw/riscv/sifive_e.c| 2 +-
>  hw/riscv/sifive_u.c| 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH 09/12] s390x/pci: enable adapter event notification for interpreted devices

2021-12-08 Thread Thomas Huth

On 07/12/2021 22.04, Matthew Rosato wrote:

Use the associated vfio feature ioctl to enable adapter event notification
and forwarding for devices when requested.  This feature will be set up
with or without firmware assist based upon the 'intassist' setting.

Signed-off-by: Matthew Rosato 
---
  hw/s390x/s390-pci-bus.c  | 24 +++--
  hw/s390x/s390-pci-inst.c | 54 +++-
  hw/s390x/s390-pci-vfio.c | 88 
  include/hw/s390x/s390-pci-bus.h  |  1 +
  include/hw/s390x/s390-pci-vfio.h | 20 
  5 files changed, 182 insertions(+), 5 deletions(-)

[...]

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 78093aaac7..6f9271df87 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -152,6 +152,94 @@ int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
  return 0;
  }
  
+int s390_pci_probe_aif(S390PCIBusDevice *pbdev)

+{
+VFIOPCIDevice *vdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);


Should this use VFIO_PCI() instead of container_of ?


+struct vfio_device_feature feat = {
+.argsz = sizeof(struct vfio_device_feature),
+.flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_AIF
+};
+
+assert(vdev);


... then you could likely also drop the assert(), I think?


+return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, );
+}
+
+int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
+ bool assist)
+{
+VFIOPCIDevice *vdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+g_autofree struct vfio_device_feature *feat;
+struct vfio_device_zpci_aif *data;
+int size;
+
+assert(vdev);


dito


+size = sizeof(*feat) + sizeof(*data);
+feat = g_malloc0(size);
+feat->argsz = size;
+feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
+
+data = (struct vfio_device_zpci_aif *)>data;
+if (enable) {
+data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
+if (!pbdev->intassist) {
+data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
+}
+/* Fill in the guest fib info */
+data->ibv = fib->aibv;
+data->sb = fib->aisb;
+data->noi = FIB_DATA_NOI(fib->data);
+data->isc = FIB_DATA_ISC(fib->data);
+data->sbo = FIB_DATA_AISBO(fib->data);
+} else {
+data->flags = 0;
+}
+
+return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
+int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
+{
+VFIOPCIDevice *vdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+g_autofree struct vfio_device_feature *feat;
+struct vfio_device_zpci_aif *data;
+int size, rc;
+
+assert(vdev);


dito


+size = sizeof(*feat) + sizeof(*data);
+feat = g_malloc0(size);
+feat->argsz = size;
+feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
+
+rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+if (rc) {
+return rc;
+}
+
+/* Determine if current interrupt settings match the host */
+data = (struct vfio_device_zpci_aif *)>data;
+if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) {
+rc = -EINVAL;
+} else if (!enable && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {
+rc = -EINVAL;
+}
+
+/*
+ * When enabled for interrupts, the assist and forced host-delivery are
+ * mututally exclusive
+ */
+if (enable && assist && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST)) {
+rc = -EINVAL;
+} else if (enable && (!assist) && (!(data->flags &
+ VFIO_DEVICE_ZPCI_FLAG_AIF_HOST))) {
+rc = -EINVAL;
+}
+
+return rc;
+}


 Thomas




Re: [PATCH v5 3/4] failover: fix unplug pending detection

2021-12-08 Thread Michael S. Tsirkin
On Wed, Dec 08, 2021 at 08:50:34AM +0100, Thomas Huth wrote:
> On 08/12/2021 08.36, Michael S. Tsirkin wrote:
> > On Fri, Nov 19, 2021 at 10:07:17AM +0100, Laurent Vivier wrote:
> > > Failover needs to detect the end of the PCI unplug to start migration
> > > after the VFIO card has been unplugged.
> > > 
> > > To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and reset 
> > > in
> > > pcie_unplug_device().
> > > 
> > > But since
> > >  17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on 
> > > Q35")
> > > we have switched to ACPI unplug and these functions are not called anymore
> > > and the flag not set. So failover migration is not able to detect if card
> > > is really unplugged and acts as it's done as soon as it's started. So it
> > > doesn't wait the end of the unplug to start the migration. We don't see 
> > > any
> > > problem when we test that because ACPI unplug is faster than PCIe native
> > > hotplug and when the migration really starts the unplug operation is
> > > already done.
> > > 
> > > See c000a9bd06ea ("pci: mark device having guest unplug request pending")
> > >  a99c4da9fc2a ("pci: mark devices partially unplugged")
> > > 
> > > Signed-off-by: Laurent Vivier 
> > > Reviewed-by: Ani Sinha 
> > 
> > Hmm.  I think this one may be needed for this release actually.
> > Isolate from testing changes and repost?
> 
> You merged it already here:
> 
>  https://gitlab.com/qemu-project/qemu/-/commit/9323f892b39d133eb6

Oh, good. Forgot to drop the tag.

> so we should be fine :-)
> 
>  Thomas




Re: [PATCH 08/12] s390x/pci: don't fence interpreted devices without MSI-X

2021-12-08 Thread Thomas Huth

On 07/12/2021 22.04, Matthew Rosato wrote:

Lack of MSI-X support is not an issue for interpreted passthrough
devices, so let's let these in.  This will allow, for example, ISM
devices to be passed through -- but only when interpretation is
available and being used.

Signed-off-by: Matthew Rosato 
---
  hw/s390x/s390-pci-bus.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 451bd32d92..503326210a 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -1096,7 +1096,7 @@ static void s390_pcihost_plug(HotplugHandler 
*hotplug_dev, DeviceState *dev,
  pbdev->interp = false;
  }
  
-if (s390_pci_msix_init(pbdev)) {

+if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
  error_setg(errp, "MSI-X support is mandatory "
 "in the S390 architecture");
  return;



Reviewed-by: Thomas Huth 




Re: [PATCH 07/12] s390x/pci: enable for load/store intepretation

2021-12-08 Thread Thomas Huth

On 07/12/2021 22.04, Matthew Rosato wrote:

Use the associated vfio feature ioctl to enable interpretation for devices
when requested.  As part of this process, we must use the host function
handle rather than a QEMU-generated one -- this is provided as part of the
ioctl payload.

Signed-off-by: Matthew Rosato 
---
  hw/s390x/s390-pci-bus.c  | 69 +++-
  hw/s390x/s390-pci-inst.c | 63 -
  hw/s390x/s390-pci-vfio.c | 55 +
  include/hw/s390x/s390-pci-bus.h  |  1 +
  include/hw/s390x/s390-pci-vfio.h | 15 +++
  5 files changed, 201 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 01b58ebc70..451bd32d92 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -971,12 +971,57 @@ static void s390_pci_update_subordinate(PCIDevice *dev, 
uint32_t nr)
  }
  }
  
+static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)

+{
+uint32_t idx;
+int rc;
+
+rc = s390_pci_probe_interp(pbdev);
+if (rc) {
+return rc;
+}
+
+rc = s390_pci_update_passthrough_fh(pbdev);
+if (rc) {
+return rc;
+}
+
+/*
+ * The host device is in an enabled state, but the device must
+ * begin as disabled for the guest so mask off the enable bit
+ * from the passthrough handle.
+ */
+pbdev->fh &= ~FH_MASK_ENABLE;
+
+/* Next, see if the idx is already in-use */
+idx = pbdev->fh & FH_MASK_INDEX;
+if (pbdev->idx != idx) {
+if (s390_pci_find_dev_by_idx(s, idx)) {
+return -EINVAL;
+}
+/*
+ * Update the idx entry with the passed through idx
+ * If the relinquised idx is lower than next_idx, use it


s/relinquised/relinquished/


+ * to replace next_idx
+ */
+g_hash_table_remove(s->zpci_table, >idx);
+if (idx < s->next_idx) {
+s->next_idx = idx;
+}
+pbdev->idx = idx;
+g_hash_table_insert(s->zpci_table, >idx, pbdev);
+}
+
+return 0;
+}

[...]

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 6f80a47e29..78093aaac7 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -97,6 +97,61 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount 
*cnt)
  }
  }
  
+int s390_pci_probe_interp(S390PCIBusDevice *pbdev)

+{
+VFIOPCIDevice *vdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+struct vfio_device_feature feat = {
+.argsz = sizeof(struct vfio_device_feature),
+.flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_INTERP
+};
+
+return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, );
+}
+
+int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
+{
+VFIOPCIDevice *vdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+g_autofree struct vfio_device_feature *feat;


IIRC there have been compiler versions that complain if a g_autofree 
variable is initialized at the point of declaration, so you might need to 
add the "= g_malloc0(size)" here already.



+struct vfio_device_zpci_interp *data;
+int size;
+
+size = sizeof(*feat) + sizeof(*data);
+feat = g_malloc0(size);
+feat->argsz = size;
+feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
+
+data = (struct vfio_device_zpci_interp *)>data;
+if (enable) {
+data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
+} else {
+data->flags = 0;
+}
+
+return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
+int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
+{
+VFIOPCIDevice *vdev = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+g_autofree struct vfio_device_feature *feat;


dito


+struct vfio_device_zpci_interp *data;
+int size, rc;
+
+size = sizeof(*feat) + sizeof(*data);
+feat = g_malloc0(size);
+feat->argsz = size;
+feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
+
+rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+if (rc) {
+return rc;
+}
+
+data = (struct vfio_device_zpci_interp *)>data;
+pbdev->fh = data->fh;
+return 0;
+}


 Thomas




Re: [PATCH v5 3/4] failover: fix unplug pending detection

2021-12-08 Thread Ani Sinha
On Wed, Dec 8, 2021 at 1:20 PM Thomas Huth  wrote:
>
> On 08/12/2021 08.36, Michael S. Tsirkin wrote:
> > On Fri, Nov 19, 2021 at 10:07:17AM +0100, Laurent Vivier wrote:
> >> Failover needs to detect the end of the PCI unplug to start migration
> >> after the VFIO card has been unplugged.
> >>
> >> To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and reset in
> >> pcie_unplug_device().
> >>
> >> But since
> >>  17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35")
> >> we have switched to ACPI unplug and these functions are not called anymore
> >> and the flag not set. So failover migration is not able to detect if card
> >> is really unplugged and acts as it's done as soon as it's started. So it
> >> doesn't wait the end of the unplug to start the migration. We don't see any
> >> problem when we test that because ACPI unplug is faster than PCIe native
> >> hotplug and when the migration really starts the unplug operation is
> >> already done.
> >>
> >> See c000a9bd06ea ("pci: mark device having guest unplug request pending")
> >>  a99c4da9fc2a ("pci: mark devices partially unplugged")
> >>
> >> Signed-off-by: Laurent Vivier 
> >> Reviewed-by: Ani Sinha 
> >
> > Hmm.  I think this one may be needed for this release actually.
> > Isolate from testing changes and repost?
>
> You merged it already here:
>
>   https://gitlab.com/qemu-project/qemu/-/commit/9323f892b39d133eb6

Yes I pointed this out in the other thread. Michael was not CC'd there
somehow ...

>
> so we should be fine :-)
>
>   Thomas
>



Re: [PATCH 02/12] s390x/pci: don't use hard-coded dma range in reg_ioat

2021-12-08 Thread Thomas Huth

On 07/12/2021 22.04, Matthew Rosato wrote:

Instead use the values from clp info, they will either be the hard-coded
values or what came from the host driver via vfio.

Fixes: 9670ee752727 ("s390x/pci: use a PCI Function structure")
Reviewed-by: Eric Farman 
Reviewed-by: Pierre Morel 
Signed-off-by: Matthew Rosato 
---
  hw/s390x/s390-pci-inst.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)



Reviewed-by: Thomas Huth 




Re: [PATCH 01/12] s390x/pci: use a reserved ID for the default PCI group

2021-12-08 Thread Thomas Huth

On 07/12/2021 22.04, Matthew Rosato wrote:

The current default PCI group being used can technically collide with a
real group ID passed from a hostdev.  Let's instead use a group ID that
comes from a special pool (0xF0-0xFF) that is architected to be reserved
for simulated devices.


Maybe mention that this is not a problem for migration since zPCI currently 
can't be migrated anyway (as mentioned in the discussion of an earlier 
version of this patch)


Acked-by: Thomas Huth 



Fixes: 28dc86a072 ("s390x/pci: use a PCI Group structure")
Reviewed-by: Eric Farman 
Reviewed-by: Pierre Morel 
Signed-off-by: Matthew Rosato 
---
  include/hw/s390x/s390-pci-bus.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index aa891c178d..2727e7bdef 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -313,7 +313,7 @@ typedef struct ZpciFmb {
  } ZpciFmb;
  QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
  
-#define ZPCI_DEFAULT_FN_GRP 0x20

+#define ZPCI_DEFAULT_FN_GRP 0xFF
  typedef struct S390PCIGroup {
  ClpRspQueryPciGrp zpci_group;
  int id;






Re: [PATCH 06/12] target/s390x: add zpci-interp to cpu models

2021-12-08 Thread Christian Borntraeger

Am 07.12.21 um 22:04 schrieb Matthew Rosato:

The zpci-interp feature is used to specify whether zPCI interpretation is
to be used for this guest.

Signed-off-by: Matthew Rosato 
---
  target/s390x/cpu_features_def.h.inc | 1 +
  target/s390x/gen-features.c | 2 ++
  target/s390x/kvm/kvm.c  | 1 +
  3 files changed, 4 insertions(+)

diff --git a/target/s390x/cpu_features_def.h.inc 
b/target/s390x/cpu_features_def.h.inc
index e86662bb3b..4ade3182aa 100644
--- a/target/s390x/cpu_features_def.h.inc
+++ b/target/s390x/cpu_features_def.h.inc
@@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: 
Conditional-external-interception f
  DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
  DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
  DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
+DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
  
  /* Features exposed via the PLO instruction. */

  DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in general 
registers)")
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index 7cb1a6ec10..7005d22415 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
  S390_FEAT_HPMA2,
  S390_FEAT_SIE_KSS,
  S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+S390_FEAT_ZPCI_INTERP,
  };
  
  #define full_GEN14_GA2 EmptyFeat

@@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
  S390_FEAT_GROUP_MSA_EXT_8,
  S390_FEAT_MULTIPLE_EPOCH,
  S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+S390_FEAT_ZPCI_INTERP,
  };
  


For the default model you need to be careful.
Is this in any way guest visible? then you definitely need to fence this
off for older QEMU versions so that when you migrate with older QEMUs
See the s390_cpudef_featoff_greater calls in  hw/s390x/s390-virtio-ccw.c

I know its more of a theoretical aspect, since PCI currently forbids migration
but we should try to have the cpu model consistent I guess.

  #define default_GEN14_GA2 EmptyFeat
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 5b1fdb55c4..b13d78f988 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2290,6 +2290,7 @@ static int kvm_to_feat[][2] = {
  { KVM_S390_VM_CPU_FEAT_PFMFI, S390_FEAT_SIE_PFMFI},
  { KVM_S390_VM_CPU_FEAT_SIGPIF, S390_FEAT_SIE_SIGPIF},
  { KVM_S390_VM_CPU_FEAT_KSS, S390_FEAT_SIE_KSS},
+{ KVM_S390_VM_CPU_FEAT_ZPCI_INTERP, S390_FEAT_ZPCI_INTERP },
  };
  
  static int query_cpu_feat(S390FeatBitmap features)






Re: [qemu-web PATCH v4] Add Sponsors page

2021-12-08 Thread Thomas Huth

On 24/11/2021 11.30, Philippe Mathieu-Daudé wrote:

Add a page listing QEMU sponsors.

For now, only mention Fosshost which requested to be listed:
https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg05381.html

Cc: Thomas Markey 
Resolves: https://gitlab.com/qemu-project/qemu-web/-/issues/2
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Philippe Mathieu-Daudé 
---


Thanks - after considering 
https://lore.kernel.org/qemu-devel/yz5nio3xklm1p...@redhat.com/ I've finally 
applied the patch now:


 https://www.qemu.org/sponsors/

Sorry for the delay.

 Thomas




Re:[PATCH] mirror: Avoid assertion failed in mirror_run

2021-12-08 Thread wang.yi59
>[CC-ing qemu-block, Vladimir, Kevin, and John – when sending patches,  
>please look into the MAINTAINERS file or use the  
>scripts/get_maintainer.pl script to find out who to CC on them.  It’s  
>very to overlook patches on qemu-devel :/]
>
>On 07.12.21 11:56, Yi Wang wrote:
>> From: Long YunJian  
>> 
>> when blockcommit from active leaf node, sometimes, we get assertion failed 
>> with
>> "mirror_run: Assertion `QLIST_EMPTY(>tracked_requests)' failed" messages.
>> According to the core file, we find bs->tracked_requests has IO request,
>> so assertion failed.
>> (gdb) bt
>> #0  0x7f410df707cf in raise () from /lib64/libc.so.6
>> #1  0x7f410df5ac05 in abort () from /lib64/libc.so.6
>> #2  0x7f410df5aad9 in __assert_fail_base.cold.0 () from /lib64/libc..so.6
>> #3  0x7f410df68db6 in __assert_fail () from /lib64/libc.so.6
>> #4  0x556915635371 in mirror_run (job=0x556916ff8600, errp=> out>) at block/mirror.c:1092
>> #5  0x5569155e6c53 in job_co_entry (opaque=0x556916ff8600) at job..c:904
>> #6  0x5569156d9483 in coroutine_trampoline (i0=, 
>> i1=) at util/coroutine-ucontext.c:115
>> (gdb) p s->mirror_top_bs->backing->bs->tracked_requests
>> $12 = {lh_first = 0x7f3f07bfb8b0}
>> (gdb) p s->mirror_top_bs->backing->bs->tracked_requests->lh_first
>> $13 = (struct BdrvTrackedRequest *) 0x7f3f07bfb8b0
>> 
>> Actually, before excuting assert(QLIST_EMPTY(>tracked_requests)),
>> it will excute mirror_flush(s). It may handle new I/O request and maybe
>> pending I/O during this flush. Just likes in bdrv_close fuction,
>> bdrv_drain(bs) followed by bdrv_flush(bs), we should add bdrv_drain fuction
>> to handle pending I/O after mirror_flush.
>
>Oh.  How is that happening, though?  I would have expected that flushing  
>the target BB (and associated BDS) only flushes requests to the OS and  
>lower layers, but the source node (which is `bs`) should (in the case of  
>commit) always be above the target, so I wouldn’t have expected it to  
>get any new requests due to this flush.
>
>Do you have a reproducer for this?

As i know, flush maybe will do some thring write, and then in qcow2_co_pwritev 
function,
if others aready hold "s->lock" lock, qemu_co_mutex_lock(>lock) will go to 
qemu_coroutine_yield,
and do some other things. Maybe, it will handle new I/O now. 

reproducer:
first, create leaf alpine-min_leaf1.qcow2, alpine-min_leaf1.qcow2 has backing 
file alpine-min.qcow2,
and create vm from leaf qcow2, xml likes this

  
  
  



and then make I/O request by read and write files all the time in guset os.
finally, run script blockcommit.sh until assertion failed. You should modify 
script according to actual situation.
script as follows: 

#!/bin/bash
QEMUIMG=/home/qemu-6.0.0-rc2/build/qemu-img
VIRSH=/home/libvirt-7.2.0/build/tools/virsh
DOMID=v6_host7
LEAF1=/home/alpine-min_leaf1.qcow2
LEAF2=/home/alpine-min_leaf2.qcow2
while [ 1 ]
do
[ -f quitfile ] && break
${QEMUIMG} create -f qcow2 -F qcow2 -b ${LEAF1} ${LEAF2}
sleep 2
${VIRSH} -k 0 snapshot-create-as ${DOMID} --disk-only --diskspec 
vda,file=${LEAF2} --reuse-external --no-metadata
[ $? -ne 0 ] && break
sleep 30
${VIRSH} blockcommit --domain ${DOMID} vda --base ${LEAF1} --top 
${LEAF2} --wait --verbose --active --pivot
[ $? -ne 0 ] && break
done


The probability of the problem is not high, it's difficult to reproduce the 
problem, blockcommit hundreds of times at least.

>> Signed-off-by: Long YunJian  
>> Signed-off-by: Yi Wang  
>> ---
>>   block/mirror.c | 2 ++
>>   1 file changed, 2 insertions(+)
>> 
>> diff --git a/block/mirror.c b/block/mirror.c
>> index efec2c7674..1eec356310 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -1079,6 +1079,8 @@ static int coroutine_fn mirror_run(Job *job, Error 
>> **errp)
>>   s->in_drain = false;
>>   continue;
>>   }
>> +/* in case flush left pending I/O */
>> +bdrv_drain(bs);
>
>I don’t think this works, because if we drain, we would also need to  
>flush the target again.  Essentially I believe we’d basically need  
>something like
>
>do {
> bdrv_drained_begin(bs);
> mirror_flush(s);
> if (!QLIST_EMPTY(>tracked_requests)) {
> bdrv_drained_end(bs);
> }
>} while (!QLIST_EMPTY(>tracked_requests));
>
>(Which I know is really ugly)
>
>Hanna

Yes, add bdrv_drain(bs) after mirror_flush(s) can't fix this problem. Thank you 
for your idea of a solution.
and I'd like to modify as follows, how do you like it?
@@ -1074,7 +1074,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 s->in_drain = true;
 bdrv_drained_begin(bs);
 cnt = bdrv_get_dirty_count(s->dirty_bitmap);
-if (cnt > 0 || mirror_flush(s) < 0) {
+if (cnt > 0 || mirror_flush(s) < 0 || 
!QLIST_EMPTY(>tracked_requests)) {
 bdrv_drained_end(bs);
 s->in_drain = false;

Re: [PATCH v2 1/2] spice: Update QXLInterface for spice >= 0.15.0

2021-12-08 Thread Daniel P . Berrangé
On Tue, Dec 07, 2021 at 03:40:37PM -0500, John Snow wrote:
> spice updated the spelling (and arguments) of "attache_worker" in
> 0.15.0. Update QEMU to match, preventing -Wdeprecated-declarations
> compilations from reporting build errors.
> 
> See also:
> https://gitlab.freedesktop.org/spice/spice/-/commit/974692bda1e77af92b71ed43b022439448492cb9
> 
> Signed-off-by: John Snow 
> Acked-by: Gerd Hoffmann 
> ---
>  include/ui/qemu-spice.h |  6 ++
>  hw/display/qxl.c| 14 +-
>  ui/spice-display.c  | 11 +++
>  3 files changed, 30 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 2/2] ui/clipboard: Don't use g_autoptr just to free a variable

2021-12-08 Thread Daniel P . Berrangé
On Tue, Dec 07, 2021 at 03:40:38PM -0500, John Snow wrote:
> Clang doesn't recognize that the variable is being "used" and will emit
> a warning:
> 
>   ../ui/clipboard.c:47:34: error: variable 'old' set but not used 
> [-Werror,-Wunused-but-set-variable]
>   g_autoptr(QemuClipboardInfo) old = NULL;
>  ^
>   1 error generated.
> 
> OK, fine. Just do things the old way.
> 
> Signed-off-by: John Snow 
> ---
>  ui/clipboard.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/ui/clipboard.c b/ui/clipboard.c
> index d7b008d62a..9ab65efefb 100644
> --- a/ui/clipboard.c
> +++ b/ui/clipboard.c
> @@ -44,12 +44,11 @@ void qemu_clipboard_peer_release(QemuClipboardPeer *peer,
>  
>  void qemu_clipboard_update(QemuClipboardInfo *info)
>  {
> -g_autoptr(QemuClipboardInfo) old = NULL;
>  assert(info->selection < QEMU_CLIPBOARD_SELECTION__COUNT);
>  
>  notifier_list_notify(_notifiers, info);
>  
> -old = cbinfo[info->selection];
> +g_free(cbinfo[info->selection]);

This is a ref counted data type - it can't use g_free:

  https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg04890.html

>  cbinfo[info->selection] = qemu_clipboard_info_ref(info);
>  }

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] s390x/ipl: support extended kernel command line size

2021-12-08 Thread Thomas Huth

On 22/11/2021 12.29, Marc Hartmayer wrote:

In the past s390 used a fixed command line length of 896 bytes. This has changed
with the Linux commit 5ecb2da660ab ("s390: support command lines longer than 896
bytes"). There is now a parm area indicating the maximum command line size. This
parm area has always been initialized to zero, so with older kernels this field
would read zero and we must then assume that only 896 bytes are available.

Acked-by: Viktor Mihajlovski 
Signed-off-by: Marc Hartmayer 
---
  hw/s390x/ipl.c | 23 ---
  1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 7ddca0127fc2..092c66b3f9f1 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -37,8 +37,9 @@
  
  #define KERN_IMAGE_START0x01UL

  #define LINUX_MAGIC_ADDR0x010008UL
+#define KERN_PARM_AREA_SIZE_ADDR0x010430UL
  #define KERN_PARM_AREA  0x010480UL
-#define KERN_PARM_AREA_SIZE 0x000380UL
+#define LEGACY_KERN_PARM_AREA_SIZE  0x000380UL
  #define INITRD_START0x80UL
  #define INITRD_PARM_START   0x010408UL
  #define PARMFILE_START  0x001000UL
@@ -110,6 +111,21 @@ static uint64_t bios_translate_addr(void *opaque, uint64_t 
srcaddr)
  return srcaddr + dstaddr;
  }
  
+static uint64_t get_max_kernel_cmdline_size(void)

+{
+uint64_t *size_ptr = rom_ptr(KERN_PARM_AREA_SIZE_ADDR, sizeof(*size_ptr));
+
+if (size_ptr) {
+uint64_t size;
+
+size = be64_to_cpu(*size_ptr);
+if (size != 0) {
+return size;
+}
+}
+return LEGACY_KERN_PARM_AREA_SIZE;
+}
+
  static void s390_ipl_realize(DeviceState *dev, Error **errp)
  {
  MachineState *ms = MACHINE(qdev_get_machine());
@@ -197,10 +213,11 @@ static void s390_ipl_realize(DeviceState *dev, Error 
**errp)
  ipl->start_addr = KERN_IMAGE_START;
  /* Overwrite parameters in the kernel image, which are "rom" */
  if (parm_area) {
-if (cmdline_size > KERN_PARM_AREA_SIZE) {
+uint64_t max_cmdline_size = get_max_kernel_cmdline_size();
+if (cmdline_size > max_cmdline_size) {
  error_setg(errp,
 "kernel command line exceeds maximum size: %zu > 
%lu",


I think the %lu likely needs to be turned into a PRIu64 now ... could you 
please try compiling on a 32-bit host env (e.g. via gitlab-ci), fix the 
patch accordingly (also with the cosmetics mentioned by David) and send a 
v2? Thanks!


 Thomas


-   cmdline_size, KERN_PARM_AREA_SIZE);
+   cmdline_size, max_cmdline_size);
  return;
  }





Re: [PATCH v7 2/4] tests/qtest: add some tests for virtio-net failover

2021-12-08 Thread Thomas Huth

On 08/12/2021 09.52, Laurent Vivier wrote:

On 08/12/2021 09:39, Thomas Huth wrote:

On 07/12/2021 18.23, Laurent Vivier wrote:

Add test cases to test several error cases that must be
generated by invalid failover configuration.

Add a combination of coldplug and hotplug test cases to be
sure the primary is correctly managed according the
presence or not of the STANDBY feature.

Signed-off-by: Laurent Vivier 
---

[...]
diff --git a/tests/qtest/virtio-net-failover.c 
b/tests/qtest/virtio-net-failover.c

new file mode 100644
index ..7444d30d2900
--- /dev/null
+++ b/tests/qtest/virtio-net-failover.c

[...]

+static char *get_mac(QTestState *qts, const char *name)
+{
+    QDict *resp;
+    char *mac;
+
+    resp = qtest_qmp(qts, "{ 'execute': 'qom-get', "
+ "'arguments': { "
+ "'path': %s, "
+ "'property': 'mac' } }", name);
+
+    g_assert(qdict_haskey(resp, "return"));
+
+    mac = g_strdup( qdict_get_str(resp, "return"));


FYI, check_patch.pl complains about the space after the "(" here.

I'll fix it up locally, no need to resend just because of this.



I don't know how I missed that as I have pre-commit git hook to run 
checkpatch.pl


... but now I ran into another problem: Your new test fails the build with 
the sanitizer enabled:


https://gitlab.com/thuth/qemu/-/jobs/1861286254

Could you please have a look and resend after fixing it?

 Thanks,
  Thomas




Re: [PATCH v7 2/4] tests/qtest: add some tests for virtio-net failover

2021-12-08 Thread Laurent Vivier

On 08/12/2021 09:39, Thomas Huth wrote:

On 07/12/2021 18.23, Laurent Vivier wrote:

Add test cases to test several error cases that must be
generated by invalid failover configuration.

Add a combination of coldplug and hotplug test cases to be
sure the primary is correctly managed according the
presence or not of the STANDBY feature.

Signed-off-by: Laurent Vivier 
---

[...]

diff --git a/tests/qtest/virtio-net-failover.c 
b/tests/qtest/virtio-net-failover.c
new file mode 100644
index ..7444d30d2900
--- /dev/null
+++ b/tests/qtest/virtio-net-failover.c

[...]

+static char *get_mac(QTestState *qts, const char *name)
+{
+    QDict *resp;
+    char *mac;
+
+    resp = qtest_qmp(qts, "{ 'execute': 'qom-get', "
+ "'arguments': { "
+ "'path': %s, "
+ "'property': 'mac' } }", name);
+
+    g_assert(qdict_haskey(resp, "return"));
+
+    mac = g_strdup( qdict_get_str(resp, "return"));


FYI, check_patch.pl complains about the space after the "(" here.

I'll fix it up locally, no need to resend just because of this.



I don't know how I missed that as I have pre-commit git hook to run 
checkpatch.pl

Thank you.

Laurent




Re: [PATCH v2 2/2] ui/clipboard: Don't use g_autoptr just to free a variable

2021-12-08 Thread Philippe Mathieu-Daudé
On 12/7/21 21:40, John Snow wrote:
> Clang doesn't recognize that the variable is being "used" and will emit
> a warning:
> 
>   ../ui/clipboard.c:47:34: error: variable 'old' set but not used 
> [-Werror,-Wunused-but-set-variable]
>   g_autoptr(QemuClipboardInfo) old = NULL;
>  ^
>   1 error generated.
> 
> OK, fine. Just do things the old way.
> 
> Signed-off-by: John Snow 
> ---
>  ui/clipboard.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v7 2/4] tests/qtest: add some tests for virtio-net failover

2021-12-08 Thread Thomas Huth

On 07/12/2021 18.23, Laurent Vivier wrote:

Add test cases to test several error cases that must be
generated by invalid failover configuration.

Add a combination of coldplug and hotplug test cases to be
sure the primary is correctly managed according the
presence or not of the STANDBY feature.

Signed-off-by: Laurent Vivier 
---

[...]

diff --git a/tests/qtest/virtio-net-failover.c 
b/tests/qtest/virtio-net-failover.c
new file mode 100644
index ..7444d30d2900
--- /dev/null
+++ b/tests/qtest/virtio-net-failover.c

[...]

+static char *get_mac(QTestState *qts, const char *name)
+{
+QDict *resp;
+char *mac;
+
+resp = qtest_qmp(qts, "{ 'execute': 'qom-get', "
+ "'arguments': { "
+ "'path': %s, "
+ "'property': 'mac' } }", name);
+
+g_assert(qdict_haskey(resp, "return"));
+
+mac = g_strdup( qdict_get_str(resp, "return"));


FYI, check_patch.pl complains about the space after the "(" here.

I'll fix it up locally, no need to resend just because of this.


+qobject_unref(resp);
+
+return mac;
+}


 Thomas




Re: [PATCH v7 1/7] net/vmnet: add vmnet dependency and customizable option

2021-12-08 Thread Markus Armbruster
Vladislav Yaroshchuk  writes:

> If you meant patch series cover letter, it exists, see
> https://patchew.org/QEMU/20211207101828.22033-1-yaroshchuk2...@gmail.com/

Never arrived here.  Since the list archive has it, the problem must be
on my end.  Thanks!




Re: [PATCH v1 3/3] virtio-mem: Set "unplugged-inaccessible=auto" for the 6.2 machine on x86

2021-12-08 Thread Pankaj Gupta
> Set the new default to "auto", keeping it set to "on" for compat
> machines. This property is only available for x86 targets.
>
> TODO: once 6.2 was released and we have compat machines, target the next
>   QEMU release.
>
> Signed-off-by: David Hildenbrand 
> ---
>  hw/i386/pc.c   | 1 +
>  hw/virtio/virtio-mem.c | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index a2ef40ecbc..045ba05431 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_1[] = {
>  { TYPE_X86_CPU, "hv-version-id-major", "0x0006" },
>  { TYPE_X86_CPU, "hv-version-id-minor", "0x0001" },
>  { "ICH9-LPC", "x-keep-pci-slot-hpc", "false" },
> +{ "virtio-mem", "unplugged-inaccessible", "off" },
>  };
>  const size_t pc_compat_6_1_len = G_N_ELEMENTS(pc_compat_6_1);
>
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index 1e57156e81..a5d26d414f 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -1169,7 +1169,7 @@ static Property virtio_mem_properties[] = {
>   TYPE_MEMORY_BACKEND, HostMemoryBackend *),
>  #if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS)
>  DEFINE_PROP_ON_OFF_AUTO(VIRTIO_MEM_UNPLUGGED_INACCESSIBLE_PROP, 
> VirtIOMEM,
> -unplugged_inaccessible, ON_OFF_AUTO_OFF),
> +unplugged_inaccessible, ON_OFF_AUTO_AUTO),
>  #endif
>  DEFINE_PROP_END_OF_LIST(),
>  };

With correction in commit message pointed by Michal.
 Reviewed-by: Pankaj Gupta