date:20220819

Re: [PATCH 5/6] target/arm: Advertise FEAT_ETS for '-cpu max'

2022-08-19 Thread Richard Henderson


On 8/19/22 04:00, Peter Maydell wrote:

The architectural feature FEAT_ETS (Enhanced Translation
Synchronization) is a set of tightened guarantees about memory
ordering involving translation table walks:

  * if memory access RW1 is ordered-before memory access RW2 then it
is also ordered-before any translation table walk generated by RW2
that generates a translation fault, address size fault or access
fault

  * TLB maintenance on non-exec-permission translations is guaranteed
complete after a DSB (ie it does not need the context
synchronization event that you have to have if you don’t have
FEAT_ETS)

For QEMU’s implementation we don’t reorder translation table walk
accesses, and we guarantee to finish the TLB maintenance as soon as
the TLB op is done (the tlb_flush functions will complete at the end
of the TLB, and TLB ops always end the TB because they’re sysreg


First TLB on this line should be TB.

Reviewed-by: Richard Henderson 


r~

Re: [PATCH 6/6] target/arm: Add missing space in comment

2022-08-19 Thread Richard Henderson


On 8/19/22 04:00, Peter Maydell wrote:

Fix a missing space before a comment terminator.

Signed-off-by: Peter Maydell 
---
  target/arm/cpu_tcg.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 


r~

Re: [PATCH 4/6] target/arm: Implement ID_DFR1

2022-08-19 Thread Richard Henderson


On 8/19/22 04:00, Peter Maydell wrote:

In Armv8.6, a new AArch32 ID register ID_DFR1 is defined; implement
it. We don't have any CPUs with features that they need to advertise
here yet, but plumbing in the ID register gives it the right name
when debugging and will help in future when we do add a CPU that
has non-zero ID_DFR1 fields.

Signed-off-by: Peter Maydell
---
  target/arm/cpu.h| 1 +
  target/arm/helper.c | 4 ++--
  target/arm/kvm64.c  | 2 ++
  3 files changed, 5 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 


r~

[PATCH v3] target/riscv: Use official extension names for AIA CSRs

2022-08-19 Thread Anup Patel

The arch review of AIA spec is completed and we now have official
extension names for AIA: Smaia (M-mode AIA CSRs) and Ssaia (S-mode
AIA CSRs).

Refer, section 1.6 of the latest AIA v0.3.1 stable specification at
https://github.com/riscv/riscv-aia/releases/download/0.3.1-draft.32/riscv-interrupts-032.pdf)

Based on above, we update QEMU RISC-V to:
1) Have separate config options for Smaia and Ssaia extensions
   which replace RISCV_FEATURE_AIA in CPU features
2) Not generate AIA INTC compatible string in virt machine

Signed-off-by: Anup Patel 
Reviewed-by: Andrew Jones 
---
Changes since v2:
 - Use env_archcpu() to get RISCVCPU * from CPURISCVState *
Changes since v1:
 - Remove redundant "has_aia" parameter from riscv_cpu_pending_to_irq()
---
 hw/intc/riscv_imsic.c |  4 +++-
 hw/riscv/virt.c   | 13 ++---
 target/riscv/cpu.c|  9 -
 target/riscv/cpu.h|  4 ++--
 target/riscv/cpu_helper.c |  3 ++-
 target/riscv/csr.c| 24 ++--
 6 files changed, 31 insertions(+), 26 deletions(-)

diff --git a/hw/intc/riscv_imsic.c b/hw/intc/riscv_imsic.c
index 8615e4cc1d..4d4d5b50ca 100644
--- a/hw/intc/riscv_imsic.c
+++ b/hw/intc/riscv_imsic.c
@@ -344,9 +344,11 @@ static void riscv_imsic_realize(DeviceState *dev, Error 
**errp)
 
 /* Force select AIA feature and setup CSR read-modify-write callback */
 if (env) {
-riscv_set_feature(env, RISCV_FEATURE_AIA);
 if (!imsic->mmode) {
+rcpu->cfg.ext_ssaia = true;
 riscv_cpu_set_geilen(env, imsic->num_pages - 1);
+} else {
+rcpu->cfg.ext_smaia = true;
 }
 riscv_cpu_set_aia_ireg_rmw_fn(env, (imsic->mmode) ? PRV_M : PRV_S,
   riscv_imsic_rmw, imsic);
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index e779d399ae..b041b33afc 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -261,17 +261,8 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, int 
socket,
 qemu_fdt_add_subnode(mc->fdt, intc_name);
 qemu_fdt_setprop_cell(mc->fdt, intc_name, "phandle",
 intc_phandles[cpu]);
-if (riscv_feature(>soc[socket].harts[cpu].env,
-  RISCV_FEATURE_AIA)) {
-static const char * const compat[2] = {
-"riscv,cpu-intc-aia", "riscv,cpu-intc"
-};
-qemu_fdt_setprop_string_array(mc->fdt, intc_name, "compatible",
-  (char **), ARRAY_SIZE(compat));
-} else {
-qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
-"riscv,cpu-intc");
-}
+qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
+"riscv,cpu-intc");
 qemu_fdt_setprop(mc->fdt, intc_name, "interrupt-controller", NULL, 0);
 qemu_fdt_setprop_cell(mc->fdt, intc_name, "#interrupt-cells", 1);
 
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index d3fbaa..3cf0c86661 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -101,6 +101,8 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zve64f, true, PRIV_VERSION_1_12_0, ext_zve64f),
 ISA_EXT_DATA_ENTRY(zhinx, true, PRIV_VERSION_1_12_0, ext_zhinx),
 ISA_EXT_DATA_ENTRY(zhinxmin, true, PRIV_VERSION_1_12_0, ext_zhinxmin),
+ISA_EXT_DATA_ENTRY(smaia, true, PRIV_VERSION_1_12_0, ext_smaia),
+ISA_EXT_DATA_ENTRY(ssaia, true, PRIV_VERSION_1_12_0, ext_ssaia),
 ISA_EXT_DATA_ENTRY(sscofpmf, true, PRIV_VERSION_1_12_0, ext_sscofpmf),
 ISA_EXT_DATA_ENTRY(sstc, true, PRIV_VERSION_1_12_0, ext_sstc),
 ISA_EXT_DATA_ENTRY(svinval, true, PRIV_VERSION_1_12_0, ext_svinval),
@@ -669,10 +671,6 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 }
 }
 
-if (cpu->cfg.aia) {
-riscv_set_feature(env, RISCV_FEATURE_AIA);
-}
-
 if (cpu->cfg.debug) {
 riscv_set_feature(env, RISCV_FEATURE_DEBUG);
 }
@@ -1058,7 +1056,8 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("x-j", RISCVCPU, cfg.ext_j, false),
 /* ePMP 0.9.3 */
 DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false),
-DEFINE_PROP_BOOL("x-aia", RISCVCPU, cfg.aia, false),
+DEFINE_PROP_BOOL("x-smaia", RISCVCPU, cfg.ext_smaia, false),
+DEFINE_PROP_BOOL("x-ssaia", RISCVCPU, cfg.ext_ssaia, false),
 
 DEFINE_PROP_END_OF_LIST(),
 };
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 42edfa4558..15cad73def 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -85,7 +85,6 @@ enum {
 RISCV_FEATURE_PMP,
 RISCV_FEATURE_EPMP,
 RISCV_FEATURE_MISA,
-RISCV_FEATURE_AIA,
 RISCV_FEATURE_DEBUG
 };
 
@@ -452,6 +451,8 @@ struct RISCVCPUConfig {
 bool ext_zve64f;
 bool ext_zmmul;
 bool ext_sscofpmf;
+bool ext_smaia;
+bool ext_ssaia;
 bool rvv_ta_all_1s;
 bool rvv_ma_all_1s;
 
@@ -472,7 +473,6 @@ struct RISCVCPUConfig {
 bool mmu;
 bool pmp;

Re: [PATCH v2] target/riscv: Use official extension names for AIA CSRs

2022-08-19 Thread Anup Patel

On Fri, Aug 19, 2022 at 8:40 PM Richard Henderson
 wrote:
>
> On 8/19/22 00:31, Anup Patel wrote:
> >   static int aia_hmode(CPURISCVState *env, int csrno)
> >   {
> > -if (!riscv_feature(env, RISCV_FEATURE_AIA)) {
> > +CPUState *cs = env_cpu(env);
> > +RISCVCPU *cpu = RISCV_CPU(cs);
>
> Better as
>
>  RISCVCPU *cpu = env_archcpu(env);

Okay, I will update.

Thanks,
Anup

>
>
> r~

Re: [PATCH 3/6] target/arm: Implement ID_MMFR5

2022-08-19 Thread Richard Henderson


On 8/19/22 04:00, Peter Maydell wrote:

In Armv8.6 a new AArch32 ID register ID_MMFR5 is defined.
Implement this; we want to be able to use it to report to
the guest that we implement FEAT_ETS.

Signed-off-by: Peter Maydell
---
  target/arm/cpu.h| 1 +
  target/arm/helper.c | 4 ++--
  target/arm/kvm64.c  | 2 ++
  3 files changed, 5 i


Reviewed-by: Richard Henderson 

r~

Re: [PATCH 2/6] target/arm: Sort KVM reads of AArch32 ID registers into encoding order

2022-08-19 Thread Richard Henderson


On 8/19/22 04:00, Peter Maydell wrote:

The code that reads the AArch32 ID registers from KVM in
kvm_arm_get_host_cpu_features() does so almost but not quite in
encoding order.  Move the read of ID_PFR2 down so it's really in
encoding order.

Signed-off-by: Peter Maydell
---
  target/arm/kvm64.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH 1/6] target/arm: Make cpregs 0, c0, c{3-15}, {0-7} correctly RAZ in v8

2022-08-19 Thread Richard Henderson


On 8/19/22 04:00, Peter Maydell wrote:

In the AArch32 ID register scheme, coprocessor registers with
encoding cp15, 0, c0, c{0-7}, {0-7} are all in the space covered by
what in v6 and v7 was called the "CPUID scheme", and are supposed to
RAZ if they're not allocated to a specific ID register.  For our
pre-v8 CPUs we get this right, because the regdefs in
id_pre_v8_midr_cp_reginfo[] cover these RAZ requirements.  However
for v8 we failed to put in the necessary patterns to cover this, so
we end up UNDEFing on everything we didn't have an ID register for.
This is a problem because in Armv8 some encodings in 0, c0, c3, {0-7}
are now being used for new ID registers, and guests might thus start
trying to read them.  (We already have one of these: ID_PFR2.)

For v8 CPUs, we already have regdefs for 0, c0, c{0-2}, {0-7} (that
is, the space is completely allocated with no reserved spaces).  Add
entries to v8_idregs[] covering 0, c0, c3, {0-7}:
  * c3, {0-2} is the reserved AArch32 space corresponding to the
AArch64 MVFR[012]_EL1
  * c3, {3,5,6,7} are reserved RAZ for both AArch32 and AArch64
(in fact some of these are given defined meanings in Armv8.6,
but we don't implement them yet)
  * c3, 4 is ID_PFR2 (already defined)

We then programmatically add RAZ patterns for AArch32 for
0, c0, c{4..15}, {0-7}:
  * c4-c7 are unused, and not shared with AArch64 (these
are the encodings corresponding to where the AArch64
specific ID registers live in the system register space)
  * c8-c15 weren't required to RAZ in v6/v7, but v8 extends
the AArch32 reserved-should-RAZ space to cover these;
the equivalent area of the AArch64 sysreg space is not
defined as must-RAZ

Note that the architecture allows some registers in this space
to return an UNKNOWN value; we always return 0.

Signed-off-by: Peter Maydell
---
  target/arm/helper.c | 65 +
  1 file changed, 60 insertions(+), 5 deletions(-)


This is the thing at the top of H.a page G7-8990, yeah?

Reviewed-by: Richard Henderson 


r~

Re: [PATCH v2] scsi-disk: support setting CD-ROM block size via device options

2022-08-19 Thread John Millikin

Gentle ping

On Thu, Aug 04, 2022 at 09:29:51PM +0900, John Millikin wrote:
> SunOS expects CD-ROM devices to have a block size of 512, and will
> fail to mount or install using QEMU's default block size of 2048.
> 
> When initializing the SCSI device, allow the `physical_block_size'
> block device option to override the default block size.
> 
> Signed-off-by: John Millikin 
> ---
>  hw/scsi/scsi-disk.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> This is the same diff as sent to qemu-devel@ about a week ago. That
> first email seems to have been eaten by a grue, but replying to it
> worked, so maybe the grue is gone now.
> 
> See https://gitlab.com/qemu-project/qemu/-/issues/1127 for some
> related discussion about SunOS CD-ROM compatibility.
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index f5cdb9ad4b..acdf8dc05c 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -2533,6 +2533,7 @@ static void scsi_cd_realize(SCSIDevice *dev, Error 
> **errp)
>  SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
>  AioContext *ctx;
>  int ret;
> +uint32_t blocksize = 2048;
>  
>  if (!dev->conf.blk) {
>  /* Anonymous BlockBackend for an empty drive. As we put it into
> @@ -2542,9 +2543,13 @@ static void scsi_cd_realize(SCSIDevice *dev, Error 
> **errp)
>  assert(ret == 0);
>  }
>  
> +if (dev->conf.physical_block_size != 0) {
> +blocksize = dev->conf.physical_block_size;
> +}
> +
>  ctx = blk_get_aio_context(dev->conf.blk);
>  aio_context_acquire(ctx);
> -s->qdev.blocksize = 2048;
> +s->qdev.blocksize = blocksize;
>  s->qdev.type = TYPE_ROM;
>  s->features |= 1 << SCSI_DISK_F_REMOVABLE;
>  if (!s->product) {
> -- 
> 2.25.1
>

[PATCH v2 2/2] scsi: Reject commands if the CDB length exceeds buf_len

2022-08-19 Thread John Millikin

In scsi_req_parse_cdb(), if the CDB length implied by the command type
exceeds the initialized portion of the command buffer, reject the request.

Rejected requests are recorded by the `scsi_req_parse_bad` trace event.

On example of a bug detected by this check is SunOS's use of interleaved
DMA and non-DMA commands. This guest behavior currently causes QEMU to
parse uninitialized memory as a SCSI command, with unpredictable
outcomes.

With the new check in place:

  * QEMU consistently creates a trace event and rejects the request.

  * SunOS retries the request(s) and is able to successfully boot from
disk.

Signed-off-by: John Millikin 
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1127
---
 hw/scsi/scsi-bus.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index b35fde0a30..abe195b22a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -712,6 +712,8 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 SCSICommand cmd = { .len = 0 };
 int ret;
 
+assert(buf_len > 0);
+
 if ((d->unit_attention.key == UNIT_ATTENTION ||
  bus->unit_attention.key == UNIT_ATTENTION) &&
 (buf[0] != INQUIRY &&
@@ -1316,7 +1318,7 @@ int scsi_req_parse_cdb(SCSIDevice *dev, SCSICommand *cmd, 
uint8_t *buf,
 
 cmd->lba = -1;
 len = scsi_cdb_length(buf);
-if (len < 0) {
+if (len < 0 || len > buf_len) {
 return -1;
 }
 
-- 
2.25.1

[PATCH v2 1/2] scsi: Add buf_len parameter to scsi_req_new()

2022-08-19 Thread John Millikin

When a SCSI command is received from the guest, the CDB length implied
by the first byte might exceed the number of bytes the guest sent. In
this case scsi_req_new() will read uninitialized data, causing
unpredictable behavior.

Adds the buf_len parameter to scsi_req_new() and plumbs it through the
call stack.

Signed-off-by: John Millikin 
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1127
---
 hw/scsi/esp.c  |  2 +-
 hw/scsi/lsi53c895a.c   |  2 +-
 hw/scsi/megasas.c  | 12 +++-
 hw/scsi/mptsas.c   |  3 ++-
 hw/scsi/scsi-bus.c | 21 +
 hw/scsi/scsi-disk.c|  7 ---
 hw/scsi/scsi-generic.c |  5 +++--
 hw/scsi/spapr_vscsi.c  |  3 ++-
 hw/scsi/virtio-scsi.c  |  5 +++--
 hw/scsi/vmw_pvscsi.c   |  3 ++-
 hw/usb/dev-storage.c   |  3 ++-
 hw/usb/dev-uas.c   |  5 +++--
 include/hw/scsi/scsi.h | 11 ++-
 13 files changed, 49 insertions(+), 33 deletions(-)

Changes from v1:
* Applied Paolo's suggestions for proper CDB sizes instead of a TODO.
  https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg02928.html

v1: https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg02509.html

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 2d3c649567..19fafad2a3 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -292,7 +292,7 @@ static void do_command_phase(ESPState *s)
 esp_fifo_pop_buf(>cmdfifo, buf, cmdlen);
 
 current_lun = scsi_device_find(>bus, 0, s->current_dev->id, s->lun);
-s->current_req = scsi_req_new(current_lun, 0, s->lun, buf, s);
+s->current_req = scsi_req_new(current_lun, 0, s->lun, buf, cmdlen, s);
 datalen = scsi_req_enqueue(s->current_req);
 s->ti_size = datalen;
 fifo8_reset(>cmdfifo);
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index ad5f5e5f39..05a43ec807 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -864,7 +864,7 @@ static void lsi_do_command(LSIState *s)
 s->current = g_new0(lsi_request, 1);
 s->current->tag = s->select_tag;
 s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun, buf,
-   s->current);
+   s->dbc, s->current);
 
 n = scsi_req_enqueue(s->current->req);
 if (n) {
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index d5dfb412ba..04d48b9cb8 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -1062,7 +1062,8 @@ static int megasas_pd_get_info_submit(SCSIDevice *sdev, 
int lun,
 info->inquiry_data[0] = 0x7f; /* Force PQual 0x3, PType 0x1f */
 info->vpd_page83[0] = 0x7f;
 megasas_setup_inquiry(cmdbuf, 0, sizeof(info->inquiry_data));
-cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf, cmd);
+cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf,
+sizeof(cmdbuf), cmd);
 if (!cmd->req) {
 trace_megasas_dcmd_req_alloc_failed(cmd->index,
 "PD get info std inquiry");
@@ -1080,7 +1081,8 @@ static int megasas_pd_get_info_submit(SCSIDevice *sdev, 
int lun,
 return MFI_STAT_INVALID_STATUS;
 } else if (info->inquiry_data[0] != 0x7f && info->vpd_page83[0] == 0x7f) {
 megasas_setup_inquiry(cmdbuf, 0x83, sizeof(info->vpd_page83));
-cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf, cmd);
+cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf,
+sizeof(cmdbuf), cmd);
 if (!cmd->req) {
 trace_megasas_dcmd_req_alloc_failed(cmd->index,
 "PD get info vpd inquiry");
@@ -1268,7 +1270,7 @@ static int megasas_ld_get_info_submit(SCSIDevice *sdev, 
int lun,
 cmd->iov_buf = g_malloc0(dcmd_size);
 info = cmd->iov_buf;
 megasas_setup_inquiry(cdb, 0x83, sizeof(info->vpd_page83));
-cmd->req = scsi_req_new(sdev, cmd->index, lun, cdb, cmd);
+cmd->req = scsi_req_new(sdev, cmd->index, lun, cdb, sizeof(cdb), cmd);
 if (!cmd->req) {
 trace_megasas_dcmd_req_alloc_failed(cmd->index,
 "LD get info vpd inquiry");
@@ -1748,7 +1750,7 @@ static int megasas_handle_scsi(MegasasState *s, 
MegasasCmd *cmd,
 return MFI_STAT_SCSI_DONE_WITH_ERROR;
 }
 
-cmd->req = scsi_req_new(sdev, cmd->index, lun_id, cdb, cmd);
+cmd->req = scsi_req_new(sdev, cmd->index, lun_id, cdb, cdb_len, cmd);
 if (!cmd->req) {
 trace_megasas_scsi_req_alloc_failed(
 mfi_frame_desc(frame_cmd), target_id, lun_id);
@@ -1823,7 +1825,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd 
*cmd, int frame_cmd)
 
 megasas_encode_lba(cdb, lba_start, lba_count, is_write);
 cmd->req = scsi_req_new(sdev, cmd->index,
-lun_id, cdb, cmd);
+lun_id, cdb, cdb_len, cmd);
 if (!cmd->req) {
 trace_megasas_scsi_req_alloc_failed(

Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory

2022-08-19 Thread Kirill A. Shutemov

On Thu, Aug 18, 2022 at 08:00:41PM -0700, Hugh Dickins wrote:
> On Thu, 18 Aug 2022, Kirill A . Shutemov wrote:
> > On Wed, Aug 17, 2022 at 10:40:12PM -0700, Hugh Dickins wrote:
> > > 
> > > If your memory could be swapped, that would be enough of a good reason
> > > to make use of shmem.c: but it cannot be swapped; and although there
> > > are some references in the mailthreads to it perhaps being swappable
> > > in future, I get the impression that will not happen soon if ever.
> > > 
> > > If your memory could be migrated, that would be some reason to use
> > > filesystem page cache (because page migration happens to understand
> > > that type of memory): but it cannot be migrated.
> > 
> > Migration support is in pipeline. It is part of TDX 1.5 [1]. And swapping
> > theoretically possible, but I'm not aware of any plans as of now.
> > 
> > [1] 
> > https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html
> 
> I always forget, migration means different things to different audiences.
> As an mm person, I was meaning page migration, whereas a virtualization
> person thinks VM live migration (which that reference appears to be about),
> a scheduler person task migration, an ornithologist bird migration, etc.
> 
> But you're an mm person too: you may have cited that reference in the
> knowledge that TDX 1.5 Live Migration will entail page migration of the
> kind I'm thinking of.  (Anyway, it's not important to clarify that here.)

TDX 1.5 brings both.

In TDX speak, mm migration called relocation. See TDH.MEM.PAGE.RELOCATE.

> > > Some of these impressions may come from earlier iterations of the
> > > patchset (v7 looks better in several ways than v5).  I am probably
> > > underestimating the extent to which you have taken on board other
> > > usages beyond TDX and SEV private memory, and rightly want to serve
> > > them all with similar interfaces: perhaps there is enough justification
> > > for shmem there, but I don't see it.  There was mention of userfaultfd
> > > in one link: does that provide the justification for using shmem?
> > > 
> > > I'm afraid of the special demands you may make of memory allocation
> > > later on - surprised that huge pages are not mentioned already;
> > > gigantic contiguous extents? secretmem removed from direct map?
> > 
> > The design allows for extension to hugetlbfs if needed. Combination of
> > MFD_INACCESSIBLE | MFD_HUGETLB should route this way. There should be zero
> > implications for shmem. It is going to be separate struct 
> > memfile_backing_store.
> 
> Last year's MFD_HUGEPAGE proposal would have allowed you to do it with
> memfd via tmpfs without needing to involve hugetlbfs; but you may prefer
> the determinism of hugetlbfs, relying on /proc/sys/vm/nr_hugepages etc.
> 
> But I've yet to see why you want to involve this or that filesystem
> (with all its filesystem-icity suppressed) at all.  The backing store
> is host memory, and tmpfs and hugetlbfs just impose their own
> idiosyncrasies on how that memory is allocated; but I think you would
> do better to choose your own idiosyncrasies in allocation directly -
> you don't need a different "backing store" to choose between 4k or 2M
> or 1G or whatever allocations.

These idiosyncrasies are well known: user who used hugetlbfs may want to
get direct replacement that would tap into the same hugetlb reserves and
get the same allocation guarantees. Admins know where to look if ENOMEM
happens.

For THP, admin may know how to tweak allocation/defrag policy for his
liking and how to track if they are allocated.

> tmpfs and hugetlbfs and page cache are designed around sharing memory:
> TDX is designed around absolutely not sharing memory; and the further
> uses which Sean foresees appear not to need it as page cache either.
> 
> Except perhaps for page migration reasons.  It's somewhat incidental,  
> but of course page migration knows how to migrate page cache, so
> masquerading as page cache will give a short cut to page migration,
> when page migration becomes at all possible.
> 
> > 
> > I'm not sure secretmem is a fit here as we want to extend MFD_INACCESSIBLE
> > to be movable if platform supports it and secretmem is not migratable by
> > design (without direct mapping fragmentations).
> > 
> > > Here's what I would prefer, and imagine much easier for you to maintain;
> > > but I'm no system designer, and may be misunderstanding throughout.
> > > 
> > > QEMU gets fd from opening /dev/kvm_something, uses ioctls (or perhaps
> > > the fallocate syscall interface itself) to allocate and free the memory,
> > > ioctl for initializing some of it too.  KVM in control of whether that
> > > fd can be read or written or mmap'ed or whatever, no need to prevent it
> > > in shmem.c, no need for flags, seals, notifications to and fro because
> > > KVM is already in control and knows the history.  If shmem actually has
> > > value, call into it underneath - somewhat like SysV

Re: [PATCH v3] xio3130_upstream: Add ACS (Access Control Services) capability

2022-08-19 Thread Paul Schlacter

ping

On Thu, Aug 18, 2022 at 10:25 PM Paul Schlacter  wrote:
>
> If it is a pcie device, check that all devices on the path from
> the device to the root complex have ACS enabled, and then the
> device will become an iommu_group.
>
> pci_acs_path_enabled, this function in the Linux kernel, it means
> that if the device is a PCIe device, check the path from the
> device to the root complex. If ACS is all enabled, the device will
> become an iommu_group.
>
> acs determine whether it is a separate iommu_group.
>
> Signed-off-by: wlfightup 
> —
> v3:
> - Suggested by Michael S. Tsirkin, use x-disable-acs, and set the
> default value to true, Compatible with previous defaults
>
> v2:
> - Allow ACS to be disabled.
> - Suggested by Michael S. Tsirkin, use disable-acs to set property.
>
> hw/pci-bridge/xio3130_upstream.c | 13 +
> 1 file changed, 13 insertions(+)
>
> diff --git a/hw/pci-bridge/xio3130_upstream.c 
> b/hw/pci-bridge/xio3130_upstream.c
> index 5ff46ef050..f918113d76 100644
> --- a/hw/pci-bridge/xio3130_upstream.c
> +++ b/hw/pci-bridge/xio3130_upstream.c
> @@ -24,6 +24,7 @@
> #include "hw/pci/msi.h"
> #include "hw/pci/pcie.h"
> #include "hw/pci/pcie_port.h"
> +#include "hw/qdev-properties.h"
> #include "migration/vmstate.h"
> #include "qemu/module.h"
>
> @@ -37,6 +38,8 @@
> #define XIO3130_SSVID_SSID  0
> #define XIO3130_EXP_OFFSET  0x90
> #define XIO3130_AER_OFFSET  0x100
> +#define XIO3130_ACS_OFFSET \
> +(XIO3130_AER_OFFSET + PCI_ERR_SIZEOF)
>
> static void xio3130_upstream_write_config(PCIDevice *d, uint32_t address,
> uint32_t val, int len)
> @@ -57,6 +60,7 @@ static void xio3130_upstream_reset(DeviceState *qdev)
> static void xio3130_upstream_realize(PCIDevice *d, Error **errp)
> {
>   PCIEPort *p = PCIE_PORT(d);
> +PCIESlot *s = PCIE_SLOT(d);
>   int rc;
>
>   pci_bridge_initfn(d, TYPE_PCIE_BUS);
> @@ -92,6 +96,9 @@ static void xio3130_upstream_realize(PCIDevice *d, Error 
> **errp)
>   goto err;
>   }
>
> +if (!s->disable_acs) {
> +pcie_acs_init(d, XIO3130_ACS_OFFSET);
> +}
>   return;
>
> err:
> @@ -110,6 +117,11 @@ static void xio3130_upstream_exitfn(PCIDevice *d)
>   pci_bridge_exitfn(d);
> }
>
> +static Property xio3130_upstream_props[] = {
> +DEFINE_PROP_BOOL("x-disable-acs", PCIESlot, disable_acs, true),
> +DEFINE_PROP_END_OF_LIST()
> +};
> +
> static const VMStateDescription vmstate_xio3130_upstream = {
>   .name = "xio3130-express-upstream-port",
>   .priority = MIG_PRI_PCI_BUS,
> @@ -139,6 +151,7 @@ static void xio3130_upstream_class_init(ObjectClass 
> *klass, void *data)
>   dc->desc = "TI X3130 Upstream Port of PCI Express Switch";
>   dc->reset = xio3130_upstream_reset;
>   dc->vmsd = _xio3130_upstream;
> +device_class_set_props(dc, xio3130_upstream_props);
> }
>
> static const TypeInfo xio3130_upstream_info = {
> --
> 2.24.3 (Apple Git-128)
>
>

Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory

2022-08-19 Thread Sean Christopherson

On Thu, Aug 18, 2022, Hugh Dickins wrote:
> On Fri, 19 Aug 2022, Sean Christopherson wrote:
> > On Thu, Aug 18, 2022, Kirill A . Shutemov wrote:
> > > On Wed, Aug 17, 2022 at 10:40:12PM -0700, Hugh Dickins wrote:
> > > > If your memory could be migrated, that would be some reason to use
> > > > filesystem page cache (because page migration happens to understand
> > > > that type of memory): but it cannot be migrated.
> > > 
> > > Migration support is in pipeline. It is part of TDX 1.5 [1]. 
> > 
> > And this isn't intended for just TDX (or SNP, or pKVM).  We're not _that_ 
> > far off
> > from being able to use UPM for "regular" VMs as a way to provide 
> > defense-in-depth
> 
> UPM? That's an acronym from your side of the fence, I spy references to
> it in the mail threads, but haven't tracked down a definition.  I'll
> just take it to mean the fd-based memory we're discussing.

Ya, sorry, UPM is what we came up with as shorthand for "Unmapping guest Private
Memory".  Your assumption is spot on, it's just a fancy way of saying "guest is
backed with inaccessible fd-based memory".

> > without having to take on the overhead of confidential VMs.  At that point,
> > migration and probably even swap are on the table.
> 
> Good, the more "flexible" that memory is, the better for competing users
> of memory.  But an fd supplied by KVM gives you freedom to change to a
> better implementation of allocation underneath, whenever it suits you.
> Maybe shmem beneath is good from the start, maybe not.

The main flaw with KVM providing the fd is that it forces KVM to get into the
memory management business, which us KVM folks really, really do not want to do.
And based on the types of bugs KVM has had in the past related to memory 
management,
it's a safe bet to say the mm folks don't want us getting involved either :-)

The combination of gup()/follow_pte() and mmu_notifiers has worked very well.
KVM gets a set of (relatively) simple rules to follow and doesn't have to be 
taught
new things every time a new backing type comes along.  And from the other side, 
KVM
has very rarely had to go poke into other subsystems' code to support exposing a
new type of memory to guests.

What we're trying to do with UPM/fd-based memory is establish a similar contract
between mm and KVM, but without requiring mm to also map memory into host 
userspace.

The only way having KVM provide the fd works out in the long run is if KVM is 
the
only subsystem that ever wants to make use of memory that isn't accessible from
userspace and isn't tied to a specific backing type, _and_ if the set of backing
types that KVM ever supports is kept to an absolute minimum.

Re: [PATCH for-7.2 v3 20/20] hmp, device_tree.c: add 'info fdt ' support

2022-08-19 Thread Daniel Henrique Barboza





On 8/17/22 22:34, David Gibson wrote:

On Tue, Aug 16, 2022 at 02:34:28PM -0300, Daniel Henrique Barboza wrote:

'info fdt' is only able to print full nodes so far. It would be good to
be able to also print single properties, since ometimes we just want
to verify a single value from the FDT.

libfdt does not have support to find a property given its full path, but
it does have a way to return a fdt_property given a prop name and its
subnode.

Add a new optional 'propname' parameter to x-query-fdt to specify the
property of a given node. If it's present, we'll proceed to find the
node as usual but, instead of printing the node, we'll attempt to find
the property and print it standalone.

After this change, if an user wants to print just the value of 'cpu' inside
/cpu/cpu-map(...) from an ARM FDT, we can do it:

(qemu) info fdt /cpus/cpu-map/socket0/cluster0/core0 cpu
/cpus/cpu-map/socket0/cluster0/core0/cpu = <0x8001>

Or the 'ibm,my-dma-window' from the v-scsi device inside the pSeries
FDT:

(qemu) info fdt /vdevice/v-scsi@7103 ibm,my-dma-window
/vdevice/v-scsi@7103/ibm,my-dma-window = <0x7103 0x0 0x0 0x0 0x1000>


nit: dts property definitions also include a terminating ';'
prop = "foobar";



I forgot to update the commit msg. The code is already putting
the semi-colon at the end:

(qemu) info fdt /vdevice/v-scsi@7103 ibm,my-dma-window
/vdevice/v-scsi@7103/ibm,my-dma-window = <0x7103 0x0 0x0 0x0 
0x1000>;
(qemu)


I'll update it in v4.

Thanks,

Daniel




Cc: Dr. David Alan Gilbert 
Acked-by: Dr. David Alan Gilbert 
Signed-off-by: Daniel Henrique Barboza 
---
  hmp-commands-info.hx |  9 +
  include/sysemu/device_tree.h |  2 ++
  monitor/hmp-cmds.c   |  5 -
  monitor/qmp-cmds.c   |  8 +---
  qapi/machine.json|  4 +++-
  softmmu/device_tree.c| 29 -
  6 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 743b48865d..17d6ee4d30 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -924,13 +924,14 @@ ERST
  
  {

  .name   = "fdt",
-.args_type  = "nodepath:s",
-.params = "nodepath",
-.help   = "show firmware device tree node given its path",
+.args_type  = "nodepath:s,propname:s?",
+.params = "nodepath [propname]",
+.help   = "show firmware device tree node or property given its 
path",
  .cmd= hmp_info_fdt,
  },
  
  SRST

``info fdt``
-Show a firmware device tree node given its path. Requires libfdt.
+Show a firmware device tree node or property given its path.
+Requires libfdt.
  ERST
diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index 057d13e397..551a02dee2 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -140,6 +140,8 @@ int qemu_fdt_add_path(void *fdt, const char *path);
  void qemu_fdt_dumpdtb(void *fdt, int size);
  void qemu_fdt_qmp_dumpdtb(const char *filename, Error **errp);
  HumanReadableText *qemu_fdt_qmp_query_fdt(const char *nodepath,
+  bool has_propname,
+  const char *propname,
Error **errp);
  
  /**

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index accde90380..df8493adc5 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -2488,8 +2488,11 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
  void hmp_info_fdt(Monitor *mon, const QDict *qdict)
  {
  const char *nodepath = qdict_get_str(qdict, "nodepath");
+const char *propname = qdict_get_try_str(qdict, "propname");
  Error *err = NULL;
-g_autoptr(HumanReadableText) info = qmp_x_query_fdt(nodepath, );
+g_autoptr(HumanReadableText) info = NULL;
+
+info = qmp_x_query_fdt(nodepath, propname != NULL, propname, );
  
  if (hmp_handle_error(mon, err)) {

  return;
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index db2c6aa7da..ca2a96cdf7 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -604,9 +604,10 @@ void qmp_dumpdtb(const char *filename, Error **errp)
  return qemu_fdt_qmp_dumpdtb(filename, errp);
  }
  
-HumanReadableText *qmp_x_query_fdt(const char *nodepath, Error **errp)

+HumanReadableText *qmp_x_query_fdt(const char *nodepath, bool has_propname,
+   const char *propname, Error **errp)
  {
-return qemu_fdt_qmp_query_fdt(nodepath, errp);
+return qemu_fdt_qmp_query_fdt(nodepath, has_propname, propname, errp);
  }
  #else
  void qmp_dumpdtb(const char *filename, Error **errp)
@@ -614,7 +615,8 @@ void qmp_dumpdtb(const char *filename, Error **errp)
  error_setg(errp, "dumpdtb requires libfdt");
  }
  
-HumanReadableText *qmp_x_query_fdt(const char *nodepath, Error **errp)

+HumanReadableText *qmp_x_query_fdt(const char

Re: [PATCH] configure: improve error for ucontext coroutine backend

2022-08-19 Thread Richard Henderson


On 8/19/22 10:02, Paolo Bonzini wrote:

Instead of using feature_not_found(), which is not a good match because
there is no "remedy" to fix the lack of makecontext(), just print a
custom error.

This happens to remove the last use of feature_not_found(), so remove
the definition and the documentation.

Signed-off-by: Paolo Bonzini
---
  configure   | 11 +--
  docs/devel/build-system.rst |  5 -
  2 files changed, 1 insertion(+), 15 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v7 11/14] KVM: Register/unregister the guest private memory regions

2022-08-19 Thread Vishal Annapurve

> ...
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 230c8ff9659c..bb714c2a4b06 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -914,6 +914,35 @@ static int kvm_init_mmu_notifier(struct kvm *kvm)
>
>  #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */
>
> +#ifdef CONFIG_HAVE_KVM_PRIVATE_MEM
> +#define KVM_MEM_ATTR_PRIVATE   0x0001
> +static int kvm_vm_ioctl_set_encrypted_region(struct kvm *kvm, unsigned int 
> ioctl,
> +struct kvm_enc_region *region)
> +{
> +   unsigned long start, end;
> +   void *entry;
> +   int r;
> +
> +   if (region->size == 0 || region->addr + region->size < region->addr)
> +   return -EINVAL;
> +   if (region->addr & (PAGE_SIZE - 1) || region->size & (PAGE_SIZE - 1))
> +   return -EINVAL;
> +
> +   start = region->addr >> PAGE_SHIFT;
> +   end = (region->addr + region->size - 1) >> PAGE_SHIFT;
> +
> +   entry = ioctl == KVM_MEMORY_ENCRYPT_REG_REGION ?
> +   xa_mk_value(KVM_MEM_ATTR_PRIVATE) : NULL;
> +
> +   r = xa_err(xa_store_range(>mem_attr_array, start, end,
> +   entry, GFP_KERNEL_ACCOUNT));

xa_store_range seems to create multi-index entries by default.
Subsequent xa_store_range call changes all the entries stored
previously.
xa_store needs to be used here instead of xa_store_range to achieve
the intended behavior.

> +
> +   kvm_zap_gfn_range(kvm, start, end + 1);
> +
> +   return r;
> +}
> +#endif /* CONFIG_HAVE_KVM_PRIVATE_MEM */
> +
> ...

Re: [PULL for 7.1] Fix proberi instruction for qemu-hppa-user

2022-08-19 Thread Richard Henderson


On 8/19/22 07:19, Helge Deller wrote:

Peter & Richard,

would you please pull this one-line fix for target-hppa ?
It fixes a mistranslation of the "proberi" hppa assembler instruction
when run as linux-user. Without the fix many debian packages won't
build in a hppa-chroot.

There is no need to extend the release cycle of qemu-7.1 due to this
fix.

Thanks,
Helge

---

The following changes since commit c7208a6e0d049f9e8af15df908168a79b1f99685:

   Update version for v7.1.0-rc3 release (2022-08-16 20:45:19 -0500)

are available in the Git repository at:

   https://github.com/hdeller/qemu-hppa.git tags/for-7.1-hppa

for you to fetch changes up to 6fab0c182dabaca5b3d56e60a8de3122ce9afbea:

   target/hppa: Fix proberi instruction emulation for linux-user (2022-08-19 
15:59:14 +0200)


target/hppa: Fix proberi instruction emulation for linux-user


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/7.1 as 
appropriate.


r~





Helge Deller (1):
   target/hppa: Fix proberi instruction emulation for linux-user

  target/hppa/op_helper.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

[RFC PATCH] qemu-options: try and clarify preferred block semantics

2022-08-19 Thread Alex Bennée

Try to correct any confusion about QEMU's Byzantine disk options by
laying out the preferred "modern" options as-per:

 " (best:  -device + -blockdev,  2nd obsolete syntax: -device +
 -drive,  3rd obsolete syntax: -drive, 4th obsolete syntax: -hdNN)"

Signed-off-by: Alex Bennée 
Cc: qemu-bl...@nongnu.org
Cc: Kevin Wolf 
Cc: Hanna Reitz 
Cc: Daniel P. Berrange 
Cc: Thomas Huth 
---
 qemu-options.hx | 13 +
 1 file changed, 13 insertions(+)

diff --git a/qemu-options.hx b/qemu-options.hx
index 3f23a42fa8..d57239d364 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1105,6 +1105,19 @@ DEFHEADING()
 
 DEFHEADING(Block device options:)
 
+SRST
+The QEMU block device handling options have a long history and
+have gone through several iterations as the feature set and complexity
+of the block layer have grown. Many online guides to QEMU often
+reference older and deprecated options which can lead to confusion.
+
+The recommended modern way to describe disks is to use combination of
+``-device`` to specify the hardware device and ``-blockdev`` to
+describe the backend. The device defines what the guest sees and the
+backend describes how QEMU handles the data.
+
+ERST
+
 DEF("fda", HAS_ARG, QEMU_OPTION_fda,
 "-fda/-fdb file  use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL)
 DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL)
-- 
2.30.2

Re: [PATCH 08/62] target/arm: Create GetPhysAddrResult

2022-08-19 Thread Richard Henderson


On 8/10/22 06:02, Alex Bennée wrote:


Richard Henderson  writes:


Combine 5 output pointer argument from get_phys_addr
into a single struct.  Adjust all callers.


This looks to be an improvement - I guess the real benefit is the
compiler isn't jamming so many closely aligned pointers on the stack
frame for all the return values?


Correct.  The number of parameters is also down to 6, which fits all in register arguments 
for most hosts, including x86_64.   And in turn, we need to copy fewer arguments down to 
the subroutines.



r~

[PATCH 5/5] vdpa: Allow MQ feture in SVQ

2022-08-19 Thread Eugenio Pérez

Finally enable SVQ with MQ feature.

Signed-off-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index d5bbda37a1..ccf933de2f 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -92,6 +92,7 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |
 BIT_ULL(VIRTIO_NET_F_STATUS) |
 BIT_ULL(VIRTIO_NET_F_CTRL_VQ) |
+BIT_ULL(VIRTIO_NET_F_MQ) |
 BIT_ULL(VIRTIO_F_ANY_LAYOUT) |
 BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR) |
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
-- 
2.31.1

[PATCH 3/5] vdpa: validate MQ CVQ commands

2022-08-19 Thread Eugenio Pérez

So we are sure we can update the device model properly before sending to
the device.

Signed-off-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 96fd3bc835..d5bbda37a1 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -484,6 +484,15 @@ static bool vhost_vdpa_net_cvq_validate_cmd(const void 
*out_buf, size_t len)
   __func__, ctrl.cmd);
 };
 break;
+case VIRTIO_NET_CTRL_MQ:
+switch (ctrl.cmd) {
+case VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET:
+return true;
+default:
+qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid mq cmd %u\n",
+  __func__, ctrl.cmd);
+};
+break;
 default:
 qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid control class %u\n",
   __func__, ctrl.class);
-- 
2.31.1

[PATCH 2/5] vdpa: Add vhost_vdpa_net_load_mq

2022-08-19 Thread Eugenio Pérez

Same way as with the MAC, restore the expected number of queues at
device's start.

Signed-off-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 1e0dbfcced..96fd3bc835 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -391,6 +391,35 @@ static int vhost_vdpa_net_load_mac(VhostVDPAState *s,
 return 0;
 }
 
+static int vhost_vdpa_net_load_mq(VhostVDPAState *s,
+  const VirtIONet *n)
+{
+uint64_t features = n->parent_obj.guest_features;
+ssize_t dev_written;
+void *cursor = s->cvq_cmd_out_buffer;
+if (!(features & BIT_ULL(VIRTIO_NET_F_MQ))) {
+return 0;
+}
+
+*(struct virtio_net_ctrl_hdr *)cursor = (struct virtio_net_ctrl_hdr) {
+.class = VIRTIO_NET_CTRL_MQ,
+.cmd = VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET,
+};
+cursor += sizeof(struct virtio_net_ctrl_hdr);
+*(struct virtio_net_ctrl_mq *)cursor = (struct virtio_net_ctrl_mq) {
+.virtqueue_pairs = cpu_to_le16(n->curr_queue_pairs),
+};
+cursor += sizeof(struct virtio_net_ctrl_mq);
+
+dev_written = vhost_vdpa_net_cvq_add(s, cursor - s->cvq_cmd_out_buffer,
+ sizeof(virtio_net_ctrl_ack));
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+
+return *((virtio_net_ctrl_ack *)s->cvq_cmd_in_buffer) != VIRTIO_NET_OK;
+}
+
 static int vhost_vdpa_net_load(NetClientState *nc)
 {
 VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -409,6 +438,10 @@ static int vhost_vdpa_net_load(NetClientState *nc)
 if (unlikely(r < 0)) {
 return r;
 }
+r = vhost_vdpa_net_load_mq(s, n);
+if (unlikely(r)) {
+return r;
+}
 
 return 0;
 }
-- 
2.31.1

Re: [PATCH v6 00/21] linux-user: Fix siginfo_t contents when jumping to non-readable pages

2022-08-19 Thread Vivian Wang

On 8/19/22 11:25, Richard Henderson wrote:
> Hi Ilya,
>
> After adding support for riscv (similar to s390x, in that we can
> find the total insn length from the first couple of bits, so, easy),
> I find that the test case doesn't work without all of the other
> changes for PROT_EXEC, including the translator_ld changes.
>
> Other changes from your v5:
>   - mprotect invalidates tbs.  The test case is riscv, with a
> 4-byte insn at offset 0xffe, which was chained to from the
> insn at offset 0xffa.  The fact that the 0xffe tb was not
> invalidated meant that we chained to it and re-executed
> without revalidating page protections.
>
>   - rewrote the test framework to be agnostic of page size, which
> reduces some of the repetition.  I ran into trouble with the
> riscv linker, which relaxed the segment such that .align+.org
> wasn't actually honored.  This new form doesn't require the
> test bytes to be aligned in the binary.
>
>
> r~
I've confirmed that this fixes #1155

Tested-by: Vivian Wang 

> Ilya Leoshkevich (4):
>   linux-user: Clear translations and tb_jmp_cache on mprotect()
>   accel/tcg: Introduce is_same_page()
>   target/s390x: Make translator stop before the end of a page
>   target/i386: Make translator stop before the end of a page
>
> Richard Henderson (17):
>   linux-user/arm: Mark the commpage executable
>   linux-user/hppa: Allocate page zero as a commpage
>   linux-user/x86_64: Allocate vsyscall page as a commpage
>   linux-user: Honor PT_GNU_STACK
>   tests/tcg/i386: Move smc_code2 to an executable section
>   accel/tcg: Properly implement get_page_addr_code for user-only
>   accel/tcg: Unlock mmap_lock after longjmp
>   accel/tcg: Make tb_htable_lookup static
>   accel/tcg: Move qemu_ram_addr_from_host_nofail to physmem.c
>   accel/tcg: Use probe_access_internal for softmmu
> get_page_addr_code_hostp
>   accel/tcg: Add nofault parameter to get_page_addr_code_hostp
>   accel/tcg: Raise PROT_EXEC exception early
>   accel/tcg: Remove translator_ldsw
>   accel/tcg: Add pc and host_pc params to gen_intermediate_code
>   accel/tcg: Add fast path for translator_ld*
>   target/riscv: Add MAX_INSN_LEN and insn_len
>   target/riscv: Make translator stop before the end of a page
>
>  include/elf.h |   1 +
>  include/exec/cpu-common.h |   1 +
>  include/exec/exec-all.h   |  87 ++
>  include/exec/translator.h |  96 +---
>  linux-user/arm/target_cpu.h   |   4 +-
>  linux-user/qemu.h |   1 +
>  accel/tcg/cpu-exec.c  | 134 ++--
>  accel/tcg/cputlb.c|  93 ++--
>  accel/tcg/plugin-gen.c|   4 +-
>  accel/tcg/translate-all.c |  29 +++---
>  accel/tcg/translator.c| 136 +---
>  accel/tcg/user-exec.c |  18 +++-
>  linux-user/elfload.c  |  82 +++--
>  linux-user/mmap.c |   8 ++
>  softmmu/physmem.c |  12 +++
>  target/alpha/translate.c  |   5 +-
>  target/arm/translate.c|   5 +-
>  target/avr/translate.c|   5 +-
>  target/cris/translate.c   |   5 +-
>  target/hexagon/translate.c|   6 +-
>  target/hppa/translate.c   |   5 +-
>  target/i386/tcg/translate.c   |  32 ++-
>  target/loongarch/translate.c  |   6 +-
>  target/m68k/translate.c   |   5 +-
>  target/microblaze/translate.c |   5 +-
>  target/mips/tcg/translate.c   |   5 +-
>  target/nios2/translate.c  |   5 +-
>  target/openrisc/translate.c   |   6 +-
>  target/ppc/translate.c|   5 +-
>  target/riscv/translate.c  |  32 +--
>  target/rx/translate.c |   5 +-
>  target/s390x/tcg/translate.c  |  20 +++--
>  target/sh4/translate.c|   5 +-
>  target/sparc/translate.c  |   5 +-
>  target/tricore/translate.c|   6 +-
>  target/xtensa/translate.c |   6 +-
>  tests/tcg/i386/test-i386.c|   2 +-
>  tests/tcg/riscv64/noexec.c|  79 +
>  tests/tcg/s390x/noexec.c  | 106 ++
>  tests/tcg/x86_64/noexec.c |  75 
>  tests/tcg/multiarch/noexec.c.inc  | 141 ++
>  tests/tcg/riscv64/Makefile.target |   1 +
>  tests/tcg/s390x/Makefile.target   |   1 +
>  tests/tcg/x86_64/Makefile.target  |   3 +-
>  44 files changed, 951 insertions(+), 342 deletions(-)
>  create mode 100644 tests/tcg/riscv64/noexec.c
>  create mode 100644 tests/tcg/s390x/noexec.c
>  create mode 100644 tests/tcg/x86_64/noexec.c
>  create mode 100644 tests/tcg/multiarch/noexec.c.inc
>

[PATCH 1/5] vdpa: extract vhost_vdpa_net_load_mac from vhost_vdpa_net_load

2022-08-19 Thread Eugenio Pérez

Since there may be many commands we need to issue to load the NIC
state, let's split them in individual functions

Signed-off-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 39 +--
 1 file changed, 25 insertions(+), 14 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 97b658f412..1e0dbfcced 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -363,21 +363,10 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, 
size_t out_len,
 return vhost_svq_poll(svq);
 }
 
-static int vhost_vdpa_net_load(NetClientState *nc)
+static int vhost_vdpa_net_load_mac(VhostVDPAState *s,
+  const VirtIONet *n)
 {
-VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
-const struct vhost_vdpa *v = >vhost_vdpa;
-const VirtIONet *n;
-uint64_t features;
-
-assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
-
-if (!v->shadow_vqs_enabled) {
-return 0;
-}
-
-n = VIRTIO_NET(v->dev->vdev);
-features = n->parent_obj.guest_features;
+uint64_t features = n->parent_obj.guest_features;
 if (features & BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR)) {
 const struct virtio_net_ctrl_hdr ctrl = {
 .class = VIRTIO_NET_CTRL_MAC,
@@ -402,6 +391,28 @@ static int vhost_vdpa_net_load(NetClientState *nc)
 return 0;
 }
 
+static int vhost_vdpa_net_load(NetClientState *nc)
+{
+VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+struct vhost_vdpa *v = >vhost_vdpa;
+const VirtIONet *n;
+int r;
+
+assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+if (!v->shadow_vqs_enabled) {
+return 0;
+}
+
+n = VIRTIO_NET(v->dev->vdev);
+r = vhost_vdpa_net_load_mac(s, n);
+if (unlikely(r < 0)) {
+return r;
+}
+
+return 0;
+}
+
 static NetClientInfo net_vhost_vdpa_cvq_info = {
 .type = NET_CLIENT_DRIVER_VHOST_VDPA,
 .size = sizeof(VhostVDPAState),
-- 
2.31.1

[PATCH] configure: improve error for ucontext coroutine backend

2022-08-19 Thread Paolo Bonzini

Instead of using feature_not_found(), which is not a good match because
there is no "remedy" to fix the lack of makecontext(), just print a
custom error.

This happens to remove the last use of feature_not_found(), so remove
the definition and the documentation.

Signed-off-by: Paolo Bonzini 
---
 configure   | 11 +--
 docs/devel/build-system.rst |  5 -
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/configure b/configure
index 72ab03f11a..575dde1c1f 100755
--- a/configure
+++ b/configure
@@ -1468,15 +1468,6 @@ if test "$tcg" = "enabled"; then
 git_submodules="$git_submodules tests/fp/berkeley-softfloat-3"
 fi
 
-feature_not_found() {
-  feature=$1
-  remedy=$2
-
-  error_exit "User requested feature $feature" \
-  "configure was not able to find it." \
-  "$remedy"
-}
-
 # ---
 # big/little endian test
 cat > $TMPC << EOF
@@ -1639,7 +1630,7 @@ else
 ;;
   ucontext)
 if test "$ucontext_works" != "yes"; then
-  feature_not_found "ucontext"
+  error_exit "'ucontext' backend requested but makecontext not available"
 fi
 ;;
   sigaltstack)
diff --git a/docs/devel/build-system.rst b/docs/devel/build-system.rst
index 431caba7aa..1894721743 100644
--- a/docs/devel/build-system.rst
+++ b/docs/devel/build-system.rst
@@ -99,11 +99,6 @@ developers in checking for system features:
Write a minimal C program main() function to the temporary file
indicated by $TMPC
 
-``feature_not_found $NAME $REMEDY``
-   Print a message to stderr that the feature $NAME was not available
-   on the system, suggesting the user try $REMEDY to address the
-   problem.
-
 ``error_exit $MESSAGE $MORE...``
Print $MESSAGE to stderr, followed by $MORE... and then exit from the
configure script with non-zero status
-- 
2.37.1

Re: [PATCH for-7.2 v3 10/20] hw/ppc: set machine->fdt in spapr machine

2022-08-19 Thread Daniel Henrique Barboza





On 8/17/22 23:07, David Gibson wrote:

On Tue, Aug 16, 2022 at 02:34:18PM -0300, Daniel Henrique Barboza wrote:

The pSeries machine never bothered with the common machine->fdt
attribute. We do all the FDT related work using spapr->fdt_blob.

We're going to introduce HMP commands to read and save the FDT, which
will rely on setting machine->fdt properly to work across all machine
archs/types.

Let's set machine->fdt in the two places where we manipulate the FDT:
spapr_machine_reset() and CAS.


So, there's a third place where fdt_blob is updated, in h_update_dt();
that happens because SLOF can make some updates to the DT that qemu
needs to be aware of.  It's kinda ugly, and is a consequence of the
fact that qemu and SLOF kind of share the role of "platform firmware"
for spapr.

But.. it's worse than that.  Those are the only 3 places we actually
alter fdt_blob, but not the only places we logically update the device
tree.  Up until now there wasn't a way to introspect the DT, and so we
didn't bother keeping spapr->fdt_blob update.  Essentially, we
considered maintaining the DT image the job of the guest after CAS.

Specifically, every dynamic reconfiguration event (hotplug or unplug)
alters the device tree.  We generate an fdt fragment for the new
device then stream that as an update to the guest using the PAPR
specified interface (rtas_ibm_configure_connector).  As noted we
currently don't update qemu's global fdt image based on that.  On hot
unplug logically we need to revert those changes, which is actually
pretty tricky, but currently the guest's job.


Really, the trouble is that just dumping or viewing the dt is only
simple in an "embedded" style environment where the fdt is generate
then spit into the guest.  In an actual open firmware environment like
spapr, the DT is logically a dynamic thing maintained by firmware -
but because "firmware"'s responsibility is split between SLOF and
RTAS/qemu, keeping track of that is pretty nasty.  For an environment
like this, the flat tree format isn't really suited either - we'd want
a dynamic representation of the tree.  We get away with flat trees for
now (barely) only because we mostly delegate the responsibility for
managing the tree to SLOF and/or the OS kernel, both of which do use
non-flat representations of the tree.


Thanks for the explanation, but I'm not sure what to do with this patch now.
Should I amend the commit msg to reflect what you explained, mentioning that
we're missing a handful of places where the FDT is updated? We can then come
back at a later time and update ms->fdt in those places as well. This would
be a good follow-up to do after we get rid of spapr->fdt_blob and use ms->fdt
only.

Another alternative is to drop this patch and do spapr in separate later on.
I can live with that, but I'd rather have something for spapr even under the
disclaimer that "this might be not the most up to date FDT the guest is
using".


Thanks,


Daniel




spapr->fdt_blob is left untouched for now. To replace it with
machine->fdt, since we're migrating spapr->fdt_blob, we would need to
migrate machine->fdt as well. This is something that we would like to to
do keep our code simpler but it's a work we'll do another day.

Cc: Cédric Le Goater 
Cc: qemu-...@nongnu.org
Signed-off-by: Daniel Henrique Barboza 
---
  hw/ppc/spapr.c   | 6 ++
  hw/ppc/spapr_hcall.c | 8 
  2 files changed, 14 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bc9ba6e6dc..7031cf964a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1713,6 +1713,12 @@ static void spapr_machine_reset(MachineState *machine)
  spapr->fdt_initial_size = spapr->fdt_size;
  spapr->fdt_blob = fdt;
  
+/*

+ * Set the common machine->fdt pointer to enable support
+ * for 'dumpdtb' and 'info fdt' QMP/HMP commands.
+ */
+machine->fdt = fdt;
+
  /* Set up the entry state */
  first_ppc_cpu->env.gpr[5] = 0;
  
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c

index a8d4a6bcf0..a53bfd76f4 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1256,6 +1256,14 @@ target_ulong do_client_architecture_support(PowerPCCPU 
*cpu,
  spapr->fdt_initial_size = spapr->fdt_size;
  spapr->fdt_blob = fdt;
  
+/*

+ * Set the machine->fdt pointer again since we just freed
+ * it above (by freeing spapr->fdt_blob). We set this
+ * pointer to enable support for 'dumpdtb' and 'info fdt'
+ * QMP/HMP commands.
+ */
+MACHINE(spapr)->fdt = fdt;
+
  return H_SUCCESS;
  }

[PATCH v9 12/12] vdpa: Delete CVQ migration blocker

2022-08-19 Thread Eugenio Pérez

We can restore the device state in the destination via CVQ now. Remove
the migration blocker.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
 include/hw/virtio/vhost-vdpa.h |  1 -
 hw/virtio/vhost-vdpa.c | 15 ---
 net/vhost-vdpa.c   |  2 --
 3 files changed, 18 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index d10a89303e..d85643 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -35,7 +35,6 @@ typedef struct vhost_vdpa {
 bool shadow_vqs_enabled;
 /* IOVA mapping used by the Shadow Virtqueue */
 VhostIOVATree *iova_tree;
-Error *migration_blocker;
 GPtrArray *shadow_vqs;
 const VhostShadowVirtqueueOps *shadow_vq_ops;
 void *shadow_vq_ops_opaque;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 45d6e86b45..15cb87223e 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1032,13 +1032,6 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
 return true;
 }
 
-if (v->migration_blocker) {
-int r = migrate_add_blocker(v->migration_blocker, );
-if (unlikely(r < 0)) {
-return false;
-}
-}
-
 for (i = 0; i < v->shadow_vqs->len; ++i) {
 VirtQueue *vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
 VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
@@ -1081,10 +1074,6 @@ err:
 vhost_svq_stop(svq);
 }
 
-if (v->migration_blocker) {
-migrate_del_blocker(v->migration_blocker);
-}
-
 return false;
 }
 
@@ -1100,10 +1089,6 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
 VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
 vhost_vdpa_svq_unmap_rings(dev, svq);
 }
-
-if (v->migration_blocker) {
-migrate_del_blocker(v->migration_blocker);
-}
 }
 
 static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 8fad31a5fd..97b658f412 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -555,8 +555,6 @@ static NetClientState *net_vhost_vdpa_init(NetClientState 
*peer,
 
 s->vhost_vdpa.shadow_vq_ops = _vdpa_net_svq_ops;
 s->vhost_vdpa.shadow_vq_ops_opaque = s;
-error_setg(>vhost_vdpa.migration_blocker,
-   "Migration disabled: vhost-vdpa uses CVQ.");
 }
 ret = vhost_vdpa_add(nc, (void *)>vhost_vdpa, queue_pair_index, nvqs);
 if (ret) {
-- 
2.31.1

[PATCH 4/5] virtio-net: Update virtio-net curr_queue_pairs in vdpa backends

2022-08-19 Thread Eugenio Pérez

It was returned as error before. Instead of it, simply update the
corresponding field so qemu can send it in the migration data.

Signed-off-by: Eugenio Pérez 
---
 hw/net/virtio-net.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index dd0d056fde..63a8332cd0 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1412,19 +1412,14 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t 
cmd,
 return VIRTIO_NET_ERR;
 }
 
-/* Avoid changing the number of queue_pairs for vdpa device in
- * userspace handler. A future fix is needed to handle the mq
- * change in userspace handler with vhost-vdpa. Let's disable
- * the mq handling from userspace for now and only allow get
- * done through the kernel. Ripples may be seen when falling
- * back to userspace, but without doing it qemu process would
- * crash on a recursive entry to virtio_net_set_status().
- */
+n->curr_queue_pairs = queue_pairs;
 if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
-return VIRTIO_NET_ERR;
+/*
+ * Avoid updating the backend for a vdpa device: We're only interested
+ * in updating the device model queues.
+ */
+return VIRTIO_NET_OK;
 }
-
-n->curr_queue_pairs = queue_pairs;
 /* stop the backend before changing the number of queue_pairs to avoid 
handling a
  * disabled queue */
 virtio_net_set_status(vdev, vdev->status);
-- 
2.31.1

[PATCH v9 07/12] vdpa: add net_vhost_vdpa_cvq_info NetClientInfo

2022-08-19 Thread Eugenio Pérez

Next patches will add a new info callback to restore NIC status through
CVQ. Since only the CVQ vhost device is needed, create it with a new
NetClientInfo.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
v5: Create a new NetClientInfo instead of reusing the dataplane one.
---
 net/vhost-vdpa.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 303447a68e..2cc1e64daf 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -334,6 +334,16 @@ static bool vhost_vdpa_net_cvq_map_elem(VhostVDPAState *s,
 return true;
 }
 
+static NetClientInfo net_vhost_vdpa_cvq_info = {
+.type = NET_CLIENT_DRIVER_VHOST_VDPA,
+.size = sizeof(VhostVDPAState),
+.receive = vhost_vdpa_receive,
+.cleanup = vhost_vdpa_cleanup,
+.has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
+.has_ufo = vhost_vdpa_has_ufo,
+.check_peer_type = vhost_vdpa_check_peer_type,
+};
+
 /**
  * Do not forward commands not supported by SVQ. Otherwise, the device could
  * accept it and qemu would not know how to update the device model.
@@ -475,7 +485,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState 
*peer,
 nc = qemu_new_net_client(_vhost_vdpa_info, peer, device,
  name);
 } else {
-nc = qemu_new_net_control_client(_vhost_vdpa_info, peer,
+nc = qemu_new_net_control_client(_vhost_vdpa_cvq_info, peer,
  device, name);
 }
 snprintf(nc->info_str, sizeof(nc->info_str), TYPE_VHOST_VDPA);
-- 
2.31.1

[PATCH v9 09/12] vdpa: extract vhost_vdpa_net_cvq_add from vhost_vdpa_net_handle_ctrl_avail

2022-08-19 Thread Eugenio Pérez

So we can reuse it to inject state messages.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
--
v7:
* Remove double free error

v6:
* Do not assume in buffer sent to the device is sizeof(virtio_net_ctrl_ack)

v5:
* Do not use an artificial !NULL VirtQueueElement
* Use only out size instead of iovec dev_buffers for these functions.
---
 net/vhost-vdpa.c | 59 +++-
 1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 9ca065dff6..ebf76d1034 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -331,6 +331,38 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
 }
 }
 
+static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, size_t out_len,
+  size_t in_len)
+{
+/* Buffers for the device */
+const struct iovec out = {
+.iov_base = s->cvq_cmd_out_buffer,
+.iov_len = out_len,
+};
+const struct iovec in = {
+.iov_base = s->cvq_cmd_in_buffer,
+.iov_len = sizeof(virtio_net_ctrl_ack),
+};
+VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 0);
+int r;
+
+r = vhost_svq_add(svq, , 1, , 1, NULL);
+if (unlikely(r != 0)) {
+if (unlikely(r == -ENOSPC)) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
+  __func__);
+}
+return r;
+}
+
+/*
+ * We can poll here since we've had BQL from the time we sent the
+ * descriptor. Also, we need to take the answer before SVQ pulls by itself,
+ * when BQL is released
+ */
+return vhost_svq_poll(svq);
+}
+
 static NetClientInfo net_vhost_vdpa_cvq_info = {
 .type = NET_CLIENT_DRIVER_VHOST_VDPA,
 .size = sizeof(VhostVDPAState),
@@ -387,23 +419,18 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 void *opaque)
 {
 VhostVDPAState *s = opaque;
-size_t in_len, dev_written;
+size_t in_len;
 virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
 /* Out buffer sent to both the vdpa device and the device model */
 struct iovec out = {
 .iov_base = s->cvq_cmd_out_buffer,
 };
-/* In buffer sent to the device */
-const struct iovec dev_in = {
-.iov_base = s->cvq_cmd_in_buffer,
-.iov_len = sizeof(virtio_net_ctrl_ack),
-};
 /* in buffer used for device model */
 const struct iovec in = {
 .iov_base = ,
 .iov_len = sizeof(status),
 };
-int r = -EINVAL;
+ssize_t dev_written = -EINVAL;
 bool ok;
 
 out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
@@ -414,21 +441,11 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 goto out;
 }
 
-r = vhost_svq_add(svq, , 1, _in, 1, elem);
-if (unlikely(r != 0)) {
-if (unlikely(r == -ENOSPC)) {
-qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
-  __func__);
-}
+dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status));
+if (unlikely(dev_written < 0)) {
 goto out;
 }
 
-/*
- * We can poll here since we've had BQL from the time we sent the
- * descriptor. Also, we need to take the answer before SVQ pulls by itself,
- * when BQL is released
- */
-dev_written = vhost_svq_poll(svq);
 if (unlikely(dev_written < sizeof(status))) {
 error_report("Insufficient written data (%zu)", dev_written);
 goto out;
@@ -436,7 +453,7 @@ static int 
vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 
 memcpy(, s->cvq_cmd_in_buffer, sizeof(status));
 if (status != VIRTIO_NET_OK) {
-goto out;
+return VIRTIO_NET_ERR;
 }
 
 status = VIRTIO_NET_ERR;
@@ -453,7 +470,7 @@ out:
 }
 vhost_svq_push_elem(svq, elem, MIN(in_len, sizeof(status)));
 g_free(elem);
-return r;
+return dev_written < 0 ? dev_written : 0;
 }
 
 static const VhostShadowVirtqueueOps vhost_vdpa_net_svq_ops = {
-- 
2.31.1

Re: [PATCH v2 06/11] hw/ppc/pnv: Avoid dynamic stack allocation

2022-08-19 Thread Daniel Henrique Barboza





On 8/19/22 12:39, Peter Maydell wrote:

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: David Gibson 
Signed-off-by: Peter Maydell 
Reviewed-by: Peter Maydell 


Reviewed-by: Daniel Henrique Barboza 


Feel free to take it via target-arm.next.



Thanks,


Daniel


---
  hw/ppc/pnv.c   | 4 ++--
  hw/ppc/spapr.c | 8 
  hw/ppc/spapr_pci_nvlink2.c | 2 +-
  3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index d3f77c83672..dd4101e5b65 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -137,7 +137,7 @@ static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void 
*fdt)
  int smt_threads = CPU_CORE(pc)->nr_threads;
  CPUPPCState *env = >env;
  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
-uint32_t servers_prop[smt_threads];
+g_autofree uint32_t *servers_prop = g_new(uint32_t, smt_threads);
  int i;
  uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
 0x, 0x};
@@ -240,7 +240,7 @@ static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void 
*fdt)
  servers_prop[i] = cpu_to_be32(pc->pir + i);
  }
  _FDT((fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
-   servers_prop, sizeof(servers_prop;
+   servers_prop, sizeof(*servers_prop) * smt_threads)));
  }
  
  static void pnv_dt_icp(PnvChip *chip, void *fdt, uint32_t pir,

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bc9ba6e6dcf..28626efd479 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -177,8 +177,8 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
PowerPCCPU *cpu,
int smt_threads)
  {
  int i, ret = 0;
-uint32_t servers_prop[smt_threads];
-uint32_t gservers_prop[smt_threads * 2];
+g_autofree uint32_t *servers_prop = g_new(uint32_t, smt_threads);
+g_autofree uint32_t *gservers_prop = g_new(uint32_t, smt_threads * 2);
  int index = spapr_get_vcpu_id(cpu);
  
  if (cpu->compat_pvr) {

@@ -196,12 +196,12 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
PowerPCCPU *cpu,
  gservers_prop[i*2 + 1] = 0;
  }
  ret = fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
-  servers_prop, sizeof(servers_prop));
+  servers_prop, sizeof(*servers_prop) * smt_threads);
  if (ret < 0) {
  return ret;
  }
  ret = fdt_setprop(fdt, offset, "ibm,ppc-interrupt-gserver#s",
-  gservers_prop, sizeof(gservers_prop));
+  gservers_prop, sizeof(*gservers_prop) * smt_threads * 2);
  
  return ret;

  }
diff --git a/hw/ppc/spapr_pci_nvlink2.c b/hw/ppc/spapr_pci_nvlink2.c
index 63b476c8f72..2a8a11be1d6 100644
--- a/hw/ppc/spapr_pci_nvlink2.c
+++ b/hw/ppc/spapr_pci_nvlink2.c
@@ -397,7 +397,7 @@ void spapr_phb_nvgpu_populate_pcidev_dt(PCIDevice *dev, 
void *fdt, int offset,
  continue;
  }
  if (dev == nvslot->gpdev) {
-uint32_t npus[nvslot->linknum];
+g_autofree uint32_t *npus = g_new(uint32_t, nvslot->linknum);
  
  for (j = 0; j < nvslot->linknum; ++j) {

  PCIDevice *npdev = nvslot->links[j].npdev;

[PATCH 20/20] ppc4xx_sdram: Convert DDR SDRAM controller to new bank handling

2022-08-19 Thread BALATON Zoltan

Use the generic bank handling introduced in previous patch in the DDR
SDRAM controller too. This also fixes previously broken region unmap
due to sdram_ddr_unmap_bcr() ignoring container region so it crashed
with an assert when the guest tried to disable the controller.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc4xx_sdram.c | 98 ---
 1 file changed, 37 insertions(+), 61 deletions(-)

diff --git a/hw/ppc/ppc4xx_sdram.c b/hw/ppc/ppc4xx_sdram.c
index b0bdfa5bc1..60c1bfcf46 100644
--- a/hw/ppc/ppc4xx_sdram.c
+++ b/hw/ppc/ppc4xx_sdram.c
@@ -137,6 +137,8 @@ static void sdram_bank_set_bcr(Ppc4xxSdramBank *bank, 
uint32_t bcr,
 
 /*/
 /* DDR SDRAM controller */
+#define SDRAM_DDR_BCR_MASK 0xFFDEE001
+
 static uint32_t sdram_ddr_bcr(hwaddr ram_base, hwaddr ram_size)
 {
 uint32_t bcr;
@@ -195,58 +197,6 @@ static hwaddr sdram_ddr_size(uint32_t bcr)
 return size;
 }
 
-static void sdram_ddr_set_bcr(Ppc4xxSdramDdrState *sdram, int i,
-  uint32_t bcr, int enabled)
-{
-if (sdram->bank[i].bcr & 1) {
-/* Unmap RAM */
-trace_ppc4xx_sdram_unmap(sdram_ddr_base(sdram->bank[i].bcr),
- sdram_ddr_size(sdram->bank[i].bcr));
-memory_region_del_subregion(get_system_memory(),
->bank[i].container);
-memory_region_del_subregion(>bank[i].container,
->bank[i].ram);
-object_unparent(OBJECT(>bank[i].container));
-}
-sdram->bank[i].bcr = bcr & 0xFFDEE001;
-if (enabled && (bcr & 1)) {
-trace_ppc4xx_sdram_map(sdram_ddr_base(bcr), sdram_ddr_size(bcr));
-memory_region_init(>bank[i].container, NULL, "sdram-container",
-   sdram_ddr_size(bcr));
-memory_region_add_subregion(>bank[i].container, 0,
->bank[i].ram);
-memory_region_add_subregion(get_system_memory(),
-sdram_ddr_base(bcr),
->bank[i].container);
-}
-}
-
-static void sdram_ddr_map_bcr(Ppc4xxSdramDdrState *sdram)
-{
-int i;
-
-for (i = 0; i < sdram->nbanks; i++) {
-if (sdram->bank[i].size != 0) {
-sdram_ddr_set_bcr(sdram, i, sdram_ddr_bcr(sdram->bank[i].base,
-  sdram->bank[i].size), 1);
-} else {
-sdram_ddr_set_bcr(sdram, i, 0, 0);
-}
-}
-}
-
-static void sdram_ddr_unmap_bcr(Ppc4xxSdramDdrState *sdram)
-{
-int i;
-
-for (i = 0; i < sdram->nbanks; i++) {
-trace_ppc4xx_sdram_unmap(sdram_ddr_base(sdram->bank[i].bcr),
- sdram_ddr_size(sdram->bank[i].bcr));
-memory_region_del_subregion(get_system_memory(),
->bank[i].ram);
-}
-}
-
 static uint32_t sdram_ddr_dcr_read(void *opaque, int dcrn)
 {
 Ppc4xxSdramDdrState *s = opaque;
@@ -317,6 +267,7 @@ static uint32_t sdram_ddr_dcr_read(void *opaque, int dcrn)
 static void sdram_ddr_dcr_write(void *opaque, int dcrn, uint32_t val)
 {
 Ppc4xxSdramDdrState *s = opaque;
+int i;
 
 switch (dcrn) {
 case SDRAM0_CFGADDR:
@@ -338,12 +289,24 @@ static void sdram_ddr_dcr_write(void *opaque, int dcrn, 
uint32_t val)
 if (!(s->cfg & 0x8000) && (val & 0x8000)) {
 trace_ppc4xx_sdram_enable("enable");
 /* validate all RAM mappings */
-sdram_ddr_map_bcr(s);
+for (i = 0; i < s->nbanks; i++) {
+if (s->bank[i].size) {
+sdram_bank_set_bcr(>bank[i], s->bank[i].bcr,
+   s->bank[i].base, s->bank[i].size,
+   1);
+}
+}
 s->status &= ~0x8000;
 } else if ((s->cfg & 0x8000) && !(val & 0x8000)) {
 trace_ppc4xx_sdram_enable("disable");
 /* invalidate all RAM mappings */
-sdram_ddr_unmap_bcr(s);
+for (i = 0; i < s->nbanks; i++) {
+if (s->bank[i].size) {
+sdram_bank_set_bcr(>bank[i], s->bank[i].bcr,
+   s->bank[i].base, s->bank[i].size,
+   0);
+}
+}
 s->status |= 0x8000;
 }
 if (!(s->cfg & 0x4000) && (val & 0x4000)) {
@@ -363,16 +326,16 @@ static void sdram_ddr_dcr_write(void *opaque, int dcrn, 
uint32_t val)
 s->pmit = (val & 0xF800) | 0x07C0;
 break;
 case 0x40: /* SDRAM_B0CR */
-sdram_ddr_set_bcr(s, 0, val, s->cfg & 0x8000);
-break;
 case 0x44: /*

[PATCH 17/20] ppc4xx_sdram: Use hwaddr for memory bank size

2022-08-19 Thread BALATON Zoltan

This resolves the target_ulong dependency that's clearly wrong and was
also noted in a fixme comment.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc4xx_sdram.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/hw/ppc/ppc4xx_sdram.c b/hw/ppc/ppc4xx_sdram.c
index 8683207179..69b670737f 100644
--- a/hw/ppc/ppc4xx_sdram.c
+++ b/hw/ppc/ppc4xx_sdram.c
@@ -34,7 +34,6 @@
 #include "qapi/error.h"
 #include "qemu/log.h"
 #include "exec/address-spaces.h" /* get_system_memory() */
-#include "exec/cpu-defs.h" /* target_ulong */
 #include "hw/irq.h"
 #include "hw/qdev-properties.h"
 #include "qapi/error.h"
@@ -122,11 +121,6 @@ static void sdram_bank_unmap(Ppc4xxSdramBank *bank)
 
 /*/
 /* DDR SDRAM controller */
-/*
- * XXX: TOFIX: some patches have made this code become inconsistent:
- *  there are type inconsistencies, mixing hwaddr, target_ulong
- *  and uint32_t
- */
 static uint32_t sdram_ddr_bcr(hwaddr ram_base, hwaddr ram_size)
 {
 uint32_t bcr;
@@ -170,9 +164,9 @@ static inline hwaddr sdram_ddr_base(uint32_t bcr)
 return bcr & 0xFF80;
 }
 
-static target_ulong sdram_ddr_size(uint32_t bcr)
+static hwaddr sdram_ddr_size(uint32_t bcr)
 {
-target_ulong size;
+hwaddr size;
 int sh;
 
 sh = (bcr >> 17) & 0x7;
@@ -513,9 +507,9 @@ static inline hwaddr sdram_ddr2_base(uint32_t bcr)
 return (bcr & 0xffe0) << 2;
 }
 
-static uint64_t sdram_ddr2_size(uint32_t bcr)
+static hwaddr sdram_ddr2_size(uint32_t bcr)
 {
-uint64_t size;
+hwaddr size;
 int sh;
 
 sh = 1024 - ((bcr >> 6) & 0x3ff);
-- 
2.30.4

[PATCH v9 05/12] vhost_net: Add NetClientInfo start callback

2022-08-19 Thread Eugenio Pérez

This is used by the backend to perform actions before the device is
started.

In particular, vdpa net use it to map CVQ buffers to the device, so it
can send control commands using them.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
v9: Rename also in patch message
v8: Rename NetClientInfo prepare callback to start, so it aligns with
future "stop"
---
 include/net/net.h  | 2 ++
 hw/net/vhost_net.c | 7 +++
 2 files changed, 9 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 523136c7ac..ad9e80083a 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -44,6 +44,7 @@ typedef struct NICConf {
 
 typedef void (NetPoll)(NetClientState *, bool enable);
 typedef bool (NetCanReceive)(NetClientState *);
+typedef int (NetStart)(NetClientState *);
 typedef ssize_t (NetReceive)(NetClientState *, const uint8_t *, size_t);
 typedef ssize_t (NetReceiveIOV)(NetClientState *, const struct iovec *, int);
 typedef void (NetCleanup) (NetClientState *);
@@ -71,6 +72,7 @@ typedef struct NetClientInfo {
 NetReceive *receive_raw;
 NetReceiveIOV *receive_iov;
 NetCanReceive *can_receive;
+NetStart *start;
 NetCleanup *cleanup;
 LinkStatusChanged *link_status_changed;
 QueryRxFilter *query_rx_filter;
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index ccac5b7a64..2e0baeba26 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -244,6 +244,13 @@ static int vhost_net_start_one(struct vhost_net *net,
 struct vhost_vring_file file = { };
 int r;
 
+if (net->nc->info->start) {
+r = net->nc->info->start(net->nc);
+if (r < 0) {
+return r;
+}
+}
+
 r = vhost_dev_enable_notifiers(>dev, dev);
 if (r < 0) {
 goto fail_notifiers;
-- 
2.31.1

[PATCH v9 08/12] vdpa: Move command buffers map to start of net device

2022-08-19 Thread Eugenio Pérez

As this series will reuse them to restore the device state at the end of
a migration (or a device start), let's allocate only once at the device
start so we don't duplicate their map and unmap.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
 net/vhost-vdpa.c | 123 ++-
 1 file changed, 58 insertions(+), 65 deletions(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 2cc1e64daf..9ca065dff6 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -263,29 +263,20 @@ static size_t vhost_vdpa_net_cvq_cmd_page_len(void)
 return ROUND_UP(vhost_vdpa_net_cvq_cmd_len(), qemu_real_host_page_size());
 }
 
-/** Copy and map a guest buffer. */
-static bool vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v,
-   const struct iovec *out_data,
-   size_t out_num, size_t data_len, void *buf,
-   size_t *written, bool write)
+/** Map CVQ buffer. */
+static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, void *buf, size_t size,
+  bool write)
 {
 DMAMap map = {};
 int r;
 
-if (unlikely(!data_len)) {
-qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid legnth of %s buffer\n",
-  __func__, write ? "in" : "out");
-return false;
-}
-
-*written = iov_to_buf(out_data, out_num, 0, buf, data_len);
 map.translated_addr = (hwaddr)(uintptr_t)buf;
-map.size = vhost_vdpa_net_cvq_cmd_page_len() - 1;
+map.size = size - 1;
 map.perm = write ? IOMMU_RW : IOMMU_RO,
 r = vhost_iova_tree_map_alloc(v->iova_tree, );
 if (unlikely(r != IOVA_OK)) {
 error_report("Cannot map injected element");
-return false;
+return r;
 }
 
 r = vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(), buf,
@@ -294,50 +285,58 @@ static bool vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v,
 goto dma_map_err;
 }
 
-return true;
+return 0;
 
 dma_map_err:
 vhost_iova_tree_remove(v->iova_tree, );
-return false;
+return r;
 }
 
-/**
- * Copy the guest element into a dedicated buffer suitable to be sent to NIC
- *
- * @iov: [0] is the out buffer, [1] is the in one
- */
-static bool vhost_vdpa_net_cvq_map_elem(VhostVDPAState *s,
-VirtQueueElement *elem,
-struct iovec *iov)
+static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 {
-size_t in_copied;
-bool ok;
+VhostVDPAState *s;
+int r;
 
-iov[0].iov_base = s->cvq_cmd_out_buffer;
-ok = vhost_vdpa_cvq_map_buf(>vhost_vdpa, elem->out_sg, elem->out_num,
-vhost_vdpa_net_cvq_cmd_len(), iov[0].iov_base,
-[0].iov_len, false);
-if (unlikely(!ok)) {
-return false;
+assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+s = DO_UPCAST(VhostVDPAState, nc, nc);
+if (!s->vhost_vdpa.shadow_vqs_enabled) {
+return 0;
 }
 
-iov[1].iov_base = s->cvq_cmd_in_buffer;
-ok = vhost_vdpa_cvq_map_buf(>vhost_vdpa, NULL, 0,
-sizeof(virtio_net_ctrl_ack), iov[1].iov_base,
-_copied, true);
-if (unlikely(!ok)) {
+r = vhost_vdpa_cvq_map_buf(>vhost_vdpa, s->cvq_cmd_out_buffer,
+   vhost_vdpa_net_cvq_cmd_page_len(), false);
+if (unlikely(r < 0)) {
+return r;
+}
+
+r = vhost_vdpa_cvq_map_buf(>vhost_vdpa, s->cvq_cmd_in_buffer,
+   vhost_vdpa_net_cvq_cmd_page_len(), true);
+if (unlikely(r < 0)) {
 vhost_vdpa_cvq_unmap_buf(>vhost_vdpa, s->cvq_cmd_out_buffer);
-return false;
 }
 
-iov[1].iov_len = sizeof(virtio_net_ctrl_ack);
-return true;
+return r;
+}
+
+static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
+{
+VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+
+assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+if (s->vhost_vdpa.shadow_vqs_enabled) {
+vhost_vdpa_cvq_unmap_buf(>vhost_vdpa, s->cvq_cmd_out_buffer);
+vhost_vdpa_cvq_unmap_buf(>vhost_vdpa, s->cvq_cmd_in_buffer);
+}
 }
 
 static NetClientInfo net_vhost_vdpa_cvq_info = {
 .type = NET_CLIENT_DRIVER_VHOST_VDPA,
 .size = sizeof(VhostVDPAState),
 .receive = vhost_vdpa_receive,
+.start = vhost_vdpa_net_cvq_start,
+.stop = vhost_vdpa_net_cvq_stop,
 .cleanup = vhost_vdpa_cleanup,
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
 .has_ufo = vhost_vdpa_has_ufo,
@@ -348,19 +347,17 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
  * Do not forward commands not supported by SVQ. Otherwise, the device could
  * accept it and qemu would not know how to update the device model.
  */
-static bool vhost_vdpa_net_cvq_validate_cmd(const struct iovec *out,
-size_t out_num)
+static bool

[PATCH 0/5] Vhost-vdpa Shadow Virtqueue multiqueue support.

2022-08-19 Thread Eugenio Pérez

This series enables shadowed CVQ to intercept multiqueue commands through
shadowed CVQ, update the virtio NIC device model so qemu send it in a
migration, and the restore of that MQ state in the destination.

It needs to be applied on top of [1].

[1] https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg02965.html

Eugenio Pérez (5):
  vdpa: extract vhost_vdpa_net_load_mac from vhost_vdpa_net_load
  vdpa: Add vhost_vdpa_net_load_mq
  vdpa: validate MQ CVQ commands
  virtio-net: Update virtio-net curr_queue_pairs in vdpa backends
  vdpa: Allow MQ feture in SVQ

 hw/net/virtio-net.c | 17 --
 net/vhost-vdpa.c| 82 +
 2 files changed, 74 insertions(+), 25 deletions(-)

-- 
2.31.1

[PATCH v9 04/12] vhost: Do not depend on !NULL VirtQueueElement on vhost_svq_flush

2022-08-19 Thread Eugenio Pérez

Since QEMU will be able to inject new elements on CVQ to restore the
state, we need not to depend on a VirtQueueElement to know if a new
element has been used by the device or not. Instead of check that, check
if there are new elements only using used idx on vhost_svq_flush.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
v6: Change less from the previous function
---
 hw/virtio/vhost-shadow-virtqueue.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 8df5296f24..e8e5bbc368 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -499,17 +499,20 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
 size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
 {
 int64_t start_us = g_get_monotonic_time();
+uint32_t len;
+
 do {
-uint32_t len;
-VirtQueueElement *elem = vhost_svq_get_buf(svq, );
-if (elem) {
-return len;
+if (vhost_svq_more_used(svq)) {
+break;
 }
 
 if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
 return 0;
 }
 } while (true);
+
+vhost_svq_get_buf(svq, );
+return len;
 }
 
 /**
-- 
2.31.1

[PATCH v9 01/12] vhost: stop transfer elem ownership in vhost_handle_guest_kick

2022-08-19 Thread Eugenio Pérez

It was easier to allow vhost_svq_add to handle the memory. Now that we
will allow qemu to add elements to a SVQ without the guest's knowledge,
it's better to handle it in the caller.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
 hw/virtio/vhost-shadow-virtqueue.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 82a784d250..a1261d4a0f 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -233,9 +233,6 @@ static void vhost_svq_kick(VhostShadowVirtqueue *svq)
 /**
  * Add an element to a SVQ.
  *
- * The caller must check that there is enough slots for the new element. It
- * takes ownership of the element: In case of failure not ENOSPC, it is free.
- *
  * Return -EINVAL if element is invalid, -ENOSPC if dev queue is full
  */
 int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
@@ -252,7 +249,6 @@ int vhost_svq_add(VhostShadowVirtqueue *svq, const struct 
iovec *out_sg,
 
 ok = vhost_svq_add_split(svq, out_sg, out_num, in_sg, in_num, _head);
 if (unlikely(!ok)) {
-g_free(elem);
 return -EINVAL;
 }
 
@@ -293,7 +289,7 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue 
*svq)
 virtio_queue_set_notification(svq->vq, false);
 
 while (true) {
-VirtQueueElement *elem;
+g_autofree VirtQueueElement *elem;
 int r;
 
 if (svq->next_guest_avail_elem) {
@@ -324,12 +320,14 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue 
*svq)
  * queue the current guest descriptor and ignore kicks
  * until some elements are used.
  */
-svq->next_guest_avail_elem = elem;
+svq->next_guest_avail_elem = g_steal_pointer();
 }
 
 /* VQ is full or broken, just return and ignore kicks */
 return;
 }
+/* elem belongs to SVQ or external caller now */
+elem = NULL;
 }
 
 virtio_queue_set_notification(svq->vq, true);
-- 
2.31.1

[PATCH v9 11/12] vdpa: Add virtio-net mac address via CVQ at start

2022-08-19 Thread Eugenio Pérez

This is needed so the destination vdpa device see the same state a the
guest set in the source.

Signed-off-by: Eugenio Pérez 
---
v9:
* Use guest acked features instead of device's.
* Constify vhost_vdpa and VirtIONet variables.
* Delete unneeded increment of cursor.

v8:
* Delete unneeded copy from device's in buffer.

v6:
* Map and unmap command buffers at the start and end of device usage.

v5:
* Rename s/start/load/
* Use independent NetClientInfo to only add load callback on cvq.
---
 net/vhost-vdpa.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index ebf76d1034..8fad31a5fd 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -363,11 +363,51 @@ static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, 
size_t out_len,
 return vhost_svq_poll(svq);
 }
 
+static int vhost_vdpa_net_load(NetClientState *nc)
+{
+VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+const struct vhost_vdpa *v = >vhost_vdpa;
+const VirtIONet *n;
+uint64_t features;
+
+assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+if (!v->shadow_vqs_enabled) {
+return 0;
+}
+
+n = VIRTIO_NET(v->dev->vdev);
+features = n->parent_obj.guest_features;
+if (features & BIT_ULL(VIRTIO_NET_F_CTRL_MAC_ADDR)) {
+const struct virtio_net_ctrl_hdr ctrl = {
+.class = VIRTIO_NET_CTRL_MAC,
+.cmd = VIRTIO_NET_CTRL_MAC_ADDR_SET,
+};
+char *cursor = s->cvq_cmd_out_buffer;
+ssize_t dev_written;
+
+memcpy(cursor, , sizeof(ctrl));
+cursor += sizeof(ctrl);
+memcpy(cursor, n->mac, sizeof(n->mac));
+
+dev_written = vhost_vdpa_net_cvq_add(s, sizeof(ctrl) + sizeof(n->mac),
+ sizeof(virtio_net_ctrl_ack));
+if (unlikely(dev_written < 0)) {
+return dev_written;
+}
+
+return *((virtio_net_ctrl_ack *)s->cvq_cmd_in_buffer) != VIRTIO_NET_OK;
+}
+
+return 0;
+}
+
 static NetClientInfo net_vhost_vdpa_cvq_info = {
 .type = NET_CLIENT_DRIVER_VHOST_VDPA,
 .size = sizeof(VhostVDPAState),
 .receive = vhost_vdpa_receive,
 .start = vhost_vdpa_net_cvq_start,
+.load = vhost_vdpa_net_load,
 .stop = vhost_vdpa_net_cvq_stop,
 .cleanup = vhost_vdpa_cleanup,
 .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
-- 
2.31.1

[PATCH 16/20] ppc4xx_sdram: Move ppc4xx DDR and DDR2 SDRAM controller models together

2022-08-19 Thread BALATON Zoltan

Move the PPC4xx DDR and DDR2 SDRAM contrller models into a new file
called ppc4xx_sdram to separate from other device models and put them
in one place allowing sharing some code between them.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/meson.build  |   3 +-
 hw/ppc/ppc440_uc.c  | 321 -
 hw/ppc/ppc4xx_devs.c| 403 -
 hw/ppc/ppc4xx_sdram.c   | 752 
 include/hw/ppc/ppc4xx.h |  34 +-
 5 files changed, 771 insertions(+), 742 deletions(-)
 create mode 100644 hw/ppc/ppc4xx_sdram.c

diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build
index 62801923f3..74720dd1e1 100644
--- a/hw/ppc/meson.build
+++ b/hw/ppc/meson.build
@@ -59,8 +59,9 @@ ppc_ss.add(when: 'CONFIG_PPC440', if_true: files(
   'ppc440_bamboo.c',
   'ppc440_pcix.c', 'ppc440_uc.c'))
 ppc_ss.add(when: 'CONFIG_PPC4XX', if_true: files(
+  'ppc4xx_devs.c',
   'ppc4xx_pci.c',
-  'ppc4xx_devs.c'))
+  'ppc4xx_sdram.c'))
 ppc_ss.add(when: 'CONFIG_SAM460EX', if_true: files('sam460ex.c'))
 # PReP
 ppc_ss.add(when: 'CONFIG_PREP', if_true: files('prep.c'))
diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index a75a1748bd..651263926e 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -10,20 +10,14 @@
 
 #include "qemu/osdep.h"
 #include "qemu/units.h"
-#include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qemu/log.h"
-#include "qemu/module.h"
 #include "hw/irq.h"
-#include "exec/memory.h"
 #include "hw/ppc/ppc4xx.h"
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci.h"
-#include "sysemu/block-backend.h"
 #include "sysemu/reset.h"
 #include "ppc440.h"
-#include "qom/object.h"
-#include "trace.h"
 
 /*/
 /* L2 Cache as SRAM */
@@ -379,10 +373,6 @@ enum {
 PESDR1_RSTSTA = 0x365,
 };
 
-#define SDR0_DDR0_DDRM_ENCODE(n)  unsigned long)(n)) & 0x03) << 29)
-#define SDR0_DDR0_DDRM_DDR1   0x2000
-#define SDR0_DDR0_DDRM_DDR2   0x4000
-
 static uint32_t dcr_read_sdr(void *opaque, int dcrn)
 {
 ppc4xx_sdr_t *sdr = opaque;
@@ -481,317 +471,6 @@ void ppc4xx_sdr_init(CPUPPCState *env)
  sdr, _read_sdr, _write_sdr);
 }
 
-/*/
-/* SDRAM controller */
-enum {
-SDRAM_R0BAS = 0x40,
-SDRAM_R1BAS,
-SDRAM_R2BAS,
-SDRAM_R3BAS,
-SDRAM_CONF1HB = 0x45,
-SDRAM_PLBADDULL = 0x4a,
-SDRAM_CONF1LL = 0x4b,
-SDRAM_CONFPATHB = 0x4f,
-SDRAM_PLBADDUHB = 0x50,
-};
-
-static uint32_t sdram_ddr2_bcr(hwaddr ram_base, hwaddr ram_size)
-{
-uint32_t bcr;
-
-switch (ram_size) {
-case (8 * MiB):
-bcr = 0xffc0;
-break;
-case (16 * MiB):
-bcr = 0xff80;
-break;
-case (32 * MiB):
-bcr = 0xff00;
-break;
-case (64 * MiB):
-bcr = 0xfe00;
-break;
-case (128 * MiB):
-bcr = 0xfc00;
-break;
-case (256 * MiB):
-bcr = 0xf800;
-break;
-case (512 * MiB):
-bcr = 0xf000;
-break;
-case (1 * GiB):
-bcr = 0xe000;
-break;
-case (2 * GiB):
-bcr = 0xc000;
-break;
-case (4 * GiB):
-bcr = 0x8000;
-break;
-default:
-error_report("invalid RAM size " TARGET_FMT_plx, ram_size);
-return 0;
-}
-bcr |= ram_base >> 2 & 0xffe0;
-bcr |= 1;
-
-return bcr;
-}
-
-static inline hwaddr sdram_ddr2_base(uint32_t bcr)
-{
-return (bcr & 0xffe0) << 2;
-}
-
-static uint64_t sdram_ddr2_size(uint32_t bcr)
-{
-uint64_t size;
-int sh;
-
-sh = 1024 - ((bcr >> 6) & 0x3ff);
-size = 8 * MiB * sh;
-
-return size;
-}
-
-static void sdram_bank_map(Ppc4xxSdramBank *bank)
-{
-memory_region_init(>container, NULL, "sdram-container", bank->size);
-memory_region_add_subregion(>container, 0, >ram);
-memory_region_add_subregion(get_system_memory(), bank->base,
->container);
-}
-
-static void sdram_bank_unmap(Ppc4xxSdramBank *bank)
-{
-memory_region_del_subregion(get_system_memory(), >container);
-memory_region_del_subregion(>container, >ram);
-object_unparent(OBJECT(>container));
-}
-
-static void sdram_ddr2_set_bcr(Ppc4xxSdramDdr2State *sdram, int i,
-   uint32_t bcr, int enabled)
-{
-if (sdram->bank[i].bcr & 1) {
-/* First unmap RAM if enabled */
-trace_ppc4xx_sdram_unmap(sdram_ddr2_base(sdram->bank[i].bcr),
- sdram_ddr2_size(sdram->bank[i].bcr));
-sdram_bank_unmap(>bank[i]);
-}
-sdram->bank[i].bcr = bcr & 0xffe0ffc1;
-sdram->bank[i].base = sdram_ddr2_base(bcr);
-sdram->bank[i].size = sdram_ddr2_size(bcr);
-if (enabled && (bcr & 1)) {
-trace_ppc4xx_sdram_map(sdram_ddr2_base(bcr), sdram_ddr2_size(bcr));
-sdram_bank_map(>bank[i]);
-}
-}
-
-static void

[PATCH v9 02/12] vhost: use SVQ element ndescs instead of opaque data for desc validation

2022-08-19 Thread Eugenio Pérez

Since we're going to allow SVQ to add elements without the guest's
knowledge and without its own VirtQueueElement, it's easier to check if
an element is a valid head checking a different thing than the
VirtQueueElement.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
 hw/virtio/vhost-shadow-virtqueue.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index a1261d4a0f..b35aeef4bd 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -414,7 +414,7 @@ static VirtQueueElement 
*vhost_svq_get_buf(VhostShadowVirtqueue *svq,
 return NULL;
 }
 
-if (unlikely(!svq->desc_state[used_elem.id].elem)) {
+if (unlikely(!svq->desc_state[used_elem.id].ndescs)) {
 qemu_log_mask(LOG_GUEST_ERROR,
 "Device %s says index %u is used, but it was not available",
 svq->vdev->name, used_elem.id);
@@ -422,6 +422,7 @@ static VirtQueueElement 
*vhost_svq_get_buf(VhostShadowVirtqueue *svq,
 }
 
 num = svq->desc_state[used_elem.id].ndescs;
+svq->desc_state[used_elem.id].ndescs = 0;
 last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
 svq->desc_next[last_used_chain] = svq->free_head;
 svq->free_head = used_elem.id;
-- 
2.31.1

[PATCH 18/20] ppc4xx_sdram: Rename local state variable for brevity

2022-08-19 Thread BALATON Zoltan

Rename the sdram local state variable to s in dcr read/write functions
and reset methods for better readability and to match realize methods.
Other places not converted will be changed or removed in subsequent
patches.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc4xx_sdram.c | 158 +-
 1 file changed, 79 insertions(+), 79 deletions(-)

diff --git a/hw/ppc/ppc4xx_sdram.c b/hw/ppc/ppc4xx_sdram.c
index 69b670737f..a479d15fa6 100644
--- a/hw/ppc/ppc4xx_sdram.c
+++ b/hw/ppc/ppc4xx_sdram.c
@@ -233,56 +233,56 @@ static void sdram_ddr_unmap_bcr(Ppc4xxSdramDdrState 
*sdram)
 
 static uint32_t sdram_ddr_dcr_read(void *opaque, int dcrn)
 {
-Ppc4xxSdramDdrState *sdram = opaque;
+Ppc4xxSdramDdrState *s = opaque;
 uint32_t ret;
 
 switch (dcrn) {
 case SDRAM0_CFGADDR:
-ret = sdram->addr;
+ret = s->addr;
 break;
 case SDRAM0_CFGDATA:
-switch (sdram->addr) {
+switch (s->addr) {
 case 0x00: /* SDRAM_BESR0 */
-ret = sdram->besr0;
+ret = s->besr0;
 break;
 case 0x08: /* SDRAM_BESR1 */
-ret = sdram->besr1;
+ret = s->besr1;
 break;
 case 0x10: /* SDRAM_BEAR */
-ret = sdram->bear;
+ret = s->bear;
 break;
 case 0x20: /* SDRAM_CFG */
-ret = sdram->cfg;
+ret = s->cfg;
 break;
 case 0x24: /* SDRAM_STATUS */
-ret = sdram->status;
+ret = s->status;
 break;
 case 0x30: /* SDRAM_RTR */
-ret = sdram->rtr;
+ret = s->rtr;
 break;
 case 0x34: /* SDRAM_PMIT */
-ret = sdram->pmit;
+ret = s->pmit;
 break;
 case 0x40: /* SDRAM_B0CR */
-ret = sdram->bank[0].bcr;
+ret = s->bank[0].bcr;
 break;
 case 0x44: /* SDRAM_B1CR */
-ret = sdram->bank[1].bcr;
+ret = s->bank[1].bcr;
 break;
 case 0x48: /* SDRAM_B2CR */
-ret = sdram->bank[2].bcr;
+ret = s->bank[2].bcr;
 break;
 case 0x4C: /* SDRAM_B3CR */
-ret = sdram->bank[3].bcr;
+ret = s->bank[3].bcr;
 break;
 case 0x80: /* SDRAM_TR */
 ret = -1; /* ? */
 break;
 case 0x94: /* SDRAM_ECCCFG */
-ret = sdram->ecccfg;
+ret = s->ecccfg;
 break;
 case 0x98: /* SDRAM_ECCESR */
-ret = sdram->eccesr;
+ret = s->eccesr;
 break;
 default: /* Error */
 ret = -1;
@@ -300,78 +300,78 @@ static uint32_t sdram_ddr_dcr_read(void *opaque, int dcrn)
 
 static void sdram_ddr_dcr_write(void *opaque, int dcrn, uint32_t val)
 {
-Ppc4xxSdramDdrState *sdram = opaque;
+Ppc4xxSdramDdrState *s = opaque;
 
 switch (dcrn) {
 case SDRAM0_CFGADDR:
-sdram->addr = val;
+s->addr = val;
 break;
 case SDRAM0_CFGDATA:
-switch (sdram->addr) {
+switch (s->addr) {
 case 0x00: /* SDRAM_BESR0 */
-sdram->besr0 &= ~val;
+s->besr0 &= ~val;
 break;
 case 0x08: /* SDRAM_BESR1 */
-sdram->besr1 &= ~val;
+s->besr1 &= ~val;
 break;
 case 0x10: /* SDRAM_BEAR */
-sdram->bear = val;
+s->bear = val;
 break;
 case 0x20: /* SDRAM_CFG */
 val &= 0xFFE0;
-if (!(sdram->cfg & 0x8000) && (val & 0x8000)) {
+if (!(s->cfg & 0x8000) && (val & 0x8000)) {
 trace_ppc4xx_sdram_enable("enable");
 /* validate all RAM mappings */
-sdram_ddr_map_bcr(sdram);
-sdram->status &= ~0x8000;
-} else if ((sdram->cfg & 0x8000) && !(val & 0x8000)) {
+sdram_ddr_map_bcr(s);
+s->status &= ~0x8000;
+} else if ((s->cfg & 0x8000) && !(val & 0x8000)) {
 trace_ppc4xx_sdram_enable("disable");
 /* invalidate all RAM mappings */
-sdram_ddr_unmap_bcr(sdram);
-sdram->status |= 0x8000;
+sdram_ddr_unmap_bcr(s);
+s->status |= 0x8000;
 }
-if (!(sdram->cfg & 0x4000) && (val & 0x4000)) {
-sdram->status |= 0x4000;
-} else if ((sdram->cfg & 0x4000) && !(val & 0x4000)) {
-sdram->status &= ~0x4000;
+if (!(s->cfg & 0x4000) && (val & 0x4000)) {
+s->status |= 0x4000;
+} else if ((s->cfg & 0x4000) && !(val & 0x4000)) {
+s->status &= ~0x4000;
 }
-sdram->cfg = val;
+s->cfg = val;
 break;

[PATCH 19/20] ppc4xx_sdram: Generalise bank setup

2022-08-19 Thread BALATON Zoltan

Currently only base and size are set on initial bank creation and bcr
value is computed on mapping the region. Set bcr at init so the bcr
encoding method becomes local to the controller model and mapping and
unmapping can operate on the bank so it can be shared between
different controller models. This patch converts the DDR2 controller.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc4xx_sdram.c | 93 ++-
 hw/ppc/trace-events   |  1 +
 2 files changed, 48 insertions(+), 46 deletions(-)

diff --git a/hw/ppc/ppc4xx_sdram.c b/hw/ppc/ppc4xx_sdram.c
index a479d15fa6..b0bdfa5bc1 100644
--- a/hw/ppc/ppc4xx_sdram.c
+++ b/hw/ppc/ppc4xx_sdram.c
@@ -106,6 +106,7 @@ static void ppc4xx_sdram_banks(MemoryRegion *ram, int 
nr_banks,
 
 static void sdram_bank_map(Ppc4xxSdramBank *bank)
 {
+trace_ppc4xx_sdram_map(bank->base, bank->size);
 memory_region_init(>container, NULL, "sdram-container", bank->size);
 memory_region_add_subregion(>container, 0, >ram);
 memory_region_add_subregion(get_system_memory(), bank->base,
@@ -114,11 +115,26 @@ static void sdram_bank_map(Ppc4xxSdramBank *bank)
 
 static void sdram_bank_unmap(Ppc4xxSdramBank *bank)
 {
+trace_ppc4xx_sdram_unmap(bank->base, bank->size);
 memory_region_del_subregion(get_system_memory(), >container);
 memory_region_del_subregion(>container, >ram);
 object_unparent(OBJECT(>container));
 }
 
+static void sdram_bank_set_bcr(Ppc4xxSdramBank *bank, uint32_t bcr,
+   hwaddr base, hwaddr size, int enabled)
+{
+if (memory_region_is_mapped(>container)) {
+sdram_bank_unmap(bank);
+}
+bank->bcr = bcr;
+bank->base = base;
+bank->size = size;
+if (enabled && (bcr & 1)) {
+sdram_bank_map(bank);
+}
+}
+
 /*/
 /* DDR SDRAM controller */
 static uint32_t sdram_ddr_bcr(hwaddr ram_base, hwaddr ram_size)
@@ -445,6 +461,8 @@ static void ppc4xx_sdram_ddr_class_init(ObjectClass *oc, 
void *data)
 
 /*/
 /* DDR2 SDRAM controller */
+#define SDRAM_DDR2_BCR_MASK 0xffe0ffc1
+
 enum {
 SDRAM_R0BAS = 0x40,
 SDRAM_R1BAS,
@@ -518,50 +536,6 @@ static hwaddr sdram_ddr2_size(uint32_t bcr)
 return size;
 }
 
-static void sdram_ddr2_set_bcr(Ppc4xxSdramDdr2State *sdram, int i,
-   uint32_t bcr, int enabled)
-{
-if (sdram->bank[i].bcr & 1) {
-/* First unmap RAM if enabled */
-trace_ppc4xx_sdram_unmap(sdram_ddr2_base(sdram->bank[i].bcr),
- sdram_ddr2_size(sdram->bank[i].bcr));
-sdram_bank_unmap(>bank[i]);
-}
-sdram->bank[i].bcr = bcr & 0xffe0ffc1;
-sdram->bank[i].base = sdram_ddr2_base(bcr);
-sdram->bank[i].size = sdram_ddr2_size(bcr);
-if (enabled && (bcr & 1)) {
-trace_ppc4xx_sdram_map(sdram_ddr2_base(bcr), sdram_ddr2_size(bcr));
-sdram_bank_map(>bank[i]);
-}
-}
-
-static void sdram_ddr2_map_bcr(Ppc4xxSdramDdr2State *sdram)
-{
-int i;
-
-for (i = 0; i < sdram->nbanks; i++) {
-if (sdram->bank[i].size) {
-sdram_ddr2_set_bcr(sdram, i,
-   sdram_ddr2_bcr(sdram->bank[i].base,
-  sdram->bank[i].size), 1);
-} else {
-sdram_ddr2_set_bcr(sdram, i, 0, 0);
-}
-}
-}
-
-static void sdram_ddr2_unmap_bcr(Ppc4xxSdramDdr2State *sdram)
-{
-int i;
-
-for (i = 0; i < sdram->nbanks; i++) {
-if (sdram->bank[i].size) {
-sdram_ddr2_set_bcr(sdram, i, sdram->bank[i].bcr & ~1, 0);
-}
-}
-}
-
 static uint32_t sdram_ddr2_dcr_read(void *opaque, int dcrn)
 {
 Ppc4xxSdramDdr2State *s = opaque;
@@ -618,6 +592,7 @@ static uint32_t sdram_ddr2_dcr_read(void *opaque, int dcrn)
 static void sdram_ddr2_dcr_write(void *opaque, int dcrn, uint32_t val)
 {
 Ppc4xxSdramDdr2State *s = opaque;
+int i;
 
 switch (dcrn) {
 case SDRAM_R0BAS:
@@ -641,12 +616,24 @@ static void sdram_ddr2_dcr_write(void *opaque, int dcrn, 
uint32_t val)
 if (!(s->mcopt2 & 0x0800) && (val & 0x0800)) {
 trace_ppc4xx_sdram_enable("enable");
 /* validate all RAM mappings */
-sdram_ddr2_map_bcr(s);
+for (i = 0; i < s->nbanks; i++) {
+if (s->bank[i].size) {
+sdram_bank_set_bcr(>bank[i], s->bank[i].bcr,
+   s->bank[i].base, s->bank[i].size,
+   1);
+}
+}
 s->mcopt2 |= 0x0800;
 } else if ((s->mcopt2 & 0x0800) && !(val & 0x0800)) {
 trace_ppc4xx_sdram_enable("disable");
 /* invalidate all RAM mappings */
-sdram_ddr2_unmap_bcr(s);
+

[PATCH v9 10/12] vhost_net: add NetClientState->load() callback

2022-08-19 Thread Eugenio Pérez

It allows per-net client operations right after device's successful
start. In particular, to load the device status.

Vhost-vdpa net will use it to add the CVQ buffers to restore the device
status.

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
v5: Rename start / load, naming it more specifically.
---
 include/net/net.h  | 2 ++
 hw/net/vhost_net.c | 7 +++
 2 files changed, 9 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 476ad45b9a..81d0b21def 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -45,6 +45,7 @@ typedef struct NICConf {
 typedef void (NetPoll)(NetClientState *, bool enable);
 typedef bool (NetCanReceive)(NetClientState *);
 typedef int (NetStart)(NetClientState *);
+typedef int (NetLoad)(NetClientState *);
 typedef void (NetStop)(NetClientState *);
 typedef ssize_t (NetReceive)(NetClientState *, const uint8_t *, size_t);
 typedef ssize_t (NetReceiveIOV)(NetClientState *, const struct iovec *, int);
@@ -74,6 +75,7 @@ typedef struct NetClientInfo {
 NetReceiveIOV *receive_iov;
 NetCanReceive *can_receive;
 NetStart *start;
+NetLoad *load;
 NetStop *stop;
 NetCleanup *cleanup;
 LinkStatusChanged *link_status_changed;
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 9d4b334453..d28f8b974b 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -281,6 +281,13 @@ static int vhost_net_start_one(struct vhost_net *net,
 }
 }
 }
+
+if (net->nc->info->load) {
+r = net->nc->info->load(net->nc);
+if (r < 0) {
+goto fail;
+}
+}
 return 0;
 fail:
 file.fd = -1;
-- 
2.31.1

[PATCH 14/20] ppc440_sdram: Move RAM size check to ppc440_sdram_init

2022-08-19 Thread BALATON Zoltan

Move the check for valid memory sizes from board to sdram contrller
init. Board now only checks for additinal restrictions imposed by
firmware then sdram init checks for valid sizes for SoC.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440.h|  4 ++--
 hw/ppc/ppc440_uc.c | 15 +++
 hw/ppc/sam460ex.c  | 32 +---
 3 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/hw/ppc/ppc440.h b/hw/ppc/ppc440.h
index 7bd5cca1ab..29f6f14ed7 100644
--- a/hw/ppc/ppc440.h
+++ b/hw/ppc/ppc440.h
@@ -11,13 +11,13 @@
 #ifndef PPC440_H
 #define PPC440_H
 
-#include "hw/ppc/ppc4xx.h"
+#include "hw/ppc/ppc.h"
 
 void ppc4xx_l2sram_init(CPUPPCState *env);
 void ppc4xx_cpr_init(CPUPPCState *env);
 void ppc4xx_sdr_init(CPUPPCState *env);
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   Ppc4xxSdramBank ram_banks[]);
+   MemoryRegion *ram);
 void ppc4xx_ahb_init(CPUPPCState *env);
 void ppc4xx_dma_init(CPUPPCState *env, int dcr_base);
 void ppc460ex_pcie_init(CPUPPCState *env);
diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index b39c6dbbd2..e77d56225d 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -486,7 +486,7 @@ void ppc4xx_sdr_init(CPUPPCState *env)
 typedef struct ppc440_sdram_t {
 uint32_t addr;
 uint32_t mcopt2;
-int nbanks;
+int nbanks; /* Banks to use from the 4, e.g. when board has less slots */
 Ppc4xxSdramBank bank[4];
 } ppc440_sdram_t;
 
@@ -728,18 +728,17 @@ static void sdram_ddr2_reset(void *opaque)
 }
 
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   Ppc4xxSdramBank ram_banks[])
+   MemoryRegion *ram)
 {
 ppc440_sdram_t *s;
-int i;
+const ram_addr_t valid_bank_sizes[] = {
+4 * GiB, 2 * GiB, 1 * GiB, 512 * MiB, 256 * MiB, 128 * MiB, 64 * MiB,
+32 * MiB, 16 * MiB, 8 * MiB, 0
+};
 
 s = g_malloc0(sizeof(*s));
 s->nbanks = nbanks;
-for (i = 0; i < nbanks; i++) {
-s->bank[i].ram = ram_banks[i].ram;
-s->bank[i].base = ram_banks[i].base;
-s->bank[i].size = ram_banks[i].size;
-}
+ppc4xx_sdram_banks(ram, s->nbanks, s->bank, valid_bank_sizes);
 qemu_register_reset(_ddr2_reset, s);
 ppc_dcr_register(env, SDRAM0_CFGADDR,
  s, _ddr2_dcr_read, _ddr2_dcr_write);
diff --git a/hw/ppc/sam460ex.c b/hw/ppc/sam460ex.c
index dac329d482..9b850808a3 100644
--- a/hw/ppc/sam460ex.c
+++ b/hw/ppc/sam460ex.c
@@ -74,13 +74,6 @@
 #define EBC_FREQ 11500
 #define UART_FREQ 11059200
 
-/* The SoC could also handle 4 GiB but firmware does not work with that. */
-/* Maybe it overflows a signed 32 bit number somewhere? */
-static const ram_addr_t ppc460ex_sdram_bank_sizes[] = {
-2 * GiB, 1 * GiB, 512 * MiB, 256 * MiB, 128 * MiB, 64 * MiB,
-32 * MiB, 0
-};
-
 struct boot_info {
 uint32_t dt_base;
 uint32_t dt_size;
@@ -273,7 +266,6 @@ static void sam460ex_init(MachineState *machine)
 {
 MemoryRegion *address_space_mem = get_system_memory();
 MemoryRegion *isa = g_new(MemoryRegion, 1);
-Ppc4xxSdramBank *ram_banks = g_new0(Ppc4xxSdramBank, 1);
 MemoryRegion *l2cache_ram = g_new(MemoryRegion, 1);
 DeviceState *uic[4];
 int i;
@@ -340,12 +332,22 @@ static void sam460ex_init(MachineState *machine)
 }
 
 /* SDRAM controller */
-/* put all RAM on first bank because board has one slot
- * and firmware only checks that */
-ppc4xx_sdram_banks(machine->ram, 1, ram_banks, ppc460ex_sdram_bank_sizes);
-
+/* The SoC could also handle 4 GiB but firmware does not work with that. */
+if (machine->ram_size > 2 * GiB) {
+error_report("Memory over 2 GiB is not supported");
+exit(1);
+}
+/* Firmware needs at least 64 MiB */
+if (machine->ram_size < 64 * MiB) {
+error_report("Memory below 64 MiB is not supported");
+exit(1);
+}
+/*
+ * Put all RAM on first bank because board has one slot
+ * and firmware only checks that
+ */
+ppc440_sdram_init(env, 1, machine->ram);
 /* FIXME: does 460EX have ECC interrupts? */
-ppc440_sdram_init(env, 1, ram_banks);
 /* Enable SDRAM memory regions as we may boot without firmware */
 if (ppc_dcr_write(env->dcr_env, SDRAM0_CFGADDR, 0x21) ||
 ppc_dcr_write(env->dcr_env, SDRAM0_CFGDATA, 0x0800)) {
@@ -358,8 +360,8 @@ static void sam460ex_init(MachineState *machine)
qdev_get_gpio_in(uic[0], 2));
 i2c = PPC4xx_I2C(dev)->bus;
 /* SPD EEPROM on RAM module */
-spd_data = spd_data_generate(ram_banks->size < 128 * MiB ? DDR : DDR2,
- ram_banks->size);
+spd_data = spd_data_generate(machine->ram_size < 128 * MiB ? DDR : DDR2,
+ machine->ram_size);
 spd_data[20] = 4; /* SO-DIMM module */
 smbus_eeprom_init_one(i2c, 0x50, spd_data);
 /* RTC */
-- 
2.30.4

[PATCH 12/20] ppc440_sdram: Rename local variable for readibility

2022-08-19 Thread BALATON Zoltan

Rename local sdram variable in ppc440_sdram_init to s for readibility.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_uc.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index bd3d60f278..72eb75d3d2 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -729,40 +729,40 @@ static void sdram_reset(void *opaque)
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
Ppc4xxSdramBank ram_banks[])
 {
-ppc440_sdram_t *sdram;
+ppc440_sdram_t *s;
 int i;
 
-sdram = g_malloc0(sizeof(*sdram));
-sdram->nbanks = nbanks;
+s = g_malloc0(sizeof(*s));
+s->nbanks = nbanks;
 for (i = 0; i < nbanks; i++) {
-sdram->bank[i].ram = ram_banks[i].ram;
-sdram->bank[i].base = ram_banks[i].base;
-sdram->bank[i].size = ram_banks[i].size;
+s->bank[i].ram = ram_banks[i].ram;
+s->bank[i].base = ram_banks[i].base;
+s->bank[i].size = ram_banks[i].size;
 }
-qemu_register_reset(_reset, sdram);
+qemu_register_reset(_reset, s);
 ppc_dcr_register(env, SDRAM0_CFGADDR,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM0_CFGDATA,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 
 ppc_dcr_register(env, SDRAM_R0BAS,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_R1BAS,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_R2BAS,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_R3BAS,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_CONF1HB,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_PLBADDULL,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_CONF1LL,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_CONFPATHB,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM_PLBADDUHB,
- sdram, _read_sdram, _write_sdram);
+ s, _read_sdram, _write_sdram);
 }
 
 /*/
-- 
2.30.4

[PATCH 07/20] ppc4xx_sdram: QOM'ify

2022-08-19 Thread BALATON Zoltan

Change the ppc4xx_sdram model to a QOM class derived from the
PPC4xx-dcr-device and name it ppc4xx-sdram-ddr. This is mostly
modelling the DDR SDRAM controller found in the 440EP (used on the
bamboo board) but also backward compatible with the older DDR
controllers on some 405 SoCs so we also use it for those now. This
likely does not cause problems for guests we run as the new features
are just not accessed but to model 405 SoC accurately some features
may have to be disabled or the model split between 440 and older.

Newer SoCs (regardless of their PPC core, e.g. 405EX) may have an
updated DDR2 SDRAM controller implemented by the ppc440_sdram model
(only partially, enough for the 460EX on the sam460ex) that is not yet
QOM'ified in this patch. That is intended to become ppc4xx-sdram-ddr2
when QOM'ified later.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc405.h |  3 +-
 hw/ppc/ppc405_uc.c  | 22 +
 hw/ppc/ppc440_bamboo.c  | 10 +++--
 hw/ppc/ppc4xx_devs.c| 99 ++---
 include/hw/ppc/ppc4xx.h | 27 +--
 5 files changed, 98 insertions(+), 63 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index ad54dff542..9a4312691e 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -167,8 +167,6 @@ struct Ppc405SoCState {
 DeviceState parent_obj;
 
 /* Public */
-MemoryRegion *dram_mr;
-
 PowerPCCPU cpu;
 PPCUIC uic;
 Ppc405CpcState cpc;
@@ -182,6 +180,7 @@ struct Ppc405SoCState {
 Ppc405PobState pob;
 Ppc4xxPlbState plb;
 Ppc4xxMalState mal;
+Ppc4xxSdramDdrState sdram;
 };
 
 #endif /* PPC405_H */
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 4049fb98dc..9c266a21ad 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -1013,6 +1013,9 @@ static void ppc405_soc_instance_init(Object *obj)
 object_initialize_child(obj, "plb", >plb, TYPE_PPC4xx_PLB);
 
 object_initialize_child(obj, "mal", >mal, TYPE_PPC4xx_MAL);
+
+object_initialize_child(obj, "sdram", >sdram, TYPE_PPC4xx_SDRAM_DDR);
+object_property_add_alias(obj, "dram", OBJECT(>sdram), "dram");
 }
 
 static void ppc405_reset(void *opaque)
@@ -1070,9 +1073,17 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
qdev_get_gpio_in(DEVICE(>cpu), PPC40x_INPUT_CINT));
 
 /* SDRAM controller */
+/*
+ * We use the 440 DDR SDRAM controller which has more regs and features
+ * but it's compatible enough for now
+ */
+object_property_set_int(OBJECT(>sdram), "nbanks", 2, _abort);
+if (!ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(>sdram), >cpu, errp)) {
+return;
+}
 /* XXX 405EP has no ECC interrupt */
-ppc4xx_sdram_init(env, qdev_get_gpio_in(DEVICE(>uic), 17), 1,
-  s->dram_mr);
+sysbus_connect_irq(SYS_BUS_DEVICE(>sdram), 0,
+   qdev_get_gpio_in(DEVICE(>uic), 17));
 
 /* External bus controller */
 if (!ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(>ebc), >cpu, errp)) {
@@ -1147,12 +1158,6 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
 /* Uses UIC IRQs 9, 15, 17 */
 }
 
-static Property ppc405_soc_properties[] = {
-DEFINE_PROP_LINK("dram", Ppc405SoCState, dram_mr, TYPE_MEMORY_REGION,
- MemoryRegion *),
-DEFINE_PROP_END_OF_LIST(),
-};
-
 static void ppc405_soc_class_init(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
@@ -1160,7 +1165,6 @@ static void ppc405_soc_class_init(ObjectClass *oc, void 
*data)
 dc->realize = ppc405_soc_realize;
 /* Reason: only works as part of a ppc405 board/machine */
 dc->user_creatable = false;
-device_class_set_props(dc, ppc405_soc_properties);
 }
 
 static const TypeInfo ppc405_types[] = {
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index 9b456f1819..6052d3a2e0 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -48,8 +48,6 @@
 #define PPC440EP_PCI_IO 0xe800
 #define PPC440EP_PCI_IOLEN  0x0001
 
-#define PPC440EP_SDRAM_NR_BANKS 4
-
 static hwaddr entry;
 
 static int bamboo_load_device_tree(hwaddr addr,
@@ -198,9 +196,13 @@ static void bamboo_init(MachineState *machine)
qdev_get_gpio_in(DEVICE(cpu), PPC40x_INPUT_CINT));
 
 /* SDRAM controller */
+dev = qdev_new(TYPE_PPC4xx_SDRAM_DDR);
+object_property_set_link(OBJECT(dev), "dram", OBJECT(machine->ram),
+ _abort);
+ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(dev), cpu, _fatal);
+object_unref(OBJECT(dev));
 /* XXX 440EP's ECC interrupts are on UIC1, but we've only created UIC0. */
-ppc4xx_sdram_init(env, qdev_get_gpio_in(uicdev, 14),
-  PPC440EP_SDRAM_NR_BANKS, machine->ram);
+sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(uicdev, 14));
 /* Enable SDRAM memory regions, this should be done by the firmware */
 if (ppc_dcr_write(env->dcr_env, SDRAM0_CFGADDR, 0x20) ||

[PATCH v9 06/12] vhost_net: Add NetClientInfo stop callback

2022-08-19 Thread Eugenio Pérez

Used by the backend to perform actions after the device is stopped.

In particular, vdpa net use it to unmap CVQ buffers to the device,
cleaning the actions performed in prepare().

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
v9: Replace performend by performed in patch message
---
 include/net/net.h  | 2 ++
 hw/net/vhost_net.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index ad9e80083a..476ad45b9a 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -45,6 +45,7 @@ typedef struct NICConf {
 typedef void (NetPoll)(NetClientState *, bool enable);
 typedef bool (NetCanReceive)(NetClientState *);
 typedef int (NetStart)(NetClientState *);
+typedef void (NetStop)(NetClientState *);
 typedef ssize_t (NetReceive)(NetClientState *, const uint8_t *, size_t);
 typedef ssize_t (NetReceiveIOV)(NetClientState *, const struct iovec *, int);
 typedef void (NetCleanup) (NetClientState *);
@@ -73,6 +74,7 @@ typedef struct NetClientInfo {
 NetReceiveIOV *receive_iov;
 NetCanReceive *can_receive;
 NetStart *start;
+NetStop *stop;
 NetCleanup *cleanup;
 LinkStatusChanged *link_status_changed;
 QueryRxFilter *query_rx_filter;
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 2e0baeba26..9d4b334453 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -320,6 +320,9 @@ static void vhost_net_stop_one(struct vhost_net *net,
 net->nc->info->poll(net->nc, true);
 }
 vhost_dev_stop(>dev, dev);
+if (net->nc->info->stop) {
+net->nc->info->stop(net->nc);
+}
 vhost_dev_disable_notifiers(>dev, dev);
 }
 
-- 
2.31.1

[PATCH 15/20] ppc440_sdram: QOM'ify

2022-08-19 Thread BALATON Zoltan

Change the ppc440_sdram model to a QOM class derived from the
PPC4xx-dcr-device and name it ppc4xx-sdram-ddr2. This is mostly
modelling the DDR2 SDRAM controller found in the 460EX (used on the
sam460ex board). Newer SoCs (regardless of their PPC core, e.g. 405EX)
may have this controller but we only emulate enough of it for the
sam460ex u-boot firmware.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440.h |   2 -
 hw/ppc/ppc440_uc.c  | 115 +---
 hw/ppc/sam460ex.c   |   7 ++-
 include/hw/ppc/ppc4xx.h |  14 +
 4 files changed, 91 insertions(+), 47 deletions(-)

diff --git a/hw/ppc/ppc440.h b/hw/ppc/ppc440.h
index 29f6f14ed7..7c24db8504 100644
--- a/hw/ppc/ppc440.h
+++ b/hw/ppc/ppc440.h
@@ -16,8 +16,6 @@
 void ppc4xx_l2sram_init(CPUPPCState *env);
 void ppc4xx_cpr_init(CPUPPCState *env);
 void ppc4xx_sdr_init(CPUPPCState *env);
-void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   MemoryRegion *ram);
 void ppc4xx_ahb_init(CPUPPCState *env);
 void ppc4xx_dma_init(CPUPPCState *env, int dcr_base);
 void ppc460ex_pcie_init(CPUPPCState *env);
diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index e77d56225d..a75a1748bd 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -483,13 +483,6 @@ void ppc4xx_sdr_init(CPUPPCState *env)
 
 /*/
 /* SDRAM controller */
-typedef struct ppc440_sdram_t {
-uint32_t addr;
-uint32_t mcopt2;
-int nbanks; /* Banks to use from the 4, e.g. when board has less slots */
-Ppc4xxSdramBank bank[4];
-} ppc440_sdram_t;
-
 enum {
 SDRAM_R0BAS = 0x40,
 SDRAM_R1BAS,
@@ -578,7 +571,7 @@ static void sdram_bank_unmap(Ppc4xxSdramBank *bank)
 object_unparent(OBJECT(>container));
 }
 
-static void sdram_ddr2_set_bcr(ppc440_sdram_t *sdram, int i,
+static void sdram_ddr2_set_bcr(Ppc4xxSdramDdr2State *sdram, int i,
uint32_t bcr, int enabled)
 {
 if (sdram->bank[i].bcr & 1) {
@@ -596,7 +589,7 @@ static void sdram_ddr2_set_bcr(ppc440_sdram_t *sdram, int i,
 }
 }
 
-static void sdram_ddr2_map_bcr(ppc440_sdram_t *sdram)
+static void sdram_ddr2_map_bcr(Ppc4xxSdramDdr2State *sdram)
 {
 int i;
 
@@ -611,7 +604,7 @@ static void sdram_ddr2_map_bcr(ppc440_sdram_t *sdram)
 }
 }
 
-static void sdram_ddr2_unmap_bcr(ppc440_sdram_t *sdram)
+static void sdram_ddr2_unmap_bcr(Ppc4xxSdramDdr2State *sdram)
 {
 int i;
 
@@ -624,7 +617,7 @@ static void sdram_ddr2_unmap_bcr(ppc440_sdram_t *sdram)
 
 static uint32_t sdram_ddr2_dcr_read(void *opaque, int dcrn)
 {
-ppc440_sdram_t *sdram = opaque;
+Ppc4xxSdramDdr2State *sdram = opaque;
 uint32_t ret = 0;
 
 switch (dcrn) {
@@ -677,7 +670,7 @@ static uint32_t sdram_ddr2_dcr_read(void *opaque, int dcrn)
 
 static void sdram_ddr2_dcr_write(void *opaque, int dcrn, uint32_t val)
 {
-ppc440_sdram_t *sdram = opaque;
+Ppc4xxSdramDdr2State *sdram = opaque;
 
 switch (dcrn) {
 case SDRAM_R0BAS:
@@ -719,52 +712,86 @@ static void sdram_ddr2_dcr_write(void *opaque, int dcrn, 
uint32_t val)
 }
 }
 
-static void sdram_ddr2_reset(void *opaque)
+static void ppc4xx_sdram_ddr2_reset(DeviceState *dev)
 {
-ppc440_sdram_t *sdram = opaque;
+Ppc4xxSdramDdr2State *sdram = PPC4xx_SDRAM_DDR2(dev);
 
 sdram->addr = 0;
 sdram->mcopt2 = 0;
 }
 
-void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   MemoryRegion *ram)
+static void ppc4xx_sdram_ddr2_realize(DeviceState *dev, Error **errp)
 {
-ppc440_sdram_t *s;
+Ppc4xxSdramDdr2State *s = PPC4xx_SDRAM_DDR2(dev);
+Ppc4xxDcrDeviceState *dcr = PPC4xx_DCR_DEVICE(dev);
 const ram_addr_t valid_bank_sizes[] = {
 4 * GiB, 2 * GiB, 1 * GiB, 512 * MiB, 256 * MiB, 128 * MiB, 64 * MiB,
 32 * MiB, 16 * MiB, 8 * MiB, 0
 };
 
-s = g_malloc0(sizeof(*s));
-s->nbanks = nbanks;
-ppc4xx_sdram_banks(ram, s->nbanks, s->bank, valid_bank_sizes);
-qemu_register_reset(_ddr2_reset, s);
-ppc_dcr_register(env, SDRAM0_CFGADDR,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM0_CFGDATA,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-
-ppc_dcr_register(env, SDRAM_R0BAS,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM_R1BAS,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM_R2BAS,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM_R3BAS,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM_CONF1HB,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM_PLBADDULL,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM_CONF1LL,
- s, _ddr2_dcr_read, _ddr2_dcr_write);
-ppc_dcr_register(env, SDRAM_CONFPATHB,
-

[PATCH v9 03/12] vhost: Delete useless read memory barrier

2022-08-19 Thread Eugenio Pérez

As discussed in previous series [1], this memory barrier is useless with
the atomic read of used idx at vhost_svq_more_used. Deleting it.

[1] https://lists.nongnu.org/archive/html/qemu-devel/2022-07/msg02616.html

Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
 hw/virtio/vhost-shadow-virtqueue.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index b35aeef4bd..8df5296f24 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -509,9 +509,6 @@ size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
 if (unlikely(g_get_monotonic_time() - start_us > 10e6)) {
 return 0;
 }
-
-/* Make sure we read new used_idx */
-smp_rmb();
 } while (true);
 }
 
-- 
2.31.1

[PATCH 09/20] ppc440_sdram: Split off map/unmap of sdram banks for later reuse

2022-08-19 Thread BALATON Zoltan

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_uc.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index 3507c35b63..c33f91e134 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -561,26 +561,33 @@ static uint64_t sdram_size(uint32_t bcr)
 return size;
 }
 
+static void sdram_bank_map(Ppc4xxSdramBank *bank)
+{
+memory_region_init(>container, NULL, "sdram-container", bank->size);
+memory_region_add_subregion(>container, 0, >ram);
+memory_region_add_subregion(get_system_memory(), bank->base,
+>container);
+}
+
+static void sdram_bank_unmap(Ppc4xxSdramBank *bank)
+{
+memory_region_del_subregion(get_system_memory(), >container);
+memory_region_del_subregion(>container, >ram);
+object_unparent(OBJECT(>container));
+}
+
 static void sdram_set_bcr(ppc440_sdram_t *sdram, int i,
   uint32_t bcr, int enabled)
 {
 if (sdram->bank[i].bcr & 1) {
 /* First unmap RAM if enabled */
-memory_region_del_subregion(get_system_memory(),
->bank[i].container);
-memory_region_del_subregion(>bank[i].container,
->bank[i].ram);
-object_unparent(OBJECT(>bank[i].container));
+sdram_bank_unmap(>bank[i]);
 }
 sdram->bank[i].bcr = bcr & 0xffe0ffc1;
+sdram->bank[i].base = sdram_base(bcr);
+sdram->bank[i].size = sdram_size(bcr);
 if (enabled && (bcr & 1)) {
-memory_region_init(>bank[i].container, NULL, "sdram-container",
-   sdram_size(bcr));
-memory_region_add_subregion(>bank[i].container, 0,
->bank[i].ram);
-memory_region_add_subregion(get_system_memory(),
-sdram_base(bcr),
->bank[i].container);
+sdram_bank_map(>bank[i]);
 }
 }
 
-- 
2.30.4

[PATCH 13/20] ppc4xx_sdram: Rename functions to prevent name clashes

2022-08-19 Thread BALATON Zoltan

Rename functions to avoid name clashes when moving the DDR2 controller
model currently called ppc440_sdram to ppc4xx_devs. This also more
clearly shows which function belongs to which model.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_uc.c   | 69 ++--
 hw/ppc/ppc4xx_devs.c | 44 ++--
 2 files changed, 57 insertions(+), 56 deletions(-)

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index 72eb75d3d2..b39c6dbbd2 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -502,7 +502,7 @@ enum {
 SDRAM_PLBADDUHB = 0x50,
 };
 
-static uint32_t sdram_bcr(hwaddr ram_base, hwaddr ram_size)
+static uint32_t sdram_ddr2_bcr(hwaddr ram_base, hwaddr ram_size)
 {
 uint32_t bcr;
 
@@ -547,12 +547,12 @@ static uint32_t sdram_bcr(hwaddr ram_base, hwaddr 
ram_size)
 return bcr;
 }
 
-static inline hwaddr sdram_base(uint32_t bcr)
+static inline hwaddr sdram_ddr2_base(uint32_t bcr)
 {
 return (bcr & 0xffe0) << 2;
 }
 
-static uint64_t sdram_size(uint32_t bcr)
+static uint64_t sdram_ddr2_size(uint32_t bcr)
 {
 uint64_t size;
 int sh;
@@ -578,50 +578,51 @@ static void sdram_bank_unmap(Ppc4xxSdramBank *bank)
 object_unparent(OBJECT(>container));
 }
 
-static void sdram_set_bcr(ppc440_sdram_t *sdram, int i,
-  uint32_t bcr, int enabled)
+static void sdram_ddr2_set_bcr(ppc440_sdram_t *sdram, int i,
+   uint32_t bcr, int enabled)
 {
 if (sdram->bank[i].bcr & 1) {
 /* First unmap RAM if enabled */
-trace_ppc4xx_sdram_unmap(sdram_base(sdram->bank[i].bcr),
- sdram_size(sdram->bank[i].bcr));
+trace_ppc4xx_sdram_unmap(sdram_ddr2_base(sdram->bank[i].bcr),
+ sdram_ddr2_size(sdram->bank[i].bcr));
 sdram_bank_unmap(>bank[i]);
 }
 sdram->bank[i].bcr = bcr & 0xffe0ffc1;
-sdram->bank[i].base = sdram_base(bcr);
-sdram->bank[i].size = sdram_size(bcr);
+sdram->bank[i].base = sdram_ddr2_base(bcr);
+sdram->bank[i].size = sdram_ddr2_size(bcr);
 if (enabled && (bcr & 1)) {
-trace_ppc4xx_sdram_map(sdram_base(bcr), sdram_size(bcr));
+trace_ppc4xx_sdram_map(sdram_ddr2_base(bcr), sdram_ddr2_size(bcr));
 sdram_bank_map(>bank[i]);
 }
 }
 
-static void sdram_map_bcr(ppc440_sdram_t *sdram)
+static void sdram_ddr2_map_bcr(ppc440_sdram_t *sdram)
 {
 int i;
 
 for (i = 0; i < sdram->nbanks; i++) {
 if (sdram->bank[i].size) {
-sdram_set_bcr(sdram, i, sdram_bcr(sdram->bank[i].base,
+sdram_ddr2_set_bcr(sdram, i,
+   sdram_ddr2_bcr(sdram->bank[i].base,
   sdram->bank[i].size), 1);
 } else {
-sdram_set_bcr(sdram, i, 0, 0);
+sdram_ddr2_set_bcr(sdram, i, 0, 0);
 }
 }
 }
 
-static void sdram_unmap_bcr(ppc440_sdram_t *sdram)
+static void sdram_ddr2_unmap_bcr(ppc440_sdram_t *sdram)
 {
 int i;
 
 for (i = 0; i < sdram->nbanks; i++) {
 if (sdram->bank[i].size) {
-sdram_set_bcr(sdram, i, sdram->bank[i].bcr & ~1, 0);
+sdram_ddr2_set_bcr(sdram, i, sdram->bank[i].bcr & ~1, 0);
 }
 }
 }
 
-static uint32_t dcr_read_sdram(void *opaque, int dcrn)
+static uint32_t sdram_ddr2_dcr_read(void *opaque, int dcrn)
 {
 ppc440_sdram_t *sdram = opaque;
 uint32_t ret = 0;
@@ -632,8 +633,8 @@ static uint32_t dcr_read_sdram(void *opaque, int dcrn)
 case SDRAM_R2BAS:
 case SDRAM_R3BAS:
 if (sdram->bank[dcrn - SDRAM_R0BAS].size) {
-ret = sdram_bcr(sdram->bank[dcrn - SDRAM_R0BAS].base,
-sdram->bank[dcrn - SDRAM_R0BAS].size);
+ret = sdram_ddr2_bcr(sdram->bank[dcrn - SDRAM_R0BAS].base,
+ sdram->bank[dcrn - SDRAM_R0BAS].size);
 }
 break;
 case SDRAM_CONF1HB:
@@ -674,7 +675,7 @@ static uint32_t dcr_read_sdram(void *opaque, int dcrn)
 return ret;
 }
 
-static void dcr_write_sdram(void *opaque, int dcrn, uint32_t val)
+static void sdram_ddr2_dcr_write(void *opaque, int dcrn, uint32_t val)
 {
 ppc440_sdram_t *sdram = opaque;
 
@@ -700,12 +701,12 @@ static void dcr_write_sdram(void *opaque, int dcrn, 
uint32_t val)
 if (!(sdram->mcopt2 & 0x0800) && (val & 0x0800)) {
 trace_ppc4xx_sdram_enable("enable");
 /* validate all RAM mappings */
-sdram_map_bcr(sdram);
+sdram_ddr2_map_bcr(sdram);
 sdram->mcopt2 |= 0x0800;
 } else if ((sdram->mcopt2 & 0x0800) && !(val & 0x0800)) {
 trace_ppc4xx_sdram_enable("disable");
 /* invalidate all RAM mappings */
-sdram_unmap_bcr(sdram);
+sdram_ddr2_unmap_bcr(sdram);
 sdram->mcopt2 &= ~0x0800;
 }

[PATCH 08/20] ppc4xx_sdram: Drop extra zeros for readability

2022-08-19 Thread BALATON Zoltan

Constants that are written zero padded for no good reason are hard to
read, it's easier to see what is meant if it's just 0 or 1 instead.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc4xx_devs.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index 375834a52b..bfe7b2d3a6 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -49,31 +49,31 @@ static uint32_t sdram_ddr_bcr(hwaddr ram_base, hwaddr 
ram_size)
 
 switch (ram_size) {
 case 4 * MiB:
-bcr = 0x;
+bcr = 0;
 break;
 case 8 * MiB:
-bcr = 0x0002;
+bcr = 0x2;
 break;
 case 16 * MiB:
-bcr = 0x0004;
+bcr = 0x4;
 break;
 case 32 * MiB:
-bcr = 0x0006;
+bcr = 0x6;
 break;
 case 64 * MiB:
-bcr = 0x0008;
+bcr = 0x8;
 break;
 case 128 * MiB:
-bcr = 0x000A;
+bcr = 0xA;
 break;
 case 256 * MiB:
-bcr = 0x000C;
+bcr = 0xC;
 break;
 default:
 qemu_log_mask(LOG_GUEST_ERROR,
   "%s: invalid RAM size 0x%" HWADDR_PRIx "\n", __func__,
   ram_size);
-return 0x;
+return 0;
 }
 bcr |= ram_base & 0xFF80;
 bcr |= 1;
@@ -104,7 +104,7 @@ static target_ulong sdram_size(uint32_t bcr)
 static void sdram_set_bcr(Ppc4xxSdramDdrState *sdram, int i,
   uint32_t bcr, int enabled)
 {
-if (sdram->bank[i].bcr & 0x0001) {
+if (sdram->bank[i].bcr & 1) {
 /* Unmap RAM */
 trace_ppc4xx_sdram_unmap(sdram_base(sdram->bank[i].bcr),
  sdram_size(sdram->bank[i].bcr));
@@ -115,7 +115,7 @@ static void sdram_set_bcr(Ppc4xxSdramDdrState *sdram, int i,
 object_unparent(OBJECT(>bank[i].container));
 }
 sdram->bank[i].bcr = bcr & 0xFFDEE001;
-if (enabled && (bcr & 0x0001)) {
+if (enabled && (bcr & 1)) {
 trace_ppc4xx_sdram_map(sdram_base(bcr), sdram_size(bcr));
 memory_region_init(>bank[i].container, NULL, "sdram-container",
sdram_size(bcr));
@@ -136,7 +136,7 @@ static void sdram_map_bcr(Ppc4xxSdramDdrState *sdram)
 sdram_set_bcr(sdram, i, sdram_ddr_bcr(sdram->bank[i].base,
   sdram->bank[i].size), 1);
 } else {
-sdram_set_bcr(sdram, i, 0x, 0);
+sdram_set_bcr(sdram, i, 0, 0);
 }
 }
 }
@@ -213,7 +213,7 @@ static uint32_t sdram_ddr_dcr_read(void *opaque, int dcrn)
 break;
 default:
 /* Avoid gcc warning */
-ret = 0x;
+ret = 0;
 break;
 }
 
@@ -306,18 +306,18 @@ static void ppc4xx_sdram_ddr_reset(DeviceState *dev)
 {
 Ppc4xxSdramDdrState *sdram = PPC4xx_SDRAM_DDR(dev);
 
-sdram->addr = 0x;
-sdram->bear = 0x;
-sdram->besr0 = 0x; /* No error */
-sdram->besr1 = 0x; /* No error */
-sdram->cfg = 0x;
-sdram->ecccfg = 0x; /* No ECC */
-sdram->eccesr = 0x; /* No error */
+sdram->addr = 0;
+sdram->bear = 0;
+sdram->besr0 = 0; /* No error */
+sdram->besr1 = 0; /* No error */
+sdram->cfg = 0;
+sdram->ecccfg = 0; /* No ECC */
+sdram->eccesr = 0; /* No error */
 sdram->pmit = 0x07C0;
 sdram->rtr = 0x05F0;
 sdram->tr = 0x00854009;
 /* We pre-initialize RAM banks */
-sdram->status = 0x;
+sdram->status = 0;
 sdram->cfg = 0x0080;
 }
 
-- 
2.30.4

[PATCH 10/20] ppc440_sdram: Implement enable bit in the DDR2 SDRAM controller

2022-08-19 Thread BALATON Zoltan

To allow removing the do_init hack we need to improve the DDR2 SDRAM
controller model to handle the enable/disable bit that it ignored so
far.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_uc.c | 34 --
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index c33f91e134..7c1513ff69 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -23,6 +23,7 @@
 #include "sysemu/reset.h"
 #include "ppc440.h"
 #include "qom/object.h"
+#include "trace.h"
 
 /*/
 /* L2 Cache as SRAM */
@@ -484,6 +485,7 @@ void ppc4xx_sdr_init(CPUPPCState *env)
 /* SDRAM controller */
 typedef struct ppc440_sdram_t {
 uint32_t addr;
+uint32_t mcopt2;
 int nbanks;
 Ppc4xxSdramBank bank[4];
 } ppc440_sdram_t;
@@ -581,12 +583,15 @@ static void sdram_set_bcr(ppc440_sdram_t *sdram, int i,
 {
 if (sdram->bank[i].bcr & 1) {
 /* First unmap RAM if enabled */
+trace_ppc4xx_sdram_unmap(sdram_base(sdram->bank[i].bcr),
+ sdram_size(sdram->bank[i].bcr));
 sdram_bank_unmap(>bank[i]);
 }
 sdram->bank[i].bcr = bcr & 0xffe0ffc1;
 sdram->bank[i].base = sdram_base(bcr);
 sdram->bank[i].size = sdram_size(bcr);
 if (enabled && (bcr & 1)) {
+trace_ppc4xx_sdram_map(sdram_base(bcr), sdram_size(bcr));
 sdram_bank_map(>bank[i]);
 }
 }
@@ -596,7 +601,7 @@ static void sdram_map_bcr(ppc440_sdram_t *sdram)
 int i;
 
 for (i = 0; i < sdram->nbanks; i++) {
-if (sdram->bank[i].size != 0) {
+if (sdram->bank[i].size) {
 sdram_set_bcr(sdram, i, sdram_bcr(sdram->bank[i].base,
   sdram->bank[i].size), 1);
 } else {
@@ -605,6 +610,17 @@ static void sdram_map_bcr(ppc440_sdram_t *sdram)
 }
 }
 
+static void sdram_unmap_bcr(ppc440_sdram_t *sdram)
+{
+int i;
+
+for (i = 0; i < sdram->nbanks; i++) {
+if (sdram->bank[i].size) {
+sdram_set_bcr(sdram, i, sdram->bank[i].bcr & ~1, 0);
+}
+}
+}
+
 static uint32_t dcr_read_sdram(void *opaque, int dcrn)
 {
 ppc440_sdram_t *sdram = opaque;
@@ -636,7 +652,7 @@ static uint32_t dcr_read_sdram(void *opaque, int dcrn)
 ret = 0x8000;
 break;
 case 0x21: /* SDRAM_MCOPT2 */
-ret = 0x0800;
+ret = sdram->mcopt2;
 break;
 case 0x40: /* SDRAM_MB0CF */
 ret = 0x8001;
@@ -680,6 +696,19 @@ static void dcr_write_sdram(void *opaque, int dcrn, 
uint32_t val)
 switch (sdram->addr) {
 case 0x00: /* B0CR */
 break;
+case 0x21: /* SDRAM_MCOPT2 */
+if (!(sdram->mcopt2 & 0x0800) && (val & 0x0800)) {
+trace_ppc4xx_sdram_enable("enable");
+/* validate all RAM mappings */
+sdram_map_bcr(sdram);
+sdram->mcopt2 |= 0x0800;
+} else if ((sdram->mcopt2 & 0x0800) && !(val & 0x0800)) {
+trace_ppc4xx_sdram_enable("disable");
+/* invalidate all RAM mappings */
+sdram_unmap_bcr(sdram);
+sdram->mcopt2 &= ~0x0800;
+}
+break;
 default:
 break;
 }
@@ -694,6 +723,7 @@ static void sdram_reset(void *opaque)
 ppc440_sdram_t *sdram = opaque;
 
 sdram->addr = 0;
+sdram->mcopt2 = 0x0800;
 }
 
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-- 
2.30.4

[PATCH v9 00/12] NIC vhost-vdpa state restore via Shadow CVQ

2022-08-19 Thread Eugenio Pérez

CVQ of net vhost-vdpa devices can be intercepted since the addition of x-svq.
The virtio-net device model is updated. The migration was blocked because
although the state can be megrated between VMM it was not possible to restore
on the destination NIC.

This series add support for SVQ to inject external messages without the guest's
knowledge, so before the guest is resumed all the guest visible state is
restored. It is done using standard CVQ messages, so the vhost-vdpa device does
not need to learn how to restore it: As long as they have the feature, they
know how to handle it.

Depending on the device, this series may need fixes with message id [1] to
achieve full live migration.

[1] <20220819165357.3591965-1-epere...@redhat.com>

v9:
- Use guest acked features instead of device's.
- Minors: fix typos and patch messages, constify vhost_vdpa and VirtIONet vars,
  delete unneeded increment of cursor.

v8:
- Rename NetClientInfo load to start, so is symmetrical with stop()
- Delete copy of device's in buffer at vhost_vdpa_net_load

v7:
- Remove accidental double free.

v6:
- Move map and unmap of the buffers to the start and stop of the device. This
  implies more callbacks on NetClientInfo, but simplifies the SVQ CVQ code.
- Not assume that in buffer is sizeof(virtio_net_ctrl_ack) in
  vhost_vdpa_net_cvq_add
- Reduce the number of changes from previous versions
- Delete unused memory barrier

v5:
- Rename s/start/load/
- Use independent NetClientInfo to only add load callback on cvq.
- Accept out sg instead of dev_buffers[] at vhost_vdpa_net_cvq_map_elem
- Use only out size instead of iovec dev_buffers to know if the descriptor is
  effectively available, allowing to delete artificial !NULL VirtQueueElement
  on vhost_svq_add call.

v4:
- Actually use NetClientInfo callback.

v3:
- Route vhost-vdpa start code through NetClientInfo callback.
- Delete extra vhost_net_stop_one() call.

v2:
- Fix SIGSEGV dereferencing SVQ when not in svq mode

v1 from RFC:
- Do not reorder DRIVER_OK & enable patches.
- Delete leftovers

Eugenio Pérez (12):
  vhost: stop transfer elem ownership in vhost_handle_guest_kick
  vhost: use SVQ element ndescs instead of opaque data for desc
validation
  vhost: Delete useless read memory barrier
  vhost: Do not depend on !NULL VirtQueueElement on vhost_svq_flush
  vhost_net: Add NetClientInfo start callback
  vhost_net: Add NetClientInfo stop callback
  vdpa: add net_vhost_vdpa_cvq_info NetClientInfo
  vdpa: Move command buffers map to start of net device
  vdpa: extract vhost_vdpa_net_cvq_add from
vhost_vdpa_net_handle_ctrl_avail
  vhost_net: add NetClientState->load() callback
  vdpa: Add virtio-net mac address via CVQ at start
  vdpa: Delete CVQ migration blocker

 include/hw/virtio/vhost-vdpa.h |   1 -
 include/net/net.h  |   6 +
 hw/net/vhost_net.c |  17 +++
 hw/virtio/vhost-shadow-virtqueue.c |  27 ++--
 hw/virtio/vhost-vdpa.c |  15 --
 net/vhost-vdpa.c   | 224 ++---
 6 files changed, 177 insertions(+), 113 deletions(-)

-- 
2.31.1

[PATCH 00/20] ppc4xx_sdram QOMify and clean ups

2022-08-19 Thread BALATON Zoltan

Hello,

This is based on gitlab.com/danielhb/qemu/tree/ppc-7.2

This is the end of the QOMify series started by Cédric. This series
handles the SDRAM controller models to clean them up, QOMify and unify
them and at least partially clean up the mess that has accumulated
around these in the past. This includes the not yet merged patches
from the last series and new ones that change the DDR2 version used by
sam460ex. This is all I have for now I don't intend to make any more
bigger changes for this now.

Regards,
BALATON Zoltan

BALATON Zoltan (20):
  ppc440_bamboo: Remove unnecessary memsets
  ppc4xx: Introduce Ppc4xxSdramBank struct
  ppc4xx_sdram: Get rid of the init RAM hack
  ppc4xx: Use Ppc4xxSdramBank in ppc4xx_sdram_banks()
  ppc440_bamboo: Add missing 4 MiB valid memory size
  ppc4xx_sdram: Move size check to ppc4xx_sdram_init()
  ppc4xx_sdram: QOM'ify
  ppc4xx_sdram: Drop extra zeros for readability
  ppc440_sdram: Split off map/unmap of sdram banks for later reuse
  ppc440_sdram: Implement enable bit in the DDR2 SDRAM controller
  ppc440_sdram: Get rid of the init RAM hack
  ppc440_sdram: Rename local variable for readibility
  ppc4xx_sdram: Rename functions to prevent name clashes
  ppc440_sdram: Move RAM size check to ppc440_sdram_init
  ppc440_sdram: QOM'ify
  ppc4xx_sdram: Move ppc4xx DDR and DDR2 SDRAM controller models
together
  ppc4xx_sdram: Use hwaddr for memory bank size
  ppc4xx_sdram: Rename local state variable for brevity
  ppc4xx_sdram: Generalise bank setup
  ppc4xx_sdram: Convert DDR SDRAM controller to new bank handling

 hw/ppc/meson.build  |   3 +-
 hw/ppc/ppc405.h |   8 +-
 hw/ppc/ppc405_boards.c  |  19 +-
 hw/ppc/ppc405_uc.c  |  33 +-
 hw/ppc/ppc440.h |   4 -
 hw/ppc/ppc440_bamboo.c  |  29 +-
 hw/ppc/ppc440_uc.c  | 267 +--
 hw/ppc/ppc4xx_devs.c| 413 ---
 hw/ppc/ppc4xx_sdram.c   | 723 
 hw/ppc/sam460ex.c   |  48 +--
 hw/ppc/trace-events |   1 +
 include/hw/ppc/ppc4xx.h |  66 +++-
 12 files changed, 844 insertions(+), 770 deletions(-)
 create mode 100644 hw/ppc/ppc4xx_sdram.c

-- 
2.30.4

[PATCH 01/20] ppc440_bamboo: Remove unnecessary memsets

2022-08-19 Thread BALATON Zoltan

In ppc4xx_sdram_init() the struct is allocated with g_new0() so no
need to clear its elements. In the bamboo machine init memset can be
replaced with array initialiser which is shorter.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_bamboo.c | 6 ++
 hw/ppc/ppc4xx_devs.c   | 8 ++--
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index ea945a1c99..5ec82fa8c2 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -169,8 +169,8 @@ static void bamboo_init(MachineState *machine)
 MemoryRegion *address_space_mem = get_system_memory();
 MemoryRegion *isa = g_new(MemoryRegion, 1);
 MemoryRegion *ram_memories = g_new(MemoryRegion, PPC440EP_SDRAM_NR_BANKS);
-hwaddr ram_bases[PPC440EP_SDRAM_NR_BANKS];
-hwaddr ram_sizes[PPC440EP_SDRAM_NR_BANKS];
+hwaddr ram_bases[PPC440EP_SDRAM_NR_BANKS] = {0};
+hwaddr ram_sizes[PPC440EP_SDRAM_NR_BANKS] = {0};
 PCIBus *pcibus;
 PowerPCCPU *cpu;
 CPUPPCState *env;
@@ -205,8 +205,6 @@ static void bamboo_init(MachineState *machine)
qdev_get_gpio_in(DEVICE(cpu), PPC40x_INPUT_CINT));
 
 /* SDRAM controller */
-memset(ram_bases, 0, sizeof(ram_bases));
-memset(ram_sizes, 0, sizeof(ram_sizes));
 ppc4xx_sdram_banks(machine->ram, PPC440EP_SDRAM_NR_BANKS, ram_memories,
ram_bases, ram_sizes, ppc440ep_sdram_bank_sizes);
 /* XXX 440EP's ECC interrupts are on UIC1, but we've only created UIC0. */
diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index ce38ae65e6..b4cd10f735 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -363,12 +363,8 @@ void ppc4xx_sdram_init(CPUPPCState *env, qemu_irq irq, int 
nbanks,
 sdram->irq = irq;
 sdram->nbanks = nbanks;
 sdram->ram_memories = ram_memories;
-memset(sdram->ram_bases, 0, 4 * sizeof(hwaddr));
-memcpy(sdram->ram_bases, ram_bases,
-   nbanks * sizeof(hwaddr));
-memset(sdram->ram_sizes, 0, 4 * sizeof(hwaddr));
-memcpy(sdram->ram_sizes, ram_sizes,
-   nbanks * sizeof(hwaddr));
+memcpy(sdram->ram_bases, ram_bases, nbanks * sizeof(hwaddr));
+memcpy(sdram->ram_sizes, ram_sizes, nbanks * sizeof(hwaddr));
 qemu_register_reset(_reset, sdram);
 ppc_dcr_register(env, SDRAM0_CFGADDR,
  sdram, _read_sdram, _write_sdram);
-- 
2.30.4

[PATCH 05/20] ppc440_bamboo: Add missing 4 MiB valid memory size

2022-08-19 Thread BALATON Zoltan

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_bamboo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index 2aac8a3fe9..2bd5e41140 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -51,7 +51,7 @@
 #define PPC440EP_SDRAM_NR_BANKS 4
 
 static const ram_addr_t ppc440ep_sdram_bank_sizes[] = {
-256 * MiB, 128 * MiB, 64 * MiB, 32 * MiB, 16 * MiB, 8 * MiB, 0
+256 * MiB, 128 * MiB, 64 * MiB, 32 * MiB, 16 * MiB, 8 * MiB, 4 * MiB, 0
 };
 
 static hwaddr entry;
-- 
2.30.4

[PATCH 11/20] ppc440_sdram: Get rid of the init RAM hack

2022-08-19 Thread BALATON Zoltan

Remove the do_init parameter of ppc440_sdram_init and enable SDRAM
controller from the board via DCR access instead. Firmware does this
so it may not be needed when booting firmware only with -kernel but we
enable it unconditionally to preserve previous behaviour.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440.h| 3 +--
 hw/ppc/ppc440_uc.c | 8 ++--
 hw/ppc/sam460ex.c  | 8 +++-
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/ppc440.h b/hw/ppc/ppc440.h
index 5eb2f9a6b3..7bd5cca1ab 100644
--- a/hw/ppc/ppc440.h
+++ b/hw/ppc/ppc440.h
@@ -17,8 +17,7 @@ void ppc4xx_l2sram_init(CPUPPCState *env);
 void ppc4xx_cpr_init(CPUPPCState *env);
 void ppc4xx_sdr_init(CPUPPCState *env);
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   Ppc4xxSdramBank ram_banks[],
-   int do_init);
+   Ppc4xxSdramBank ram_banks[]);
 void ppc4xx_ahb_init(CPUPPCState *env);
 void ppc4xx_dma_init(CPUPPCState *env, int dcr_base);
 void ppc460ex_pcie_init(CPUPPCState *env);
diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index 7c1513ff69..bd3d60f278 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -723,12 +723,11 @@ static void sdram_reset(void *opaque)
 ppc440_sdram_t *sdram = opaque;
 
 sdram->addr = 0;
-sdram->mcopt2 = 0x0800;
+sdram->mcopt2 = 0;
 }
 
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   Ppc4xxSdramBank ram_banks[],
-   int do_init)
+   Ppc4xxSdramBank ram_banks[])
 {
 ppc440_sdram_t *sdram;
 int i;
@@ -745,9 +744,6 @@ void ppc440_sdram_init(CPUPPCState *env, int nbanks,
  sdram, _read_sdram, _write_sdram);
 ppc_dcr_register(env, SDRAM0_CFGDATA,
  sdram, _read_sdram, _write_sdram);
-if (do_init) {
-sdram_map_bcr(sdram);
-}
 
 ppc_dcr_register(env, SDRAM_R0BAS,
  sdram, _read_sdram, _write_sdram);
diff --git a/hw/ppc/sam460ex.c b/hw/ppc/sam460ex.c
index f4c2a693fb..dac329d482 100644
--- a/hw/ppc/sam460ex.c
+++ b/hw/ppc/sam460ex.c
@@ -345,7 +345,13 @@ static void sam460ex_init(MachineState *machine)
 ppc4xx_sdram_banks(machine->ram, 1, ram_banks, ppc460ex_sdram_bank_sizes);
 
 /* FIXME: does 460EX have ECC interrupts? */
-ppc440_sdram_init(env, 1, ram_banks, 1);
+ppc440_sdram_init(env, 1, ram_banks);
+/* Enable SDRAM memory regions as we may boot without firmware */
+if (ppc_dcr_write(env->dcr_env, SDRAM0_CFGADDR, 0x21) ||
+ppc_dcr_write(env->dcr_env, SDRAM0_CFGDATA, 0x0800)) {
+error_report("Couldn't enable memory regions");
+exit(1);
+}
 
 /* IIC controllers and devices */
 dev = sysbus_create_simple(TYPE_PPC4xx_I2C, 0x4ef600700,
-- 
2.30.4

[PATCH 06/20] ppc4xx_sdram: Move size check to ppc4xx_sdram_init()

2022-08-19 Thread BALATON Zoltan

Instead of checking if memory size is valid in board code move this
check to ppc4xx_sdram_init() as this is a restriction imposed by the
SDRAM controller.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc405.h |  2 --
 hw/ppc/ppc405_boards.c  | 10 --
 hw/ppc/ppc405_uc.c  | 11 ++-
 hw/ppc/ppc440_bamboo.c  | 10 +-
 hw/ppc/ppc4xx_devs.c| 14 ++
 include/hw/ppc/ppc4xx.h |  2 +-
 6 files changed, 10 insertions(+), 39 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index ca0972b88b..ad54dff542 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -167,9 +167,7 @@ struct Ppc405SoCState {
 DeviceState parent_obj;
 
 /* Public */
-Ppc4xxSdramBank ram_banks[2];
 MemoryRegion *dram_mr;
-hwaddr ram_size;
 
 PowerPCCPU cpu;
 PPCUIC uic;
diff --git a/hw/ppc/ppc405_boards.c b/hw/ppc/ppc405_boards.c
index 0a29ad97c7..a82b6c5c83 100644
--- a/hw/ppc/ppc405_boards.c
+++ b/hw/ppc/ppc405_boards.c
@@ -278,21 +278,11 @@ static void boot_from_kernel(MachineState *machine, 
PowerPCCPU *cpu)
 static void ppc405_init(MachineState *machine)
 {
 Ppc405MachineState *ppc405 = PPC405_MACHINE(machine);
-MachineClass *mc = MACHINE_GET_CLASS(machine);
 const char *kernel_filename = machine->kernel_filename;
 MemoryRegion *sysmem = get_system_memory();
 
-if (machine->ram_size != mc->default_ram_size) {
-char *sz = size_to_str(mc->default_ram_size);
-error_report("Invalid RAM size, should be %s", sz);
-g_free(sz);
-exit(EXIT_FAILURE);
-}
-
 object_initialize_child(OBJECT(machine), "soc", >soc,
 TYPE_PPC405_SOC);
-object_property_set_uint(OBJECT(>soc), "ram-size",
- machine->ram_size, _fatal);
 object_property_set_link(OBJECT(>soc), "dram",
  OBJECT(machine->ram), _abort);
 object_property_set_uint(OBJECT(>soc), "sys-clk", ,
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 461d18c8a5..4049fb98dc 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -1070,15 +1070,9 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
qdev_get_gpio_in(DEVICE(>cpu), PPC40x_INPUT_CINT));
 
 /* SDRAM controller */
-/* XXX 405EP has no ECC interrupt */
-s->ram_banks[0].base = 0;
-s->ram_banks[0].size = s->ram_size;
-memory_region_init_alias(>ram_banks[0].ram, OBJECT(s),
- "ppc405.sdram0", s->dram_mr,
- s->ram_banks[0].base, s->ram_banks[0].size);
-
+/* XXX 405EP has no ECC interrupt */
 ppc4xx_sdram_init(env, qdev_get_gpio_in(DEVICE(>uic), 17), 1,
-  s->ram_banks);
+  s->dram_mr);
 
 /* External bus controller */
 if (!ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(>ebc), >cpu, errp)) {
@@ -1156,7 +1150,6 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
 static Property ppc405_soc_properties[] = {
 DEFINE_PROP_LINK("dram", Ppc405SoCState, dram_mr, TYPE_MEMORY_REGION,
  MemoryRegion *),
-DEFINE_PROP_UINT64("ram-size", Ppc405SoCState, ram_size, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index 2bd5e41140..9b456f1819 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -50,10 +50,6 @@
 
 #define PPC440EP_SDRAM_NR_BANKS 4
 
-static const ram_addr_t ppc440ep_sdram_bank_sizes[] = {
-256 * MiB, 128 * MiB, 64 * MiB, 32 * MiB, 16 * MiB, 8 * MiB, 4 * MiB, 0
-};
-
 static hwaddr entry;
 
 static int bamboo_load_device_tree(hwaddr addr,
@@ -168,8 +164,6 @@ static void bamboo_init(MachineState *machine)
 unsigned int pci_irq_nrs[4] = { 28, 27, 26, 25 };
 MemoryRegion *address_space_mem = get_system_memory();
 MemoryRegion *isa = g_new(MemoryRegion, 1);
-Ppc4xxSdramBank *ram_banks = g_new0(Ppc4xxSdramBank,
-PPC440EP_SDRAM_NR_BANKS);
 PCIBus *pcibus;
 PowerPCCPU *cpu;
 CPUPPCState *env;
@@ -204,11 +198,9 @@ static void bamboo_init(MachineState *machine)
qdev_get_gpio_in(DEVICE(cpu), PPC40x_INPUT_CINT));
 
 /* SDRAM controller */
-ppc4xx_sdram_banks(machine->ram, PPC440EP_SDRAM_NR_BANKS, ram_banks,
-   ppc440ep_sdram_bank_sizes);
 /* XXX 440EP's ECC interrupts are on UIC1, but we've only created UIC0. */
 ppc4xx_sdram_init(env, qdev_get_gpio_in(uicdev, 14),
-  PPC440EP_SDRAM_NR_BANKS, ram_banks);
+  PPC440EP_SDRAM_NR_BANKS, machine->ram);
 /* Enable SDRAM memory regions, this should be done by the firmware */
 if (ppc_dcr_write(env->dcr_env, SDRAM0_CFGADDR, 0x20) ||
 ppc_dcr_write(env->dcr_env, SDRAM0_CFGDATA, 0x8000)) {
diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index e0b5931c04..eb3aa97b16 100644
--- a/hw/ppc/ppc4xx_devs.c
+++

[PATCH 04/20] ppc4xx: Use Ppc4xxSdramBank in ppc4xx_sdram_banks()

2022-08-19 Thread BALATON Zoltan

Change ppc4xx_sdram_banks() to take one Ppc4xxSdramBank array instead
of the separate arrays and adjust ppc4xx_sdram_init() and
ppc440_sdram_init() accordingly as well as machines using these.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc405.h |  4 +---
 hw/ppc/ppc405_uc.c  | 10 +-
 hw/ppc/ppc440.h |  5 ++---
 hw/ppc/ppc440_bamboo.c  | 15 ++-
 hw/ppc/ppc440_uc.c  |  9 -
 hw/ppc/ppc4xx_devs.c| 21 +
 hw/ppc/sam460ex.c   | 15 +--
 include/hw/ppc/ppc4xx.h |  9 +++--
 8 files changed, 35 insertions(+), 53 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index 756865621b..ca0972b88b 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -167,9 +167,7 @@ struct Ppc405SoCState {
 DeviceState parent_obj;
 
 /* Public */
-MemoryRegion ram_banks[2];
-hwaddr ram_bases[2], ram_sizes[2];
-
+Ppc4xxSdramBank ram_banks[2];
 MemoryRegion *dram_mr;
 hwaddr ram_size;
 
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 2833d0d538..461d18c8a5 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -1071,14 +1071,14 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
 
 /* SDRAM controller */
 /* XXX 405EP has no ECC interrupt */
-s->ram_bases[0] = 0;
-s->ram_sizes[0] = s->ram_size;
-memory_region_init_alias(>ram_banks[0], OBJECT(s),
+s->ram_banks[0].base = 0;
+s->ram_banks[0].size = s->ram_size;
+memory_region_init_alias(>ram_banks[0].ram, OBJECT(s),
  "ppc405.sdram0", s->dram_mr,
- s->ram_bases[0], s->ram_sizes[0]);
+ s->ram_banks[0].base, s->ram_banks[0].size);
 
 ppc4xx_sdram_init(env, qdev_get_gpio_in(DEVICE(>uic), 17), 1,
-  s->ram_banks, s->ram_bases, s->ram_sizes);
+  s->ram_banks);
 
 /* External bus controller */
 if (!ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(>ebc), >cpu, errp)) {
diff --git a/hw/ppc/ppc440.h b/hw/ppc/ppc440.h
index 7cef936125..5eb2f9a6b3 100644
--- a/hw/ppc/ppc440.h
+++ b/hw/ppc/ppc440.h
@@ -11,14 +11,13 @@
 #ifndef PPC440_H
 #define PPC440_H
 
-#include "hw/ppc/ppc.h"
+#include "hw/ppc/ppc4xx.h"
 
 void ppc4xx_l2sram_init(CPUPPCState *env);
 void ppc4xx_cpr_init(CPUPPCState *env);
 void ppc4xx_sdr_init(CPUPPCState *env);
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   MemoryRegion *ram_memories,
-   hwaddr *ram_bases, hwaddr *ram_sizes,
+   Ppc4xxSdramBank ram_banks[],
int do_init);
 void ppc4xx_ahb_init(CPUPPCState *env);
 void ppc4xx_dma_init(CPUPPCState *env, int dcr_base);
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index e3412c4fcd..2aac8a3fe9 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -168,9 +168,8 @@ static void bamboo_init(MachineState *machine)
 unsigned int pci_irq_nrs[4] = { 28, 27, 26, 25 };
 MemoryRegion *address_space_mem = get_system_memory();
 MemoryRegion *isa = g_new(MemoryRegion, 1);
-MemoryRegion *ram_memories = g_new(MemoryRegion, PPC440EP_SDRAM_NR_BANKS);
-hwaddr ram_bases[PPC440EP_SDRAM_NR_BANKS] = {0};
-hwaddr ram_sizes[PPC440EP_SDRAM_NR_BANKS] = {0};
+Ppc4xxSdramBank *ram_banks = g_new0(Ppc4xxSdramBank,
+PPC440EP_SDRAM_NR_BANKS);
 PCIBus *pcibus;
 PowerPCCPU *cpu;
 CPUPPCState *env;
@@ -205,13 +204,11 @@ static void bamboo_init(MachineState *machine)
qdev_get_gpio_in(DEVICE(cpu), PPC40x_INPUT_CINT));
 
 /* SDRAM controller */
-ppc4xx_sdram_banks(machine->ram, PPC440EP_SDRAM_NR_BANKS, ram_memories,
-   ram_bases, ram_sizes, ppc440ep_sdram_bank_sizes);
+ppc4xx_sdram_banks(machine->ram, PPC440EP_SDRAM_NR_BANKS, ram_banks,
+   ppc440ep_sdram_bank_sizes);
 /* XXX 440EP's ECC interrupts are on UIC1, but we've only created UIC0. */
-ppc4xx_sdram_init(env,
-  qdev_get_gpio_in(uicdev, 14),
-  PPC440EP_SDRAM_NR_BANKS, ram_memories,
-  ram_bases, ram_sizes);
+ppc4xx_sdram_init(env, qdev_get_gpio_in(uicdev, 14),
+  PPC440EP_SDRAM_NR_BANKS, ram_banks);
 /* Enable SDRAM memory regions, this should be done by the firmware */
 if (ppc_dcr_write(env->dcr_env, SDRAM0_CFGADDR, 0x20) ||
 ppc_dcr_write(env->dcr_env, SDRAM0_CFGDATA, 0x8000)) {
diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index 6ab0ad7985..3507c35b63 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -690,8 +690,7 @@ static void sdram_reset(void *opaque)
 }
 
 void ppc440_sdram_init(CPUPPCState *env, int nbanks,
-   MemoryRegion *ram_memories,
-   hwaddr *ram_bases, hwaddr *ram_sizes,
+   Ppc4xxSdramBank ram_banks[],

[PATCH 03/20] ppc4xx_sdram: Get rid of the init RAM hack

2022-08-19 Thread BALATON Zoltan

The do_init parameter of ppc4xx_sdram_init() is used to map memory
regions that is normally done by the firmware by programming the SDRAM
controller. This is needed when booting a kernel directly from -kernel
without a firmware. Do this from board code accesing normal SDRAM
controller registers the same way as firmware would do, so we can get
rid of this hack.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc405.h |  1 -
 hw/ppc/ppc405_boards.c  |  9 +++--
 hw/ppc/ppc405_uc.c  |  4 +---
 hw/ppc/ppc440_bamboo.c  |  8 +++-
 hw/ppc/ppc440_uc.c  |  2 --
 hw/ppc/ppc4xx_devs.c| 11 +--
 include/hw/ppc/ppc4xx.h |  8 ++--
 7 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index 1e558c7831..756865621b 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -169,7 +169,6 @@ struct Ppc405SoCState {
 /* Public */
 MemoryRegion ram_banks[2];
 hwaddr ram_bases[2], ram_sizes[2];
-bool do_dram_init;
 
 MemoryRegion *dram_mr;
 hwaddr ram_size;
diff --git a/hw/ppc/ppc405_boards.c b/hw/ppc/ppc405_boards.c
index 083f12b23e..0a29ad97c7 100644
--- a/hw/ppc/ppc405_boards.c
+++ b/hw/ppc/ppc405_boards.c
@@ -263,6 +263,13 @@ static void boot_from_kernel(MachineState *machine, 
PowerPCCPU *cpu)
 boot_info.cmdline_size = bdloc + len;
 }
 
+/* Enable SDRAM memory regions */
+if (ppc_dcr_write(env->dcr_env, SDRAM0_CFGADDR, 0x20) ||
+ppc_dcr_write(env->dcr_env, SDRAM0_CFGDATA, 0x8000)) {
+error_report("Could not enable memory regions");
+exit(1);
+}
+
 /* Install our custom reset handler to start from Linux */
 qemu_register_reset(main_cpu_reset, cpu);
 env->load_info = _info;
@@ -288,8 +295,6 @@ static void ppc405_init(MachineState *machine)
  machine->ram_size, _fatal);
 object_property_set_link(OBJECT(>soc), "dram",
  OBJECT(machine->ram), _abort);
-object_property_set_bool(OBJECT(>soc), "dram-init",
- kernel_filename != NULL, _abort);
 object_property_set_uint(OBJECT(>soc), "sys-clk", ,
  _abort);
 qdev_realize(DEVICE(>soc), NULL, _fatal);
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 6296130936..2833d0d538 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -1078,8 +1078,7 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
  s->ram_bases[0], s->ram_sizes[0]);
 
 ppc4xx_sdram_init(env, qdev_get_gpio_in(DEVICE(>uic), 17), 1,
-  s->ram_banks, s->ram_bases, s->ram_sizes,
-  s->do_dram_init);
+  s->ram_banks, s->ram_bases, s->ram_sizes);
 
 /* External bus controller */
 if (!ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(>ebc), >cpu, errp)) {
@@ -1157,7 +1156,6 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
 static Property ppc405_soc_properties[] = {
 DEFINE_PROP_LINK("dram", Ppc405SoCState, dram_mr, TYPE_MEMORY_REGION,
  MemoryRegion *),
-DEFINE_PROP_BOOL("dram-init", Ppc405SoCState, do_dram_init, 0),
 DEFINE_PROP_UINT64("ram-size", Ppc405SoCState, ram_size, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index 5ec82fa8c2..e3412c4fcd 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -211,7 +211,13 @@ static void bamboo_init(MachineState *machine)
 ppc4xx_sdram_init(env,
   qdev_get_gpio_in(uicdev, 14),
   PPC440EP_SDRAM_NR_BANKS, ram_memories,
-  ram_bases, ram_sizes, 1);
+  ram_bases, ram_sizes);
+/* Enable SDRAM memory regions, this should be done by the firmware */
+if (ppc_dcr_write(env->dcr_env, SDRAM0_CFGADDR, 0x20) ||
+ppc_dcr_write(env->dcr_env, SDRAM0_CFGDATA, 0x8000)) {
+error_report("couldn't enable memory regions");
+exit(1);
+}
 
 /* PCI */
 dev = sysbus_create_varargs(TYPE_PPC4xx_PCI_HOST_BRIDGE,
diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index db4e29..6ab0ad7985 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -489,8 +489,6 @@ typedef struct ppc440_sdram_t {
 } ppc440_sdram_t;
 
 enum {
-SDRAM0_CFGADDR = 0x10,
-SDRAM0_CFGDATA,
 SDRAM_R0BAS = 0x40,
 SDRAM_R1BAS,
 SDRAM_R2BAS,
diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index 1226ec4aa9..936d6f77fe 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -56,11 +56,6 @@ struct ppc4xx_sdram_t {
 qemu_irq irq;
 };
 
-enum {
-SDRAM0_CFGADDR = 0x010,
-SDRAM0_CFGDATA = 0x011,
-};
-
 /*
  * XXX: TOFIX: some patches have made this code become inconsistent:
  *  there are type inconsistencies, mixing hwaddr, target_ulong
@@ -350,8 +345,7 @@ static void sdram_reset(void *opaque)
 void ppc4xx_sdram_init(CPUPPCState *env,

[PATCH 1/7] vdpa: Skip the maps not in the iova tree

2022-08-19 Thread Eugenio Pérez

Next patch will skip the registering of dma maps that the vdpa device
rejects in the iova tree. We need to consider that here or we cause a
SIGSEGV accessing result.

Reported-by: Lei Yang 
Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
 hw/virtio/vhost-vdpa.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3ff9ce3501..983d3697b0 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -289,6 +289,10 @@ static void vhost_vdpa_listener_region_del(MemoryListener 
*listener,
 };
 
 result = vhost_iova_tree_find_iova(v->iova_tree, _region);
+if (!result) {
+/* The memory listener map wasn't mapped */
+return;
+}
 iova = result->iova;
 vhost_iova_tree_remove(v->iova_tree, result);
 }
-- 
2.31.1

[PATCH 4/7] vdpa: Remove SVQ vring from iova_tree at shutdown

2022-08-19 Thread Eugenio Pérez

Although the device will be reset before usage, the right thing to do is
to clean it.

Reported-by: Lei Yang 
Fixes: 34e3c94eda ("vdpa: Add custom IOTLB translations to SVQ")
Signed-off-by: Eugenio Pérez 
---
 hw/virtio/vhost-vdpa.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7e28d2f674..943799c17c 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -898,7 +898,12 @@ static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v,
 
 size = ROUND_UP(result->size, qemu_real_host_page_size());
 r = vhost_vdpa_dma_unmap(v, result->iova, size);
-return r == 0;
+if (unlikely(r < 0)) {
+return false;
+}
+
+vhost_iova_tree_remove(v->iova_tree, result);
+return 0;
 }
 
 static bool vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
-- 
2.31.1

[PATCH 3/7] util: make a copy of iova_tree_remove_parameter

2022-08-19 Thread Eugenio Pérez

It's convenient to call iova_tree_remove from a map returned from
iova_tree_find or iova_tree_find_iova. With the current code this is not
possible, since we will free it, and then we will try to search for it
again.

Fix it making a copy of the argument. Not applying a fixes tag, since
there is no use like that at the moment.

Signed-off-by: Eugenio Pérez 
---
 util/iova-tree.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/util/iova-tree.c b/util/iova-tree.c
index fee530a579..713e3edd7b 100644
--- a/util/iova-tree.c
+++ b/util/iova-tree.c
@@ -166,9 +166,11 @@ void iova_tree_foreach(IOVATree *tree, iova_tree_iterator 
iterator)
 
 void iova_tree_remove(IOVATree *tree, const DMAMap *map)
 {
+/* Just in case caller is calling iova_tree_remove from a result of find */
+const DMAMap needle = *map;
 const DMAMap *overlap;
 
-while ((overlap = iova_tree_find(tree, map))) {
+while ((overlap = iova_tree_find(tree, ))) {
 g_tree_remove(tree->tree, overlap);
 }
 }
-- 
2.31.1

[PATCH 02/20] ppc4xx: Introduce Ppc4xxSdramBank struct

2022-08-19 Thread BALATON Zoltan

Instead of storing sdram bank parameters in unrelated arrays put them
in a struct so it's clear they belong to the same bank and simplify
the state struct using this bank type.

Signed-off-by: BALATON Zoltan 
---
 hw/ppc/ppc440_uc.c  | 49 +-
 hw/ppc/ppc4xx_devs.c| 59 -
 include/hw/ppc/ppc4xx.h |  8 ++
 3 files changed, 61 insertions(+), 55 deletions(-)

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index 53e981ddf4..db4e29 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -16,7 +16,7 @@
 #include "qemu/module.h"
 #include "hw/irq.h"
 #include "exec/memory.h"
-#include "hw/ppc/ppc.h"
+#include "hw/ppc/ppc4xx.h"
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci.h"
 #include "sysemu/block-backend.h"
@@ -485,11 +485,7 @@ void ppc4xx_sdr_init(CPUPPCState *env)
 typedef struct ppc440_sdram_t {
 uint32_t addr;
 int nbanks;
-MemoryRegion containers[4]; /* used for clipping */
-MemoryRegion *ram_memories;
-hwaddr ram_bases[4];
-hwaddr ram_sizes[4];
-uint32_t bcr[4];
+Ppc4xxSdramBank bank[4];
 } ppc440_sdram_t;
 
 enum {
@@ -570,23 +566,23 @@ static uint64_t sdram_size(uint32_t bcr)
 static void sdram_set_bcr(ppc440_sdram_t *sdram, int i,
   uint32_t bcr, int enabled)
 {
-if (sdram->bcr[i] & 1) {
+if (sdram->bank[i].bcr & 1) {
 /* First unmap RAM if enabled */
 memory_region_del_subregion(get_system_memory(),
->containers[i]);
-memory_region_del_subregion(>containers[i],
->ram_memories[i]);
-object_unparent(OBJECT(>containers[i]));
+>bank[i].container);
+memory_region_del_subregion(>bank[i].container,
+>bank[i].ram);
+object_unparent(OBJECT(>bank[i].container));
 }
-sdram->bcr[i] = bcr & 0xffe0ffc1;
+sdram->bank[i].bcr = bcr & 0xffe0ffc1;
 if (enabled && (bcr & 1)) {
-memory_region_init(>containers[i], NULL, "sdram-containers",
+memory_region_init(>bank[i].container, NULL, "sdram-container",
sdram_size(bcr));
-memory_region_add_subregion(>containers[i], 0,
->ram_memories[i]);
+memory_region_add_subregion(>bank[i].container, 0,
+>bank[i].ram);
 memory_region_add_subregion(get_system_memory(),
 sdram_base(bcr),
->containers[i]);
+>bank[i].container);
 }
 }
 
@@ -595,9 +591,9 @@ static void sdram_map_bcr(ppc440_sdram_t *sdram)
 int i;
 
 for (i = 0; i < sdram->nbanks; i++) {
-if (sdram->ram_sizes[i] != 0) {
-sdram_set_bcr(sdram, i, sdram_bcr(sdram->ram_bases[i],
-  sdram->ram_sizes[i]), 1);
+if (sdram->bank[i].size != 0) {
+sdram_set_bcr(sdram, i, sdram_bcr(sdram->bank[i].base,
+  sdram->bank[i].size), 1);
 } else {
 sdram_set_bcr(sdram, i, 0, 0);
 }
@@ -614,9 +610,9 @@ static uint32_t dcr_read_sdram(void *opaque, int dcrn)
 case SDRAM_R1BAS:
 case SDRAM_R2BAS:
 case SDRAM_R3BAS:
-if (sdram->ram_sizes[dcrn - SDRAM_R0BAS]) {
-ret = sdram_bcr(sdram->ram_bases[dcrn - SDRAM_R0BAS],
-sdram->ram_sizes[dcrn - SDRAM_R0BAS]);
+if (sdram->bank[dcrn - SDRAM_R0BAS].size) {
+ret = sdram_bcr(sdram->bank[dcrn - SDRAM_R0BAS].base,
+sdram->bank[dcrn - SDRAM_R0BAS].size);
 }
 break;
 case SDRAM_CONF1HB:
@@ -701,12 +697,15 @@ void ppc440_sdram_init(CPUPPCState *env, int nbanks,
int do_init)
 {
 ppc440_sdram_t *sdram;
+int i;
 
 sdram = g_malloc0(sizeof(*sdram));
 sdram->nbanks = nbanks;
-sdram->ram_memories = ram_memories;
-memcpy(sdram->ram_bases, ram_bases, nbanks * sizeof(hwaddr));
-memcpy(sdram->ram_sizes, ram_sizes, nbanks * sizeof(hwaddr));
+for (i = 0; i < nbanks; i++) {
+sdram->bank[i].ram = ram_memories[i];
+sdram->bank[i].base = ram_bases[i];
+sdram->bank[i].size = ram_sizes[i];
+}
 qemu_register_reset(_reset, sdram);
 ppc_dcr_register(env, SDRAM0_CFGADDR,
  sdram, _read_sdram, _write_sdram);
diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index b4cd10f735..1226ec4aa9 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -42,10 +42,7 @@ typedef struct ppc4xx_sdram_t ppc4xx_sdram_t;
 struct ppc4xx_sdram_t {
 uint32_t addr;
 int nbanks;
-MemoryRegion containers[4]; /* used for clipping */
-MemoryRegion *ram_memories;
-hwaddr ram_bases[4];
-hwaddr ram_sizes[4];
+

[PATCH 6/7] vhost: Always store new kick fd on vhost_svq_set_svq_kick_fd

2022-08-19 Thread Eugenio Pérez

We can unbind twice a file descriptor if we call twice
vhost_svq_set_svq_kick_fd because of this. Since it comes from vhost and
not from SVQ, that file descriptor could be a different thing that
guest's vhost notifier.

Likewise, it can happens the same if a guest start and stop the device
multiple times.

Reported-by: Lei Yang 
Fixes: dff4426fa6 ("vhost: Add Shadow VirtQueue kick forwarding capabilities")
Signed-off-by: Eugenio Pérez 
---
 hw/virtio/vhost-shadow-virtqueue.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index e4956728dd..82a784d250 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -602,13 +602,13 @@ void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, 
int svq_kick_fd)
 event_notifier_set_handler(svq_kick, NULL);
 }
 
+event_notifier_init_fd(svq_kick, svq_kick_fd);
 /*
  * event_notifier_set_handler already checks for guest's notifications if
  * they arrive at the new file descriptor in the switch, so there is no
  * need to explicitly check for them.
  */
 if (poll_start) {
-event_notifier_init_fd(svq_kick, svq_kick_fd);
 event_notifier_set(svq_kick);
 event_notifier_set_handler(svq_kick, vhost_handle_guest_kick_notifier);
 }
@@ -655,7 +655,7 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, 
VirtIODevice *vdev,
  */
 void vhost_svq_stop(VhostShadowVirtqueue *svq)
 {
-event_notifier_set_handler(>svq_kick, NULL);
+vhost_svq_set_svq_kick_fd(svq, VHOST_FILE_UNBIND);
 g_autofree VirtQueueElement *next_avail_elem = NULL;
 
 if (!svq->vq) {
-- 
2.31.1

[PATCH 5/7] vdpa: Make SVQ vring unmapping return void

2022-08-19 Thread Eugenio Pérez

Nothing actually reads the return value, but an error in cleaning some
entries could cause device stop to abort, making a restart impossible.
Better ignore explicitely the return value.

Reported-by: Lei Yang 
Fixes: 34e3c94eda ("vdpa: Add custom IOTLB translations to SVQ")
Signed-off-by: Eugenio Pérez 
---
 hw/virtio/vhost-vdpa.c | 32 ++--
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 943799c17c..07d00f5284 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -884,7 +884,7 @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
 /**
  * Unmap a SVQ area in the device
  */
-static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v,
+static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v,
   const DMAMap *needle)
 {
 const DMAMap *result = vhost_iova_tree_find_iova(v->iova_tree, needle);
@@ -893,37 +893,32 @@ static bool vhost_vdpa_svq_unmap_ring(struct vhost_vdpa 
*v,
 
 if (unlikely(!result)) {
 error_report("Unable to find SVQ address to unmap");
-return false;
+return;
 }
 
 size = ROUND_UP(result->size, qemu_real_host_page_size());
 r = vhost_vdpa_dma_unmap(v, result->iova, size);
 if (unlikely(r < 0)) {
-return false;
+return;
 }
 
 vhost_iova_tree_remove(v->iova_tree, result);
-return 0;
 }
 
-static bool vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
+static void vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
const VhostShadowVirtqueue *svq)
 {
 DMAMap needle = {};
 struct vhost_vdpa *v = dev->opaque;
 struct vhost_vring_addr svq_addr;
-bool ok;
 
 vhost_svq_get_vring_addr(svq, _addr);
 
 needle.translated_addr = svq_addr.desc_user_addr;
-ok = vhost_vdpa_svq_unmap_ring(v, );
-if (unlikely(!ok)) {
-return false;
-}
+vhost_vdpa_svq_unmap_ring(v, );
 
 needle.translated_addr = svq_addr.used_user_addr;
-return vhost_vdpa_svq_unmap_ring(v, );
+vhost_vdpa_svq_unmap_ring(v, );
 }
 
 /**
@@ -1094,26 +1089,22 @@ err:
 return false;
 }
 
-static bool vhost_vdpa_svqs_stop(struct vhost_dev *dev)
+static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
 {
 struct vhost_vdpa *v = dev->opaque;
 
 if (!v->shadow_vqs) {
-return true;
+return;
 }
 
 for (unsigned i = 0; i < v->shadow_vqs->len; ++i) {
 VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, i);
-bool ok = vhost_vdpa_svq_unmap_rings(dev, svq);
-if (unlikely(!ok)) {
-return false;
-}
+vhost_vdpa_svq_unmap_rings(dev, svq);
 }
 
 if (v->migration_blocker) {
 migrate_del_blocker(v->migration_blocker);
 }
-return true;
 }
 
 static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
@@ -1130,10 +1121,7 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, 
bool started)
 }
 vhost_vdpa_set_vring_ready(dev);
 } else {
-ok = vhost_vdpa_svqs_stop(dev);
-if (unlikely(!ok)) {
-return -1;
-}
+vhost_vdpa_svqs_stop(dev);
 vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
 }
 
-- 
2.31.1

[PATCH 2/7] vdpa: do not save failed dma maps in SVQ iova tree

2022-08-19 Thread Eugenio Pérez

If a map fails for whatever reason, it must not be saved in the tree.
Otherwise, qemu will try to unmap it in cleanup, leaving to more errors.

Fixes: 34e3c94eda ("vdpa: Add custom IOTLB translations to SVQ")
Reported-by: Lei Yang 
Signed-off-by: Eugenio Pérez 
Acked-by: Jason Wang 
---
 hw/virtio/vhost-vdpa.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 983d3697b0..7e28d2f674 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -176,6 +176,7 @@ static void vhost_vdpa_listener_commit(MemoryListener 
*listener)
 static void vhost_vdpa_listener_region_add(MemoryListener *listener,
MemoryRegionSection *section)
 {
+DMAMap mem_region = {};
 struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
 hwaddr iova;
 Int128 llend, llsize;
@@ -212,13 +213,13 @@ static void vhost_vdpa_listener_region_add(MemoryListener 
*listener,
 
 llsize = int128_sub(llend, int128_make64(iova));
 if (v->shadow_vqs_enabled) {
-DMAMap mem_region = {
-.translated_addr = (hwaddr)(uintptr_t)vaddr,
-.size = int128_get64(llsize) - 1,
-.perm = IOMMU_ACCESS_FLAG(true, section->readonly),
-};
+int r;
 
-int r = vhost_iova_tree_map_alloc(v->iova_tree, _region);
+mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
+mem_region.size = int128_get64(llsize) - 1,
+mem_region.perm = IOMMU_ACCESS_FLAG(true, section->readonly),
+
+r = vhost_iova_tree_map_alloc(v->iova_tree, _region);
 if (unlikely(r != IOVA_OK)) {
 error_report("Can't allocate a mapping (%d)", r);
 goto fail;
@@ -232,11 +233,16 @@ static void vhost_vdpa_listener_region_add(MemoryListener 
*listener,
  vaddr, section->readonly);
 if (ret) {
 error_report("vhost vdpa map fail!");
-goto fail;
+goto fail_map;
 }
 
 return;
 
+fail_map:
+if (v->shadow_vqs_enabled) {
+vhost_iova_tree_remove(v->iova_tree, _region);
+}
+
 fail:
 /*
  * On the initfn path, store the first error in the container so we
-- 
2.31.1

[PATCH 7/7] vdpa: Use ring hwaddr at vhost_vdpa_svq_unmap_ring

2022-08-19 Thread Eugenio Pérez

Reduce code duplication.

Signed-off-by: Eugenio Pérez 
---
 hw/virtio/vhost-vdpa.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 07d00f5284..45d6e86b45 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -884,10 +884,12 @@ static int vhost_vdpa_svq_set_fds(struct vhost_dev *dev,
 /**
  * Unmap a SVQ area in the device
  */
-static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v,
-  const DMAMap *needle)
+static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa *v, hwaddr addr)
 {
-const DMAMap *result = vhost_iova_tree_find_iova(v->iova_tree, needle);
+const DMAMap needle = {
+.translated_addr = addr,
+};
+const DMAMap *result = vhost_iova_tree_find_iova(v->iova_tree, );
 hwaddr size;
 int r;
 
@@ -908,17 +910,14 @@ static void vhost_vdpa_svq_unmap_ring(struct vhost_vdpa 
*v,
 static void vhost_vdpa_svq_unmap_rings(struct vhost_dev *dev,
const VhostShadowVirtqueue *svq)
 {
-DMAMap needle = {};
 struct vhost_vdpa *v = dev->opaque;
 struct vhost_vring_addr svq_addr;
 
 vhost_svq_get_vring_addr(svq, _addr);
 
-needle.translated_addr = svq_addr.desc_user_addr;
-vhost_vdpa_svq_unmap_ring(v, );
+vhost_vdpa_svq_unmap_ring(v, svq_addr.desc_user_addr);
 
-needle.translated_addr = svq_addr.used_user_addr;
-vhost_vdpa_svq_unmap_ring(v, );
+vhost_vdpa_svq_unmap_ring(v, svq_addr.used_user_addr);
 }
 
 /**
@@ -996,7 +995,7 @@ static bool vhost_vdpa_svq_map_rings(struct vhost_dev *dev,
 ok = vhost_vdpa_svq_map_ring(v, _region, errp);
 if (unlikely(!ok)) {
 error_prepend(errp, "Cannot create vq device region: ");
-vhost_vdpa_svq_unmap_ring(v, _region);
+vhost_vdpa_svq_unmap_ring(v, driver_region.translated_addr);
 }
 addr->used_user_addr = device_region.iova;
 
-- 
2.31.1

[PATCH 0/7] vDPA shadow virtqueue iova tree fixes.

2022-08-19 Thread Eugenio Pérez

Collection of iova tree fixes detected preparing live migration with real
devices and multiqueue.

These cannot be triggered in simple setups (vdpa_sim_net, no display, no
device reset with different features) but it's possible to trigger them with
real devices or if the kernel fails some step like memory mapping / unmapping.

First two patches are already in the list at [1]. Last one is not a fix by
itself but a straightforward merge of the same code.

[1] https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00773.html

Eugenio Pérez (7):
  vdpa: Skip the maps not in the iova tree
  vdpa: do not save failed dma maps in SVQ iova tree
  util: make a copy of iova_tree_remove_parameter
  vdpa: Remove SVQ vring from iova_tree at shutdown
  vdpa: Make SVQ vring unmapping return void
  vhost: Always store new kick fd on vhost_svq_set_svq_kick_fd
  vdpa: Use ring hwaddr at vhost_vdpa_svq_unmap_ring

 hw/virtio/vhost-shadow-virtqueue.c |  4 +-
 hw/virtio/vhost-vdpa.c | 70 +++---
 util/iova-tree.c   |  4 +-
 3 files changed, 41 insertions(+), 37 deletions(-)

-- 
2.31.1

Python test QMP Server implementation / version reply

2022-08-19 Thread John Snow

Hi Eric:

Once upon a time I mentioned writing a QMP server implementation in Python.
Now that the dust of the fork has (mostly) settled, I've rebased and
uploaded that WIP here:

https://gitlab.com/jsnow/python-qemu-qmp/-/commits/create_server/

Tracking issue: https://gitlab.com/qemu-project/python-qemu-qmp/-/issues/2

It's still in prototype shape, but mostly works (Except the OOB support
...!). I wanted to ask again what you thought about what the version reply
ought to be for this test appliance.

I assume that it will be most helpful if the programmer can set their own
version reply, but it should likely have a default that indicates that it's
the qemu.qmp test server and what version it is. I want to ensure that the
reply is aligned with the QMP spec, but the spec is a little fuzzy on what
the requirements for the version reply are. (I can't find our previous
discussion on the topic right now, but I know it was discussed.)

My intended use case for this appliance is for unit testing the QMP library
itself without relying on QEMU, but it may have other uses for testing
other things, too.

Thanks,
--js

Re: [PULL 0/1] query-stats-schemas crash fix for QEMU 7.1

2022-08-19 Thread Richard Henderson


On 8/19/22 02:12, Paolo Bonzini wrote:

The following changes since commit c7208a6e0d049f9e8af15df908168a79b1f99685:

   Update version for v7.1.0-rc3 release (2022-08-16 20:45:19 -0500)

are available in the Git repository at:

   https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to a9197ad2101cfc885cc316af299f49ba89039e54:

   kvm: fix segfault with query-stats-schemas and -M none (2022-08-18 14:08:24 
+0200)


Fix SIGSEGV with query-stats-schema.

This allows management tools to query the statistics schemas without
worrying that some versions of QEMU will crash.



Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/7.1 as 
appropriate.


r~





Paolo Bonzini (1):
   kvm: fix segfault with query-stats-schemas and -M none

  accel/kvm/kvm-all.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

Re: [RFC PATCH v2 2/8] qapi: golang: Generate qapi's alternate types in Go

2022-08-19 Thread Andrea Bolognani

On Wed, Aug 17, 2022 at 04:04:19PM +0200, Victor Toso wrote:
> On Tue, Jul 05, 2022 at 08:45:06AM -0700, Andrea Bolognani wrote:
> > On Fri, Jun 17, 2022 at 02:19:26PM +0200, Victor Toso wrote:
> > > func (s *BlockdevRef) UnmarshalJSON(data []byte) error {
> > > // Check for json-null first
> > > if string(data) == "null" {
> > > return errors.New(`null not supported for BlockdevRef`)
> > > }
> > > // Check for BlockdevOptions
> > > {
> > > s.Definition = new(BlockdevOptions)
> > > if err := StrictDecode(s.Definition, data); err == nil {
> > > return nil
> > > }
> >
> > The use of StrictDecode() here means that we won't be able to
> > parse an alternate produced by a version of QEMU where
> > BlockdevOptions has gained additional fields, doesn't it?
>
> That's correct. This means that with this RFCv2 proposal, qapi-go
> based on qemu version 7.1 might not be able to decode a qmp
> message from qemu version 7.2 if it has introduced a new field.
>
> This needs fixing, not sure yet the way to go.
>
> > Considering that we will happily parse such a BlockdevOptions
> > outside of the context of BlockdevRef, I think we should be
> > consistent and allow the same to happen here.
>
> StrictDecode is only used with alternates because, unlike unions,
> Alternate types don't have a 'discriminator' field that would
> allow us to know what data type to expect.
>
> With this in mind, theoretically speaking, we could have very
> similar struct types as Alternate fields and we have to find on
> runtime which type is that underlying byte stream.
>
> So, to reply to your suggestion, if we allow BlockdevRef without
> StrictDecode we might find ourselves in a situation that it
> matched a few fields of BlockdevOptions but it the byte stream
> was actually another type.

IIUC your concern is that the QAPI schema could gain a new type,
TotallyNotBlockdevOptions, which looks exactly like BlockdevOptions
except for one or more extra fields.

If QEMU then produced a JSON like

  { "description": { /* a TotallyNotBlockdevOptions here */ } }

and we'd try to deserialize it with Go code like

  ref := BlockdevRef{}
  json.Unmarsal()

we'd end up mistakenly parsing the TotallyNotBlockdevOptions as a
valid BlockdevOptions, dropping the extra fields in the process.

Does that correctly describe the reason why you feel that the use of
StrictDecode is necessary?

If so, I respectfully disagree :)

If the client code is expecting a BlockdevRef as the return value of
a command and QEMU is producing something that is *not* a BlockdevRef
instead, that's an obvious bug in QEMU. If the client code is
expecting a BlockdevRef as the return value of a command that is
specified *not* to return a BlockdevRef, that's an obvious bug in the
client code.

In neither case it should be the responsibility of the SDK to
second-guess the declared intent, especially when it's perfectly
valid for a type to be extended in a backwards-compatible way by
adding fields to it.

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [PATCH 0/4] Speed up migration tests

2022-08-19 Thread Alex Bennée



Thomas Huth  writes:

> We are currently facing the problem that the "gcov-gprof" CI jobs
> in the gitlab-CI are running way too long - which happens since
> the migration-tests have been enabled there recently.
>
> These patches now speed up the migration tests, so that the
> CI job should be fine again.

Awesome stuff, queued to testing/next, thanks.

-- 
Alex Bennée

Re: [PATCH v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation

2022-08-19 Thread Paolo Bonzini


On 8/18/22 17:01, Vitaly Kuznetsov wrote:

Changes since RFC:
- Call kvm_put_msr_feature_control() before kvm_put_sregs2() to achieve
  the same result [Paolo].
- Add Maxim's R-b to PATCH1.

It was discovered that Windows 11 with WSL2 (Hyper-V) enabled guests fail
to reboot when QEMU's 'system_reset' command is issued. The problem appears
to be that KVM_SET_SREGS2 fails because zeroed CR4 register value doesn't
pass vmx_is_valid_cr4() check in KVM as certain bits can't be zero while in
VMX root operation (post-VMXON). kvm_arch_put_registers() does call
kvm_put_nested_state() which is supposed to kick vCPU out of VMX root
operation, however, it only does so after kvm_put_sregs2() and there's
a good reason for that: 'real' nested state requires e.g. EFER.SVME to
be set.

The root cause of the issue seems to be that QEMU is doing quite a lot
to forcefully reset a vCPU as KVM doesn't export kvm_vcpu_reset() (or,
rather, it's super-set) yet. While all the numerous existing APIs for
setting a vCPU state work fine for a newly created vCPU, using them for
vCPU reset is a mess caused by various dependencies between different
components of the state (VMX, SMM, MSRs, XCRs, CPUIDs, ...). It would've
been possible to allow to set 'inconsistent' state and only validate it
upon VCPU_RUN from the very beginning but that ship has long sailed for
KVM. A new, dedicated API for vCPU reset is likely the way to go.

Resolve the immediate issue by setting MSR_IA32_FEATURE_CONTROL before
kvm_put_sregs2() (and kvm_put_nested_state()), this ensures vCPU gets
kicked out of VMX root operation.


Queued, thanks.

Paolo

Re: [qemu-web PATCH] Add signing pubkey for python-qemu-qmp package

2022-08-19 Thread Paolo Bonzini


On 8/18/22 18:53, John Snow wrote:

Add the pubkey currently used for signing PyPI releases of qemu.qmp to a
stable location where it can be referenced by e.g. Fedora RPM specfiles.

At present, the key happens to just simply be my own -- but future
releases may be signed by a different key. In that case, we can
increment '1.txt' to '2.txt' and so on. The old keys should be left in
place.

The format for the keyfile was chosen by copying what OpenStack was
doing:
https://releases.openstack.org/_static/0x2426b928085a020d8a90d0d879ab7008d0896c8a.txt

Generated with:

gpg --with-fingerprint --list-keys js...@redhat.com > pubkey
gpg --armor --export js...@redhat.com >> pubkey


Signed-off-by: John Snow 
---
  assets/keys/python-qemu-qmp.1.txt | 288 ++
  1 file changed, 288 insertions(+)
  create mode 100644 assets/keys/python-qemu-qmp.1.txt

diff --git a/assets/keys/python-qemu-qmp.1.txt 
b/assets/keys/python-qemu-qmp.1.txt
new file mode 100644
index 000..54edbbd
--- /dev/null
+++ b/assets/keys/python-qemu-qmp.1.txt
@@ -0,0 +1,288 @@
+pub   rsa4096 2015-01-29 [SC] [expires: 2023-05-28]
+  FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
+uid   [ultimate] John Snow (John Huston) 
+sub   rsa4096 2015-01-29 [E] [expires: 2023-05-28]
+sub   rsa4096 2015-01-29 [S] [expires: 2023-05-28]
+
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBFTKefwBEAChvwqYC6saTzawbih87LqBYq0d5A8jXYXaiFMV/EvMSDqqY4EY
+6whXliNOIYzhgrPEe7ZmPxbCSe4iMykjhwMh5byIHDoPGDU+FsQty2KXuoxto+Zd
+rP9gymAgmyqdk3aVvzzmCa3cOppcqKvA0Kqr10UeX/z4OMVV390V+DVWUvzXpda4
+5/Sxup57pk+hyY52wxxjIqefrj8u5BN93s5uCVTus0oiVA6W+iXYzTvVDStMFVqn
+TxSxlpZoH5RGKvmoWV3uutByQyBPHW2U1Y6n6iEZ9MlP3hcDqlo0S8jeP03HaD4g
+OqCuqLceWF5+2WyHzNfylpNMFVi+Hp0H/nSDtCvQua7j+6Pt7q5rvqgHvRipkDDV
+sjqwasuNc3wyoHexrBeLU/iJBuDld5iLy+dHXoYMB3HmjMxj3K5/8XhGrDx6BDFe
+O3HIpi3u2z1jniB7RtyVEtdupED6lqsDj0oSz9NxaOFZrS3Jf6z/kHIfh42mM9Sx
+7+s4c07N2LieUxcfqhFTaa/voRibF4cmkBVUhOD1AKXNfhEsTvmcz9NbUchCkcvA
+T9119CrsxfVsE7bXiGvdXnzyGLXdsoosjzwacKdOrVaDmN3Uy+SHiQXo6TlkSdV0
+XH2PUxTMLsBFIO9qXO43Ai6J6iPAP/01l8fuZfpJE0/L/c25yyaND7xA3wARAQAB
+tCpKb2huIFNub3cgKEpvaG4gSHVzdG9uKSA8anNub3dAcmVkaGF0LmNvbT6JAj0E
+EwECACcCGwMCHgECF4AFCwkIBwMFFQoJCAsFFgIDAQAFAlTKigkFCQPCdwsACgkQ
+iKkGTRg1Yet1Pw/+KGEA0n30z1oSgFLPs2XyVvpeH8bpanTVufOHjwlcaBgmUEk8
+KnPRd7oL8y4cq9KjmJwip2hH2vjeBR1HtxmEx06GvGBA9X/YDMaihmJmIHSlxJfl
+YpaK52R1bJYWBTNyK7X5VCU+nQdhdz80X10MLQcdwX13HkP8DfxnbTSj1oSgoOwZ
+zb4ni9xOmwHOpdKUSCm6hJUlgsIHWB193CVpV9CHoU8ovUoGIDEt8l17tPtf/QcP
+wdW65Bfqq0k1WeVBjdq7birH216rcdP6FkEwyJcFBJWUk4U44iZPKJMiqhAysujH
++JCwOk3n4+/SUQd4uO8gdnkfTIqGu6wwOUq63B0B0qm50OOZ6Ir2tyQ44ae5X0PG
+13wJqvmWi9umlK1qiXDACCJX0xW6hRvLAnHYnGllidfZSopkFvxUvs+CpCwJYZuH
+DLbfUQnl/eF8oYR5QjQRxrFOr2l7TJVgxTEJQRuyWDFtJE4c1krB3IQPDA4f5jpM
+FagWp6J+oIzdLhMabxFlSTpDnrbkZxy1qra0FW1oWBoV83/nR+8rXY1q94/9+4ib
+cBKDdIYrQX22CCU3MRlksQVGPk7swNdlaucRuED6Ow5rQU/0GDWEkNsWrtb/EQ/e
+ZH4RcLifgEKfFvWhuxP3za/kWu0cmFtyhcxAMsJUolh4FzQf+LMJ8Y1/LimJARsE
+EAECAAYFAlXWjaEACgkQUhGOPAsp2mvr5gf49Dxc3tJ96er+pH/EoBZ4b+Q+0kWX
+NA2FQY8fDeNvHlvB7pn4mZ2wnFAhc94dbmFWe+Zd067tSC66wQboInaANSpt6PYC
+CazbxGtqxOimSpoPi1awQDk0rCJ3UBYnIPhiJUP52mH0hhgwo6Y9pWMCpNwyuVng
+XaZLWxnN0sL+k00DKpEnPJDDzux7B9dllIk1x91ux7rNWfM+EbUS/iLWUM5KxC/k
+9WTPC+38K46Erzhdd+ZwVH5/d+jXxQXYxPgDTTjmsq5Bq1gwzUlzZuKVt39G9rQW
+m3GJsWbCCtjJSQvYmHglm2t1A1A9aXiG982fsBQ3JZo1/w/8GJhU94MbiQEcBBAB
+AgAGBQJU045eAAoJEJykq7OBq3PInwgIAJ2VQOIdDZ0q6OohWchGZ5qdjk8f25wy
+kreyv7t+nZ/fWr3K4GvdRo9YboBPYe/A44oPBBc9E2JUp4nwlNVzqyuJDcS2T7cU
+lGcRcHdPg7mdq78V1HxRcgMXti8+dht/eReBnuc7Y0Whrst4336u8MoVcIuaix2X
+jMOt/qvZ4MYL7f9OXjT0I0k/FUXpThT/Lb5Yn60ZdeDvfTuSOtV5OIaevy9QgW3g
+tvRo5GHw0Mrn/IFY9ZFH9B9jqVqhm5om1l/9rcaZGGF4gsZ+Lnm6AKP04jGM3t5v
+PoeWYkG8k2Dt7KdpqgheK75U9NTR3E4PpHNSJ5vBnZEae05Prh+vTTqJARwEEgEI
+AAYFAlXbbx4ACgkQp6FrSiUnQ2ou7wf/TpH0oSP3KSn2bGN7+6fqq/OLQ1QXsrAn
+JUzDzf+/JdqvhoGRnFkWH4+6aSqp+tNSnmfqNFl4mSkFFVTCLc4Jg989zrGpGgzx
+G6qb3Dpx3zURGXW26x8b156dxUcCB339Uz0SiocDtwq/w54NQgZXWxob7XJIzx5z
+74biFVwncKGn4v9kr7CryI9bgf3BhSEmCzWCBvUYgeTGSV9qQZyQ01QFP84aS/2I
+I1lnsN0b1NrlBLhhmq8A/TLmGJhh8AbWc+6OC3ImWB/xSnLFebXvNGQfXiTOB7n8
+/p2n8yOaWMV7O/wn5s6tgpmbmArC3tcrRYoIq/2HyAwtFYfnRtO4HYkBHAQTAQgA
+BgUCVdtZyQAKCRD0B9sAYdXPQFQFCACNbrL26QDM2GkMlCXtC7MVyf6tRxF3diXv
+cnWil8BtP3b+Iqv35Udqx8Y49PLRDy7j2ATFDdIn3Pl/fu1mSmbai6hD6P07dwLc
+jzF8nimh/vTOFN1FgOAX3hTlmIyAn5eCW4nKshxsjaX5SwI7BKMELZ77Y1E823//
+yCvtSH1Nwq6sPTfhUiFlrLPJltCg0T7teg3nUsDaVE8FTuQXN/0HwGtpcGHjz3k0
+/vZH1vZd8W2vIzVaAnIUxU4H4myFT7V9vBBn1xlsLxmQALb8HGMQsuTP3zdTmReY
+WKgb3rtGAic18GtSIoqAoRLKqUZKh9AbT4AOHYT1OmFJYRzyix7NiQHwBBABAgAG
+BQJV7DxBAAoJEH4VEAzNNmmxTFkOoMTA5Af0IWHUg4EKbFHVO6gULGkAKw1ZjnIj
+UtfE+JZ8/bs6sAXCS7gGMa8llf6TprUefmkDsp8t6HjKyw6n7TBwfY7RSz9eqGFW
+3/DmAn0iai9c29gfBZXlzsVyrpgy/RcHYkDTW/e1rP3CKz7W2FMTlZGcHx3DFXJY
+fQDVRyt2lF1qUWrgByYXhcSVbva65M8gvUJRt2D6ODfxTU8+HgcA1XsbkPw6Yptu
+XbxRbGCv16hQsaN8dz8FuFCF5qxpeVje9w9N8vHxgEFjxyCGPvX9lHY48w9cOVtv
+6n+H5TFhkl56sWU2Zb0k2QnCwzRT68+o8RQZPDLD4fK7GEhGO9hdQtaV+S3DxoDz

Re: [PATCH 1/2] block: pass OnOffAuto instead of bool to block_acct_setup()

2022-08-19 Thread Vladimir Sementsov-Ogievskiy


On 8/17/22 12:28, Denis V. Lunev wrote:

We would have one more place for block_acct_setup() calling, which should
not corrupt original value.

Signed-off-by: Denis V. Lunev 
CC: Peter Krempa 
CC: Markus Armbruster 
CC: John Snow 
CC: Kevin Wolf 
CC: Hanna Reitz 
CC: Vladimir Sementsov-Ogievskiy 
---
  block/accounting.c | 24 
  blockdev.c | 17 ++---
  include/block/accounting.h |  6 +++---
  3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/block/accounting.c b/block/accounting.c
index 2030851d79..73ac639656 100644
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -40,11 +40,27 @@ void block_acct_init(BlockAcctStats *stats)
  }
  }
  
-void block_acct_setup(BlockAcctStats *stats, bool account_invalid,

-  bool account_failed)
+static bool bool_from_onoffauto(OnOffAuto val, bool def)
  {
-stats->account_invalid = account_invalid;
-stats->account_failed = account_failed;
+switch (val) {
+case ON_OFF_AUTO_AUTO:
+return def;
+case ON_OFF_AUTO_ON:
+return true;
+case ON_OFF_AUTO_OFF:
+return false;
+default:
+abort();
+}
+}
+
+void block_acct_setup(BlockAcctStats *stats, enum OnOffAuto account_invalid,
+  enum OnOffAuto account_failed)
+{
+stats->account_invalid = bool_from_onoffauto(account_invalid,
+ stats->account_invalid);
+stats->account_failed = bool_from_onoffauto(account_failed,
+stats->account_failed);
  }
  
  void block_acct_cleanup(BlockAcctStats *stats)

diff --git a/blockdev.c b/blockdev.c
index 9230888e34..392d9476e6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -455,6 +455,17 @@ static void extract_common_blockdev_options(QemuOpts 
*opts, int *bdrv_flags,
  }
  }
  
+static OnOffAuto account_get_opt(QemuOpts *opts, const char *name)

+{
+if (!qemu_opt_find(opts, name)) {
+return ON_OFF_AUTO_AUTO;
+}
+if (qemu_opt_get_bool(opts, name, true)) {
+return ON_OFF_AUTO_ON;
+}
+return ON_OFF_AUTO_OFF;
+}
+
  /* Takes the ownership of bs_opts */
  static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
 Error **errp)
@@ -462,7 +473,7 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
  const char *buf;
  int bdrv_flags = 0;
  int on_read_error, on_write_error;
-bool account_invalid, account_failed;
+OnOffAuto account_invalid, account_failed;
  bool writethrough, read_only;
  BlockBackend *blk;
  BlockDriverState *bs;
@@ -496,8 +507,8 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
  /* extract parameters */
  snapshot = qemu_opt_get_bool(opts, "snapshot", 0);
  
-account_invalid = qemu_opt_get_bool(opts, "stats-account-invalid", true);

-account_failed = qemu_opt_get_bool(opts, "stats-account-failed", true);
+account_invalid = account_get_opt(opts, "stats-account-invalid");
+account_failed = account_get_opt(opts, "stats-account-failed");


Hm. Actually here you change default behavior from true to false.

In the next patch you fix it back, but better is to avoid the extra change 
somehow.

  
  writethrough = !qemu_opt_get_bool(opts, BDRV_OPT_CACHE_WB, true);
  
diff --git a/include/block/accounting.h b/include/block/accounting.h

index 878b4c3581..b9caad60d5 100644
--- a/include/block/accounting.h
+++ b/include/block/accounting.h
@@ -27,7 +27,7 @@
  
  #include "qemu/timed-average.h"

  #include "qemu/thread.h"
-#include "qapi/qapi-builtin-types.h"
+#include "qapi/qapi-types-common.h"
  
  typedef struct BlockAcctTimedStats BlockAcctTimedStats;

  typedef struct BlockAcctStats BlockAcctStats;
@@ -100,8 +100,8 @@ typedef struct BlockAcctCookie {
  } BlockAcctCookie;
  
  void block_acct_init(BlockAcctStats *stats);

-void block_acct_setup(BlockAcctStats *stats, bool account_invalid,
- bool account_failed);
+void block_acct_setup(BlockAcctStats *stats, enum OnOffAuto account_invalid,
+  enum OnOffAuto account_failed);
  void block_acct_cleanup(BlockAcctStats *stats);
  void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length);
  BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,



--
Best regards,
Vladimir

Re: [PATCH 1/2] scsi: Add buf_len parameter to scsi_req_new()

2022-08-19 Thread Paolo Bonzini


On 8/17/22 07:34, John Millikin wrote:

The sigil SCSI_CMD_BUF_LEN_TODO() is used to indicate that the buffer
length calculation is TODO it should be replaced by a better value,
such as the length of a successful DMA read.


Let's just do it right:

diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index e8a4a705e7..05a43ec807 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -864,7 +864,7 @@ static void lsi_do_command(LSIState *s)
 s->current = g_new0(lsi_request, 1);
 s->current->tag = s->select_tag;
 s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun, buf,
-   SCSI_CMD_BUF_LEN_TODO(s->dbc), s->current);
+   s->dbc, s->current);
 
 n = scsi_req_enqueue(s->current->req);

 if (n) {
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index e887ae8adb..842edc3f9d 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -1053,7 +1053,6 @@ static int megasas_pd_get_info_submit(SCSIDevice *sdev, 
int lun,
 uint64_t pd_size;
 uint16_t pd_id = ((sdev->id & 0xFF) << 8) | (lun & 0xFF);
 uint8_t cmdbuf[6];
-size_t cmdbuf_len = SCSI_CMD_BUF_LEN_TODO(sizeof(cmdbuf));
 size_t len;
 dma_addr_t residual;
 
@@ -1063,7 +1062,7 @@ static int megasas_pd_get_info_submit(SCSIDevice *sdev, int lun,

 info->inquiry_data[0] = 0x7f; /* Force PQual 0x3, PType 0x1f */
 info->vpd_page83[0] = 0x7f;
 megasas_setup_inquiry(cmdbuf, 0, sizeof(info->inquiry_data));
-cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf, cmdbuf_len, 
cmd);
+cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf, sizeof(cmdbuf), 
cmd);
 if (!cmd->req) {
 trace_megasas_dcmd_req_alloc_failed(cmd->index,
 "PD get info std inquiry");
@@ -1081,7 +1080,7 @@ static int megasas_pd_get_info_submit(SCSIDevice *sdev, 
int lun,
 return MFI_STAT_INVALID_STATUS;
 } else if (info->inquiry_data[0] != 0x7f && info->vpd_page83[0] == 0x7f) {
 megasas_setup_inquiry(cmdbuf, 0x83, sizeof(info->vpd_page83));
-cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf, cmdbuf_len, 
cmd);
+cmd->req = scsi_req_new(sdev, cmd->index, lun, cmdbuf, sizeof(cmdbuf), 
cmd);
 if (!cmd->req) {
 trace_megasas_dcmd_req_alloc_failed(cmd->index,
 "PD get info vpd inquiry");
diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c
index 711613b5b1..a90c2546f1 100644
--- a/hw/scsi/mptsas.c
+++ b/hw/scsi/mptsas.c
@@ -324,8 +324,8 @@ static int mptsas_process_scsi_io_request(MPTSASState *s,
 }
 
 req->sreq = scsi_req_new(sdev, scsi_io->MsgContext,

-scsi_io->LUN[1], scsi_io->CDB,
-SCSI_CMD_BUF_LEN_TODO(SIZE_MAX), req);
+ scsi_io->LUN[1], scsi_io->CDB,
+ scsi_io->CDBLength, req);
 
 if (req->sreq->cmd.xfer > scsi_io->DataLength) {

 goto overrun;
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 288ea12969..cc71524024 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -1707,7 +1707,6 @@ static int get_scsi_requests(QEMUFile *f, void *pv, 
size_t size,
 
 while ((sbyte = qemu_get_sbyte(f)) > 0) {

 uint8_t buf[SCSI_CMD_BUF_SIZE];
-size_t buf_len;
 uint32_t tag;
 uint32_t lun;
 SCSIRequest *req;
@@ -1715,8 +1714,11 @@ static int get_scsi_requests(QEMUFile *f, void *pv, 
size_t size,
 qemu_get_buffer(f, buf, sizeof(buf));
 qemu_get_be32s(f, );
 qemu_get_be32s(f, );
-buf_len = SCSI_CMD_BUF_LEN_TODO(sizeof(buf));
-req = scsi_req_new(s, tag, lun, buf, buf_len, NULL);
+/*
+ * A too-short CDB would have been rejected by scsi_req_new, so just 
use
+ * SCSI_CMD_BUF_SIZE as the CDB length.
+ */
+req = scsi_req_new(s, tag, lun, buf, sizeof(buf), NULL);
 req->retry = (sbyte == 1);
 if (bus->info->load_request) {
 req->hba_private = bus->info->load_request(f, req);
diff --git a/hw/scsi/spapr_vscsi.c b/hw/scsi/spapr_vscsi.c
index c17bb47f7b..0a8cbf5a4b 100644
--- a/hw/scsi/spapr_vscsi.c
+++ b/hw/scsi/spapr_vscsi.c
@@ -783,7 +783,7 @@ static int vscsi_queue_cmd(VSCSIState *s, vscsi_req *req)
 union srp_iu *srp = _iu(req)->srp;
 SCSIDevice *sdev;
 int n, lun;
-size_t cdb_len = SCSI_CMD_BUF_LEN_TODO(SIZE_MAX);
+size_t cdb_len = sizeof (srp->cmd.cdb) + (srp->cmd.add_cdb_len & ~3);
 
 if ((srp->cmd.lun == 0 || be64_to_cpu(srp->cmd.lun) == SRP_REPORT_LUNS_WLUN)

   && srp->cmd.cdb[0] == REPORT_LUNS) {
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 632e3aa58f..41f2a56301 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -673,7 +673,6 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI 
*s, VirtIOSCSIReq *req)

Re: [RFC PATCH 06/13] target/ppc: remove embedded interrupts from ppc_pending_interrupt_p9

2022-08-19 Thread Fabiano Rosas

"Matheus K. Ferst"  writes:

> On 15/08/2022 18:23, Fabiano Rosas wrote:
>> Matheus Ferst  writes:
>> 
>>> Critical Input, Watchdog Timer, and Fixed Interval Timer are only
>>> defined for embedded CPUs. The Programmable Interval Timer is 40x-only.
>>>
>>> Signed-off-by: Matheus Ferst 
>>> ---
>>>   target/ppc/excp_helper.c | 18 --
>>>   1 file changed, 18 deletions(-)
>>>
>>> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
>>> index 2ca6a917b2..42b57019ba 100644
>>> --- a/target/ppc/excp_helper.c
>>> +++ b/target/ppc/excp_helper.c
>>> @@ -1780,28 +1780,10 @@ static int ppc_pending_interrupt_p9(CPUPPCState 
>>> *env)
>>>   return PPC_INTERRUPT_EXT;
>>>   }
>>>   }
>>> -if (FIELD_EX64(env->msr, MSR, CE)) {
>>> -/* External critical interrupt */
>>> -if (env->pending_interrupts & PPC_INTERRUPT_CEXT) {
>>> -return PPC_INTERRUPT_CEXT;
>>> -}
>>> -}
>>>   if (async_deliver != 0) {
>>> -/* Watchdog timer on embedded PowerPC */
>>> -if (env->pending_interrupts & PPC_INTERRUPT_WDT) {
>>> -return PPC_INTERRUPT_WDT;
>>> -}
>>>   if (env->pending_interrupts & PPC_INTERRUPT_CDOORBELL) {
>>>   return PPC_INTERRUPT_CDOORBELL;
>>>   }
>> 
>> This one too.
>> 
>> And the Thermal.
>
> There are some other interrupts that I was not sure if we should remove:

I would keep them simply because that is an unrelated cleanup. Here you
are removing the ones that are not present in those CPUs at all. I think
the discussion about how/what QEMU emulates is a different one. If we
determine that they are indeed not used and that is not a mistake, we
could replace them with a placeholder comment or even some explanation
of why we don't need them.

> - PPC_INTERRUPT_PERFM doesn't seem to be used anywhere else. I guess it 
> will be used when we implement more PMU stuff, so I left it in all 
> ppc_pending_interrupt_* methods.
> - PPC_INTERRUPT_RESET was treated in cpu_has_work_POWER*, but AFAICS, 
> it's only used in ppc6xx_set_irq and ppc970_set_irq, which means it can 
> only be raised on 6xx, 7xx, 970, and POWER5+. Should we remove it too?

I'm not sure if we have an external interrupt source that affects the
CPU like that. I see that we simply call powerpc_excp to reset the CPUs
when we need it.

Cédric, any thoughts?

>> 
>>> -/* Fixed interval timer on embedded PowerPC */
>>> -if (env->pending_interrupts & PPC_INTERRUPT_FIT) {
>>> -return PPC_INTERRUPT_FIT;
>>> -}
>>> -/* Programmable interval timer on embedded PowerPC */
>>> -if (env->pending_interrupts & PPC_INTERRUPT_PIT) {
>>> -return PPC_INTERRUPT_PIT;
>>> -}
>>>   /* Decrementer exception */
>>>   if (env->pending_interrupts & PPC_INTERRUPT_DECR) {
>>>   return PPC_INTERRUPT_DECR;
>
> Tḧanks,
> Matheus K. Ferst
> Instituto de Pesquisas ELDORADO 
> Analista de Software
> Aviso Legal - Disclaimer

Re: [PATCH 1/2] block: use bdrv_is_sg() helper instead of raw bs->sg reading

2022-08-19 Thread Vladimir Sementsov-Ogievskiy


On 8/17/22 11:37, Denis V. Lunev wrote:

I believe that if the helper exists, it must be used always for reading
of the value. It breaks expectations in the other case.

Signed-off-by: Denis V. Lunev
CC: Kevin Wolf
CC: Hanna Reitz
CC: Stefan Hajnoczi
CC: Fam Zheng
CC: Ronnie Sahlberg
CC: Paolo Bonzini
CC: Peter Lieven
CC: Vladimir Sementsov-Ogievskiy


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH 2/2] block: make serializing requests functions 'void'

2022-08-19 Thread Vladimir Sementsov-Ogievskiy


On 8/17/22 11:37, Denis V. Lunev wrote:

Return codes of the following functions are never used in the code:
* bdrv_wait_serialising_requests_locked
* bdrv_wait_serialising_requests
* bdrv_make_request_serialising

Signed-off-by: Denis V. Lunev
CC: Kevin Wolf
CC: Hanna Reitz
CC: Stefan Hajnoczi
CC: Fam Zheng
CC: Ronnie Sahlberg
CC: Paolo Bonzini
CC: Peter Lieven
CC: Vladimir Sementsov-Ogievskiy


Oh, good cleanup!


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH v2] target/arm: Add cortex-a35

2022-08-19 Thread Peter Maydell

On Fri, 19 Aug 2022 at 01:20, Hao Wu  wrote:
>
> Add cortex A35 core and enable it for virt board.
>
> Signed-off-by: Hao Wu 
> Reviewed-by: Joe Komlodi 

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH v10 18/21] job.c: enable job lock/unlock and remove Aiocontext locks

2022-08-19 Thread Vladimir Sementsov-Ogievskiy


On 8/18/22 10:46, Emanuele Giuseppe Esposito wrote:



Am 17/08/2022 um 20:54 schrieb Vladimir Sementsov-Ogievskiy:

On 8/16/22 15:52, Emanuele Giuseppe Esposito wrote:

    }
@@ -501,8 +481,12 @@ void job_unref_locked(Job *job)>
assert(!job->txn);
      if (job->driver->free) {
+    AioContext *aio_context = job->aio_context;
    job_unlock();
+    /* FIXME: aiocontext lock is required because cb calls
blk_unref */
+    aio_context_acquire(aio_context);
    job->driver->free(job);
+    aio_context_release(aio_context);

So, job_unref_locked() must be called with aio_context not locked,
otherwise we dead-lock here? That should be documented in function
declaration comment.

For example in qemu-img.c in run_block_job() we do exactly that: call
job_unref_locked()  inside aio-context lock critical seaction..

No, job_unref_locked has also status and refcnt and all the other fields
that need to be protectd. Only free must be called without job lock held.



I mean, don't we try here to acquire aio_context twice, when we call
job_unref_locked() with aio_context _already_ acquired?


You're right, so why do we need the AioContext lock in qemu-img then?
All jobs API don't need it, aio_poll seems not to need it either, and
progress API has its own lock.

I guess I could remove it?



Not sure, but hope that yes.

Anyway, trying to lock it twice will dead-lock, as I understand.


--
Best regards,
Vladimir

Re: [PATCH] esp: Handle CMD_BUSRESET by resetting the SCSI bus

2022-08-19 Thread Paolo Bonzini

Queued, thanks.

Paolo

[PATCH v2 11/11] tests/unit/test-vmstate: Avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
---
 tests/unit/test-vmstate.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tests/unit/test-vmstate.c b/tests/unit/test-vmstate.c
index 72077b57800..541bb4f63e3 100644
--- a/tests/unit/test-vmstate.c
+++ b/tests/unit/test-vmstate.c
@@ -87,17 +87,16 @@ static void save_buffer(const uint8_t *buf, size_t buf_size)
 static void compare_vmstate(const uint8_t *wire, size_t size)
 {
 QEMUFile *f = open_test_file(false);
-uint8_t result[size];
+g_autofree uint8_t *result = g_malloc(size);
 
 /* read back as binary */
 
-g_assert_cmpint(qemu_get_buffer(f, result, sizeof(result)), ==,
-sizeof(result));
+g_assert_cmpint(qemu_get_buffer(f, result, size), ==, size);
 g_assert(!qemu_file_get_error(f));
 
 /* Compare that what is on the file is the same that what we
expected to be there */
-SUCCESS(memcmp(result, wire, sizeof(result)));
+SUCCESS(memcmp(result, wire, size));
 
 /* Must reach EOF */
 qemu_get_byte(f);
-- 
2.25.1

[PATCH v2 10/11] ui/curses: Avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
---
 ui/curses.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ui/curses.c b/ui/curses.c
index 861d63244c7..de962faa7cd 100644
--- a/ui/curses.c
+++ b/ui/curses.c
@@ -69,7 +69,7 @@ static void curses_update(DisplayChangeListener *dcl,
   int x, int y, int w, int h)
 {
 console_ch_t *line;
-cchar_t curses_line[width];
+g_autofree cchar_t *curses_line = g_new(cchar_t, width);
 wchar_t wch[CCHARW_MAX];
 attr_t attrs;
 short colors;
-- 
2.25.1

[PATCH v2 04/11] io/channel-websock: Replace strlen(const_str) by sizeof(const_str) - 1

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

The combined_key[... QIO_CHANNEL_WEBSOCK_GUID_LEN ...] array in
qio_channel_websock_handshake_send_res_ok() expands to a call
to strlen(QIO_CHANNEL_WEBSOCK_GUID), and the compiler doesn't
realize the string is const, so consider combined_key[] being
a variable-length array.

To remove the variable-length array, we provide it a hint to
the compiler by using sizeof() - 1 instead of strlen().

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Peter Maydell 
---
 io/channel-websock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/io/channel-websock.c b/io/channel-websock.c
index 9619906ac36..fb4932ade70 100644
--- a/io/channel-websock.c
+++ b/io/channel-websock.c
@@ -32,7 +32,7 @@
 
 #define QIO_CHANNEL_WEBSOCK_CLIENT_KEY_LEN 24
 #define QIO_CHANNEL_WEBSOCK_GUID "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"
-#define QIO_CHANNEL_WEBSOCK_GUID_LEN strlen(QIO_CHANNEL_WEBSOCK_GUID)
+#define QIO_CHANNEL_WEBSOCK_GUID_LEN (sizeof(QIO_CHANNEL_WEBSOCK_GUID) - 1)
 
 #define QIO_CHANNEL_WEBSOCK_HEADER_PROTOCOL "sec-websocket-protocol"
 #define QIO_CHANNEL_WEBSOCK_HEADER_VERSION "sec-websocket-version"
-- 
2.25.1

[PATCH v2 06/11] hw/ppc/pnv: Avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: David Gibson 
Signed-off-by: Peter Maydell 
Reviewed-by: Peter Maydell 
---
 hw/ppc/pnv.c   | 4 ++--
 hw/ppc/spapr.c | 8 
 hw/ppc/spapr_pci_nvlink2.c | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index d3f77c83672..dd4101e5b65 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -137,7 +137,7 @@ static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void 
*fdt)
 int smt_threads = CPU_CORE(pc)->nr_threads;
 CPUPPCState *env = >env;
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
-uint32_t servers_prop[smt_threads];
+g_autofree uint32_t *servers_prop = g_new(uint32_t, smt_threads);
 int i;
 uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
0x, 0x};
@@ -240,7 +240,7 @@ static void pnv_dt_core(PnvChip *chip, PnvCore *pc, void 
*fdt)
 servers_prop[i] = cpu_to_be32(pc->pir + i);
 }
 _FDT((fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
-   servers_prop, sizeof(servers_prop;
+   servers_prop, sizeof(*servers_prop) * smt_threads)));
 }
 
 static void pnv_dt_icp(PnvChip *chip, void *fdt, uint32_t pir,
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bc9ba6e6dcf..28626efd479 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -177,8 +177,8 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
PowerPCCPU *cpu,
   int smt_threads)
 {
 int i, ret = 0;
-uint32_t servers_prop[smt_threads];
-uint32_t gservers_prop[smt_threads * 2];
+g_autofree uint32_t *servers_prop = g_new(uint32_t, smt_threads);
+g_autofree uint32_t *gservers_prop = g_new(uint32_t, smt_threads * 2);
 int index = spapr_get_vcpu_id(cpu);
 
 if (cpu->compat_pvr) {
@@ -196,12 +196,12 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
PowerPCCPU *cpu,
 gservers_prop[i*2 + 1] = 0;
 }
 ret = fdt_setprop(fdt, offset, "ibm,ppc-interrupt-server#s",
-  servers_prop, sizeof(servers_prop));
+  servers_prop, sizeof(*servers_prop) * smt_threads);
 if (ret < 0) {
 return ret;
 }
 ret = fdt_setprop(fdt, offset, "ibm,ppc-interrupt-gserver#s",
-  gservers_prop, sizeof(gservers_prop));
+  gservers_prop, sizeof(*gservers_prop) * smt_threads * 2);
 
 return ret;
 }
diff --git a/hw/ppc/spapr_pci_nvlink2.c b/hw/ppc/spapr_pci_nvlink2.c
index 63b476c8f72..2a8a11be1d6 100644
--- a/hw/ppc/spapr_pci_nvlink2.c
+++ b/hw/ppc/spapr_pci_nvlink2.c
@@ -397,7 +397,7 @@ void spapr_phb_nvgpu_populate_pcidev_dt(PCIDevice *dev, 
void *fdt, int offset,
 continue;
 }
 if (dev == nvslot->gpdev) {
-uint32_t npus[nvslot->linknum];
+g_autofree uint32_t *npus = g_new(uint32_t, nvslot->linknum);
 
 for (j = 0; j < nvslot->linknum; ++j) {
 PCIDevice *npdev = nvslot->links[j].npdev;
-- 
2.25.1

[PATCH v2 08/11] hw/i386/multiboot: Avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length array on
the stack. Replace the snprintf() call by g_strdup_printf().

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
---
 hw/i386/multiboot.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c
index 0a10089f14b..963e29362e4 100644
--- a/hw/i386/multiboot.c
+++ b/hw/i386/multiboot.c
@@ -163,6 +163,7 @@ int load_multiboot(X86MachineState *x86ms,
 uint8_t *mb_bootinfo_data;
 uint32_t cmdline_len;
 GList *mods = NULL;
+g_autofree char *kcmdline = NULL;
 
 /* Ok, let's see if it is a multiboot image.
The header is 12x32bit long, so the latest entry may be 8192 - 48. */
@@ -362,9 +363,7 @@ int load_multiboot(X86MachineState *x86ms,
 }
 
 /* Commandline support */
-char kcmdline[strlen(kernel_filename) + strlen(kernel_cmdline) + 2];
-snprintf(kcmdline, sizeof(kcmdline), "%s %s",
- kernel_filename, kernel_cmdline);
+kcmdline = g_strdup_printf("%s %s", kernel_filename, kernel_cmdline);
 stl_p(bootinfo + MBI_CMDLINE, mb_add_cmdline(, kcmdline));
 
 stl_p(bootinfo + MBI_BOOTLOADER, mb_add_bootloader(, bootloader_name));
-- 
2.25.1

[PATCH v2 07/11] hw/intc/xics: Avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: David Gibson 
Reviewed-by: Greg Kurz 
Signed-off-by: Peter Maydell 
---
 hw/intc/xics.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 5b0b4d96242..dcd021af668 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -567,8 +567,8 @@ static void ics_reset_irq(ICSIRQState *irq)
 static void ics_reset(DeviceState *dev)
 {
 ICSState *ics = ICS(dev);
+g_autofree uint8_t *flags = g_malloc(ics->nr_irqs);
 int i;
-uint8_t flags[ics->nr_irqs];
 
 for (i = 0; i < ics->nr_irqs; i++) {
 flags[i] = ics->irqs[i].flags;
-- 
2.25.1

[PATCH v2 01/11] chardev/baum: Replace magic values by X_MAX / Y_MAX definitions

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Replace '84' magic value by the X_MAX definition, and '1' by Y_MAX.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Samuel Thibault 
Signed-off-by: Peter Maydell 
---
 chardev/baum.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/chardev/baum.c b/chardev/baum.c
index 79d618e3504..6d538808a0f 100644
--- a/chardev/baum.c
+++ b/chardev/baum.c
@@ -87,6 +87,9 @@
 
 #define BUF_SIZE 256
 
+#define X_MAX   84
+#define Y_MAX   1
+
 struct BaumChardev {
 Chardev parent;
 
@@ -244,11 +247,11 @@ static int baum_deferred_init(BaumChardev *baum)
 brlapi_perror("baum: brlapi__getDisplaySize");
 return 0;
 }
-if (baum->y > 1) {
-baum->y = 1;
+if (baum->y > Y_MAX) {
+baum->y = Y_MAX;
 }
-if (baum->x > 84) {
-baum->x = 84;
+if (baum->x > X_MAX) {
+baum->x = X_MAX;
 }
 
 con = qemu_console_lookup_by_index(0);
-- 
2.25.1

[PATCH v2 05/11] hw/net/e1000e_core: Use definition to avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

The compiler isn't clever enough to figure 'min_buf_size'
is a constant, so help it by using a definitions instead.

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Jason Wang 
Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
---
 hw/net/e1000e_core.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 208e3e0d798..82aa61fedcd 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1622,15 +1622,16 @@ e1000e_rx_fix_l4_csum(E1000ECore *core, struct NetRxPkt 
*pkt)
 }
 }
 
+/* Min. octets in an ethernet frame sans FCS */
+#define MIN_BUF_SIZE 60
+
 ssize_t
 e1000e_receive_iov(E1000ECore *core, const struct iovec *iov, int iovcnt)
 {
 static const int maximum_ethernet_hdr_len = (14 + 4);
-/* Min. octets in an ethernet frame sans FCS */
-static const int min_buf_size = 60;
 
 uint32_t n = 0;
-uint8_t min_buf[min_buf_size];
+uint8_t min_buf[MIN_BUF_SIZE];
 struct iovec min_iov;
 uint8_t *filter_buf;
 size_t size, orig_size;
-- 
2.25.1

[PATCH v2 09/11] hw/usb/hcd-ohci: Use definition to avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

The compiler isn't clever enough to figure 'width' is a constant,
so help it by using a definitions instead.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Peter Maydell 
---
 hw/usb/hcd-ohci.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 895b29fb865..5585fd32ccf 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -805,13 +805,14 @@ static int ohci_service_iso_td(OHCIState *ohci, struct 
ohci_ed *ed)
 return 1;
 }
 
+#define HEX_CHAR_PER_LINE 16
+
 static void ohci_td_pkt(const char *msg, const uint8_t *buf, size_t len)
 {
 bool print16;
 bool printall;
-const int width = 16;
 int i;
-char tmp[3 * width + 1];
+char tmp[3 * HEX_CHAR_PER_LINE + 1];
 char *p = tmp;
 
 print16 = !!trace_event_get_state_backends(TRACE_USB_OHCI_TD_PKT_SHORT);
@@ -822,7 +823,7 @@ static void ohci_td_pkt(const char *msg, const uint8_t 
*buf, size_t len)
 }
 
 for (i = 0; ; i++) {
-if (i && (!(i % width) || (i == len))) {
+if (i && (!(i % HEX_CHAR_PER_LINE) || (i == len))) {
 if (!printall) {
 trace_usb_ohci_td_pkt_short(msg, tmp);
 break;
-- 
2.25.1

[PATCH v2 03/11] chardev/baum: Avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Samuel Thibault 
Signed-off-by: Peter Maydell 
---
 chardev/baum.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/chardev/baum.c b/chardev/baum.c
index 6a210ffd815..0a0d12661a4 100644
--- a/chardev/baum.c
+++ b/chardev/baum.c
@@ -299,7 +299,8 @@ static void baum_chr_accept_input(struct Chardev *chr)
 static void baum_write_packet(BaumChardev *baum, const uint8_t *buf, int len)
 {
 Chardev *chr = CHARDEV(baum);
-uint8_t io_buf[1 + 2 * len], *cur = io_buf;
+g_autofree uint8_t *io_buf = g_malloc(1 + 2 * len);
+uint8_t *cur = io_buf;
 int room;
 *cur++ = ESC;
 while (len--)
-- 
2.25.1

[PATCH v2 00/11] misc: Remove variable-length arrays on the stack

2022-08-19 Thread Peter Maydell

This is a resend of a subset of patches from a series that
Philippe sent out last year:
https://patchew.org/QEMU/20210505211047.1496765-1-phi...@redhat.com/

Basically I just pulled out the patches which:
 (1) trivially applied on a rebase
 (2) had got a Reviewed-by: or at least an Acked-by:

since these should be good to just apply immediately
(well, as soon as we reopen for 7.2 development).

Given they're a mixed bag, I propose to take these via
the target-arm.next tree, unless anybody specifically
wishes to grab specific patches via some other route.

I might come back and have another look at the other
left-behind patches later, but this gets rid of more
than half of the complaints that a -Wvla build reports.

thanks
-- PMM

Philippe Mathieu-Daudé (11):
  chardev/baum: Replace magic values by X_MAX / Y_MAX definitions
  chardev/baum: Use definitions to avoid dynamic stack allocation
  chardev/baum: Avoid dynamic stack allocation
  io/channel-websock: Replace strlen(const_str) by sizeof(const_str) - 1
  hw/net/e1000e_core: Use definition to avoid dynamic stack allocation
  hw/ppc/pnv: Avoid dynamic stack allocation
  hw/intc/xics: Avoid dynamic stack allocation
  hw/i386/multiboot: Avoid dynamic stack allocation
  hw/usb/hcd-ohci: Use definition to avoid dynamic stack allocation
  ui/curses: Avoid dynamic stack allocation
  tests/unit/test-vmstate: Avoid dynamic stack allocation

 chardev/baum.c | 22 +-
 hw/i386/multiboot.c|  5 ++---
 hw/intc/xics.c |  2 +-
 hw/net/e1000e_core.c   |  7 ---
 hw/ppc/pnv.c   |  4 ++--
 hw/ppc/spapr.c |  8 
 hw/ppc/spapr_pci_nvlink2.c |  2 +-
 hw/usb/hcd-ohci.c  |  7 ---
 io/channel-websock.c   |  2 +-
 tests/unit/test-vmstate.c  |  7 +++
 ui/curses.c|  2 +-
 11 files changed, 36 insertions(+), 32 deletions(-)

-- 
2.25.1

[PATCH v2 02/11] chardev/baum: Use definitions to avoid dynamic stack allocation

2022-08-19 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

We know 'x * y' will be at most 'X_MAX * Y_MAX' (which is not
a big value, it is actually 84). Instead of having the compiler
use variable-length array, declare an array able to hold the
maximum 'x * y'.

Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Samuel Thibault 
Signed-off-by: Peter Maydell 
---
 chardev/baum.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/chardev/baum.c b/chardev/baum.c
index 6d538808a0f..6a210ffd815 100644
--- a/chardev/baum.c
+++ b/chardev/baum.c
@@ -383,9 +383,9 @@ static int baum_eat_packet(BaumChardev *baum, const uint8_t 
*buf, int len)
 switch (req) {
 case BAUM_REQ_DisplayData:
 {
-uint8_t cells[baum->x * baum->y], c;
-uint8_t text[baum->x * baum->y];
-uint8_t zero[baum->x * baum->y];
+uint8_t cells[X_MAX * Y_MAX], c;
+uint8_t text[X_MAX * Y_MAX];
+uint8_t zero[X_MAX * Y_MAX];
 int cursor = BRLAPI_CURSOR_OFF;
 int i;
 
@@ -408,7 +408,7 @@ static int baum_eat_packet(BaumChardev *baum, const uint8_t 
*buf, int len)
 }
 timer_del(baum->cellCount_timer);
 
-memset(zero, 0, sizeof(zero));
+memset(zero, 0, baum->x * baum->y);
 
 brlapi_writeArguments_t wa = {
 .displayNumber = BRLAPI_DISPLAY_DEFAULT,
-- 
2.25.1

Re: [RFC PATCH v2 4/8] qapi: golang: Generate qapi's union types in Go

2022-08-19 Thread Andrea Bolognani

On Fri, Aug 19, 2022 at 09:20:52AM +0200, Victor Toso wrote:
> > > On Wed, Jul 06, 2022 at 10:37:54AM +0100, Daniel P. Berrangé wrote:
> > > > On Wed, Jul 06, 2022 at 04:28:16AM -0500, Andrea Bolognani wrote:
> > > > > You're right, that is undesirable. What about something like this?
> > > > >
> > > > >   type GuestPanicInformation struct {
> > > > >   HyperV *GuestPanicInformationHyperV
> > > > >   S390   *GuestPanicInformationS390
> > > > >   }
> > > > >
> > > > >   type jsonGuestPanicInformation struct {
> > > > >   Discriminator string   `json:"type"`
> > > > >   HyperV*GuestPanicInformationHyperV `json:"hyper-v"`
> > > > >   S390  *GuestPanicInformationS390   `json:"s390"`
> > > > >   }
> > > >
> > > > It can possibly be even simpler with just embedding the real
> > > > struct
> > > >
> > > >type jsonGuestPanicInformation struct {
> > > >Discriminator string
> > > >GuestPanicInformation
> > > >}
> >
> > Similar to what I said in previous email (same thread) to Andrea,
> > this would not work because the end result does not match with
> > QAPI spec, where HyperV or S390 fields should be at the same
> > level as 'type'.

Yeah, you're absolutely correct, I was misreading the schema and
suggested an implementation that couldn't possibly work.

> > If we embed either HyperV or S390, then it should work, like:
> >
> > tmp := struct {
> > GuestPanicInformationHyperV
> > Discriminator string "type"
> > }{}
>
> For the same reason, I could not use json.RawMessage, sadly.
>
> As Andrea pointed out before, json.RawMessage is just an alias
> for []byte. We need a field for that (so, it can't be embed) and
> the contents of the JSON need to match that field's name.

Right. It would work if unions looked like

  { "type": "...", "data": { ... }}

with the "data" part having different fields based on the value of
"type", but not for the flat JSON dictionaries that are used for QAPI
unions.

> func (s QCryptoBlockOpenOptions) MarshalJSON() ([]byte, error) {
>   var bytes []byte
>   var err error
>   if s.Qcow != nil {
>   tmp := struct {
>   QCryptoBlockOptionsQCow
>   Discriminator string `json:"format"`
>   }{
>   QCryptoBlockOptionsQCow: *s.Qcow,
>   Discriminator:   "qcow",
>   }
>   if bytes, err = json.Marshal(tmp); err != nil {
>   return nil, err
>   }
>   }
>   if s.Luks != nil {
>   if len(bytes) != 0 {
>   return nil, errors.New(`multiple fields set for 
> QCryptoBlockOpenOptions`)
>   }

Why are you checking this here? I would just have a check like

  if s.Qcow != nil && s.Luks != nil {
  return nil, errors.New(`multiple fields set for QCryptoBlockOpenOptions`)
  }

at the top of the function, so that you can abort early and cheaply
if the user has provided invalid input instead of having to wait
after one of the fields has already been serialized.

>   tmp := struct {
>   QCryptoBlockOptionsLUKS
>   Discriminator string `json:"format"`
>   }{
>   QCryptoBlockOptionsLUKS: *s.Luks,
>   Discriminator:   "luks",
>   }
>   if bytes, err = json.Marshal(tmp); err != nil {
>   return nil, err
>   }
>   }
>   if len(bytes) == 0 {
>   return nil, errors.New(`null not supported for 
> QCryptoBlockOpenOptions`)
>   }

Similarly, this should be checked early. So the complete check could
be turned into

  if (s.Qcow != nil && s.Luks != nil) || (s.Qcow == nil && s.Luks == nil) {
  return nil, errors.New("must set exactly one field")
  }

You should have enough information to produce such a check when
generating the function, right? If making sure all possible
combinations are covered up ends up being too complicated, something
like

  var setFields int
  if s.Field1 != nil {
  setFields += 1
  }
  if s.Field2 != nil {
  setFields += 1
  }
  // ...
  if setFields != 1 {
  return nil, errors.New("must set exactly one field")
  }

would also work.

> func (s *QCryptoBlockOpenOptions) UnmarshalJSON(data []byte) error {
>   tmp := struct {
>   Discriminator string `json:"format"`
>   }{}
>   if err := json.Unmarshal(data, ); err != nil {
>   return err
>   }
>   switch tmp.Discriminator {
>   case "qcow":
>   s.Qcow = {}
>   if err := json.Unmarshal(data, s.Qcow); err != nil {
>   s.Qcow = nil
>   return err
>   }
>   case "luks":
>   s.Luks = {}
>   if err := json.Unmarshal(data, s.Luks); err != nil {
>   s.Luks =

Re: [PATCH v2] target/riscv: Use official extension names for AIA CSRs

2022-08-19 Thread Richard Henderson


On 8/19/22 00:31, Anup Patel wrote:

  static int aia_hmode(CPURISCVState *env, int csrno)
  {
-if (!riscv_feature(env, RISCV_FEATURE_AIA)) {
+CPUState *cs = env_cpu(env);
+RISCVCPU *cpu = RISCV_CPU(cs);


Better as

RISCVCPU *cpu = env_archcpu(env);


r~

1 2 >

1 - 100 of 149 matches

Mail list logo