Re: [PATCH v2] Move the libssh setup from configure to meson.build

2021-12-12 Thread Thomas Huth

On 10/12/2021 14.05, Philippe Mathieu-Daudé wrote:

On 12/9/21 16:22, Richard W.M. Jones wrote:

On Thu, Dec 09, 2021 at 04:08:24PM +0100, Thomas Huth wrote:

On 09/12/2021 15.55, Richard W.M. Jones wrote:

On Thu, Dec 09, 2021 at 03:48:01PM +0100, Thomas Huth wrote:

It's easier to do this in meson.build now.

Signed-off-by: Thomas Huth 
---
  v2: Added the missing "config_host_data.set('CONFIG_LIBSSH', libssh.found())"

  configure | 27 ---
  meson.build   | 13 +
  meson_options.txt |  2 ++
  scripts/meson-buildoptions.sh |  3 +++
  4 files changed, 14 insertions(+), 31 deletions(-)



I should say that my interest in the ssh driver in qemu is not that
much these days.  I've been telling people to use nbdkit-ssh-plugin
instead.  It's more featureful and running it in a separate process is
probably safer too.


Then it's maybe time to deprecate the ssh driver in QEMU?


Wll ...  I didn't necessarily want to say that.  Others may be
using it, and deprecating working software causes trouble for some.
But I'll let others have their say on this.


The deprecation process is slow, users have 8 months to notice it,
and we might discover contributors willing to maintain it. IOW more
PROs than CONs IMHO.


Right - one of the ideas of the deprecation process is that this is a way to 
find out if a feature is still used in practice, and whether someone still 
cares about it being maintained. So if you think that there is a better 
alternative these days and don't want to maintain the feature forever 
anymore, just send a patch to docs/about/deprecated.rst to mark it as 
deprecated there (and the status in MAINTAINERS should maybe rather be "Odd 
Fixes" than "Supported", I guess?).


 Thomas




[PATCH] meson: be strict for boolean options

2021-12-12 Thread Антон Кочков
This patch allows to proceed further to be able to build with Muon buildsystem

https://sr.ht/~lattis/muon/

There are still few bugs remain, but they are on the Muon side:
https://todo.sr.ht/~lattis/muon/21

Best regards,
Anton Kochkov.

>From fa80e0c17b14b8f5067d13ad7bc63e0d2cbb94ce Mon Sep 17 00:00:00 2001
From: Anton Kochkov 
Date: Fri, 10 Dec 2021 21:10:34 +0800
Subject: [PATCH] meson: be strict for boolean options

While Meson buildsystem accepts the 'false' as a value
for boolean options, it's not covered by the specification
and in general invalid. Some alternative Meson implementations,
like Muon, do not accept 'false' or 'true' as a valid value
for the boolean options.

See https://mesonbuild.com/Build-options.html

Signed-off-by: Anton Kochkov 
---
 meson_options.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/meson_options.txt b/meson_options.txt
index e392323732..4ca770c1bd 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -60,9 +60,9 @@ option('tcg', type: 'feature', value: 'auto',
description: 'TCG support')
 option('tcg_interpreter', type: 'boolean', value: false,
description: 'TCG with bytecode interpreter (slow)')
-option('cfi', type: 'boolean', value: 'false',
+option('cfi', type: 'boolean', value: false,
description: 'Control-Flow Integrity (CFI)')
-option('cfi_debug', type: 'boolean', value: 'false',
+option('cfi_debug', type: 'boolean', value: false,
description: 'Verbose errors in case of CFI violation')
 option('multiprocess', type: 'feature', value: 'auto',
description: 'Out of process device emulation support')
-- 
2.33.1



Re: [PATCH v8 2/7] net/vmnet: add vmnet backends to qapi/net

2021-12-12 Thread Markus Armbruster
Vladislav Yaroshchuk  writes:

> Create separate netdevs for each vmnet operating mode:
> - vmnet-host
> - vmnet-shared
> - vmnet-bridged
>
> Signed-off-by: Vladislav Yaroshchuk 

QAPI schema
Acked-by: Markus Armbruster 




Re: [PATCH v2 10/12] target/riscv: Add kvm_riscv_get/put_regs_timer

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:37 PM Yifei Jiang  wrote:
>
> Add kvm_riscv_get/put_regs_timer to synchronize virtual time context
> from KVM.
>
> To set register of RISCV_TIMER_REG(state) will occur a error from KVM
> on kvm_timer_state == 0. It's better to adapt in KVM, but it doesn't matter
> that adaping in QEMU.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 
> ---
>  target/riscv/cpu.h |  7 +
>  target/riscv/kvm.c | 72 ++
>  2 files changed, 79 insertions(+)
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index e7dba35acb..c892a2c8b7 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -259,6 +259,13 @@ struct CPURISCVState {
>
>  hwaddr kernel_addr;
>  hwaddr fdt_addr;
> +
> +/* kvm timer */
> +bool kvm_timer_dirty;
> +uint64_t kvm_timer_time;
> +uint64_t kvm_timer_compare;
> +uint64_t kvm_timer_state;
> +uint64_t kvm_timer_frequency;
>  };
>
>  OBJECT_DECLARE_TYPE(RISCVCPU, RISCVCPUClass,
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 171a32adf9..802c076b22 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -64,6 +64,9 @@ static uint64_t kvm_riscv_reg_id(CPURISCVState *env, 
> uint64_t type, uint64_t idx
>  #define RISCV_CSR_REG(env, name)  kvm_riscv_reg_id(env, KVM_REG_RISCV_CSR, \
>   KVM_REG_RISCV_CSR_REG(name))
>
> +#define RISCV_TIMER_REG(env, name)  kvm_riscv_reg_id(env, 
> KVM_REG_RISCV_TIMER, \
> + KVM_REG_RISCV_TIMER_REG(name))
> +
>  #define RISCV_FP_F_REG(env, idx)  kvm_riscv_reg_id(env, KVM_REG_RISCV_FP_F, 
> idx)
>
>  #define RISCV_FP_D_REG(env, idx)  kvm_riscv_reg_id(env, KVM_REG_RISCV_FP_D, 
> idx)
> @@ -235,6 +238,75 @@ static int kvm_riscv_put_regs_fp(CPUState *cs)
>  return ret;
>  }
>
> +static void kvm_riscv_get_regs_timer(CPUState *cs)
> +{
> +int ret;
> +uint64_t reg;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +if (env->kvm_timer_dirty) {
> +return;
> +}
> +
> +ret = kvm_get_one_reg(cs, RISCV_TIMER_REG(env, time), ®);
> +if (ret) {
> +abort();
> +}
> +env->kvm_timer_time = reg;
> +
> +ret = kvm_get_one_reg(cs, RISCV_TIMER_REG(env, compare), ®);
> +if (ret) {
> +abort();
> +}
> +env->kvm_timer_compare = reg;
> +
> +ret = kvm_get_one_reg(cs, RISCV_TIMER_REG(env, state), ®);
> +if (ret) {
> +abort();
> +}
> +env->kvm_timer_state = reg;

Please read the timer frequency here.

> +
> +env->kvm_timer_dirty = true;
> +}
> +
> +static void kvm_riscv_put_regs_timer(CPUState *cs)
> +{
> +int ret;
> +uint64_t reg;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +if (!env->kvm_timer_dirty) {
> +return;
> +}

Over here, we should get the timer frequency and abort() with an
error message if it does not match env->kvm_timer_frequency

For now, migration will not work between Hosts with different
timer frequency.

Regards,
Anup

> +
> +reg = env->kvm_timer_time;
> +ret = kvm_set_one_reg(cs, RISCV_TIMER_REG(env, time), ®);
> +if (ret) {
> +abort();
> +}
> +
> +reg = env->kvm_timer_compare;
> +ret = kvm_set_one_reg(cs, RISCV_TIMER_REG(env, compare), ®);
> +if (ret) {
> +abort();
> +}
> +
> +/*
> + * To set register of RISCV_TIMER_REG(state) will occur a error from KVM
> + * on env->kvm_timer_state == 0, It's better to adapt in KVM, but it
> + * doesn't matter that adaping in QEMU now.
> + * TODO If KVM changes, adapt here.
> + */
> +if (env->kvm_timer_state) {
> +reg = env->kvm_timer_state;
> +ret = kvm_set_one_reg(cs, RISCV_TIMER_REG(env, state), ®);
> +if (ret) {
> +abort();
> +}
> +}
> +
> +env->kvm_timer_dirty = false;
> +}
>
>  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  KVM_CAP_LAST_INFO
> --
> 2.19.1
>



[PATCH] MAINTAINERS: Add a separate entry for acpi/VIOT tables

2021-12-12 Thread Ani Sinha
All work related to VIOT tables are being done by Jean. Adding him as the
maintainer for acpi VIOT table code in qemu.

Signed-off-by: Ani Sinha 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7543eb4d59..f9580f2fe2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1776,6 +1776,13 @@ F: docs/specs/acpi_mem_hotplug.rst
 F: docs/specs/acpi_pci_hotplug.rst
 F: docs/specs/acpi_hw_reduced_hotplug.rst
 
+ACPI/VIOT
+M: Jean-Philippe Brucker 
+R: Ani Sinha 
+S: Supported
+F: hw/acpi/viot.c
+F: hw/acpi/viot.h
+
 ACPI/HEST/GHES
 R: Dongjiu Geng 
 L: qemu-...@nongnu.org
-- 
2.25.1




Re: [PATCH v2 11/12] target/riscv: Implement virtual time adjusting with vm state changing

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:38 PM Yifei Jiang  wrote:
>
> We hope that virtual time adjusts with vm state changing. When a vm
> is stopped, guest virtual time should stop counting and kvm_timer
> should be stopped. When the vm is resumed, guest virtual time should
> continue to count and kvm_timer should be restored.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 

Looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup

> ---
>  target/riscv/kvm.c | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 802c076b22..be95dbc3f3 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -40,6 +40,7 @@
>  #include "kvm_riscv.h"
>  #include "sbi_ecall_interface.h"
>  #include "semihosting/console.h"
> +#include "sysemu/runstate.h"
>
>  static uint64_t kvm_riscv_reg_id(CPURISCVState *env, uint64_t type, uint64_t 
> idx)
>  {
> @@ -377,6 +378,17 @@ unsigned long kvm_arch_vcpu_id(CPUState *cpu)
>  return cpu->cpu_index;
>  }
>
> +static void kvm_riscv_vm_state_change(void *opaque, bool running, RunState 
> state)
> +{
> +CPUState *cs = opaque;
> +
> +if (running) {
> +kvm_riscv_put_regs_timer(cs);
> +} else {
> +kvm_riscv_get_regs_timer(cs);
> +}
> +}
> +
>  void kvm_arch_init_irq_routing(KVMState *s)
>  {
>  }
> @@ -389,6 +401,8 @@ int kvm_arch_init_vcpu(CPUState *cs)
>  CPURISCVState *env = &cpu->env;
>  uint64_t id;
>
> +qemu_add_vm_change_state_handler(kvm_riscv_vm_state_change, cs);
> +
>  id = kvm_riscv_reg_id(env, KVM_REG_RISCV_CONFIG, 
> KVM_REG_RISCV_CONFIG_REG(isa));
>  ret = kvm_get_one_reg(cs, id, &isa);
>  if (ret) {
> --
> 2.19.1
>



Re: [PATCH v2 12/12] target/riscv: Support virtual time context synchronization

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:38 PM Yifei Jiang  wrote:
>
> Add virtual time context description to vmstate_kvmtimer. After cpu being
> loaded, virtual time context is updated to KVM.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 
> ---
>  target/riscv/machine.c | 37 +++--
>  1 file changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/machine.c b/target/riscv/machine.c
> index ad8248ebfd..f46443c316 100644
> --- a/target/riscv/machine.c
> +++ b/target/riscv/machine.c
> @@ -164,10 +164,42 @@ static const VMStateDescription vmstate_pointermasking 
> = {
>  }
>  };
>
> +static bool kvmtimer_needed(void *opaque)
> +{
> +return kvm_enabled();
> +}
> +
> +

Remove extra newline from here.

> +static const VMStateDescription vmstate_kvmtimer = {
> +.name = "cpu/kvmtimer",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.needed = kvmtimer_needed,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINT64(env.kvm_timer_time, RISCVCPU),
> +VMSTATE_UINT64(env.kvm_timer_compare, RISCVCPU),
> +VMSTATE_UINT64(env.kvm_timer_state, RISCVCPU),
> +
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
> +static int cpu_post_load(void *opaque, int version_id)
> +{
> +RISCVCPU *cpu = opaque;
> +CPURISCVState *env = &cpu->env;
> +
> +if (kvm_enabled()) {
> +env->kvm_timer_dirty = true;
> +}
> +return 0;
> +}
> +
>  const VMStateDescription vmstate_riscv_cpu = {
>  .name = "cpu",
> -.version_id = 3,
> -.minimum_version_id = 3,
> +.version_id = 4,
> +.minimum_version_id = 4,
> +.post_load = cpu_post_load,
>  .fields = (VMStateField[]) {
>  VMSTATE_UINTTL_ARRAY(env.gpr, RISCVCPU, 32),
>  VMSTATE_UINT64_ARRAY(env.fpr, RISCVCPU, 32),
> @@ -218,6 +250,7 @@ const VMStateDescription vmstate_riscv_cpu = {
>  &vmstate_hyper,
>  &vmstate_vector,
>  &vmstate_pointermasking,
> +&vmstate_kvmtimer,
>  NULL
>  }
>  };
> --
> 2.19.1
>
>
> --
> kvm-riscv mailing list
> kvm-ri...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kvm-riscv

Otherwise, it looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup



Re: [PATCH v2 08/12] target/riscv: Handle KVM_EXIT_RISCV_SBI exit

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:37 PM Yifei Jiang  wrote:
>
> Use char-fe to handle console sbi call, which implement early
> console io while apply 'earlycon=sbi' into kernel parameters.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 

Looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup

> ---
>  target/riscv/kvm.c | 38 +++-
>  target/riscv/sbi_ecall_interface.h | 72 ++
>  2 files changed, 109 insertions(+), 1 deletion(-)
>  create mode 100644 target/riscv/sbi_ecall_interface.h
>
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 0027f11f45..171a32adf9 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -38,6 +38,8 @@
>  #include "qemu/log.h"
>  #include "hw/loader.h"
>  #include "kvm_riscv.h"
> +#include "sbi_ecall_interface.h"
> +#include "semihosting/console.h"
>
>  static uint64_t kvm_riscv_reg_id(CPURISCVState *env, uint64_t type, uint64_t 
> idx)
>  {
> @@ -365,9 +367,43 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cs)
>  return true;
>  }
>
> +static int kvm_riscv_handle_sbi(CPUState *cs, struct kvm_run *run)
> +{
> +int ret = 0;
> +unsigned char ch;
> +switch (run->riscv_sbi.extension_id) {
> +case SBI_EXT_0_1_CONSOLE_PUTCHAR:
> +ch = run->riscv_sbi.args[0];
> +qemu_semihosting_log_out((const char *)&ch, sizeof(ch));
> +break;
> +case SBI_EXT_0_1_CONSOLE_GETCHAR:
> +run->riscv_sbi.args[0] =
> +(unsigned long)qemu_semihosting_console_inc(cs->env_ptr);
> +break;
> +default:
> +qemu_log_mask(LOG_UNIMP,
> +  "%s: un-handled SBI EXIT, specific reasons is %lu\n",
> +  __func__, run->riscv_sbi.extension_id);
> +ret = -1;
> +break;
> +}
> +return ret;
> +}
> +
>  int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>  {
> -return 0;
> +int ret = 0;
> +switch (run->exit_reason) {
> +case KVM_EXIT_RISCV_SBI:
> +ret = kvm_riscv_handle_sbi(cs, run);
> +break;
> +default:
> +qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n",
> +  __func__, run->exit_reason);
> +ret = -1;
> +break;
> +}
> +return ret;
>  }
>
>  void kvm_riscv_reset_vcpu(RISCVCPU *cpu)
> diff --git a/target/riscv/sbi_ecall_interface.h 
> b/target/riscv/sbi_ecall_interface.h
> new file mode 100644
> index 00..fb1a3fa8f2
> --- /dev/null
> +++ b/target/riscv/sbi_ecall_interface.h
> @@ -0,0 +1,72 @@
> +/*
> + * SPDX-License-Identifier: BSD-2-Clause
> + *
> + * Copyright (c) 2019 Western Digital Corporation or its affiliates.
> + *
> + * Authors:
> + *   Anup Patel 
> + */
> +
> +#ifndef __SBI_ECALL_INTERFACE_H__
> +#define __SBI_ECALL_INTERFACE_H__
> +
> +/* clang-format off */
> +
> +/* SBI Extension IDs */
> +#define SBI_EXT_0_1_SET_TIMER   0x0
> +#define SBI_EXT_0_1_CONSOLE_PUTCHAR 0x1
> +#define SBI_EXT_0_1_CONSOLE_GETCHAR 0x2
> +#define SBI_EXT_0_1_CLEAR_IPI   0x3
> +#define SBI_EXT_0_1_SEND_IPI0x4
> +#define SBI_EXT_0_1_REMOTE_FENCE_I  0x5
> +#define SBI_EXT_0_1_REMOTE_SFENCE_VMA   0x6
> +#define SBI_EXT_0_1_REMOTE_SFENCE_VMA_ASID 0x7
> +#define SBI_EXT_0_1_SHUTDOWN0x8
> +#define SBI_EXT_BASE0x10
> +#define SBI_EXT_TIME0x54494D45
> +#define SBI_EXT_IPI 0x735049
> +#define SBI_EXT_RFENCE  0x52464E43
> +#define SBI_EXT_HSM 0x48534D
> +
> +/* SBI function IDs for BASE extension*/
> +#define SBI_EXT_BASE_GET_SPEC_VERSION   0x0
> +#define SBI_EXT_BASE_GET_IMP_ID 0x1
> +#define SBI_EXT_BASE_GET_IMP_VERSION0x2
> +#define SBI_EXT_BASE_PROBE_EXT  0x3
> +#define SBI_EXT_BASE_GET_MVENDORID  0x4
> +#define SBI_EXT_BASE_GET_MARCHID0x5
> +#define SBI_EXT_BASE_GET_MIMPID 0x6
> +
> +/* SBI function IDs for TIME extension*/
> +#define SBI_EXT_TIME_SET_TIMER  0x0
> +
> +/* SBI function IDs for IPI extension*/
> +#define SBI_EXT_IPI_SEND_IPI0x0
> +
> +/* SBI function IDs for RFENCE extension*/
> +#define SBI_EXT_RFENCE_REMOTE_FENCE_I   0x0
> +#define SBI_EXT_RFENCE_REMOTE_SFENCE_VMA0x1
> +#define SBI_EXT_RFENCE_REMOTE_SFENCE_VMA_ASID  0x2
> +#define SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA   0x3
> +#define SBI_EXT_RFENCE_REMOTE_HFENCE_GVMA_VMID 0x4
> +#define SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA   0x5
> +#define SBI_EXT_RFENCE_REMOTE_HFENCE_VVMA_ASID 0x6
> +
> +/* SBI function IDs for HSM extension */
> +#define SBI_EXT_HSM_HART_START  0x0
> +#define SBI_EXT_HSM_HART_STOP   0x1
> +#define SBI_EXT_HSM_HART_GET_STATUS 0x2
> +
> +#define SBI_HSM_HART_STATUS_STARTED 0x0
> +#define SBI_HSM_HART_STATUS_STOPPED 0x1
> +#define SBI_HSM_HART_STATUS_START_PENDING   0x2
> +#define SBI_HSM_HART_STATUS_STOP_PENDING0x3
> +
> +#define SBI_SPEC_VERSION_MAJOR_OFFSET   24
> +#define SBI_

Re: [PATCH v2 07/12] target/riscv: Support setting external interrupt by KVM

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:37 PM Yifei Jiang  wrote:
>
> When KVM is enabled, set the S-mode external interrupt through
> kvm_riscv_set_irq function.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 
> Reviewed-by: Alistair Francis 
> ---
>  target/riscv/cpu.c   |  6 +-
>  target/riscv/kvm-stub.c  |  5 +
>  target/riscv/kvm.c   | 17 +
>  target/riscv/kvm_riscv.h |  1 +
>  4 files changed, 28 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 1c944872a3..71a7ac6831 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -603,7 +603,11 @@ static void riscv_cpu_set_irq(void *opaque, int irq, int 
> level)
>  case IRQ_S_EXT:
>  case IRQ_VS_EXT:
>  case IRQ_M_EXT:
> -riscv_cpu_update_mip(cpu, 1 << irq, BOOL_TO_MASK(level));
> +if (kvm_enabled() && (irq & IRQ_M_EXT) ) {
> +kvm_riscv_set_irq(cpu, IRQ_S_EXT, level);
> +} else {
> +riscv_cpu_update_mip(cpu, 1 << irq, BOOL_TO_MASK(level));
> +}

This does not look right.

I suggest the following:

if (kvm_enabled()) {
kvm_riscv_set_irq(cpu, irq, level);
} else {
   riscv_cpu_update_mip(cpu, 1 << irq, BOOL_TO_MASK(level));
}

>  break;
>  default:
>  g_assert_not_reached();
> diff --git a/target/riscv/kvm-stub.c b/target/riscv/kvm-stub.c
> index 39b96fe3f4..4e8fc31a21 100644
> --- a/target/riscv/kvm-stub.c
> +++ b/target/riscv/kvm-stub.c
> @@ -23,3 +23,8 @@ void kvm_riscv_reset_vcpu(RISCVCPU *cpu)
>  {
>  abort();
>  }
> +
> +void kvm_riscv_set_irq(RISCVCPU *cpu, int irq, int level)
> +{
> +abort();
> +}
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index db6d8a5b6e..0027f11f45 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -383,6 +383,23 @@ void kvm_riscv_reset_vcpu(RISCVCPU *cpu)
>  env->satp = 0;
>  }
>
> +void kvm_riscv_set_irq(RISCVCPU *cpu, int irq, int level)
> +{
> +int ret;
> +unsigned virq = level ? KVM_INTERRUPT_SET : KVM_INTERRUPT_UNSET;
> +
> +if (irq != IRQ_S_EXT) {
> +perror("kvm riscv set irq != IRQ_S_EXT\n");
> +abort();
> +}
> +
> +ret = kvm_vcpu_ioctl(CPU(cpu), KVM_INTERRUPT, &virq);
> +if (ret < 0) {
> +perror("Set irq failed");
> +abort();
> +}
> +}
> +
>  bool kvm_arch_cpu_check_are_resettable(void)
>  {
>  return true;
> diff --git a/target/riscv/kvm_riscv.h b/target/riscv/kvm_riscv.h
> index f38c82bf59..ed281bdce0 100644
> --- a/target/riscv/kvm_riscv.h
> +++ b/target/riscv/kvm_riscv.h
> @@ -20,5 +20,6 @@
>  #define QEMU_KVM_RISCV_H
>
>  void kvm_riscv_reset_vcpu(RISCVCPU *cpu);
> +void kvm_riscv_set_irq(RISCVCPU *cpu, int irq, int level);
>
>  #endif
> --
> 2.19.1
>

Regards,
Anup



Re: [PATCH v2 06/12] target/riscv: Support start kernel directly by KVM

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:37 PM Yifei Jiang  wrote:
>
> Get kernel and fdt start address in virt.c, and pass them to KVM
> when cpu reset. In addition, add kvm_riscv.h to place riscv specific
> interface.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 
> Reviewed-by: Alistair Francis 
> ---
>  hw/riscv/boot.c  | 11 
>  hw/riscv/virt.c  | 54 
>  include/hw/riscv/boot.h  |  1 +
>  target/riscv/cpu.c   |  8 ++
>  target/riscv/cpu.h   |  3 +++
>  target/riscv/kvm-stub.c  | 25 +++
>  target/riscv/kvm.c   | 14 +++
>  target/riscv/kvm_riscv.h | 24 ++
>  target/riscv/meson.build |  2 +-
>  9 files changed, 125 insertions(+), 17 deletions(-)
>  create mode 100644 target/riscv/kvm-stub.c
>  create mode 100644 target/riscv/kvm_riscv.h
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 519fa455a1..00df6d7810 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -317,3 +317,14 @@ void riscv_setup_rom_reset_vec(MachineState *machine, 
> RISCVHartArrayState *harts
>
>  return;
>  }
> +
> +void riscv_setup_direct_kernel(hwaddr kernel_addr, hwaddr fdt_addr)
> +{
> +CPUState *cs;
> +
> +for (cs = first_cpu; cs; cs = CPU_NEXT(cs)) {
> +RISCVCPU *riscv_cpu = RISCV_CPU(cs);
> +riscv_cpu->env.kernel_addr = kernel_addr;
> +riscv_cpu->env.fdt_addr = fdt_addr;
> +}
> +}
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 3af074148e..c8bcd9d9e5 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -38,6 +38,7 @@
>  #include "chardev/char.h"
>  #include "sysemu/device_tree.h"
>  #include "sysemu/sysemu.h"
> +#include "sysemu/kvm.h"
>  #include "hw/pci/pci.h"
>  #include "hw/pci-host/gpex.h"
>  #include "hw/display/ramfb.h"
> @@ -801,23 +802,25 @@ static void virt_machine_init(MachineState *machine)
>  hart_count, &error_abort);
>  sysbus_realize(SYS_BUS_DEVICE(&s->soc[i]), &error_abort);
>
> -/* Per-socket CLINT */
> -riscv_aclint_swi_create(
> -memmap[VIRT_CLINT].base + i * memmap[VIRT_CLINT].size,
> -base_hartid, hart_count, false);
> -riscv_aclint_mtimer_create(
> -memmap[VIRT_CLINT].base + i * memmap[VIRT_CLINT].size +
> -RISCV_ACLINT_SWI_SIZE,
> -RISCV_ACLINT_DEFAULT_MTIMER_SIZE, base_hartid, hart_count,
> -RISCV_ACLINT_DEFAULT_MTIMECMP, RISCV_ACLINT_DEFAULT_MTIME,
> -RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ, true);
> -
> -/* Per-socket ACLINT SSWI */
> -if (s->have_aclint) {
> +if (!kvm_enabled()) {
> +/* Per-socket CLINT */
>  riscv_aclint_swi_create(
> -memmap[VIRT_ACLINT_SSWI].base +
> -i * memmap[VIRT_ACLINT_SSWI].size,
> -base_hartid, hart_count, true);
> +memmap[VIRT_CLINT].base + i * memmap[VIRT_CLINT].size,
> +base_hartid, hart_count, false);
> +riscv_aclint_mtimer_create(
> +memmap[VIRT_CLINT].base + i * memmap[VIRT_CLINT].size +
> +RISCV_ACLINT_SWI_SIZE,
> +RISCV_ACLINT_DEFAULT_MTIMER_SIZE, base_hartid, hart_count,
> +RISCV_ACLINT_DEFAULT_MTIMECMP, RISCV_ACLINT_DEFAULT_MTIME,
> +RISCV_ACLINT_DEFAULT_TIMEBASE_FREQ, true);
> +
> +/* Per-socket ACLINT SSWI */
> +if (s->have_aclint) {
> +riscv_aclint_swi_create(
> +memmap[VIRT_ACLINT_SSWI].base +
> +i * memmap[VIRT_ACLINT_SSWI].size,
> +base_hartid, hart_count, true);
> +}

Along with this, we should not generate FDT nodes of ACLINT (or SiFive CLINT)
when KVM is enabled.

>  }
>
>  /* Per-socket PLIC hart topology configuration string */
> @@ -884,6 +887,16 @@ static void virt_machine_init(MachineState *machine)
>  memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
>  mask_rom);
>
> +/*
> + * Only direct boot kernel is currently supported for KVM VM,
> + * so the "-bios" parameter is ignored and treated like "-bios none"
> + * when KVM is enabled.
> + */
> +if (kvm_enabled()) {
> +g_free(machine->firmware);
> +machine->firmware = g_strdup("none");
> +}
> +
>  if (riscv_is_32bit(&s->soc[0])) {
>  firmware_end_addr = riscv_find_and_load_firmware(machine,
>  RISCV32_BIOS_BIN, start_addr, NULL);
> @@ -941,6 +954,15 @@ static void virt_machine_init(MachineState *machine)
>virt_memmap[VIRT_MROM].size, kernel_entry,
>fdt_load_addr, machine->fdt);
>
> +/*
> + * Only direct boot kernel is currently supported for KVM VM,
> + * So here setup kernel start address and fdt addr

Re: [PATCH v2 05/12] target/riscv: Implement kvm_arch_put_registers

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:37 PM Yifei Jiang  wrote:
>
> Put GPR CSR and FP registers to kvm by KVM_SET_ONE_REG ioctl
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 
> Reviewed-by: Alistair Francis 

Looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup

> ---
>  target/riscv/kvm.c | 104 -
>  1 file changed, 103 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 6d4df0ef6d..e695b91dc7 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -73,6 +73,14 @@ static uint64_t kvm_riscv_reg_id(CPURISCVState *env, 
> uint64_t type, uint64_t idx
>  } \
>  } while(0)
>
> +#define KVM_RISCV_SET_CSR(cs, env, csr, reg) \
> +do { \
> +int ret = kvm_set_one_reg(cs, RISCV_CSR_REG(env, csr), ®); \
> +if (ret) { \
> +return ret; \
> +} \
> +} while(0)
> +
>  static int kvm_riscv_get_regs_core(CPUState *cs)
>  {
>  int ret = 0;
> @@ -98,6 +106,31 @@ static int kvm_riscv_get_regs_core(CPUState *cs)
>  return ret;
>  }
>
> +static int kvm_riscv_put_regs_core(CPUState *cs)
> +{
> +int ret = 0;
> +int i;
> +target_ulong reg;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +reg = env->pc;
> +ret = kvm_set_one_reg(cs, RISCV_CORE_REG(env, regs.pc), ®);
> +if (ret) {
> +return ret;
> +}
> +
> +for (i = 1; i < 32; i++) {
> +uint64_t id = kvm_riscv_reg_id(env, KVM_REG_RISCV_CORE, i);
> +reg = env->gpr[i];
> +ret = kvm_set_one_reg(cs, id, ®);
> +if (ret) {
> +return ret;
> +}
> +}
> +
> +return ret;
> +}
> +
>  static int kvm_riscv_get_regs_csr(CPUState *cs)
>  {
>  int ret = 0;
> @@ -115,6 +148,24 @@ static int kvm_riscv_get_regs_csr(CPUState *cs)
>  return ret;
>  }
>
> +static int kvm_riscv_put_regs_csr(CPUState *cs)
> +{
> +int ret = 0;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus);
> +KVM_RISCV_SET_CSR(cs, env, sie, env->mie);
> +KVM_RISCV_SET_CSR(cs, env, stvec, env->stvec);
> +KVM_RISCV_SET_CSR(cs, env, sscratch, env->sscratch);
> +KVM_RISCV_SET_CSR(cs, env, sepc, env->sepc);
> +KVM_RISCV_SET_CSR(cs, env, scause, env->scause);
> +KVM_RISCV_SET_CSR(cs, env, stval, env->stval);
> +KVM_RISCV_SET_CSR(cs, env, sip, env->mip);
> +KVM_RISCV_SET_CSR(cs, env, satp, env->satp);
> +
> +return ret;
> +}
> +
>  static int kvm_riscv_get_regs_fp(CPUState *cs)
>  {
>  int ret = 0;
> @@ -148,6 +199,40 @@ static int kvm_riscv_get_regs_fp(CPUState *cs)
>  return ret;
>  }
>
> +static int kvm_riscv_put_regs_fp(CPUState *cs)
> +{
> +int ret = 0;
> +int i;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +if (riscv_has_ext(env, RVD)) {
> +uint64_t reg;
> +for (i = 0; i < 32; i++) {
> +reg = env->fpr[i];
> +ret = kvm_set_one_reg(cs, RISCV_FP_D_REG(env, i), ®);
> +if (ret) {
> +return ret;
> +}
> +}
> +return ret;
> +}
> +
> +if (riscv_has_ext(env, RVF)) {
> +uint32_t reg;
> +for (i = 0; i < 32; i++) {
> +reg = env->fpr[i];
> +ret = kvm_set_one_reg(cs, RISCV_FP_F_REG(env, i), ®);
> +if (ret) {
> +return ret;
> +}
> +}
> +return ret;
> +}
> +
> +return ret;
> +}
> +
> +
>  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  KVM_CAP_LAST_INFO
>  };
> @@ -176,7 +261,24 @@ int kvm_arch_get_registers(CPUState *cs)
>
>  int kvm_arch_put_registers(CPUState *cs, int level)
>  {
> -return 0;
> +int ret = 0;
> +
> +ret = kvm_riscv_put_regs_core(cs);
> +if (ret) {
> +return ret;
> +}
> +
> +ret = kvm_riscv_put_regs_csr(cs);
> +if (ret) {
> +return ret;
> +}
> +
> +ret = kvm_riscv_put_regs_fp(cs);
> +if (ret) {
> +return ret;
> +}
> +
> +return ret;
>  }
>
>  int kvm_arch_release_virq_post(int virq)
> --
> 2.19.1
>



Re: [PATCH v2 04/12] target/riscv: Implement kvm_arch_get_registers

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:37 PM Yifei Jiang  wrote:
>
> Get GPR CSR and FP registers from kvm by KVM_GET_ONE_REG ioctl.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 
> Reviewed-by: Alistair Francis 

Looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup

> ---
>  target/riscv/kvm.c | 112 -
>  1 file changed, 111 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index ccf3753048..6d4df0ef6d 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -55,13 +55,123 @@ static uint64_t kvm_riscv_reg_id(CPURISCVState *env, 
> uint64_t type, uint64_t idx
>  return id;
>  }
>
> +#define RISCV_CORE_REG(env, name)  kvm_riscv_reg_id(env, KVM_REG_RISCV_CORE, 
> \
> + KVM_REG_RISCV_CORE_REG(name))
> +
> +#define RISCV_CSR_REG(env, name)  kvm_riscv_reg_id(env, KVM_REG_RISCV_CSR, \
> + KVM_REG_RISCV_CSR_REG(name))
> +
> +#define RISCV_FP_F_REG(env, idx)  kvm_riscv_reg_id(env, KVM_REG_RISCV_FP_F, 
> idx)
> +
> +#define RISCV_FP_D_REG(env, idx)  kvm_riscv_reg_id(env, KVM_REG_RISCV_FP_D, 
> idx)
> +
> +#define KVM_RISCV_GET_CSR(cs, env, csr, reg) \
> +do { \
> +int ret = kvm_get_one_reg(cs, RISCV_CSR_REG(env, csr), ®); \
> +if (ret) { \
> +return ret; \
> +} \
> +} while(0)
> +
> +static int kvm_riscv_get_regs_core(CPUState *cs)
> +{
> +int ret = 0;
> +int i;
> +target_ulong reg;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +ret = kvm_get_one_reg(cs, RISCV_CORE_REG(env, regs.pc), ®);
> +if (ret) {
> +return ret;
> +}
> +env->pc = reg;
> +
> +for (i = 1; i < 32; i++) {
> +uint64_t id = kvm_riscv_reg_id(env, KVM_REG_RISCV_CORE, i);
> +ret = kvm_get_one_reg(cs, id, ®);
> +if (ret) {
> +return ret;
> +}
> +env->gpr[i] = reg;
> +}
> +
> +return ret;
> +}
> +
> +static int kvm_riscv_get_regs_csr(CPUState *cs)
> +{
> +int ret = 0;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +KVM_RISCV_GET_CSR(cs, env, sstatus, env->mstatus);
> +KVM_RISCV_GET_CSR(cs, env, sie, env->mie);
> +KVM_RISCV_GET_CSR(cs, env, stvec, env->stvec);
> +KVM_RISCV_GET_CSR(cs, env, sscratch, env->sscratch);
> +KVM_RISCV_GET_CSR(cs, env, sepc, env->sepc);
> +KVM_RISCV_GET_CSR(cs, env, scause, env->scause);
> +KVM_RISCV_GET_CSR(cs, env, stval, env->stval);
> +KVM_RISCV_GET_CSR(cs, env, sip, env->mip);
> +KVM_RISCV_GET_CSR(cs, env, satp, env->satp);
> +return ret;
> +}
> +
> +static int kvm_riscv_get_regs_fp(CPUState *cs)
> +{
> +int ret = 0;
> +int i;
> +CPURISCVState *env = &RISCV_CPU(cs)->env;
> +
> +if (riscv_has_ext(env, RVD)) {
> +uint64_t reg;
> +for (i = 0; i < 32; i++) {
> +ret = kvm_get_one_reg(cs, RISCV_FP_D_REG(env, i), ®);
> +if (ret) {
> +return ret;
> +}
> +env->fpr[i] = reg;
> +}
> +return ret;
> +}
> +
> +if (riscv_has_ext(env, RVF)) {
> +uint32_t reg;
> +for (i = 0; i < 32; i++) {
> +ret = kvm_get_one_reg(cs, RISCV_FP_F_REG(env, i), ®);
> +if (ret) {
> +return ret;
> +}
> +env->fpr[i] = reg;
> +}
> +return ret;
> +}
> +
> +return ret;
> +}
> +
>  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  KVM_CAP_LAST_INFO
>  };
>
>  int kvm_arch_get_registers(CPUState *cs)
>  {
> -return 0;
> +int ret = 0;
> +
> +ret = kvm_riscv_get_regs_core(cs);
> +if (ret) {
> +return ret;
> +}
> +
> +ret = kvm_riscv_get_regs_csr(cs);
> +if (ret) {
> +return ret;
> +}
> +
> +ret = kvm_riscv_get_regs_fp(cs);
> +if (ret) {
> +return ret;
> +}
> +
> +return ret;
>  }
>
>  int kvm_arch_put_registers(CPUState *cs, int level)
> --
> 2.19.1
>



Re: [PATCH v2 03/12] target/riscv: Implement function kvm_arch_init_vcpu

2021-12-12 Thread Anup Patel
On Fri, Dec 10, 2021 at 3:37 PM Yifei Jiang  wrote:
>
> Get isa info from kvm while kvm init.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Mingwang Li 
> Reviewed-by: Alistair Francis 

Looks good to me.

Reviewed-by: Anup Patel 

Regards,
Anup

> ---
>  target/riscv/kvm.c | 32 +++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
> index 687dd4b621..ccf3753048 100644
> --- a/target/riscv/kvm.c
> +++ b/target/riscv/kvm.c
> @@ -38,6 +38,23 @@
>  #include "qemu/log.h"
>  #include "hw/loader.h"
>
> +static uint64_t kvm_riscv_reg_id(CPURISCVState *env, uint64_t type, uint64_t 
> idx)
> +{
> +uint64_t id = KVM_REG_RISCV | type | idx;
> +
> +switch (riscv_cpu_mxl(env)) {
> +case MXL_RV32:
> +id |= KVM_REG_SIZE_U32;
> +break;
> +case MXL_RV64:
> +id |= KVM_REG_SIZE_U64;
> +break;
> +default:
> +g_assert_not_reached();
> +}
> +return id;
> +}
> +
>  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  KVM_CAP_LAST_INFO
>  };
> @@ -79,7 +96,20 @@ void kvm_arch_init_irq_routing(KVMState *s)
>
>  int kvm_arch_init_vcpu(CPUState *cs)
>  {
> -return 0;
> +int ret = 0;
> +target_ulong isa;
> +RISCVCPU *cpu = RISCV_CPU(cs);
> +CPURISCVState *env = &cpu->env;
> +uint64_t id;
> +
> +id = kvm_riscv_reg_id(env, KVM_REG_RISCV_CONFIG, 
> KVM_REG_RISCV_CONFIG_REG(isa));
> +ret = kvm_get_one_reg(cs, id, &isa);
> +if (ret) {
> +return ret;
> +}
> +env->misa_ext = isa;
> +
> +return ret;
>  }
>
>  int kvm_arch_msi_data_to_gsi(uint32_t data)
> --
> 2.19.1
>



Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-12 Thread Jason Wang
On Sat, Dec 11, 2021 at 1:23 PM Longpeng (Mike, Cloud Infrastructure
Service Product Dept.)  wrote:
>
>
>
> > -Original Message-
> > From: Jason Wang [mailto:jasow...@redhat.com]
> > Sent: Wednesday, December 8, 2021 2:27 PM
> > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > 
> > Cc: mst ; Parav Pandit ; Yongji Xie
> > ; Stefan Hajnoczi ; Stefano
> > Garzarella ; Yechuan ; Gonglei 
> > (Arei)
> > ; qemu-devel 
> > Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> >
> > On Wed, Dec 8, 2021 at 1:20 PM Longpeng(Mike)  wrote:
> > >
> > > From: Longpeng 
> > >
> > > Hi guys,
> > >
> > > This patch introduces vhost-vdpa-net device, which is inspired
> > > by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> > >
> > > I've tested this patch on Huawei's offload card:
> > > ./x86_64-softmmu/qemu-system-x86_64 \
> > > -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> > >
> > > For virtio hardware offloading, the most important requirement for us
> > > is to support live migration between offloading cards from different
> > > vendors, the combination of netdev and virtio-net seems too heavy, we
> > > prefer a lightweight way.
> >
> > Could you elaborate more on this? It's mainly the control path when
> > using with netdev, and it provides a lot of other benefits:
> >
> > - decouple the transport specific stuff out of the vhost abstraction,
> > mmio device is supported with 0 line of code
> > - migration compatibility, reuse the migration stream that is already
> > supported by Qemu virtio-net, this will allow migration among
> > different vhost backends.
> > - software mediation facility, not all the virtqueues are assigned to
> > guests directly. One example is the virtio-net cvq, qemu may want to
> > intercept and record the device state for migration. Reusing the
> > current virtio-net codes simplifies a lot of codes.
> > - transparent failover (in the future), the nic model can choose to
> > switch between vhost backends etc.
> >
>
> We want to use the vdpa framework instead of the vfio-pci framework in
> the virtio hardware offloading case, so maybe some of the benefits above
> are not needed in our case. But we need to migrate between different
> hardware, so I am not sure whether this approach would be harmful to the
> requirement.

It should not, but it needs to build the migration facility for the
net from the ground. And if we want to have a general migration
solution instead of a vendor specific one, it may duplicate some logic
of existing virtio-net implementation. The CVQ migration is an
example, we don't provide a dedicated migration facility in the spec.
So a more general way for live migration currently is using the shadow
virtqueue which is what Eugenio is doing. So thanks to the design
where we tried to do all the work in the vhost layer, this might not
be a problem for this approach. But talking about the CVQ migration,
things will be interesting. Qemu needs to decode the cvq commands in
the middle thus it can record the device state. For having a general
migration solution, vhost-vdpa-pci needs to do this as well.
Virtio-net has the full CVQ logic so it's much easier, for
vhost-vdpa-pci, it needs to duplicate them all in its own logic.

>
> > >
> > > Maybe we could support both in the future ?
> >
> > For the net, we need to figure out the advantages of this approach
> > first. Note that we didn't have vhost-user-net-pci or vhost-pci in the
> > past.
> >
>
> Why didn't support vhost-user-net-pci in history ? Because its control
> path is much more complex than the block ?

I don't know, it may be simply because no one tries to do that.

>
> > For the block, I will leave Stefan and Stefano to comment.
> >
> > > Such as:
> > >
> > > * Lightweight
> > >  Net: vhost-vdpa-net
> > >  Storage: vhost-vdpa-blk
> > >
> > > * Heavy but more powerful
> > >  Net: netdev + virtio-net + vhost-vdpa
> > >  Storage: bdrv + virtio-blk + vhost-vdpa
> > >
> > > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> > >
> > > Signed-off-by: Longpeng(Mike) 
> > > ---
> > >  hw/net/meson.build |   1 +
> > >  hw/net/vhost-vdpa-net.c| 338
> > +
> > >  hw/virtio/Kconfig  |   5 +
> > >  hw/virtio/meson.build  |   1 +
> > >  hw/virtio/vhost-vdpa-net-pci.c | 118 +
> >
> > I'd expect there's no device type specific code in this approach and
> > any kind of vDPA devices could be used with a general pci device.
> >
> > Any reason for having net specific types here?
> >
>
> No, just because there already has the proposal of vhost-vdpa-blk, so I
> developed the vhost-vdpa-net correspondingly.
>
> I pretty agree with your suggestion. If feasible, likes vfio-pci, we don't
> need to maintain the device type specific code in QEMU, what's more, it's
> possible to support the live migration of different virtio hardware.
>

See above, we 

Re: [RFC PATCH v3 00/27] Add LoongArch softmmu support.

2021-12-12 Thread yangxiaojuan
Ping!

Please help review the V3 patch, thank you!

On 12/04/2021 08:06 PM, Xiaojuan Yang wrote:
> This series patch add softmmu support for LoongArch.
> Base on the linux-user emulation support V13 patch.
>   * 
> https://patchew.org/QEMU/1638610165-15036-1-git-send-email-gaos...@loongson.cn/
> The latest kernel:
>   * https://github.com/loongson/linux/tree/loongarch-next
> The manual:
>   * 
> https://github.com/loongson/LoongArch-Documentation/releases/tag/2021.10.11
> 
> Changes for v3:
> 1.Target code mainly follow Richard's code review comments.
> 2.Put the csr and iocsr read/write instruction emulate into 2 different patch.
> 3.Simply the tlb emulation.
> 4.Delete some unused csr registers defintion.
> 5.Machine and board code mainly follow Mark's advice, discard the obsolete 
> interface.
> 6.NUMA function is removed for it is not completed.
> 7.Adjust some format problem and the Naming problem 
> 
> Changes for v2:
> 1.Combine patch 2 and 3 into one.
> 2.Adjust the order of the patch.
> 3.Put all the binaries on the github.
> 4.Modify some emulate errors when use the kernel from the github.
> 5.Adjust some format problem and the Naming problem 
> 6.Others mainly follow Richard's code review comments.
> 
> Please help review!
> 
> Thanks
> 
> Xiaojuan Yang (27):
>   target/loongarch: Update README
>   target/loongarch: Add CSR registers definition
>   target/loongarch: Add basic vmstate description of CPU.
>   target/loongarch: Implement qmp_query_cpu_definitions()
>   target/loongarch: Add stabletimer support
>   target/loongarch: Add MMU support for LoongArch CPU.
>   target/loongarch: Add LoongArch CSR instruction
>   target/loongarch: Add LoongArch IOCSR instruction
>   target/loongarch: Add TLB instruction support
>   target/loongarch: Add other core instructions support
>   target/loongarch: Add LoongArch interrupt and exception handle
>   target/loongarch: Add timer related instructions support.
>   target/loongarch: Add gdb support.
>   hw/pci-host: Add ls7a1000 PCIe Host bridge support for Loongson3
> Platform
>   hw/loongarch: Add support loongson3-ls7a machine type.
>   hw/loongarch: Add LoongArch cpu interrupt support(CPUINTC)
>   hw/loongarch: Add LoongArch ipi interrupt support(IPI)
>   hw/intc: Add LoongArch ls7a interrupt controller support(PCH-PIC)
>   hw/intc: Add LoongArch ls7a msi interrupt controller support(PCH-MSI)
>   hw/intc: Add LoongArch extioi interrupt controller(EIOINTC)
>   hw/loongarch: Add irq hierarchy for the system
>   hw/loongarch: Add some devices support for 3A5000.
>   hw/loongarch: Add LoongArch ls7a rtc device support
>   hw/loongarch: Add default bios startup support.
>   hw/loongarch: Add -kernel and -initrd options support
>   hw/loongarch: Add LoongArch smbios support
>   hw/loongarch: Add LoongArch acpi support
> 
>  .../devices/loongarch64-softmmu/default.mak   |   3 +
>  configs/targets/loongarch64-softmmu.mak   |   4 +
>  gdb-xml/loongarch-base64.xml  |  43 +
>  gdb-xml/loongarch-fpu64.xml   |  57 ++
>  hw/Kconfig|   1 +
>  hw/acpi/Kconfig   |   4 +
>  hw/acpi/ls7a.c| 349 
>  hw/acpi/meson.build   |   1 +
>  hw/intc/Kconfig   |  15 +
>  hw/intc/loongarch_extioi.c| 499 +++
>  hw/intc/loongarch_ipi.c   | 162 
>  hw/intc/loongarch_pch_msi.c   |  67 ++
>  hw/intc/loongarch_pch_pic.c   | 357 
>  hw/intc/meson.build   |   4 +
>  hw/intc/trace-events  |  21 +
>  hw/loongarch/Kconfig  |  23 +
>  hw/loongarch/acpi-build.c | 637 ++
>  hw/loongarch/fw_cfg.c |  33 +
>  hw/loongarch/fw_cfg.h |  15 +
>  hw/loongarch/loongson3.c  | 509 +++
>  hw/loongarch/meson.build  |   6 +
>  hw/meson.build|   1 +
>  hw/pci-host/Kconfig   |   4 +
>  hw/pci-host/ls7a.c| 214 +
>  hw/pci-host/meson.build   |   1 +
>  hw/rtc/Kconfig|   3 +
>  hw/rtc/ls7a_rtc.c | 323 +++
>  hw/rtc/meson.build|   1 +
>  include/exec/poison.h |   2 +
>  include/hw/acpi/ls7a.h|  53 ++
>  include/hw/intc/loongarch_extioi.h|  69 ++
>  include/hw/intc/loongarch_ipi.h   |  47 ++
>  include/hw/intc/loongarch_pch_msi.h   |  21 +
>  include/hw/intc/loongarch_pch_pic.h   |  61 ++
>  include/hw/loongarch/loongarch.h  |  68 ++
>  include/hw/pci-host/ls7a.h|  79 ++
>  include/sysemu/arch_init.h|   1 +

Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-12 Thread Jason Wang
On Sun, Dec 12, 2021 at 5:30 PM Michael S. Tsirkin  wrote:
>
> On Sat, Dec 11, 2021 at 03:00:27AM +, Longpeng (Mike, Cloud 
> Infrastructure Service Product Dept.) wrote:
> >
> >
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > > Sent: Thursday, December 9, 2021 5:17 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > 
> > > Cc: jasow...@redhat.com; m...@redhat.com; pa...@nvidia.com;
> > > xieyon...@bytedance.com; sgarz...@redhat.com; Yechuan 
> > > ;
> > > Gonglei (Arei) ; qemu-devel@nongnu.org
> > > Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> > >
> > > On Wed, Dec 08, 2021 at 01:20:10PM +0800, Longpeng(Mike) wrote:
> > > > From: Longpeng 
> > > >
> > > > Hi guys,
> > > >
> > > > This patch introduces vhost-vdpa-net device, which is inspired
> > > > by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> > > >
> > > > I've tested this patch on Huawei's offload card:
> > > > ./x86_64-softmmu/qemu-system-x86_64 \
> > > > -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> > > >
> > > > For virtio hardware offloading, the most important requirement for us
> > > > is to support live migration between offloading cards from different
> > > > vendors, the combination of netdev and virtio-net seems too heavy, we
> > > > prefer a lightweight way.
> > > >
> > > > Maybe we could support both in the future ? Such as:
> > > >
> > > > * Lightweight
> > > >  Net: vhost-vdpa-net
> > > >  Storage: vhost-vdpa-blk
> > > >
> > > > * Heavy but more powerful
> > > >  Net: netdev + virtio-net + vhost-vdpa
> > > >  Storage: bdrv + virtio-blk + vhost-vdpa
> > > >
> > > > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> > >
> > > Stefano presented a plan for vdpa-blk at KVM Forum 2021:
> > > https://kvmforum2021.sched.com/event/ke3a/vdpa-blk-unified-hardware-and-sof
> > > tware-offload-for-virtio-blk-stefano-garzarella-red-hat
> > >
> > > It's closer to today's virtio-net + vhost-net approach than the
> > > vhost-vdpa-blk device you have mentioned. The idea is to treat vDPA as
> > > an offload feature rather than a completely separate code path that
> > > needs to be maintained and tested. That way QEMU's block layer features
> > > and live migration work with vDPA devices and re-use the virtio-blk
> > > code. The key functionality that has not been implemented yet is a "fast
> > > path" mechanism that allows the QEMU virtio-blk device's virtqueue to be
> > > offloaded to vDPA.
> > >
> > > The unified vdpa-blk architecture should deliver the same performance
> > > as the vhost-vdpa-blk device you mentioned but with more features, so I
> > > wonder what aspects of the vhost-vdpa-blk idea are important to you?
> > >
> > > QEMU already has vhost-user-blk, which takes a similar approach as the
> > > vhost-vdpa-blk device you are proposing. I'm not against the
> > > vhost-vdpa-blk approach in priciple, but would like to understand your
> > > requirements and see if there is a way to collaborate on one vdpa-blk
> > > implementation instead of dividing our efforts between two.
> > >
> >
> > We prefer a simple way in the virtio hardware offloading case, it could 
> > reduce
> > our maintenance workload, we no need to maintain the virtio-net, netdev,
> > virtio-blk, bdrv and ... any more. If we need to support other vdpa devices
> > (such as virtio-crypto, virtio-fs) in the future, then we also need to 
> > maintain
> > the corresponding device emulation code?
> >
> > For the virtio hardware offloading case, we usually use the vfio-pci 
> > framework,
> > it saves a lot of our maintenance work in QEMU, we don't need to touch the 
> > device
> > types. Inspired by Jason, what we really prefer is "vhost-vdpa-pci/mmio", 
> > use it to
> > instead of the vfio-pci, it could provide the same performance as vfio-pci, 
> > but it's
> > *possible* to support live migrate between offloading cards from different 
> > vendors.
>
> OK, so the features you are dropping would be migration between
> a vdpa, vhost and virtio backends. I think given vhost-vdpa-blk is seems
> fair enough... What do others think?

I think it should be fine, and it would be even better to make it not
specific to device type.

Thanks

>
> > > Stefan
>




Re: [PATCH v10 06/10] ACPI ERST: build the ACPI ERST table

2021-12-12 Thread Michael S. Tsirkin
On Thu, Dec 09, 2021 at 12:57:31PM -0500, Eric DeVolder wrote:
> This builds the ACPI ERST table to inform OSPM how to communicate
> with the acpi-erst device.
> 
> Signed-off-by: Eric DeVolder 
> ---
>  hw/acpi/erst.c | 241 
> +
>  1 file changed, 241 insertions(+)
> 
> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> index 81f5435..753425a 100644
> --- a/hw/acpi/erst.c
> +++ b/hw/acpi/erst.c
> @@ -711,6 +711,247 @@ static const MemoryRegionOps erst_reg_ops = {
>  .endianness = DEVICE_NATIVE_ENDIAN,
>  };
>  
> +
> +/***/
> +/***/
> +
> +/* ACPI 4.0: Table 17-19 Serialization Instructions */
> +#define INST_READ_REGISTER 0x00
> +#define INST_READ_REGISTER_VALUE   0x01
> +#define INST_WRITE_REGISTER0x02
> +#define INST_WRITE_REGISTER_VALUE  0x03
> +#define INST_NOOP  0x04
> +#define INST_LOAD_VAR1 0x05
> +#define INST_LOAD_VAR2 0x06
> +#define INST_STORE_VAR10x07
> +#define INST_ADD   0x08
> +#define INST_SUBTRACT  0x09
> +#define INST_ADD_VALUE 0x0A
> +#define INST_SUBTRACT_VALUE0x0B
> +#define INST_STALL 0x0C
> +#define INST_STALL_WHILE_TRUE  0x0D
> +#define INST_SKIP_NEXT_INSTRUCTION_IF_TRUE 0x0E
> +#define INST_GOTO  0x0F
> +#define INST_SET_SRC_ADDRESS_BASE  0x10
> +#define INST_SET_DST_ADDRESS_BASE  0x11
> +#define INST_MOVE_DATA 0x12
> +

I would create wrappers for the specific uses that we do have. Leave the
rest alone.
You just use 4 of these right? And a bunch of parameters are
always the same. E.g. flags always 0, address always same.

> +/* ACPI 4.0: 17.4.1.2 Serialization Instruction Entries */
> +static void build_serialization_instruction_entry(GArray *table_data,
> +uint8_t serialization_action,
> +uint8_t instruction,
> +uint8_t flags,
> +uint8_t register_bit_width,

maybe make it width in bytes, then you do not need a switch.

> +uint64_t register_address,
> +uint64_t value,
> +uint64_t mask)
> +{
> +/* ACPI 4.0: Table 17-18 Serialization Instruction Entry */
> +struct AcpiGenericAddress gas;
> +
> +/* Serialization Action */
> +build_append_int_noprefix(table_data, serialization_action, 1);
> +/* Instruction */
> +build_append_int_noprefix(table_data, instruction , 1);
> +/* Flags */
> +build_append_int_noprefix(table_data, flags   , 1);
> +/* Reserved */
> +build_append_int_noprefix(table_data, 0   , 1);
> +/* Register Region */
> +gas.space_id = AML_SYSTEM_MEMORY;
> +gas.bit_width = register_bit_width;
> +gas.bit_offset = 0;
> +switch (register_bit_width) {
> +case 8:
> +gas.access_width = 1;
> +break;
> +case 16:
> +gas.access_width = 2;
> +break;
> +case 32:
> +gas.access_width = 3;
> +break;
> +case 64:
> +gas.access_width = 4;
> +break;
> +default:
> +gas.access_width = 0;
> +break;

does this default actually work?

> +}
> +gas.address = register_address;
> +build_append_gas_from_struct(table_data, &gas);
> +/* Value */
> +build_append_int_noprefix(table_data, value  , 8);
> +/* Mask */
> +build_append_int_noprefix(table_data, mask   , 8);
> +}
> +
> +/* ACPI 4.0: 17.4.1 Serialization Action Table */
> +void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
> +const char *oem_id, const char *oem_table_id)
> +{
> +GArray *table_instruction_data;
> +unsigned action;
> +pcibus_t bar0, bar1;
> +AcpiTable table = { .sig = "ERST", .rev = 1, .oem_id = oem_id,
> +.oem_table_id = oem_table_id };
> +
> +bar0 = (pcibus_t)pci_get_bar_addr(PCI_DEVICE(erst_dev), 0);
> +trace_acpi_erst_pci_bar_0(bar0);
> +bar1 = (pcibus_t)pci_get_bar_addr(PCI_DEVICE(erst_dev), 1);

why do we need the cast here?
Also assignment at declaration point will be cleaner, won't it?

> +trace_acpi_erst_pci_bar_1(bar1);

bar1 seems unused ... why do we bother with it just for trace?

> +
> +#define MASK8  0x00FFUL
> +#define MASK16 0xUL
> +#define MASK32 0xUL
> +#define MASK64 0xUL


can't we just pass # of bytes?

> +
> +/*
> + * Serialization Action Table
> + * The serialization action table must be generated first
> + * so that its size can be known in order to populate the
> + * Instruction Entry Count field.
> + */
> +table_instruction_data = g_array_new(FALSE, FALSE, sizeof(char));
> +
> +/* Serialization Instructio

Re: [PATCH 26/26] hw/intc/arm_gicv3_its: Factor out "find address of table entry" code

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The ITS has several tables which all share a similar format,
described by the TableDesc struct: the guest may configure them
to be a single-level table or a two-level table. Currently we
open-code the process of finding the table entry in all the
functions which read or write the device table or the collection
table. Factor out the "get the address of the table entry"
logic into a new function, so that the code which needs to
read or write a table entry only needs to call table_entry_addr()
and then perform a suitable load or store to that address.

Note that the error handling is slightly complicated because
we want to handle two cases differently:
  * failure to read the L1 table entry should end up causing
a command stall, like other kinds of DMA error
  * an L1 table entry that says there is no L2 table for this
index (ie whose valid bit is 0) must result in us treating
the table entry as not-valid on read, and discarding
writes (this is mandated by the spec)

Signed-off-by: Peter Maydell 
---
This is a worthwhile refactoring on its own, but still more
so given that GICv4 adds another table in this format.
---


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 25/26] hw/intc/arm_gicv3_its: Fix return codes in process_mapd()

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

Fix process_mapd() to consistently return CMD_STALL for memory
errors and CMD_CONTINUE for parameter errors, as we claim in the
comments that we do.

Signed-off-by: Peter Maydell
---
  hw/intc/arm_gicv3_its.c | 10 --
  1 file changed, 4 insertions(+), 6 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 24/26] hw/intc/arm_gicv3_its: Fix return codes in process_mapc()

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

Fix process_mapc() to consistently return CMD_STALL for memory
errors and CMD_CONTINUE for parameter errors, as we claim in the
comments that we do.

Signed-off-by: Peter Maydell
---
  hw/intc/arm_gicv3_its.c | 8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 23/26] hw/intc/arm_gicv3_its: Fix return codes in process_mapti()

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

Fix process_mapti() to consistently return CMD_STALL for memory
errors and CMD_CONTINUE for parameter errors, as we claim in the
comments that we do.

Signed-off-by: Peter Maydell
---
  hw/intc/arm_gicv3_its.c | 28 +---
  1 file changed, 13 insertions(+), 15 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 22/26] hw/intc/arm_gicv3_its: Refactor process_its_cmd() to reduce nesting

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

Refactor process_its_cmd() so that it consistently uses
the structure
   do thing;
   if (error condition) {
   return early;
   }
   do next thing;

rather than doing some of the work nested inside if (not error)
code blocks.

Signed-off-by: Peter Maydell
---
  hw/intc/arm_gicv3_its.c | 103 +++-
  1 file changed, 50 insertions(+), 53 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 21/26] hw/intc/arm_gicv3_its: Fix return codes in process_its_cmd()

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

Fix process_its_cmd() to consistently return CMD_STALL for
memory errors and CMD_CONTINUE for parameter errors, as
we claim in the comments that we do.

Signed-off-by: Peter Maydell 


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 10/26] hw/intc/arm_gicv3_its: Use FIELD macros for DTEs

2021-12-12 Thread Philippe Mathieu-Daudé
On 12/11/21 20:11, Peter Maydell wrote:
> Currently the ITS code that reads and writes DTEs uses open-coded
> shift-and-mask to assemble the various fields into the 64-bit DTE
> word.  The names of the macros used for mask and shift values are
> also somewhat inconsistent, and don't follow our usual convention
> that a MASK macro should specify the bits in their place in the word.
> Replace all these with use of the FIELD macro.
> 
> Signed-off-by: Peter Maydell 
> ---
>  hw/intc/gicv3_internal.h |  7 ---
>  hw/intc/arm_gicv3_its.c  | 20 +---
>  2 files changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
> index 5a63e9ed5ce..6a3b145f377 100644
> --- a/hw/intc/gicv3_internal.h
> +++ b/hw/intc/gicv3_internal.h
> @@ -393,9 +393,10 @@ FIELD(ITE_H, VPEID, 16, 16)
>   * Valid = 1 bit,ITTAddr = 44 bits,Size = 5 bits
>   */
>  #define GITS_DTE_SIZE (0x8ULL)
> -#define GITS_DTE_ITTADDR_SHIFT   6
> -#define GITS_DTE_ITTADDR_MASK 
> MAKE_64BIT_MASK(GITS_DTE_ITTADDR_SHIFT, \
> -  ITTADDR_LENGTH)

Side note, both ITTADDR_LENGTH & ITTADDR_MASK definitions are now unused.



Re: [PATCH 20/26] hw/intc/arm_gicv3_its: Use enum for return value of process_* functions

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

When an ITS detects an error in a command, it has an
implementation-defined (CONSTRAINED UNPREDICTABLE) choice of whether
to ignore the command, proceeding to the next one in the queue, or to
stall the ITS command queue, processing nothing further.  The
behaviour required when the read of the command packet from memory
fails is less clearly documented, but the same set of choices as for
command errors seem reasonable.

The intention of the QEMU implementation, as documented in the
comments, is that if we encounter a memory error reading the command
packet or one of the various data tables then we should stall, but
for command parameter errors we should ignore the queue and continue.
However, we don't actually do this.  To get the desired behaviour,
the various process_* functions need to return true to cause
process_cmdq() to advance to the next command and keep processing,
and false to stall command processing.  What they mostly do is return
false for any kind of error.

To make the code clearer, replace the 'bool' return from the process_
functions with an enum which may be either CMD_STALL or CMD_CONTINUE.
In this commit no behaviour changes; in subsequent commits we will
adjust the error-return paths for the process_ functions one by one.

Signed-off-by: Peter Maydell 
---
  hw/intc/arm_gicv3_its.c | 59 ++---
  1 file changed, 38 insertions(+), 21 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 19/26] hw/intc/arm_gicv3_its: Don't use data if reading command failed

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

In process_cmdq(), we read 64 bits of the command packet, which
contain the command identifier, which we then switch() on to dispatch
to an appropriate sub-function.  However, if address_space_ldq_le()
reports a memory transaction failure, we still read the command
identifier out of the data and switch() on it.  Restructure the code
so that we stop immediately (stalling the command queue) in this
case.

Signed-off-by: Peter Maydell 
---
  hw/intc/arm_gicv3_its.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 18/26] hw/intc/arm_gicv3_its: Fix handling of process_its_cmd() return value

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

process_its_cmd() returns a bool, like all the other process_ functions.
However we were putting its return value into 'res', not 'result',
which meant we would ignore it when deciding whether to continue
or stall the command queue. Fix the typo.

Signed-off-by: Peter Maydell 
---
  hw/intc/arm_gicv3_its.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 06/26] hw/intc/arm_gicv3_its: Reduce code duplication in extract_table_params()

2021-12-12 Thread Philippe Mathieu-Daudé
On 12/11/21 20:11, Peter Maydell wrote:
> The extract_table_params() decodes the fields in the GITS_BASER
> registers into TableDesc structs.  Since the fields are the same for
> all the GITS_BASER registers, there is currently a lot of code
> duplication within the switch (type) statement.  Refactor so that the
> cases include only what is genuinely different for each type:
> the calculation of the number of bits in the ID value that indexes
> into the table.
> 
> Signed-off-by: Peter Maydell 
> ---
>  hw/intc/arm_gicv3_its.c | 97 +
>  1 file changed, 40 insertions(+), 57 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH 03/26] hw/intc/arm_gicv3_its: Remove redundant ITS_CTLR_ENABLED define

2021-12-12 Thread Philippe Mathieu-Daudé
On 12/11/21 20:11, Peter Maydell wrote:
> We currently define a bitmask for the GITS_CTLR ENABLED bit in
> two ways: as ITS_CTLR_ENABLED, and via the FIELD() macro as
> R_GITS_CTLR_ENABLED_MASK. Consistently use the FIELD macro version
> everywhere and remove the redundant ITS_CTLR_ENABLED define.
> 
> Signed-off-by: Peter Maydell 
> ---
>  hw/intc/gicv3_internal.h |  2 --
>  hw/intc/arm_gicv3_its.c  | 20 ++--
>  2 files changed, 10 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH 11/26] hw/intc/arm_gicv3_its: Use 1ULL when shifting by (DTE.SIZE + 1)

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:
  
  if (dte_valid) {

-max_eventid = 1UL << (FIELD_EX64(dte, DTE, SIZE) + 1);
+max_eventid = 1ULL << (FIELD_EX64(dte, DTE, SIZE) + 1);


Without changing the type of max_eventid, I think it'd be easiest to fix the off-by-one 
bug by not changing the comparisions, but changing this computation.  E.g.


  max_eventid = (2 << FIELD_EX64(dte, DTE, SIZE)) - 1;

so that the value becomes UINT32_MAX for SIZE=31.



r~



Re: [PATCH 15/26] hw/intc/arm_gicv3_its: Rename max_l2_entries to num_l2_entries

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

In several places we have a local variable max_l2_entries which is
the number of entries which will fit in a level 2 table.  The
calculations done on this value are correct; rename it to
num_l2_entries to fit the convention we're using in this code.

Signed-off-by: Peter Maydell
---
  hw/intc/arm_gicv3_its.c | 24 
  1 file changed, 12 insertions(+), 12 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 13/26] hw/intc/arm_gicv3_its: Use FIELD macros for CTEs

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

Use FIELD macros to handle CTEs, rather than ad-hoc mask-and-shift.

Signed-off-by: Peter Maydell 
---
  hw/intc/gicv3_internal.h | 3 ++-
  hw/intc/arm_gicv3_its.c  | 7 ---
  2 files changed, 6 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 12/26] hw/intc/arm_gicv3_its: Correct comment about CTE RDBase field size

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The comment says that in our CTE format the RDBase field is 36 bits;
in fact for us it is only 16 bits, because we use the RDBase format
where it specifies a 16-bit CPU number. The code already uses
RDBASE_PROCNUM_LENGTH (16) as the field width, so fix the comment
to match it.

Signed-off-by: Peter Maydell 


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 11/26] hw/intc/arm_gicv3_its: Use 1ULL when shifting by (DTE.SIZE + 1)

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The DTE.SIZE field is 5 bits, which means that DTE.SIZE + 1
might in theory be 32. When calculating 1 << (DTE.SIZE + 1)
use 1ULL to ensure that we don't do this arithmetic at 32 bits
and shift the 1 off the end in this case.

Signed-off-by: Peter Maydell 
---
  hw/intc/arm_gicv3_its.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


But then you assign the result to a uint32_t, so the patch is incomplete.


r~



Re: [PATCH 10/26] hw/intc/arm_gicv3_its: Use FIELD macros for DTEs

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

Currently the ITS code that reads and writes DTEs uses open-coded
shift-and-mask to assemble the various fields into the 64-bit DTE
word.  The names of the macros used for mask and shift values are
also somewhat inconsistent, and don't follow our usual convention
that a MASK macro should specify the bits in their place in the word.
Replace all these with use of the FIELD macro.

Signed-off-by: Peter Maydell
---
  hw/intc/gicv3_internal.h |  7 ---
  hw/intc/arm_gicv3_its.c  | 20 +---
  2 files changed, 13 insertions(+), 14 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 09/26] hw/intc/arm_gicv3_its: Correct handling of MAPI

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The MAPI command takes arguments DeviceID, EventID, ICID, and is
defined to be equivalent to MAPTI DeviceID, EventID, EventID, ICID.
(That is, where MAPTI takes an explicit pINTID, MAPI uses the EventID
as the pINTID.)

We didn't quite get this right.  In particular the error checks for
MAPI include "EventID does not specify a valid LPI identifier", which
is the same as MAPTI's error check for the pINTID field.  QEMU's code
skips the pINTID error check entirely in the MAPI case.

We can fix this bug and in the process simplify the code by switching
to the obvious implementation of setting pIntid = eventid early
if ignore_pInt is true.

Signed-off-by: Peter Maydell 
---
  hw/intc/arm_gicv3_its.c | 18 +++---
  1 file changed, 7 insertions(+), 11 deletions(-)



Reviewed-by: Richard Henderson 


r~



Re: [PATCH 08/26] hw/intc/arm_gicv3_its: Don't misuse GITS_TYPE_PHYSICAL define

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The GITS_TYPE_PHYSICAL define is the value we set the
GITS_TYPER.Physical field to -- this is 1 to indicate that we support
physical LPIs.  (Support for virtual LPIs is the GITS_TYPER.Virtual
field.) We also use this define as the *value* that we write into an
interrupt translation table entry's INTTYPE field, which should be 1
for a physical interrupt and 0 for a virtual interrupt.  Finally, we
use it as a *mask* when we read the interrupt translation table entry
INTTYPE field.

Untangle this confusion: define an ITE_INTTYPE_VIRTUAL and
ITE_INTTYPE_PHYSICAL to be the valid values of the ITE INTTYPE
field, and replace the ad-hoc collection of ITE_ENTRY_* defines with
use of the FIELD() macro to define the fields of an ITE and the
FIELD_EX64() and FIELD_DP64() macros to read and write them.
We use ITE in the new setup, rather than ITE_ENTRY, because
ITE stands for "Interrupt translation entry" and so the extra
"entry" would be redundant.

We take the opportunity to correct the name of the field that holds
the GICv4 'doorbell' interrupt ID (this is always the value 1023 in a
GICv3, which is why we were calling it the 'spurious' field).

The GITS_TYPE_PHYSICAL define is then used in only one place, where
we set the initial GITS_TYPER value.  Since GITS_TYPER.Physical is
essentially a boolean, hiding the '1' value behind a macro is more
confusing than helpful, so expand out the macro there and remove the
define entirely.

Signed-off-by: Peter Maydell 
---
  hw/intc/gicv3_internal.h | 26 ++
  hw/intc/arm_gicv3_its.c  | 30 +-
  2 files changed, 27 insertions(+), 29 deletions(-)


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 07/26] hw/intc/arm_gicv3_its: Correct setting of TableDesc entry_sz

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

We set the TableDesc entry_sz field from the appropriate
GITS_BASER.ENTRYSIZE field.  That ID register field specifies the
number of bytes per table entry minus one.  However when we use
td->entry_sz we assume it to be the number of bytes per table entry
(for instance we calculate the number of entries in a page by
dividing the page size by the entry size).

The effects of this bug are:
  * we miscalculate the maximum number of entries in the table,
so our checks on guest index values are wrong (too lax)
  * when looking up an entry in the second level of an indirect
table, we calculate an incorrect index into the L2 table.
Because we make the same incorrect calculation on both
reads and writes of the L2 table, the guest won't notice
unless it's unlucky enough to use an index value that
causes us to index off the end of the L2 table page and
cause guest memory corruption in whatever follows

Signed-off-by: Peter Maydell 
---
  hw/intc/arm_gicv3_its.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 06/26] hw/intc/arm_gicv3_its: Reduce code duplication in extract_table_params()

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The extract_table_params() decodes the fields in the GITS_BASER
registers into TableDesc structs.  Since the fields are the same for
all the GITS_BASER registers, there is currently a lot of code
duplication within the switch (type) statement.  Refactor so that the
cases include only what is genuinely different for each type:
the calculation of the number of bits in the ID value that indexes
into the table.

Signed-off-by: Peter Maydell 
---
  hw/intc/arm_gicv3_its.c | 97 +
  1 file changed, 40 insertions(+), 57 deletions(-)


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 05/26] hw/intc/arm_gicv3_its: Don't return early in extract_table_params() loop

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

In extract_table_params() we process each GITS_BASER register.  If
the register's Valid bit is not set, this means there is no
in-guest-memory table and so we should not try to interpret the other
fields in the register.  This was incorrectly coded as a 'return'
rather than a 'break', so instead of looping round to process the
next GITS_BASER we would stop entirely, treating any later tables
as being not valid also.

This has no real guest-visible effects because (since we don't have
GITS_TYPER.HCC != 0) the guest must in any case set up all the
GITS_BASER to point to valid tables, so this only happens in an
odd misbehaving-guest corner case.

Fix the check to 'break', so that we leave the case statement and
loop back around to the next GITS_BASER.

Signed-off-by: Peter Maydell 
---
I suspect this was an accidental result from a refactoring at
some point in the development of the ITS code.


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 04/26] hw/intc/arm_gicv3_its: Remove maxids union from TableDesc

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The TableDesc struct defines properties of the in-guest-memory tables
which the guest tells us about by writing to the GITS_BASER
registers.  This struct currently has a union 'maxids', but all the
fields of the union have the same type (uint32_t) and do the same
thing (record one-greater-than the maximum ID value that can be used
as an index into the table).

We're about to add another table type (the GICv4 vPE table); rather
than adding another specifically-named union field for that table
type with the same type as the other union fields, remove the union
entirely and just have a 'uint32_t max_ids' struct field.

Signed-off-by: Peter Maydell
---
  include/hw/intc/arm_gicv3_its_common.h |  5 +
  hw/intc/arm_gicv3_its.c| 20 ++--
  2 files changed, 11 insertions(+), 14 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 03/26] hw/intc/arm_gicv3_its: Remove redundant ITS_CTLR_ENABLED define

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

We currently define a bitmask for the GITS_CTLR ENABLED bit in
two ways: as ITS_CTLR_ENABLED, and via the FIELD() macro as
R_GITS_CTLR_ENABLED_MASK. Consistently use the FIELD macro version
everywhere and remove the redundant ITS_CTLR_ENABLED define.

Signed-off-by: Peter Maydell
---
  hw/intc/gicv3_internal.h |  2 --
  hw/intc/arm_gicv3_its.c  | 20 ++--
  2 files changed, 10 insertions(+), 12 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 02/26] hw/intc/arm_gicv3_its: Correct off-by-one bounds check on rdbase

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

The checks in the ITS on the rdbase values in guest commands are
off-by-one: they permit the guest to pass us a value equal to
s->gicv3->num_cpu, but the valid values are 0...num_cpu-1.  This
meant the guest could cause us to index off the end of the
s->gicv3->cpu[] array when calling gicv3_redist_process_lpi(), and we
would probably crash.

Cc:qemu-sta...@nongnu.org
Fixes: 17fb5e36aabd4b ("hw/intc: GICv3 redistributor ITS processing")
Signed-off-by: Peter Maydell
---
Not a security bug, because only usable with emulation.
---
  hw/intc/arm_gicv3_its.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 01/26] hw/intc: clean-up error reporting for failed ITS cmd

2021-12-12 Thread Richard Henderson

On 12/11/21 11:11 AM, Peter Maydell wrote:

From: Alex Bennée

While trying to debug a GIC ITS failure I saw some guest errors that
had poor formatting as well as leaving me confused as to what failed.
As most of the checks aren't possible without a valid dte split that
check apart and then check the other conditions in steps. This avoids
us relying on undefined data.

I still get a failure with the current kvm-unit-tests but at least I
know (partially) why now:

   Exception return from AArch64 EL1 to AArch64 EL1 PC 0x40080588
   PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
   ITS: MAPD devid=2 size = 0x8 itt=0x4043 valid=0
   INT dev_id=2 event_id=20
   process_its_cmd: invalid command attributes: invalid dte: 0 for 2 (MEM_TX: 0)
   PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
   SUMMARY: 6 tests, 1 unexpected failures

Signed-off-by: Alex Bennée
Cc: Shashi Mallela
Cc: Peter Maydell
Reviewed-by: Peter Maydell
Signed-off-by: Peter Maydell
---
  hw/intc/arm_gicv3_its.c | 39 +++
  1 file changed, 27 insertions(+), 12 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v10 06/10] ACPI ERST: build the ACPI ERST table

2021-12-12 Thread Ani Sinha
.

On Thu, Dec 9, 2021 at 11:28 PM Eric DeVolder  wrote:
>
> This builds the ACPI ERST table to inform OSPM how to communicate
> with the acpi-erst device.

This patch starts in the middle of pci device code addition, between
erst_reg_ops and erst_post_load. I do not like this :(

>
> Signed-off-by: Eric DeVolder 
> ---
>  hw/acpi/erst.c | 241 
> +
>  1 file changed, 241 insertions(+)
>
> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> index 81f5435..753425a 100644
> --- a/hw/acpi/erst.c
> +++ b/hw/acpi/erst.c
> @@ -711,6 +711,247 @@ static const MemoryRegionOps erst_reg_ops = {
>  .endianness = DEVICE_NATIVE_ENDIAN,
>  };
>
> +
> +/***/
> +/***/
> +
> +/* ACPI 4.0: Table 17-19 Serialization Instructions */
> +#define INST_READ_REGISTER 0x00
> +#define INST_READ_REGISTER_VALUE   0x01
> +#define INST_WRITE_REGISTER0x02
> +#define INST_WRITE_REGISTER_VALUE  0x03
> +#define INST_NOOP  0x04
> +#define INST_LOAD_VAR1 0x05
> +#define INST_LOAD_VAR2 0x06
> +#define INST_STORE_VAR10x07
> +#define INST_ADD   0x08
> +#define INST_SUBTRACT  0x09
> +#define INST_ADD_VALUE 0x0A
> +#define INST_SUBTRACT_VALUE0x0B
> +#define INST_STALL 0x0C
> +#define INST_STALL_WHILE_TRUE  0x0D
> +#define INST_SKIP_NEXT_INSTRUCTION_IF_TRUE 0x0E
> +#define INST_GOTO  0x0F
> +#define INST_SET_SRC_ADDRESS_BASE  0x10
> +#define INST_SET_DST_ADDRESS_BASE  0x11
> +#define INST_MOVE_DATA 0x12

I prefer these definitions to come at the top of the file along with
other definitions.

> +
> +/* ACPI 4.0: 17.4.1.2 Serialization Instruction Entries */
> +static void build_serialization_instruction_entry(GArray *table_data,

This function and buiild_erst() can come at the end of erst.c. They go
together and are doing a common but different operation from the
operations of the pci device - building the erst table. Hence, ther
code should be separate from pci device code. A new file would be an
overkill at this state IMHO but in the future if erst table generation
code gains more weight, it can be split into two files.

> +uint8_t serialization_action,
> +uint8_t instruction,
> +uint8_t flags,
> +uint8_t register_bit_width,
> +uint64_t register_address,
> +uint64_t value,
> +uint64_t mask)
> +{
> +/* ACPI 4.0: Table 17-18 Serialization Instruction Entry */
> +struct AcpiGenericAddress gas;
> +
> +/* Serialization Action */
> +build_append_int_noprefix(table_data, serialization_action, 1);
> +/* Instruction */
> +build_append_int_noprefix(table_data, instruction , 1);
> +/* Flags */
> +build_append_int_noprefix(table_data, flags   , 1);
> +/* Reserved */
> +build_append_int_noprefix(table_data, 0   , 1);
> +/* Register Region */
> +gas.space_id = AML_SYSTEM_MEMORY;
> +gas.bit_width = register_bit_width;
> +gas.bit_offset = 0;
> +switch (register_bit_width) {
> +case 8:
> +gas.access_width = 1;
> +break;
> +case 16:
> +gas.access_width = 2;
> +break;
> +case 32:
> +gas.access_width = 3;
> +break;
> +case 64:
> +gas.access_width = 4;
> +break;
> +default:
> +gas.access_width = 0;
> +break;
> +}
> +gas.address = register_address;
> +build_append_gas_from_struct(table_data, &gas);
> +/* Value */
> +build_append_int_noprefix(table_data, value  , 8);
> +/* Mask */
> +build_append_int_noprefix(table_data, mask   , 8);
> +}
> +
> +/* ACPI 4.0: 17.4.1 Serialization Action Table */
> +void build_erst(GArray *table_data, BIOSLinker *linker, Object *erst_dev,
> +const char *oem_id, const char *oem_table_id)
> +{
> +GArray *table_instruction_data;
> +unsigned action;
> +pcibus_t bar0, bar1;
> +AcpiTable table = { .sig = "ERST", .rev = 1, .oem_id = oem_id,
> +.oem_table_id = oem_table_id };
> +
> +bar0 = (pcibus_t)pci_get_bar_addr(PCI_DEVICE(erst_dev), 0);
> +trace_acpi_erst_pci_bar_0(bar0);
> +bar1 = (pcibus_t)pci_get_bar_addr(PCI_DEVICE(erst_dev), 1);
> +trace_acpi_erst_pci_bar_1(bar1);
> +
> +#define MASK8  0x00FFUL
> +#define MASK16 0xUL
> +#define MASK32 0xUL
> +#define MASK64 0xUL
> +
> +/*
> + * Serialization Action Table
> + * The serialization action table must be generated first
> + * so that its size can be known in order to populate the
> + * Instruction Entry Count field.
> + */
>

Re: [PATCH 5/8] standard-headers: Add virtio_video.h

2021-12-12 Thread Michael S. Tsirkin
On Fri, Dec 10, 2021 at 01:09:46PM +, Peter Griffin wrote:
> Hi Michael,
> 
> On Fri, 10 Dec 2021, Michael S. Tsirkin wrote:
> 
> > On Thu, Dec 09, 2021 at 02:55:58PM +, Peter Griffin wrote:
> > > Signed-off-by: Peter Griffin 
> > > ---
> > >  include/standard-headers/linux/virtio_video.h | 483 ++
> > >  1 file changed, 483 insertions(+)
> > >  create mode 100644 include/standard-headers/linux/virtio_video.h
> > 
> > We generally inherit these files from Linux.
> > Was the driver posted for inclusion in Linux?
> 
> Thanks for reviewing. Yes the Linux virtio-video frontend driver was posted
> sometime back on the linux-media ML [1].
> 
> One piece of pushback then was not supporting vicodec/FWHT and also no Qemu
> support [2] which is what this series is trying to address.
> 
> The virtio-video spec however is now at rfc v5. So my rough plan was now I 
> have
> something working with Qemu and vicodec I can move both the frontend driver
> and the vhost-user-video to latest v5 spec.
> 
> I'm a bit unclear what the process is to get the virtio-video spec merged 
> though.
> I think I read somewhere they expect a matching frontend driver 
> implementation?
> 
> Thanks,
> 
> Peter.

No, just that it all looks on track to be merged, and got some acks from
Linux driver maintainers. This is because we don't have experts in
all fields on the TC, so input from linux maintainers is useful.

To get a change into spec the TC needs to vote
on it. The simplest way to do that is described here.

https://github.com/oasis-tcs/virtio-spec/blob/master/README.md#use-of-github-issues


> [1] 
> https://patchwork.kernel.org/project/linux-media/cover/20200218202753.652093-1-dmitry.s...@opensynergy.com/
> [2] https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg02204.html




Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-12 Thread Michael S. Tsirkin
On Sat, Dec 11, 2021 at 03:00:27AM +, Longpeng (Mike, Cloud Infrastructure 
Service Product Dept.) wrote:
> 
> 
> > -Original Message-
> > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > Sent: Thursday, December 9, 2021 5:17 PM
> > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > 
> > Cc: jasow...@redhat.com; m...@redhat.com; pa...@nvidia.com;
> > xieyon...@bytedance.com; sgarz...@redhat.com; Yechuan ;
> > Gonglei (Arei) ; qemu-devel@nongnu.org
> > Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> > 
> > On Wed, Dec 08, 2021 at 01:20:10PM +0800, Longpeng(Mike) wrote:
> > > From: Longpeng 
> > >
> > > Hi guys,
> > >
> > > This patch introduces vhost-vdpa-net device, which is inspired
> > > by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> > >
> > > I've tested this patch on Huawei's offload card:
> > > ./x86_64-softmmu/qemu-system-x86_64 \
> > > -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> > >
> > > For virtio hardware offloading, the most important requirement for us
> > > is to support live migration between offloading cards from different
> > > vendors, the combination of netdev and virtio-net seems too heavy, we
> > > prefer a lightweight way.
> > >
> > > Maybe we could support both in the future ? Such as:
> > >
> > > * Lightweight
> > >  Net: vhost-vdpa-net
> > >  Storage: vhost-vdpa-blk
> > >
> > > * Heavy but more powerful
> > >  Net: netdev + virtio-net + vhost-vdpa
> > >  Storage: bdrv + virtio-blk + vhost-vdpa
> > >
> > > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> > 
> > Stefano presented a plan for vdpa-blk at KVM Forum 2021:
> > https://kvmforum2021.sched.com/event/ke3a/vdpa-blk-unified-hardware-and-sof
> > tware-offload-for-virtio-blk-stefano-garzarella-red-hat
> > 
> > It's closer to today's virtio-net + vhost-net approach than the
> > vhost-vdpa-blk device you have mentioned. The idea is to treat vDPA as
> > an offload feature rather than a completely separate code path that
> > needs to be maintained and tested. That way QEMU's block layer features
> > and live migration work with vDPA devices and re-use the virtio-blk
> > code. The key functionality that has not been implemented yet is a "fast
> > path" mechanism that allows the QEMU virtio-blk device's virtqueue to be
> > offloaded to vDPA.
> > 
> > The unified vdpa-blk architecture should deliver the same performance
> > as the vhost-vdpa-blk device you mentioned but with more features, so I
> > wonder what aspects of the vhost-vdpa-blk idea are important to you?
> > 
> > QEMU already has vhost-user-blk, which takes a similar approach as the
> > vhost-vdpa-blk device you are proposing. I'm not against the
> > vhost-vdpa-blk approach in priciple, but would like to understand your
> > requirements and see if there is a way to collaborate on one vdpa-blk
> > implementation instead of dividing our efforts between two.
> > 
> 
> We prefer a simple way in the virtio hardware offloading case, it could reduce
> our maintenance workload, we no need to maintain the virtio-net, netdev,
> virtio-blk, bdrv and ... any more. If we need to support other vdpa devices
> (such as virtio-crypto, virtio-fs) in the future, then we also need to 
> maintain
> the corresponding device emulation code?
> 
> For the virtio hardware offloading case, we usually use the vfio-pci 
> framework,
> it saves a lot of our maintenance work in QEMU, we don't need to touch the 
> device
> types. Inspired by Jason, what we really prefer is "vhost-vdpa-pci/mmio", use 
> it to
> instead of the vfio-pci, it could provide the same performance as vfio-pci, 
> but it's
> *possible* to support live migrate between offloading cards from different 
> vendors.

OK, so the features you are dropping would be migration between
a vdpa, vhost and virtio backends. I think given vhost-vdpa-blk is seems
fair enough... What do others think?

> > Stefan