Re: virtio capabilities

2019-12-13 Thread Alexey Kardashevskiy



On 13/12/2019 18:24, Michael S. Tsirkin wrote:
> On Fri, Dec 13, 2019 at 05:05:05PM +1100, Alexey Kardashevskiy wrote:
>> Hi!
>>
>> I am having an issue with capabilities (hopefully the chunk formatting
>> won't break).
>>
>> The problem is that when virtio_pci_find_capability() reads
>> pci_find_capability(dev, PCI_CAP_ID_VNDR), 0 is returned; if repeated,
>> it returns a valid number (0x84). Timing seems to matter. pci_cfg_read
>> trace shows that that first time read does not reach QEMU but others do
>> reach QEMU and return what is expected.
>>
>> How to debug this, any quick ideas?
>> The config space is not a MMIO BAR
>> or KVM memory slot or anything like this, right? :) Thanks,
> 
> Depends on the platform.
> 
> E.g. on x86, when using cf8/cfc pair, if guest doesn't


Is there an easy way to tell if it is this "cf8/cfc" case?

I have these bars, is any of them related to cf8/cfc? Thanks,

root@le-dbg:~# (qemu) info mtree -f
FlatView #0
 AS "memory", root: system
 AS "cpu-memory-0", root: system
 Root memory region: system
  - (prio 0, ram): ppc_spapr.ram kvm
  20008000-2000802f (prio 0, i/o): msix-table
  20008800-20008807 (prio 0, i/o): msix-pba
  2100-21000fff (prio 0, i/o): virtio-pci-common
  21001000-21001fff (prio 0, i/o): virtio-pci-isr
  21002000-21002fff (prio 0, i/o): virtio-pci-device
  21003000-21003fff (prio 0, i/o): virtio-pci-notify




> have a lock around programming the pair of registers,
> then one access can conflict with another one.
> 
> When using express it's MMIO so shouldn't be a problem.
> 
>>
>> [3.489492] ___K___ (0) virtio_pci_modern_probe 642
>> [3.489697] ___K___ (0) virtio_pci_find_capability 492: FIND a cap
>> [3.490070] ___K___ (0) virtio_pci_find_capability 494: cap is at 0
>> [3.490335] ___K___ (0) virtio_pci_find_capability 492: FIND a cap
>> 10909@1576216763.643271:pci_cfg_read virtio-net-pci 00:0 @0x6 -> 0x10
>> 10909@1576216763.643431:pci_cfg_read virtio-net-pci 00:0 @0x34 -> 0x98
>> 10909@1576216763.643591:pci_cfg_read virtio-net-pci 00:0 @0x98 -> 0x8411
>> 10909@1576216763.643747:pci_cfg_read virtio-net-pci 00:0 @0x84 -> 0x7009
>> [3.491264] ___K___ (0) virtio_pci_find_capability 494: cap is at 132
>> 10909@1576216763.644140:pci_cfg_read virtio-net-pci 00:0 @0x87 -> 0x5
>> 10909@1576216763.644287:pci_cfg_read virtio-net-pci 00:0 @0x88 -> 0x0
>> [3.491803] ___K___ (0) virtio_pci_find_capability 506: 5 0
>> 10909@1576216763.644632:pci_cfg_read virtio-net-pci 00:0 @0x85 -> 0x70
>> 10909@1576216763.644786:pci_cfg_read virtio-net-pci 00:0 @0x70 -> 0x6009
>> 10909@1576216763.644942:pci_cfg_read virtio-net-pci 00:0 @0x73 -> 0x2
>> 10909@1576216763.645092:pci_cfg_read virtio-net-pci 00:0 @0x74 -> 0x4
>> [3.492607] ___K___ (0) virtio_pci_find_capability 506: 2 4
>>
>>
>>
>>
>>
>> diff --git a/drivers/virtio/virtio_pci_modern.c
>> b/drivers/virtio/virtio_pci_modern.c
>> index 7abcc50838b8..85b2a7ce96e9 100644
>> --- a/drivers/virtio/virtio_pci_modern.c
>> +++ b/drivers/virtio/virtio_pci_modern.c
>> @@ -486,9 +486,14 @@ static const struct virtio_config_ops
>> virtio_pci_config_ops = {
>>  static inline int virtio_pci_find_capability(struct pci_dev *dev, u8
>> cfg_type,
>>  u32 ioresource_types, int
>> *bars)
>>  {
>> -   int pos;
>> +   int pos = 0;// = pci_find_capability(dev, PCI_CAP_ID_VNDR);
>>
>> -   for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
>> +   while (!pos) {
>> +   pr_err("___K___ (%u) %s %u: FIND a cap\n",
>> smp_processor_id(), __func__, __LINE__);
>> +   pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
>> +   pr_err("___K___ (%u) %s %u: cap is at %d\n",
>> smp_processor_id(), __func__, __LINE__, pos);
>> +   }
>> +   for (;
>>  pos > 0;
>>  pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
>> u8 type, bar;
>>
>>
>> -- 
>> Alexey
> 

-- 
Alexey



Re: virtio capabilities

2019-12-13 Thread Michael S. Tsirkin
On Fri, Dec 13, 2019 at 07:29:40PM +1100, Alexey Kardashevskiy wrote:
> 
> 
> On 13/12/2019 18:24, Michael S. Tsirkin wrote:
> > On Fri, Dec 13, 2019 at 05:05:05PM +1100, Alexey Kardashevskiy wrote:
> >> Hi!
> >>
> >> I am having an issue with capabilities (hopefully the chunk formatting
> >> won't break).
> >>
> >> The problem is that when virtio_pci_find_capability() reads
> >> pci_find_capability(dev, PCI_CAP_ID_VNDR), 0 is returned; if repeated,
> >> it returns a valid number (0x84). Timing seems to matter. pci_cfg_read
> >> trace shows that that first time read does not reach QEMU but others do
> >> reach QEMU and return what is expected.
> >>
> >> How to debug this, any quick ideas?
> >> The config space is not a MMIO BAR
> >> or KVM memory slot or anything like this, right? :) Thanks,
> > 
> > Depends on the platform.
> > 
> > E.g. on x86, when using cf8/cfc pair, if guest doesn't
> 
> 
> Is there an easy way to tell if it is this "cf8/cfc" case?
> 
> I have these bars, is any of them related to cf8/cfc? Thanks,
> 
> root@le-dbg:~# (qemu) info mtree -f
> FlatView #0
>  AS "memory", root: system
>  AS "cpu-memory-0", root: system
>  Root memory region: system
>   - (prio 0, ram): ppc_spapr.ram kvm
>   20008000-2000802f (prio 0, i/o): msix-table
>   20008800-20008807 (prio 0, i/o): msix-pba
>   2100-21000fff (prio 0, i/o): virtio-pci-common
>   21001000-21001fff (prio 0, i/o): virtio-pci-isr
>   21002000-21002fff (prio 0, i/o): virtio-pci-device
>   21003000-21003fff (prio 0, i/o): virtio-pci-notify
> 


No, you want stuff in hw/ppc/spapr_pci.c

> 
> 
> > have a lock around programming the pair of registers,
> > then one access can conflict with another one.
> > 
> > When using express it's MMIO so shouldn't be a problem.
> > 
> >>
> >> [3.489492] ___K___ (0) virtio_pci_modern_probe 642
> >> [3.489697] ___K___ (0) virtio_pci_find_capability 492: FIND a cap
> >> [3.490070] ___K___ (0) virtio_pci_find_capability 494: cap is at 0
> >> [3.490335] ___K___ (0) virtio_pci_find_capability 492: FIND a cap
> >> 10909@1576216763.643271:pci_cfg_read virtio-net-pci 00:0 @0x6 -> 0x10
> >> 10909@1576216763.643431:pci_cfg_read virtio-net-pci 00:0 @0x34 -> 0x98
> >> 10909@1576216763.643591:pci_cfg_read virtio-net-pci 00:0 @0x98 -> 0x8411
> >> 10909@1576216763.643747:pci_cfg_read virtio-net-pci 00:0 @0x84 -> 0x7009
> >> [3.491264] ___K___ (0) virtio_pci_find_capability 494: cap is at 132
> >> 10909@1576216763.644140:pci_cfg_read virtio-net-pci 00:0 @0x87 -> 0x5
> >> 10909@1576216763.644287:pci_cfg_read virtio-net-pci 00:0 @0x88 -> 0x0
> >> [3.491803] ___K___ (0) virtio_pci_find_capability 506: 5 0
> >> 10909@1576216763.644632:pci_cfg_read virtio-net-pci 00:0 @0x85 -> 0x70
> >> 10909@1576216763.644786:pci_cfg_read virtio-net-pci 00:0 @0x70 -> 0x6009
> >> 10909@1576216763.644942:pci_cfg_read virtio-net-pci 00:0 @0x73 -> 0x2
> >> 10909@1576216763.645092:pci_cfg_read virtio-net-pci 00:0 @0x74 -> 0x4
> >> [3.492607] ___K___ (0) virtio_pci_find_capability 506: 2 4
> >>
> >>
> >>
> >>
> >>
> >> diff --git a/drivers/virtio/virtio_pci_modern.c
> >> b/drivers/virtio/virtio_pci_modern.c
> >> index 7abcc50838b8..85b2a7ce96e9 100644
> >> --- a/drivers/virtio/virtio_pci_modern.c
> >> +++ b/drivers/virtio/virtio_pci_modern.c
> >> @@ -486,9 +486,14 @@ static const struct virtio_config_ops
> >> virtio_pci_config_ops = {
> >>  static inline int virtio_pci_find_capability(struct pci_dev *dev, u8
> >> cfg_type,
> >>  u32 ioresource_types, int
> >> *bars)
> >>  {
> >> -   int pos;
> >> +   int pos = 0;// = pci_find_capability(dev, PCI_CAP_ID_VNDR);
> >>
> >> -   for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
> >> +   while (!pos) {
> >> +   pr_err("___K___ (%u) %s %u: FIND a cap\n",
> >> smp_processor_id(), __func__, __LINE__);
> >> +   pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
> >> +   pr_err("___K___ (%u) %s %u: cap is at %d\n",
> >> smp_processor_id(), __func__, __LINE__, pos);
> >> +   }
> >> +   for (;
> >>  pos > 0;
> >>  pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
> >> u8 type, bar;
> >>
> >>
> >> -- 
> >> Alexey
> > 
> 
> -- 
> Alexey




Re: [PATCH] mos6522: remove anh register

2019-12-13 Thread Laurent Vivier
Le 13/12/2019 à 02:44, David Gibson a écrit :
> On Thu, Dec 12, 2019 at 08:43:59PM +0100, Laurent Vivier wrote:
>> Register addr 1 is defined as buffer A with handshake (vBufAH),
>> register addr 15 is also defined as buffer A without handshake (vBufA).
>>
>> Linux kernel has a big "DON'T USE!" next to the register 1 addr
>> definition (vBufAH), and only uses register 15 (vBufA).
>>
>> So remove the definition of 'anh' and use only 'a' (with VIA_REG_ANH and
>> VIA_REG_A).
> 
> I'm not actually following the rationale for removing the register.
> Linux doesn't use it, but if it's part of the real hardware model we
> should keep it, no?

It's actually two methods to access the same register (with handshake,
without handshake).

In the datashit Register 15 is described as "Same as register 1 except
no handshake".

Thanks,
Laurent



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] linux-user:Fix align mistake when mmap guest space

2019-12-13 Thread Laurent Vivier
Le 13/12/2019 à 03:29, Xinyu Li a écrit :
> In init_guest_space, we need to mmap guest space. If the return address
> of first mmap is not aligned with align, which was set to MAX(SHMLBA,
> qemu_host_page_size), we need unmap and a new mmap(space is larger than
> first size). The new size is named real_size, which is aligned_size +
> qemu_host_page_size. alugned_size is the guest space size. And add a
> qemu_host_page_size to avoid memory error when we align real_start
> manually (ROUND_UP(real_start, align)). But when SHMLBA >
> qemu_host_page_size, the added size will smaller than the size to align,
> which can make a mistake(in a mips machine, it appears). So change
> real_size from aligned_size +qemu_host_page_size
> to aligned_size + align will solve it.
> 
> Signed-off-by: Xinyu Li 
> ---
>  linux-user/elfload.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index f6693e5760..312ded0779 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -2189,7 +2189,7 @@ unsigned long init_guest_space(unsigned long host_start,
>   * to where we need to put the commpage.
>   */
>  munmap((void *)real_start, host_size);
> -real_size = aligned_size + qemu_host_page_size;
> +real_size = aligned_size + align;
>  real_start = (unsigned long)
>  mmap((void *)real_start, real_size, PROT_NONE, flags, -1, 0);
>  if (real_start == (unsigned long)-1) {
> 

Your change seems correct to me.

Richard did you miss this in your patch
30ab9ef2967d ("linux-user: Fix shmat emulation by honoring host SHMLBA")
or was it voluntary to keep it?

Thanks,
Laurent



Re: [PATCH 2/2] numa: properly check if numa is supported

2019-12-13 Thread Igor Mammedov
On Fri, 13 Dec 2019 09:33:10 +0800
Tao Xu  wrote:

> On 12/12/2019 8:48 PM, Igor Mammedov wrote:
> > Commit aa57020774b, by mistake used MachineClass::numa_mem_supported
> > to check if NUMA is supported by machine and also as unrelated change
> > set it to true for sbsa-ref board.
> > 
> > Luckily change didn't break machines that support NUMA, as the field
> > is set to true for them.
> > 
> > But the field is not intended for checking if NUMA is supported and
> > will be flipped to false within this release for new machine types.
> > 
> > Fix it:
> >   - by using previously used condition
> >!mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id
> > the first time and then use MachineState::numa_state down the road
> > to check if NUMA is supported
> >   - dropping stray sbsa-ref chunk
> > 
> > Fixes: aa57020774b690a22be72453b8e91c9b5a68c516
> > Signed-off-by: Igor Mammedov 
> > ---
> > CC: Radoslaw Biernacki 
> > CC: Peter Maydell 
> > CC: Leif Lindholm 
> > CC: qemu-...@nongnu.org
> > CC: qemu-sta...@nongnu.org
> > 
> > 
> >   hw/arm/sbsa-ref.c | 1 -
> >   hw/core/machine.c | 4 ++--
> >   2 files changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
> > index 27046cc..c6261d4 100644
> > --- a/hw/arm/sbsa-ref.c
> > +++ b/hw/arm/sbsa-ref.c
> > @@ -791,7 +791,6 @@ static void sbsa_ref_class_init(ObjectClass *oc, void 
> > *data)
> >   mc->possible_cpu_arch_ids = sbsa_ref_possible_cpu_arch_ids;
> >   mc->cpu_index_to_instance_props = sbsa_ref_cpu_index_to_props;
> >   mc->get_default_cpu_node_id = sbsa_ref_get_default_cpu_node_id;
> > -mc->numa_mem_supported = true;
> >   }
> >   
> >   static const TypeInfo sbsa_ref_info = {
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index 1689ad3..aa63231 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -958,7 +958,7 @@ static void machine_initfn(Object *obj)
> >   NULL);
> >   }
> >   
> > -if (mc->numa_mem_supported) {
> > +if (mc->cpu_index_to_instance_props && mc->get_default_cpu_node_id) {
> >   ms->numa_state = g_new0(NumaState, 1);
> >   }  
> 
> I am wondering if @numa_mem_supported is unused here, it is unused for 
> QEMU, because the only usage of @numa_mem_supported is to initialize 
> @numa_state. Or there is other usage? So should it be removed from 
> struct MachineClass?
You are wrong, it's not intended for numa_state initialization,
read doc comment for it in include/hw/boards.h
(for full story look at commit cd5ff8333a3)




Re: [PATCH v26 20/21] Add rx-softmmu

2019-12-13 Thread Philippe Mathieu-Daudé

On 10/14/19 1:57 PM, Yoshinori Sato wrote:

Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Yoshinori Sato 
Message-Id: <20190607091116.49044-17-ys...@users.sourceforge.jp>
Signed-off-by: Richard Henderson 
pick ed65c02993 target/rx: Add RX to SysEmuTarget
pick 01372568ae tests: Add rx to machine-none-test.c
[PMD: Squashed patches from Richard Henderson modifying
   qapi/common.json and tests/machine-none-test.c]
Signed-off-by: Philippe Mathieu-Daudé 
---
  configure  | 8 
  default-configs/rx-softmmu.mak | 3 +++
  qapi/machine.json  | 3 ++-
  include/exec/poison.h  | 1 +
  include/sysemu/arch_init.h | 1 +
  arch_init.c| 2 ++
  tests/machine-none-test.c  | 1 +
  hw/Kconfig | 1 +
  8 files changed, 19 insertions(+), 1 deletion(-)
  create mode 100644 default-configs/rx-softmmu.mak

diff --git a/configure b/configure
index 08ca4bcb46..fa5d4274b6 100755
--- a/configure
+++ b/configure
@@ -7521,6 +7521,11 @@ case "$target_name" in
  mttcg=yes
  gdb_xml_files="riscv-64bit-cpu.xml riscv-64bit-fpu.xml 
riscv-64bit-csr.xml"
;;
+  rx)
+TARGET_ARCH=rx
+bflt="yes"
+target_compiler=$cross_cc_rx
+  ;;
sh4|sh4eb)
  TARGET_ARCH=sh4
  bflt="yes"
@@ -7702,6 +7707,9 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
riscv*)
  disas_config "RISCV"
;;
+  rx)
+disas_config "RX"
+  ;;
s390*)
  disas_config "S390"
;;
diff --git a/default-configs/rx-softmmu.mak b/default-configs/rx-softmmu.mak
new file mode 100644
index 00..a3eecefb11
--- /dev/null
+++ b/default-configs/rx-softmmu.mak
@@ -0,0 +1,3 @@
+# Default configuration for rx-softmmu
+
+CONFIG_RX_VIRT=y
diff --git a/qapi/machine.json b/qapi/machine.json
index ca26779f1a..4409c113c2 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -21,6 +21,7 @@
  #is true even for "qemu-system-x86_64".
  #
  # ppcemb: dropped in 3.1
+# rx: added in 4.2


Richard, if you take this series, do you mind changing 4.2 -> 5.0?


  #
  # Since: 3.0
  ##
@@ -28,7 +29,7 @@
'data' : [ 'aarch64', 'alpha', 'arm', 'cris', 'hppa', 'i386', 'lm32',
   'm68k', 'microblaze', 'microblazeel', 'mips', 'mips64',
   'mips64el', 'mipsel', 'moxie', 'nios2', 'or1k', 'ppc',
- 'ppc64', 'riscv32', 'riscv64', 's390x', 'sh4',
+ 'ppc64', 'riscv32', 'riscv64', 'rx', 's390x', 'sh4',
   'sh4eb', 'sparc', 'sparc64', 'tricore', 'unicore32',
   'x86_64', 'xtensa', 'xtensaeb' ] }
  
diff --git a/include/exec/poison.h b/include/exec/poison.h

index 955eb863ab..7b9ac361dc 100644
--- a/include/exec/poison.h
+++ b/include/exec/poison.h
@@ -26,6 +26,7 @@
  #pragma GCC poison TARGET_PPC
  #pragma GCC poison TARGET_PPC64
  #pragma GCC poison TARGET_ABI32
+#pragma GCC poison TARGET_RX
  #pragma GCC poison TARGET_S390X
  #pragma GCC poison TARGET_SH4
  #pragma GCC poison TARGET_SPARC
diff --git a/include/sysemu/arch_init.h b/include/sysemu/arch_init.h
index 62c6fe4cf1..6c011acc52 100644
--- a/include/sysemu/arch_init.h
+++ b/include/sysemu/arch_init.h
@@ -24,6 +24,7 @@ enum {
  QEMU_ARCH_NIOS2 = (1 << 17),
  QEMU_ARCH_HPPA = (1 << 18),
  QEMU_ARCH_RISCV = (1 << 19),
+QEMU_ARCH_RX = (1 << 20),
  };
  
  extern const uint32_t arch_type;

diff --git a/arch_init.c b/arch_init.c
index 0a1531124c..7a37fb2c34 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -73,6 +73,8 @@ int graphic_depth = 32;
  #define QEMU_ARCH QEMU_ARCH_PPC
  #elif defined(TARGET_RISCV)
  #define QEMU_ARCH QEMU_ARCH_RISCV
+#elif defined(TARGET_RX)
+#define QEMU_ARCH QEMU_ARCH_RX
  #elif defined(TARGET_S390X)
  #define QEMU_ARCH QEMU_ARCH_S390X
  #elif defined(TARGET_SH4)
diff --git a/tests/machine-none-test.c b/tests/machine-none-test.c
index 5953d31755..8bb54a6360 100644
--- a/tests/machine-none-test.c
+++ b/tests/machine-none-test.c
@@ -56,6 +56,7 @@ static struct arch2cpu cpus_map[] = {
  { "hppa", "hppa" },
  { "riscv64", "rv64gcsu-v1.10.0" },
  { "riscv32", "rv32gcsu-v1.9.1" },
+{ "rx", "rx62n" },
  };
  
  static const char *get_cpu_model_by_arch(const char *arch)

diff --git a/hw/Kconfig b/hw/Kconfig
index b45db3c813..77bbc59cc7 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -54,6 +54,7 @@ source nios2/Kconfig
  source openrisc/Kconfig
  source ppc/Kconfig
  source riscv/Kconfig
+source rx/Kconfig
  source s390x/Kconfig
  source sh4/Kconfig
  source sparc/Kconfig






Re: [PATCH] mos6522: remove anh register

2019-12-13 Thread Philippe Mathieu-Daudé

On 12/12/19 8:43 PM, Laurent Vivier wrote:

Register addr 1 is defined as buffer A with handshake (vBufAH),
register addr 15 is also defined as buffer A without handshake (vBufA).


Maybe add "IOW both addresses access the same register."



Linux kernel has a big "DON'T USE!" next to the register 1 addr
definition (vBufAH), and only uses register 15 (vBufA).


I agree with David, the Linux reference is confusing. I'd omit it.



So remove the definition of 'anh' and use only 'a' (with VIA_REG_ANH and
VIA_REG_A).

Signed-off-by: Laurent Vivier 
---
  hw/misc/mos6522.c | 12 
  include/hw/misc/mos6522.h |  1 -
  2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/hw/misc/mos6522.c b/hw/misc/mos6522.c
index cecf0be59e..86ede4005c 100644
--- a/hw/misc/mos6522.c
+++ b/hw/misc/mos6522.c
@@ -244,6 +244,7 @@ uint64_t mos6522_read(void *opaque, hwaddr addr, unsigned 
size)
  val = s->b;
  break;
  case VIA_REG_A:


Maybe add:

   /* As we do not model handshake, fall through no handshake. */


+case VIA_REG_ANH:
  val = s->a;
  break;
  case VIA_REG_DIRB:
@@ -297,9 +298,7 @@ uint64_t mos6522_read(void *opaque, hwaddr addr, unsigned 
size)
  val = s->ier | 0x80;
  break;
  default:
-case VIA_REG_ANH:
-val = s->anh;


Oops, default was buggy.

Maybe worth:

Fixes: 51f233ec92c


-break;
+g_assert_not_reached();
  }
  
  if (addr != VIA_REG_IFR || val != 0) {

@@ -322,6 +321,7 @@ void mos6522_write(void *opaque, hwaddr addr, uint64_t val, 
unsigned size)
  mdc->portB_write(s);
  break;
  case VIA_REG_A:


   /* As we do not model handshake, fall through no handshake. */


+case VIA_REG_ANH:
  s->a = (s->a & ~s->dira) | (val & s->dira);
  mdc->portA_write(s);
  break;
@@ -395,9 +395,7 @@ void mos6522_write(void *opaque, hwaddr addr, uint64_t val, 
unsigned size)
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL));
  break;
  default:
-case VIA_REG_ANH:
-s->anh = val;


Buggy again.


-break;
+g_assert_not_reached();
  }
  }
  
@@ -439,7 +437,6 @@ const VMStateDescription vmstate_mos6522 = {

  VMSTATE_UINT8(pcr, MOS6522State),
  VMSTATE_UINT8(ifr, MOS6522State),
  VMSTATE_UINT8(ier, MOS6522State),
-VMSTATE_UINT8(anh, MOS6522State),


Don't you need to increase .version_id?


  VMSTATE_STRUCT_ARRAY(timers, MOS6522State, 2, 0,
   vmstate_mos6522_timer, MOS6522Timer),
  VMSTATE_END_OF_LIST()
@@ -460,7 +457,6 @@ static void mos6522_reset(DeviceState *dev)
  s->ifr = 0;
  s->ier = 0;
  /* s->ier = T1_INT | SR_INT; */
-s->anh = 0;
  
  s->timers[0].frequency = s->frequency;

  s->timers[0].latch = 0x;
diff --git a/include/hw/misc/mos6522.h b/include/hw/misc/mos6522.h
index 493c907537..97384c6e02 100644
--- a/include/hw/misc/mos6522.h
+++ b/include/hw/misc/mos6522.h
@@ -115,7 +115,6 @@ typedef struct MOS6522State {
  uint8_t pcr;
  uint8_t ifr;
  uint8_t ier;
-uint8_t anh;
  
  MOS6522Timer timers[2];

  uint64_t frequency;






Re: [PATCH RESEND v20 2/8] numa: Extend CLI to provide memory latency and bandwidth information

2019-12-13 Thread Igor Mammedov
On Fri, 13 Dec 2019 09:19:23 +0800
Tao Xu  wrote:

> From: Liu Jingqi 
> 
> Add -numa hmat-lb option to provide System Locality Latency and
> Bandwidth Information. These memory attributes help to build
> System Locality Latency and Bandwidth Information Structure(s)
> in ACPI Heterogeneous Memory Attribute Table (HMAT). Before using
> hmat-lb option, enable HMAT with -machine hmat=on.
> 
> Acked-by: Markus Armbruster 
> Signed-off-by: Liu Jingqi 
> Signed-off-by: Tao Xu 

Reviewed-by: Igor Mammedov 

> ---
> 
> Changes in v20:
> - Update the QAPI description (Markus)
> - Keep base and bitmap unchanged when latency or bandwidth
>   out of range
> 
> Changes in v19:
> - Add description about the machine property 'hmat' in commit
>   message (Markus)
> 
> Changes in v18:
> - Use qapi type uint64 and only nanosecond for latency (Markus)
> 
> Changes in v17:
> - Add check when user input latency or bandwidth 0, the
>   lb_info_provided should also be 0. Because in ACPI 6.3 5.2.27.4,
>   0 means the corresponding latency or bandwidth information is
>   not provided.
> - Fix the infinite loop when node->latency is 0.
> ---
>  hw/core/numa.c| 194 ++
>  include/sysemu/numa.h |  53 
>  qapi/machine.json |  93 +++-
>  qemu-options.hx   |  47 +-
>  4 files changed, 384 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/core/numa.c b/hw/core/numa.c
> index e60da99293..34eb413f5d 100644
> --- a/hw/core/numa.c
> +++ b/hw/core/numa.c
> @@ -23,6 +23,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/units.h"
>  #include "sysemu/hostmem.h"
>  #include "sysemu/numa.h"
>  #include "sysemu/sysemu.h"
> @@ -198,6 +199,186 @@ void parse_numa_distance(MachineState *ms, 
> NumaDistOptions *dist, Error **errp)
>  ms->numa_state->have_numa_distance = true;
>  }
>  
> +void parse_numa_hmat_lb(NumaState *numa_state, NumaHmatLBOptions *node,
> +Error **errp)
> +{
> +int i, first_bit, last_bit;
> +uint64_t max_entry, temp_base, bitmap_copy;
> +NodeInfo *numa_info = numa_state->nodes;
> +HMAT_LB_Info *hmat_lb =
> +numa_state->hmat_lb[node->hierarchy][node->data_type];
> +HMAT_LB_Data lb_data = {};
> +HMAT_LB_Data *lb_temp;
> +
> +/* Error checking */
> +if (node->initiator > numa_state->num_nodes) {
> +error_setg(errp, "Invalid initiator=%d, it should be less than %d",
> +   node->initiator, numa_state->num_nodes);
> +return;
> +}
> +if (node->target > numa_state->num_nodes) {
> +error_setg(errp, "Invalid target=%d, it should be less than %d",
> +   node->target, numa_state->num_nodes);
> +return;
> +}
> +if (!numa_info[node->initiator].has_cpu) {
> +error_setg(errp, "Invalid initiator=%d, it isn't an "
> +   "initiator proximity domain", node->initiator);
> +return;
> +}
> +if (!numa_info[node->target].present) {
> +error_setg(errp, "The target=%d should point to an existing node",
> +   node->target);
> +return;
> +}
> +
> +if (!hmat_lb) {
> +hmat_lb = g_malloc0(sizeof(*hmat_lb));
> +numa_state->hmat_lb[node->hierarchy][node->data_type] = hmat_lb;
> +hmat_lb->list = g_array_new(false, true, sizeof(HMAT_LB_Data));
> +}
> +hmat_lb->hierarchy = node->hierarchy;
> +hmat_lb->data_type = node->data_type;
> +lb_data.initiator = node->initiator;
> +lb_data.target = node->target;
> +
> +if (node->data_type <= HMATLB_DATA_TYPE_WRITE_LATENCY) {
> +/* Input latency data */
> +
> +if (!node->has_latency) {
> +error_setg(errp, "Missing 'latency' option");
> +return;
> +}
> +if (node->has_bandwidth) {
> +error_setg(errp, "Invalid option 'bandwidth' since "
> +   "the data type is latency");
> +return;
> +}
> +
> +/* Detect duplicate configuration */
> +for (i = 0; i < hmat_lb->list->len; i++) {
> +lb_temp = &g_array_index(hmat_lb->list, HMAT_LB_Data, i);
> +
> +if (node->initiator == lb_temp->initiator &&
> +node->target == lb_temp->target) {
> +error_setg(errp, "Duplicate configuration of the latency for 
> "
> +"initiator=%d and target=%d", node->initiator,
> +node->target);
> +return;
> +}
> +}
> +
> +hmat_lb->base = hmat_lb->base ? hmat_lb->base : UINT64_MAX;
> +
> +if (node->latency) {
> +/* Calculate the temporary base and compressed latency */
> +max_entry = node->latency;
> +temp_base = 1;
> +while (QEMU_IS_ALIGNED(max_entry, 10)) {
> +max_entry /= 10;
> +temp_base *= 10;
> +

Re: [PATCH RESEND v20 3/8] numa: Extend CLI to provide memory side cache information

2019-12-13 Thread Igor Mammedov
On Fri, 13 Dec 2019 09:19:24 +0800
Tao Xu  wrote:

> From: Liu Jingqi 
> 
> Add -numa hmat-cache option to provide Memory Side Cache Information.
> These memory attributes help to build Memory Side Cache Information
> Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT).
> Before using hmat-cache option, enable HMAT with -machine hmat=on.
> 
> Acked-by: Markus Armbruster 
> Signed-off-by: Liu Jingqi 
> Signed-off-by: Tao Xu 

Reviewed-by: Igor Mammedov 


> ---
> 
> Changes in v20:
> - Disable cache level 0 in hmat-cache option (Igor)
> - Update the QAPI description (Markus)
> 
> Changes in v19:
> - Add description about the machine property 'hmat' in commit
>   message (Markus)
> - Update the QAPI comments
> - Add a check for no memory side cache
> 
> Changes in v18:
> - Update the error message (Igor)
> 
> Changes in v17:
> - Use NumaHmatCacheOptions to replace HMAT_Cache_Info (Igor)
> - Add check for unordered cache level input (Igor)
> 
> Changes in v16:
> - Add cross check with hmat_lb data (Igor)
> - Drop total_levels in struct HMAT_Cache_Info (Igor)
> - Correct the error table number (Igor)
> ---
>  hw/core/numa.c| 80 ++
>  include/sysemu/numa.h |  5 +++
>  qapi/machine.json | 81 +--
>  qemu-options.hx   | 17 +++--
>  4 files changed, 179 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/core/numa.c b/hw/core/numa.c
> index 34eb413f5d..33fda31a4c 100644
> --- a/hw/core/numa.c
> +++ b/hw/core/numa.c
> @@ -379,6 +379,73 @@ void parse_numa_hmat_lb(NumaState *numa_state, 
> NumaHmatLBOptions *node,
>  g_array_append_val(hmat_lb->list, lb_data);
>  }
>  
> +void parse_numa_hmat_cache(MachineState *ms, NumaHmatCacheOptions *node,
> +   Error **errp)
> +{
> +int nb_numa_nodes = ms->numa_state->num_nodes;
> +NodeInfo *numa_info = ms->numa_state->nodes;
> +NumaHmatCacheOptions *hmat_cache = NULL;
> +
> +if (node->node_id >= nb_numa_nodes) {
> +error_setg(errp, "Invalid node-id=%" PRIu32 ", it should be less "
> +   "than %d", node->node_id, nb_numa_nodes);
> +return;
> +}
> +
> +if (numa_info[node->node_id].lb_info_provided != (BIT(0) | BIT(1))) {
> +error_setg(errp, "The latency and bandwidth information of "
> +   "node-id=%" PRIu32 " should be provided before memory 
> side "
> +   "cache attributes", node->node_id);
> +return;
> +}
> +
> +if (node->level < 1 || node->level >= HMAT_LB_LEVELS) {
> +error_setg(errp, "Invalid level=%" PRIu8 ", it should be larger than 
> 0 "
> +   "and less than or equal to %d", node->level,
> +   HMAT_LB_LEVELS - 1);
> +return;
> +}
> +
> +assert(node->associativity < HMAT_CACHE_ASSOCIATIVITY__MAX);
> +assert(node->policy < HMAT_CACHE_WRITE_POLICY__MAX);
> +if (ms->numa_state->hmat_cache[node->node_id][node->level]) {
> +error_setg(errp, "Duplicate configuration of the side cache for "
> +   "node-id=%" PRIu32 " and level=%" PRIu8,
> +   node->node_id, node->level);
> +return;
> +}
> +
> +if ((node->level > 1) &&
> +ms->numa_state->hmat_cache[node->node_id][node->level - 1] &&
> +(node->size >=
> +ms->numa_state->hmat_cache[node->node_id][node->level - 
> 1]->size)) {
> +error_setg(errp, "Invalid size=%" PRIu64 ", the size of level=%" 
> PRIu8
> +   " should be less than the size(%" PRIu64 ") of "
> +   "level=%" PRIu8, node->size, node->level,
> +   ms->numa_state->hmat_cache[node->node_id]
> + [node->level - 1]->size,
> +   node->level - 1);
> +return;
> +}
> +
> +if ((node->level < HMAT_LB_LEVELS - 1) &&
> +ms->numa_state->hmat_cache[node->node_id][node->level + 1] &&
> +(node->size <=
> +ms->numa_state->hmat_cache[node->node_id][node->level + 
> 1]->size)) {
> +error_setg(errp, "Invalid size=%" PRIu64 ", the size of level=%" 
> PRIu8
> +   " should be larger than the size(%" PRIu64 ") of "
> +   "level=%" PRIu8, node->size, node->level,
> +   ms->numa_state->hmat_cache[node->node_id]
> + [node->level + 1]->size,
> +   node->level + 1);
> +return;
> +}
> +
> +hmat_cache = g_malloc0(sizeof(*hmat_cache));
> +memcpy(hmat_cache, node, sizeof(*hmat_cache));
> +ms->numa_state->hmat_cache[node->node_id][node->level] = hmat_cache;
> +}
> +
>  void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp)
>  {
>  Error *err = NULL;
> @@ -430,6 +497,19 @@ void set_numa_options(MachineState *ms, NumaOptions 
> *object, Error **errp

Re: [PATCH v2 3/8] hw: replace hw/i386/pc.h with a header just for the i8259

2019-12-13 Thread Philippe Mathieu-Daudé

On 12/12/19 9:05 PM, Paolo Bonzini wrote:
Il gio 12 dic 2019, 20:04 Philippe Mathieu-Daudé > ha scritto:


On 12/12/19 6:29 PM, Paolo Bonzini wrote:
 > Remove the need to include i386/pc.h to get to the i8259 functions.
 > This is enough to remove the inclusion of hw/i386/pc.h from all
non-x86
 > files.

Eh this is very similar to the patch I'v staged for 5.0, now than the
Malta/PC split got merged.


Ok, these patches are not urgent so I will just wait for yours to go in 
and rebase.


Oh, I don't want to delay your series, this was more of a "comment to 
self" while reviewing your.


The MicroVM series introduced changes that outdated my work, and since 
having MicroVM was more important that global cleanup, I didn't insist a 
that time. Now GSI and IOAPIC are more exposed so my previous work 
doesn't apply at all. Well, current code diverged.
I'll need some time to figure out if it is worthwhile salvaging, so 
don't wait for that.





Re: [PATCH RESEND v20 7/8] tests/numa: Add case for QMP build HMAT

2019-12-13 Thread Igor Mammedov
On Fri, 13 Dec 2019 09:19:28 +0800
Tao Xu  wrote:

> Check configuring HMAT usecase
> 
> Acked-by: Markus Armbruster 
> Suggested-by: Igor Mammedov 
> Signed-off-by: Tao Xu 

Reviewed-by: Igor Mammedov 

> ---
> 
> Changes in v20:
> - Fix the wrong target in pc_hmat_erange_cfg
> - Use g_assert_true and g_assert_false to replace g_assert
>   (Thomas and Markus)
> 
> Changes in v19:
> - Add some fail cases for hmat-cache when level=0
> 
> Changes in v18:
> - Rewrite the lines over 80 characters
> 
> Chenges in v17:
> - Add some fail test cases (Igor)
> ---
>  tests/numa-test.c | 213 ++
>  1 file changed, 213 insertions(+)
> 
> diff --git a/tests/numa-test.c b/tests/numa-test.c
> index 8de8581231..17dd807d2a 100644
> --- a/tests/numa-test.c
> +++ b/tests/numa-test.c
> @@ -327,6 +327,216 @@ static void pc_dynamic_cpu_cfg(const void *data)
>  qtest_quit(qs);
>  }
>  
> +static void pc_hmat_build_cfg(const void *data)
> +{
> +QTestState *qs = qtest_initf("%s -nodefaults --preconfig -machine 
> hmat=on "
> + "-smp 2,sockets=2 "
> + "-m 128M,slots=2,maxmem=1G "
> + "-object memory-backend-ram,size=64M,id=m0 "
> + "-object memory-backend-ram,size=64M,id=m1 "
> + "-numa node,nodeid=0,memdev=m0 "
> + "-numa node,nodeid=1,memdev=m1,initiator=0 "
> + "-numa cpu,node-id=0,socket-id=0 "
> + "-numa cpu,node-id=0,socket-id=1",
> + data ? (char *)data : "");
> +
> +/* Fail: Initiator should be less than the number of nodes */
> +g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 2, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-latency\" } }")));
> +
> +/* Fail: Target should be less than the number of nodes */
> +g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 2,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-latency\" } }")));
> +
> +/* Fail: Initiator should contain cpu */
> +g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 1, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-latency\" } }")));
> +
> +/* Fail: Data-type mismatch */
> +g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"write-latency\","
> +" 'bandwidth': 524288000 } }")));
> +g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"read-bandwidth\","
> +" 'latency': 5 } }")));
> +
> +/* Fail: Bandwidth should be 1MB (1048576) aligned */
> +g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
> +" 'bandwidth': 1048575 } }")));
> +
> +/* Configuring HMAT bandwidth and latency details */
> +g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-latency\","
> +" 'latency': 1 } }")));/* 1 ns */
> +g_assert_true(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-latency\","
> +" 'latency': 5 } }")));/* Fail: Duplicate configuration */
> +g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 0,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
> +" 'bandwidth': 68717379584 } }")));/* 65534 MB/s */
> +g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 1,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-latency\","
> +" 'latency': 65534 } }")));/* 65534 ns */
> +g_assert_false(qmp_rsp_is_err(qtest_qmp(qs, "{ 'execute': 
> 'set-numa-node',"
> +" 'arguments': { 'type': 'hmat-lb', 'initiator': 0, 'target': 1,"
> +" 'hierarchy': \"memory\", 'data-type': \"access-bandwidth\","
> +" 'bandwidth': 34358689792 } 

[PATCH] virtio-mmio: Clear v2 transport state on soft reset

2019-12-13 Thread Jean-Philippe Brucker
At the moment when the guest writes a status of 0, we only reset the
virtio core state but not the virtio-mmio state. The virtio-mmio
specification says (v1.1 cs01, 4.2.2.1 Device Requirements:
MMIO Device Register Layout):

Upon reset, the device MUST clear all bits in InterruptStatus and
ready bits in the QueueReady register for all queues in the device.

The core already takes care of InterruptStatus by clearing isr, but we
still need to clear QueueReady.

It would be tempting to clean all registers, but since the specification
doesn't say anything more, guests could rely on the registers keeping
their state across reset. Linux for example, relies on this for
GuestPageSize in the legacy MMIO tranport.

Fixes: 44e687a4d9ab ("virtio-mmio: implement modern (v2) personality 
(virtio-1)")
Signed-off-by: Jean-Philippe Brucker 
---
This fixes kexec of a Linux guest that uses the modern virtio-mmio
transport.
---
 hw/virtio/virtio-mmio.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 94d934c44b..ef40b7a9b2 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -65,6 +65,19 @@ static void virtio_mmio_stop_ioeventfd(VirtIOMMIOProxy 
*proxy)
 virtio_bus_stop_ioeventfd(&proxy->bus);
 }
 
+static void virtio_mmio_soft_reset(VirtIOMMIOProxy *proxy)
+{
+int i;
+
+if (proxy->legacy) {
+return;
+}
+
+for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
+proxy->vqs[i].enabled = 0;
+}
+}
+
 static uint64_t virtio_mmio_read(void *opaque, hwaddr offset, unsigned size)
 {
 VirtIOMMIOProxy *proxy = (VirtIOMMIOProxy *)opaque;
@@ -378,6 +391,7 @@ static void virtio_mmio_write(void *opaque, hwaddr offset, 
uint64_t value,
 
 if (vdev->status == 0) {
 virtio_reset(vdev);
+virtio_mmio_soft_reset(proxy);
 }
 break;
 case VIRTIO_MMIO_QUEUE_DESC_LOW:
-- 
2.24.0




Re: [PATCH for-5.0 v11 12/20] qapi: Introduce DEFINE_PROP_INTERVAL

2019-12-13 Thread Markus Armbruster
Auger Eric  writes:

> Hi Markus,
>
> On 12/12/19 1:17 PM, Markus Armbruster wrote:
>> Eric Auger  writes:
>> 
>>> Introduce a new property defining a labelled interval:
>>> ,,label.
>>>
>>> This will be used to encode reserved IOVA regions. The label
>>> is left undefined to ease reuse accross use cases.
>> 
>> What does the last sentence mean?
> The dilemma was shall I specialize this property such as ReservedRegion
> or shall I leave it generic enough to serve somebody else use case. I
> first chose the latter but now I think I should rather call it something
> like ReservedRegion as in any case it has addresses and an integer label.
>> 
>>> For instance, in virtio-iommu use case, reserved IOVA regions
>>> will be passed by the machine code to the virtio-iommu-pci
>>> device (an array of those). The label will match the
>>> virtio_iommu_probe_resv_mem subtype value:
>>> - VIRTIO_IOMMU_RESV_MEM_T_RESERVED (0)
>>> - VIRTIO_IOMMU_RESV_MEM_T_MSI (1)
>>>
>>> This is used to inform the virtio-iommu-pci device it should
>>> bypass the MSI region: 0xfee0, 0xfeef, 1.
>> 
>> So the "label" part of ",,label" is a number?
> yes it is.
>> 
>> Is a number appropriate for your use case, or would an enum be better?
> I think a number is OK. There might be other types of reserved regions
> in the future. Also if we want to allow somebody else to reuse that
> property in another context, I would rather leave it open?

I'd prioritize the user interface over possible reuse (which might never
happen).  Mind, I'm not telling you using numbers is a bad user
interface.  In general, enums are nicer, but I don't know enough about
this particular case.

>> 
>>>
>>> Signed-off-by: Eric Auger  ---
[...]
>>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>>> index e499dc215b..e238d1c352 100644
>>> --- a/include/exec/memory.h
>>> +++ b/include/exec/memory.h
>>> @@ -57,6 +57,12 @@ struct MemoryRegionMmio {
>>>  CPUWriteMemoryFunc *write[3];
>>>  };
>>>  
>>> +struct Interval {
>>> +hwaddr low;
>>> +hwaddr high;
>>> +unsigned int type;
>>> +};
>> 
>> This isn't an interval.  An interval consists of two values, not three.
>> 
>> The third one is called "type" here, and "label" elsewhere.  Pick one
>> and stick to it.
>> 
>> Then pick a name for the triple.  Elsewhere, you call it "labelled
>> interval".
> I would tend to use ReservedRegion now if nobody objects.

Sounds good to me.

> Thank you for the review!

You're welcome!




Re: [PATCH RESEND v20 0/8] Build ACPI Heterogeneous Memory Attribute Table (HMAT)

2019-12-13 Thread Michael S. Tsirkin
On Fri, Dec 13, 2019 at 09:19:21AM +0800, Tao Xu wrote:
> This series of patches will build Heterogeneous Memory Attribute Table (HMAT)
> according to the command line. The ACPI HMAT describes the memory attributes,
> such as memory side cache attributes and bandwidth and latency details,
> related to the Memory Proximity Domain.
> The software is expected to use HMAT information as hint for optimization.
> 
> In the linux kernel, the codes in drivers/acpi/hmat/hmat.c parse and report
> the platform's HMAT tables.
> 
> The V19 patches link:
> https://patchwork.kernel.org/cover/11265525/

Looks good to me, I'll queue it for merge after the release. If possible
please ping me after the release to help make sure it didn't get
dropped.



> Changelog:
> v20:
> - Resend to fix the wrong target in pc_hmat_erange_cfg()
> - Use g_assert_true and g_assert_false to replace g_assert
>   (Thomas and Markus)
> - Rename assoc as associativity, update the QAPI description (Markus)
> - Disable cache level 0 in hmat-cache option (Igor)
> - Keep base and bitmap unchanged when latency or bandwidth
>   out of range
> - Fix the broken CI case when user input latency or bandwidth
>   less than required.
> v19:
> - Add description about the machine property 'hmat' in commit
>   message (Markus)
> - Update the QAPI comments
> - Add a check for no memory side cache
> - Add some fail cases for hmat-cache when level=0
> v18:
> - Defer patches 01/14~06/14 of V17, use qapi type uint64 and
>   only nanosecond for latency (Markus)
> - Rewrite the lines over 80 characters(Igor)
> v17:
> - Add check when user input latency or bandwidth 0, the
>   lb_info_provided should also be 0. Because in ACPI 6.3 5.2.27.4,
>   0 means the corresponding latency or bandwidth information is
>   not provided.
> - Fix the infinite loop when node->latency is 0.
> - Use NumaHmatCacheOptions to replace HMAT_Cache_Info (Igor)
> - Add check for unordered cache level input (Igor)
> - Add some fail test cases (Igor)
> v16:
> - Add and use qemu_strtold_finite to parse size, support full
>   64bit precision, modify related test cases (Eduardo and Markus)
> - Simplify struct HMAT_LB_Info and related code, unify latency
>   and bandwidth (Igor)
> - Add cross check with hmat_lb data (Igor)
> - Fields in Cache Attributes are promoted to uint32_t before
>   shifting (Igor)
> - Add case for QMP build HMAT (Igor)
> v15:
> - Add a new patch to refactor do_strtosz() (Eduardo)
> - Make tests without breaking CI (Michael)
> v14:
> - Reuse the codes of do_strtosz to build qemu_strtotime_ns
>   (Eduardo)
> - Squash patch v13 01/12 and 02/12 together (Daniel and Eduardo)
> - Drop time unit picosecond (Eric)
> - Use qemu ctz64 and clz64 instead of builtin function
> v13:
> - Modify some text description
> - Drop "initiator_valid" field in struct NodeInfo
> - Reuse Garray to store the raw bandwidth and bandwidth data
> - Calculate common base unit using range bitmap
> - Add a patch to alculate hmat latency and bandwidth entry list
> - Drop the total_levels option and use readable cache size
> - Remove the unnecessary head file
> - Use decimal notation with appropriate suffix for cache size
> 
> Liu Jingqi (5):
>   numa: Extend CLI to provide memory latency and bandwidth information
>   numa: Extend CLI to provide memory side cache information
>   hmat acpi: Build Memory Proximity Domain Attributes Structure(s)
>   hmat acpi: Build System Locality Latency and Bandwidth Information
> Structure(s)
>   hmat acpi: Build Memory Side Cache Information Structure(s)
> 
> Tao Xu (3):
>   numa: Extend CLI to provide initiator information for numa nodes
>   tests/numa: Add case for QMP build HMAT
>   tests/bios-tables-test: add test cases for ACPI HMAT
> 
>  hw/acpi/Kconfig   |   7 +-
>  hw/acpi/Makefile.objs |   1 +
>  hw/acpi/hmat.c| 268 +++
>  hw/acpi/hmat.h|  42 
>  hw/core/machine.c |  64 ++
>  hw/core/numa.c| 297 ++
>  hw/i386/acpi-build.c  |   5 +
>  include/sysemu/numa.h |  63 ++
>  qapi/machine.json | 180 +++-
>  qemu-options.hx   |  95 +++-
>  tests/bios-tables-test-allowed-diff.h |   8 +
>  tests/bios-tables-test.c  |  44 
>  tests/data/acpi/pc/APIC.acpihmat  |   0
>  tests/data/acpi/pc/DSDT.acpihmat  |   0
>  tests/data/acpi/pc/HMAT.acpihmat  |   0
>  tests/data/acpi/pc/SRAT.acpihmat  |   0
>  tests/data/acpi/q35/APIC.acpihmat |   0
>  tests/data/acpi/q35/DSDT.acpihmat |   0
>  tests/data/acpi/q35/HMAT.acpihmat |   0
>  tests/data/acpi/q35/SRAT.acpihmat |   0
>  tests/numa-t

Re: [PATCH 1/2] vhost-user: add VHOST_USER_RESET_DEVICE to reset devices

2019-12-13 Thread Michael S. Tsirkin
On Tue, Oct 29, 2019 at 05:38:02PM -0400, Raphael Norwitz wrote:
> Add a VHOST_USER_RESET_DEVICE message which will reset the vhost user
> backend. Disabling all rings, and resetting all internal state, ready
> for the backend to be reinitialized.
> 
> A backend has to report it supports this features with the
> VHOST_USER_PROTOCOL_F_RESET_DEVICE protocol feature bit. If it does
> so, the new message is used instead of sending a RESET_OWNER which has
> had inconsistent implementations.
> 
> Signed-off-by: David Vrabel 
> Signed-off-by: Raphael Norwitz 

Looks good to me, I'll queue it for merge after the release. If possible
please ping me after the release to help make sure it didn't get
dropped. Same for 2/2.


> ---
>  docs/interop/vhost-user.rst | 15 +++
>  hw/virtio/vhost-user.c  |  8 +++-
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 7827b71..d213d4a 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -785,6 +785,7 @@ Protocol features
>#define VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD  10
>#define VHOST_USER_PROTOCOL_F_HOST_NOTIFIER  11
>#define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
> +  #define VHOST_USER_PROTOCOL_F_RESET_DEVICE   13
>  
>  Master message types
>  
> @@ -1190,6 +1191,20 @@ Master message types
>ancillary data. The GPU protocol is used to inform the master of
>rendering state and updates. See vhost-user-gpu.rst for details.
>  
> +``VHOST_USER_RESET_DEVICE``
> +  :id: 34
> +  :equivalent ioctl: N/A
> +  :master payload: N/A
> +  :slave payload: N/A
> +
> +  Ask the vhost user backend to disable all rings and reset all
> +  internal device state to the initial state, ready to be
> +  reinitialized. The backend retains ownership of the device
> +  throughout the reset operation.
> +
> +  Only valid if the ``VHOST_USER_PROTOCOL_F_RESET_DEVICE`` protocol
> +  feature is set by the backend.
> +
>  Slave message types
>  ---
>  
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 02a9b25..d27a10f 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -58,6 +58,7 @@ enum VhostUserProtocolFeature {
>  VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD = 10,
>  VHOST_USER_PROTOCOL_F_HOST_NOTIFIER = 11,
>  VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD = 12,
> +VHOST_USER_PROTOCOL_F_RESET_DEVICE = 13,
>  VHOST_USER_PROTOCOL_F_MAX
>  };
>  
> @@ -98,6 +99,7 @@ typedef enum VhostUserRequest {
>  VHOST_USER_GET_INFLIGHT_FD = 31,
>  VHOST_USER_SET_INFLIGHT_FD = 32,
>  VHOST_USER_GPU_SET_SOCKET = 33,
> +VHOST_USER_RESET_DEVICE = 34,
>  VHOST_USER_MAX
>  } VhostUserRequest;
>  
> @@ -890,10 +892,14 @@ static int vhost_user_set_owner(struct vhost_dev *dev)
>  static int vhost_user_reset_device(struct vhost_dev *dev)
>  {
>  VhostUserMsg msg = {
> -.hdr.request = VHOST_USER_RESET_OWNER,
>  .hdr.flags = VHOST_USER_VERSION,
>  };
>  
> +msg.hdr.request = virtio_has_feature(dev->protocol_features,
> + VHOST_USER_PROTOCOL_F_RESET_DEVICE)
> +? VHOST_USER_RESET_DEVICE
> +: VHOST_USER_RESET_OWNER;
> +
>  if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
>  return -1;
>  }
> -- 
> 1.8.3.1




Re: [PATCH] virtio-mmio: Clear v2 transport state on soft reset

2019-12-13 Thread Sergio Lopez

Jean-Philippe Brucker  writes:

> At the moment when the guest writes a status of 0, we only reset the
> virtio core state but not the virtio-mmio state. The virtio-mmio
> specification says (v1.1 cs01, 4.2.2.1 Device Requirements:
> MMIO Device Register Layout):
>
> Upon reset, the device MUST clear all bits in InterruptStatus and
> ready bits in the QueueReady register for all queues in the device.
>
> The core already takes care of InterruptStatus by clearing isr, but we
> still need to clear QueueReady.
>
> It would be tempting to clean all registers, but since the specification
> doesn't say anything more, guests could rely on the registers keeping
> their state across reset. Linux for example, relies on this for
> GuestPageSize in the legacy MMIO tranport.
>
> Fixes: 44e687a4d9ab ("virtio-mmio: implement modern (v2) personality 
> (virtio-1)")
> Signed-off-by: Jean-Philippe Brucker 
> ---
> This fixes kexec of a Linux guest that uses the modern virtio-mmio
> transport.
> ---
>  hw/virtio/virtio-mmio.c | 14 ++
>  1 file changed, 14 insertions(+)

LGTM, thanks!

Reviewed-by: Sergio Lopez 


signature.asc
Description: PGP signature


Re: [PATCH v0 2/2] block: allow to set 'drive' property on a realized block device

2019-12-13 Thread Kevin Wolf
Am 18.11.2019 um 11:50 hat Denis Plotnikov geschrieben:
> 
> 
> On 10.11.2019 22:08, Denis Plotnikov wrote:
> >
> > On 10.11.2019 22:03, Denis Plotnikov wrote:
> >> This allows to change (replace) the file on a block device and is useful
> >> to workaround exclusive file access restrictions, e.g. to implement VM
> >> migration with a shared disk stored on some storage with the exclusive
> >> file opening model: a destination VM is started waiting for incomming
> >> migration with a fake image drive, and later, on the last migration
> >> phase, the fake image file is replaced with the real one.
> >>
> >> Signed-off-by: Denis Plotnikov 
> >> ---
> >>   hw/core/qdev-properties-system.c | 89 +++-
> >>   1 file changed, 77 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/hw/core/qdev-properties-system.c 
> >> b/hw/core/qdev-properties-system.c
> >> index c534590dcd..aaab1370a4 100644
> >> --- a/hw/core/qdev-properties-system.c
> >> +++ b/hw/core/qdev-properties-system.c
> >> @@ -79,8 +79,55 @@ static void set_pointer(Object *obj, Visitor *v, 
> >> Property *prop,
> >>     /* --- drive --- */
> >>   -static void do_parse_drive(DeviceState *dev, const char *str, void 
> >> **ptr,
> >> -   const char *propname, bool iothread, 
> >> Error **errp)
> >> +static void do_parse_drive_realized(DeviceState *dev, const char *str,
> >> +    void **ptr, const char *propname,
> >> +    bool iothread, Error **errp)
> >> +{
> >> +    BlockBackend *blk = *ptr;
> >> +    BlockDriverState *bs = bdrv_lookup_bs(NULL, str, NULL);
> >> +    int ret;
> >> +    bool blk_created = false;
> >> +
> >> +    if (!bs) {
> >> +    error_setg(errp, "Can't find blockdev '%s'", str);
> >> +    return;
> >> +    }
> >> +
> >> +    if (!blk) {
> >> +    AioContext *ctx = iothread ? bdrv_get_aio_context(bs) :
> >> + qemu_get_aio_context();
> >> +    blk = blk_new(ctx, BLK_PERM_ALL, BLK_PERM_ALL);
> >> +    blk_created = true;
> >
> > Actually, I have concerns about situation where blk=null.
> >
> > Is there any case when scsi-hd (or others) doesn't have a blk assigned 
> > and it's legal?

No, block devices will always have a BlockBackend, even if it doesn't
have a root node inserted.

> >> +    } else {
> >> +    if (blk_bs(blk)) {
> >> +    blk_remove_bs(blk);
> >> +    }
> >> +    }
> >> +
> >> +    ret = blk_insert_bs(blk, bs, errp);
> >> +
> >> +    if (!ret && blk_created) {
> >> +    if (blk_attach_dev(blk, dev) < 0) {
> >> +    /*
> >> + * Shouldn't be any errors here since we just created
> >> + * the new blk because the device doesn't have any.
> >> + * Leave the message here in case blk_attach_dev is changed
> >> + */
> >> + error_setg(errp, "Can't attach drive '%s' to device '%s'",
> >> +    str, object_get_typename(OBJECT(dev)));
> >> +    } else {
> >> +    *ptr = blk;
> >> +    }
> >> +    }
> Another problem here, is that the "size" of the device dev may not match 
> after setting a drive.
> So, we should update it after the drive setting.
> It was found, that it could be done by calling 
> BlockDevOps.bdrv_parent_cb_resize.
> 
> But I have some concerns about doing it so. In the case of virtio scsi 
> disk we have the following callstack
> 
>      bdrv_parent_cb_resize calls() ->
>      scsi_device_report_change(dev, SENSE_CODE(CAPACITY_CHANGED)) ->
>              virtio_scsi_change ->
>      virtio_scsi_push_event(s, dev, VIRTIO_SCSI_T_PARAM_CHANGE,
>                              sense.asc | 
> (sense.ascq << 8));

I think the safest option for now (and which should solve the case you
want to address) is checking whether old and new size match and
returning an error otherwise.

> virtio_scsi_change  pushes the event to the guest to make the guest
> ask for size refreshing.  If I'm not mistaken, here we can get a race
> condition when some another request is processed with an unchanged
> size and then the size changing request is processed.

I think this is actually a problem even without resizing: We need to
quiesce the device between removing the old root and inserting the new
one. They way to achieve this is probably by splitting blk_drain() into
a blk_drain_begin()/end() and then draining the BlockBackend here while
we're working on it.

Kevin




Re: [PATCH] qcow2: Use offset_into_cluster()

2019-12-13 Thread Kevin Wolf
Am 12.12.2019 um 11:01 hat Alberto Garcia geschrieben:
> There's a couple of places left in the qcow2 code that still do the
> calculation manually, so let's replace them.
> 
> Signed-off-by: Alberto Garcia 

Thanks, applied to the block branch.

Kevin




Re: [RFC v4 PATCH 00/49] Initial support of multi-process qemu - status update

2019-12-13 Thread Stefan Hajnoczi
On Mon, Dec 09, 2019 at 10:47:17PM -0800, Elena Ufimtseva wrote:
> At this moment we are working on the first stage of the project with help of
> the Nutanix developers.
> The questions we have gathered so far will be addressed with muser
> and Qemu developers after we finish the first stage and make sure we 
> understand
> what it will take for us to move onto the next stage.
> 
> We will also incorporate relevant review from Stefan that he provided
> on the series 4 of the patchset. Thank you Stefan.
> 
> If anyone have any further suggestions or questions about the status,
> please reply to this email.

Hi Elena,
At KVM Forum we discussed spending 1 or 2 weeks trying out muser.  A few
weeks have passed and from your email it sounds like this "next stage"
might be a lot of work.

Is there a work-in-progress muser patch series you can post to start the
discussion early?  That way we can avoid reviewers like myself asking
you to make changes after you have invested a lot of time.

It's good that you are in touch with the muser developers (via private
discussion?  I haven't seen much activity on #muser IRC).

Stefan


signature.asc
Description: PGP signature


[PATCH 1/4] hw/i386/pc: Convert DPRINTF() to trace events

2019-12-13 Thread Philippe Mathieu-Daudé
Convert the deprecated DPRINTF() macro to trace events.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/i386/pc.c | 19 +--
 hw/i386/trace-events |  6 ++
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ac08e63604..66a30cfdf5 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -90,16 +90,7 @@
 #include "config-devices.h"
 #include "e820_memory_layout.h"
 #include "fw_cfg.h"
-
-/* debug PC/ISA interrupts */
-//#define DEBUG_IRQ
-
-#ifdef DEBUG_IRQ
-#define DPRINTF(fmt, ...)   \
-do { printf("CPUIRQ: " fmt , ## __VA_ARGS__); } while (0)
-#else
-#define DPRINTF(fmt, ...)
-#endif
+#include "trace.h"
 
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
@@ -348,7 +339,7 @@ void gsi_handler(void *opaque, int n, int level)
 {
 GSIState *s = opaque;
 
-DPRINTF("pc: %s GSI %d\n", level ? "raising" : "lowering", n);
+trace_pc_gsi_interrupt(n, level);
 if (n < ISA_NUM_IRQS) {
 qemu_set_irq(s->i8259_irq[n], level);
 }
@@ -426,7 +417,7 @@ static void pic_irq_request(void *opaque, int irq, int 
level)
 CPUState *cs = first_cpu;
 X86CPU *cpu = X86_CPU(cs);
 
-DPRINTF("pic_irqs: %s irq %d\n", level? "raise" : "lower", irq);
+trace_pc_pic_interrupt(irq, level);
 if (cpu->apic_state && !kvm_irqchip_in_kernel()) {
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
@@ -760,7 +751,7 @@ static void port92_write(void *opaque, hwaddr addr, 
uint64_t val,
 Port92State *s = opaque;
 int oldval = s->outport;
 
-DPRINTF("port92: write 0x%02" PRIx64 "\n", val);
+trace_port92_write(val);
 s->outport = val;
 qemu_set_irq(s->a20_out, (val >> 1) & 1);
 if ((val & 1) && !(oldval & 1)) {
@@ -775,7 +766,7 @@ static uint64_t port92_read(void *opaque, hwaddr addr,
 uint32_t ret;
 
 ret = s->outport;
-DPRINTF("port92: read 0x%02x\n", ret);
+trace_port92_read(ret);
 return ret;
 }
 
diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index c8bc464bc5..43f33cf7e2 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -111,3 +111,9 @@ amdvi_ir_irte_ga_val(uint64_t hi, uint64_t lo) "hi 
0x%"PRIx64" lo 0x%"PRIx64
 # vmport.c
 vmport_register(unsigned char command, void *func, void *opaque) "command: 
0x%02x func: %p opaque: %p"
 vmport_command(unsigned char command) "command: 0x%02x"
+
+# pc.c
+pc_gsi_interrupt(int irqn, int level) "GSI interrupt #%d level:%d"
+pc_pic_interrupt(int irqn, int level) "PIC interrupt #%d level:%d"
+port92_read(uint8_t val) "port92: read 0x%02x"
+port92_write(uint8_t val) "port92: write 0x%02x"
-- 
2.21.0




[PATCH 0/4] hw/i386/pc: Extract the port92 device

2019-12-13 Thread Philippe Mathieu-Daudé
In this series we
- remove the old DPRINTF() macro in hw/i386/pc.c
- extract the TYPE_PORT92 device from the same file,
  reducing it by 5%.

Philippe Mathieu-Daudé (4):
  hw/i386/pc: Convert DPRINTF() to trace events
  hw/i386/pc: Use TYPE_PORT92 instead of hardcoded string
  hw/i386/pc: Inline port92_init()
  hw/i386/pc: Extract the port92 device

 include/hw/i386/pc.h  |   3 +
 hw/i386/pc.c  | 138 ++
 hw/i386/port92.c  | 126 ++
 hw/i386/Makefile.objs |   1 +
 hw/i386/trace-events  |   8 +++
 5 files changed, 144 insertions(+), 132 deletions(-)
 create mode 100644 hw/i386/port92.c

-- 
2.21.0




[PATCH 2/4] hw/i386/pc: Use TYPE_PORT92 instead of hardcoded string

2019-12-13 Thread Philippe Mathieu-Daudé
By using the TYPE_* definitions for devices, we can:
- quickly find where devices are used with 'git-grep'
- easily rename a device (one-line change).

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/i386/pc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 66a30cfdf5..2c2ae27447 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1353,7 +1353,7 @@ static void pc_superio_init(ISABus *isa_bus, bool 
create_fdctrl, bool no_vmport)
 qdev_prop_set_ptr(dev, "ps2_mouse", i8042);
 qdev_init_nofail(dev);
 }
-port92 = isa_create_simple(isa_bus, "port92");
+port92 = isa_create_simple(isa_bus, TYPE_PORT92);
 
 a20_line = qemu_allocate_irqs(handle_a20_line_change, first_cpu, 2);
 i8042_setup_a20_line(i8042, a20_line[0]);
-- 
2.21.0




[PATCH 3/4] hw/i386/pc: Inline port92_init()

2019-12-13 Thread Philippe Mathieu-Daudé
This one-line function is not very helpful, so remove it
by inlining the call to qdev_connect_gpio_out_named().

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/i386/pc.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2c2ae27447..2e8992c7d0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -770,11 +770,6 @@ static uint64_t port92_read(void *opaque, hwaddr addr,
 return ret;
 }
 
-static void port92_init(ISADevice *dev, qemu_irq a20_out)
-{
-qdev_connect_gpio_out_named(DEVICE(dev), PORT92_A20_LINE, 0, a20_out);
-}
-
 static const VMStateDescription vmstate_port92_isa = {
 .name = "port92",
 .version_id = 1,
@@ -830,8 +825,8 @@ static void port92_class_initfn(ObjectClass *klass, void 
*data)
 dc->vmsd = &vmstate_port92_isa;
 /*
  * Reason: unlike ordinary ISA devices, this one needs additional
- * wiring: its A20 output line needs to be wired up by
- * port92_init().
+ * wiring: its A20 output line needs to be wired up with
+ * qdev_connect_gpio_out_named().
  */
 dc->user_creatable = false;
 }
@@ -1357,7 +1352,8 @@ static void pc_superio_init(ISABus *isa_bus, bool 
create_fdctrl, bool no_vmport)
 
 a20_line = qemu_allocate_irqs(handle_a20_line_change, first_cpu, 2);
 i8042_setup_a20_line(i8042, a20_line[0]);
-port92_init(port92, a20_line[1]);
+qdev_connect_gpio_out_named(DEVICE(port92),
+PORT92_A20_LINE, 0, a20_line[1]);
 g_free(a20_line);
 }
 
-- 
2.21.0




Re: [PATCH] vhost-user-fs: remove "vhostfd" property

2019-12-13 Thread Dr. David Alan Gilbert
* Marc-André Lureau (marcandre.lur...@redhat.com) wrote:
> The property doesn't make much sense for a vhost-user device.
> 
> Signed-off-by: Marc-André Lureau 

Queued for virtiofs

> ---
>  hw/virtio/vhost-user-fs.c | 1 -
>  include/hw/virtio/vhost-user-fs.h | 1 -
>  2 files changed, 2 deletions(-)
> 
> diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> index f0df7f4746..ca0b7fc9de 100644
> --- a/hw/virtio/vhost-user-fs.c
> +++ b/hw/virtio/vhost-user-fs.c
> @@ -263,7 +263,6 @@ static Property vuf_properties[] = {
>  DEFINE_PROP_UINT16("num-request-queues", VHostUserFS,
> conf.num_request_queues, 1),
>  DEFINE_PROP_UINT16("queue-size", VHostUserFS, conf.queue_size, 128),
> -DEFINE_PROP_STRING("vhostfd", VHostUserFS, conf.vhostfd),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/virtio/vhost-user-fs.h 
> b/include/hw/virtio/vhost-user-fs.h
> index 539885b458..9ff1bdb7cf 100644
> --- a/include/hw/virtio/vhost-user-fs.h
> +++ b/include/hw/virtio/vhost-user-fs.h
> @@ -28,7 +28,6 @@ typedef struct {
>  char *tag;
>  uint16_t num_request_queues;
>  uint16_t queue_size;
> -char *vhostfd;
>  } VHostUserFSConf;
>  
>  typedef struct {
> -- 
> 2.24.0
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[PATCH 4/4] hw/i386/pc: Extract the port92 device

2019-12-13 Thread Philippe Mathieu-Daudé
This device is only used by the PC machines. The pc.c file is
already big enough, with 2255 lines. By removing 113 lines of
it, we reduced it by 5%. It is now a bit easier to navigate
the file.

Signed-off-by: Philippe Mathieu-Daudé 
---
checkpatch warning:

  WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
  #142:
  new file mode 100644

is harmless because MAINTAINERS PC entry matches the directory:

  PC
  ...
  F: hw/i386/
---
 include/hw/i386/pc.h  |   3 +
 hw/i386/pc.c  | 113 -
 hw/i386/port92.c  | 126 ++
 hw/i386/Makefile.objs |   1 +
 hw/i386/trace-events  |   2 +
 5 files changed, 132 insertions(+), 113 deletions(-)
 create mode 100644 hw/i386/port92.c

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 1f86eba3f9..7e8d18d6fa 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -224,8 +224,11 @@ int cmos_get_fd_drive_type(FloppyDriveType fd0);
 
 #define FW_CFG_IO_BASE 0x510
 
+/* port92.c */
 #define PORT92_A20_LINE "a20"
 
+#define TYPE_PORT92 "port92"
+
 /* hpet.c */
 extern int no_hpet;
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2e8992c7d0..15efcb29d5 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -733,119 +733,6 @@ void pc_cmos_init(PCMachineState *pcms,
 qemu_register_reset(pc_cmos_init_late, &arg);
 }
 
-#define TYPE_PORT92 "port92"
-#define PORT92(obj) OBJECT_CHECK(Port92State, (obj), TYPE_PORT92)
-
-/* port 92 stuff: could be split off */
-typedef struct Port92State {
-ISADevice parent_obj;
-
-MemoryRegion io;
-uint8_t outport;
-qemu_irq a20_out;
-} Port92State;
-
-static void port92_write(void *opaque, hwaddr addr, uint64_t val,
- unsigned size)
-{
-Port92State *s = opaque;
-int oldval = s->outport;
-
-trace_port92_write(val);
-s->outport = val;
-qemu_set_irq(s->a20_out, (val >> 1) & 1);
-if ((val & 1) && !(oldval & 1)) {
-qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
-}
-}
-
-static uint64_t port92_read(void *opaque, hwaddr addr,
-unsigned size)
-{
-Port92State *s = opaque;
-uint32_t ret;
-
-ret = s->outport;
-trace_port92_read(ret);
-return ret;
-}
-
-static const VMStateDescription vmstate_port92_isa = {
-.name = "port92",
-.version_id = 1,
-.minimum_version_id = 1,
-.fields = (VMStateField[]) {
-VMSTATE_UINT8(outport, Port92State),
-VMSTATE_END_OF_LIST()
-}
-};
-
-static void port92_reset(DeviceState *d)
-{
-Port92State *s = PORT92(d);
-
-s->outport &= ~1;
-}
-
-static const MemoryRegionOps port92_ops = {
-.read = port92_read,
-.write = port92_write,
-.impl = {
-.min_access_size = 1,
-.max_access_size = 1,
-},
-.endianness = DEVICE_LITTLE_ENDIAN,
-};
-
-static void port92_initfn(Object *obj)
-{
-Port92State *s = PORT92(obj);
-
-memory_region_init_io(&s->io, OBJECT(s), &port92_ops, s, "port92", 1);
-
-s->outport = 0;
-
-qdev_init_gpio_out_named(DEVICE(obj), &s->a20_out, PORT92_A20_LINE, 1);
-}
-
-static void port92_realizefn(DeviceState *dev, Error **errp)
-{
-ISADevice *isadev = ISA_DEVICE(dev);
-Port92State *s = PORT92(dev);
-
-isa_register_ioport(isadev, &s->io, 0x92);
-}
-
-static void port92_class_initfn(ObjectClass *klass, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(klass);
-
-dc->realize = port92_realizefn;
-dc->reset = port92_reset;
-dc->vmsd = &vmstate_port92_isa;
-/*
- * Reason: unlike ordinary ISA devices, this one needs additional
- * wiring: its A20 output line needs to be wired up with
- * qdev_connect_gpio_out_named().
- */
-dc->user_creatable = false;
-}
-
-static const TypeInfo port92_info = {
-.name  = TYPE_PORT92,
-.parent= TYPE_ISA_DEVICE,
-.instance_size = sizeof(Port92State),
-.instance_init = port92_initfn,
-.class_init= port92_class_initfn,
-};
-
-static void port92_register_types(void)
-{
-type_register_static(&port92_info);
-}
-
-type_init(port92_register_types)
-
 static void handle_a20_line_change(void *opaque, int irq, int level)
 {
 X86CPU *cpu = opaque;
diff --git a/hw/i386/port92.c b/hw/i386/port92.c
new file mode 100644
index 00..19866c44ef
--- /dev/null
+++ b/hw/i386/port92.c
@@ -0,0 +1,126 @@
+/*
+ * QEMU I/O port 0x92 (System Control Port A, to handle Fast Gate A20)
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ *
+ * SPDX-License-Identifier: MIT
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/runstate.h"
+#include "migration/vmstate.h"
+#include "hw/irq.h"
+#include "hw/i386/pc.h"
+#include "trace.h"
+
+#define PORT92(obj) OBJECT_CHECK(Port92State, (obj), TYPE_PORT92)
+
+typedef struct Port92State {
+ISADevice parent_obj;
+
+MemoryRegion io;
+uint8_t outport;
+qemu_irq a20_out;
+} Port92State;
+
+static void port92_write(void *opaque, hwaddr add

Re: [PATCH] virtio-fs: fix MSI-X nvectors calculation

2019-12-13 Thread Dr. David Alan Gilbert
* Stefan Hajnoczi (stefa...@redhat.com) wrote:
> The following MSI-X vectors are required:
>  * VIRTIO Configuration Change
>  * hiprio virtqueue
>  * requests virtqueues
> 
> Fix the calculation to reserve enough MSI-X vectors.  Otherwise guest
> drivers fall back to a sub-optional configuration where all virtqueues
> share a single vector.
> 
> This change does not break live migration compatibility since
> vhost-user-fs-pci devices are not migratable yet.
> 
> Reported-by: Vivek Goyal 
> Signed-off-by: Stefan Hajnoczi 

Queued for virtiofs

> ---
>  hw/virtio/vhost-user-fs-pci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
> index 933a3f265b..e3a649d4a6 100644
> --- a/hw/virtio/vhost-user-fs-pci.c
> +++ b/hw/virtio/vhost-user-fs-pci.c
> @@ -40,7 +40,8 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
> *vpci_dev, Error **errp)
>  DeviceState *vdev = DEVICE(&dev->vdev);
>  
>  if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
> -vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 1;
> +/* Also reserve config change and hiprio queue vectors */
> +vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
>  }
>  
>  qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
> -- 
> 2.23.0
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: How to extend QEMU's vhost-user tests after implementing vhost-user-blk device backend

2019-12-13 Thread Stefan Hajnoczi
On Wed, Dec 11, 2019 at 11:25:32PM +0800, Coiby Xu wrote:
> I'm now writing the tests for vhost-user-blk device based on
> tests/virtio-blk-test.c. But block_resize command doesn't apply to
> vhost-user-blk device.
> 
> After launching vhost-user backend server, I type the following
> command to connect to it
> 
> (qemu) chardev-add socket,id=char1,path=/tmp/vhost-user-blk_vhost.socket
> (qemu) object_add memory-backend-memfd,id=mem,size=256M,share=on
> (qemu) device_add vhost-user-blk-pci,id=blk0,chardev=char1
> (qemu) block_resize blk0 512
> Error: Cannot find device=blk0 nor node_name=
> 
> QEMU can't find the device although in the guest OS I can already
> mount /dev/vda. And `info block` doesn't list the newly added
> vhost-user-blk device,
> (qemu) info block
> disk (#block154): dpdk.img (raw)
> Attached to:  /machine/peripheral-anon/device[0]
> Cache mode:   writeback
> 
> floppy0: [not inserted]
> Attached to:  /machine/unattached/device[17]
> Removable device: not locked, tray closed
> 
> sd0: [not inserted]
> Removable device: not locked, tray close
> 
> It seems `info block` and `block_resize` only work with `drive_add`
> which is not necessary for vhost-user-blk device.

Yes, -device vhost-user-blk doesn't have a BlockDriverState (-drive or
-blockdev) because it communicates with the vhost-user device backend
over a character device instead.

> Should I let QEMU
> support adding vhost-user backend device in the way similar to adding
> NBD device(`drive_add -n buddy
> file.driver=nbd,file.host=localhost,file.port=49153,file.export=disk,node-name=nbd_client1`),
> i.e., a drive can be added via `drive_add -n buddy
> file.driver=vhost-user,file.sock=/tmp/vhost-user-blk_vhost.socket,node-name=vhost_user_client1`?

That is probably too much work.  It's fine to skip test cases that
resize the disk.

Stefan


signature.asc
Description: PGP signature


[PATCH] hw/isa/isa-bus: Use ISA_NUM_IRQS instead of magic number

2019-12-13 Thread Philippe Mathieu-Daudé
We have a definition for the number of ISA IRQs, use it.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/isa/isa-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
index 388800603b..1d79ed133c 100644
--- a/hw/isa/isa-bus.c
+++ b/hw/isa/isa-bus.c
@@ -85,7 +85,7 @@ void isa_bus_irqs(ISABus *bus, qemu_irq *irqs)
 qemu_irq isa_get_irq(ISADevice *dev, int isairq)
 {
 assert(!dev || ISA_BUS(qdev_get_parent_bus(DEVICE(dev))) == isabus);
-if (isairq < 0 || isairq > 15) {
+if (isairq < 0 || isairq > >= ISA_NUM_IRQS) {
 hw_error("isa irq %d invalid", isairq);
 }
 return isabus->irqs[isairq];
-- 
2.21.0




[PATCH] hw/i386: De-duplicate gsi_handler() to remove kvm_pc_gsi_handler()

2019-12-13 Thread Philippe Mathieu-Daudé
Both gsi_handler() and kvm_pc_gsi_handler() have the same content,
except one comment. Move the comment, and de-duplicate the code.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/sysemu/kvm.h |  1 -
 hw/i386/kvm/ioapic.c | 12 
 hw/i386/pc.c |  5 ++---
 3 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 9fe233b9bf..f5d0d0d710 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -515,7 +515,6 @@ int kvm_irqchip_add_irqfd_notifier(KVMState *s, 
EventNotifier *n,
 int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n,
   qemu_irq irq);
 void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi);
-void kvm_pc_gsi_handler(void *opaque, int n, int level);
 void kvm_pc_setup_irq_routing(bool pci_enabled);
 void kvm_init_irq_routing(KVMState *s);
 
diff --git a/hw/i386/kvm/ioapic.c b/hw/i386/kvm/ioapic.c
index f94729c565..bae7413a39 100644
--- a/hw/i386/kvm/ioapic.c
+++ b/hw/i386/kvm/ioapic.c
@@ -48,18 +48,6 @@ void kvm_pc_setup_irq_routing(bool pci_enabled)
 }
 }
 
-void kvm_pc_gsi_handler(void *opaque, int n, int level)
-{
-GSIState *s = opaque;
-
-if (n < ISA_NUM_IRQS) {
-/* Kernel will forward to both PIC and IOAPIC */
-qemu_set_irq(s->i8259_irq[n], level);
-} else {
-qemu_set_irq(s->ioapic_irq[n], level);
-}
-}
-
 typedef struct KVMIOAPICState KVMIOAPICState;
 
 struct KVMIOAPICState {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ac08e63604..97e9049b71 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -350,6 +350,7 @@ void gsi_handler(void *opaque, int n, int level)
 
 DPRINTF("pc: %s GSI %d\n", level ? "raising" : "lowering", n);
 if (n < ISA_NUM_IRQS) {
+/* Under KVM, Kernel will forward to both PIC and IOAPIC */
 qemu_set_irq(s->i8259_irq[n], level);
 }
 qemu_set_irq(s->ioapic_irq[n], level);
@@ -362,10 +363,8 @@ GSIState *pc_gsi_create(qemu_irq **irqs, bool pci_enabled)
 s = g_new0(GSIState, 1);
 if (kvm_ioapic_in_kernel()) {
 kvm_pc_setup_irq_routing(pci_enabled);
-*irqs = qemu_allocate_irqs(kvm_pc_gsi_handler, s, GSI_NUM_PINS);
-} else {
-*irqs = qemu_allocate_irqs(gsi_handler, s, GSI_NUM_PINS);
 }
+*irqs = qemu_allocate_irqs(gsi_handler, s, GSI_NUM_PINS);
 
 return s;
 }
-- 
2.21.0




[PATCH] hw/i386/pc: Simplify ioapic_init_gsi()

2019-12-13 Thread Philippe Mathieu-Daudé
All callers of ioapic_init_gsi() provide a parent. We want new
uses to follow the same good practice and provide the parent
name, so do not make this optional: assert the parent name is
provided, and simplify the code.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/i386/pc.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ac08e63604..234945d328 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1488,15 +1488,14 @@ void ioapic_init_gsi(GSIState *gsi_state, const char 
*parent_name)
 SysBusDevice *d;
 unsigned int i;
 
+assert(parent_name);
 if (kvm_ioapic_in_kernel()) {
 dev = qdev_create(NULL, TYPE_KVM_IOAPIC);
 } else {
 dev = qdev_create(NULL, TYPE_IOAPIC);
 }
-if (parent_name) {
-object_property_add_child(object_resolve_path(parent_name, NULL),
-  "ioapic", OBJECT(dev), NULL);
-}
+object_property_add_child(object_resolve_path(parent_name, NULL),
+  "ioapic", OBJECT(dev), NULL);
 qdev_init_nofail(dev);
 d = SYS_BUS_DEVICE(dev);
 sysbus_mmio_map(d, 0, IO_APIC_DEFAULT_ADDRESS);
-- 
2.21.0




Re: [PATCH v9 01/13] accel/tcg: introduce TBStatistics structure

2019-12-13 Thread Alex Bennée


Richard Henderson  writes:

> On 10/7/19 11:28 AM, Alex Bennée wrote:
>> From: "Vanderson M. do Rosario" 
>> 
>> To store statistics for each TB, we created a TBStatistics structure
>> which is linked with the TBs. TBStatistics can stay alive after
>> tb_flush and be relinked to a regenerated TB. So the statistics can
>> be accumulated even through flushes.
>> 
>> The goal is to have all present and future qemu/tcg statistics and
>> meta-data stored in this new structure.
>> 
>> Reviewed-by: Alex Bennée 
>> Signed-off-by: Vanderson M. do Rosario 
>> Message-Id: <20190829173437.5926-2-vanderson...@gmail.com>
>> [AJB: fix git author, review comments]
>> Signed-off-by: Alex Bennée 
>> 
>> ---
>> AJB
>>   - move tcg_collect_tb_stats inside tb-stats.c
>>   - add spdx header
>>   - drop tb from tbstats and associated functions
>> ---
>
> The only quibble I have is with
>
>> +void init_tb_stats_htable_if_not(void);
>
> If not what?
>
> I can only imagine that this phrase is intended to finish "if not 
> initialized".
>  But I think it's clearer to just call this "init_tb_stats_htable".

Fixed.

>
>> +void enable_collect_tb_stats(void)
>> +{
>> +init_tb_stats_htable_if_not();
>
> Why do we need to do this again, since we did this in tb_htable_init?

This is the route if we dynamically enable tb-stats with an already
running system emulation.

-- 
Alex Bennée



Re: [PATCH for-5.0 v2 15/23] mirror: Prevent loops

2019-12-13 Thread Vladimir Sementsov-Ogievskiy
09.12.2019 17:43, Max Reitz wrote:
> On 02.12.19 13:12, Vladimir Sementsov-Ogievskiy wrote:
>> 11.11.2019 19:02, Max Reitz wrote:
>>> While bdrv_replace_node() will not follow through with it, a specific
>>> @replaces asks the mirror job to create a loop.
>>>
>>> For example, say both the source and the target share a child where the
>>> source is a filter; by letting @replaces point to the common child, you
>>> ask for a loop.
>>>
>>> Or if you use @replaces in drive-mirror with sync=none and
>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>> to a child of the source, and sync=none makes the source the backing
>>> file of the target after the job).
>>>
>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>> ignores the user-requested configuration, which is not ideally either.
>>> (In the first example above, the target's child will remain what it was,
>>> which may still be reasonable.  But in the second example, the target
>>> will just not become a child of the source, which is precisely what was
>>> requested with @replaces.)
>>>
>>> So prevent such configurations, both before the job, and before it
>>> actually completes.
>>>
>>> Signed-off-by: Max Reitz 
>>> ---
>>>block.c   | 30 
>>>block/mirror.c| 19 +++-
>>>blockdev.c| 48 ++-
>>>include/block/block_int.h |  3 +++
>>>4 files changed, 98 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index 0159f8e510..e3922a0474 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -6259,6 +6259,36 @@ out:
>>>return to_replace_bs;
>>>}
>>>
>>> +/*
>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>> + * least @min_level edges between them.
>>> + *
>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>> + * @min_level == 1, @child needs to be at least a real child; for
>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>> + */
>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>> +  int min_level)
>>> +{
>>> +BdrvChild *c;
>>> +
>>> +if (child == parent && min_level <= 0) {
>>> +return true;
>>> +}
>>> +
>>> +if (!parent) {
>>> +return false;
>>> +}
>>> +
>>> +QLIST_FOREACH(c, &parent->children, next) {
>>> +if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>> +return true;
>>> +}
>>> +}
>>> +
>>> +return false;
>>> +}
>>> +
>>>/**
>>> * Iterates through the list of runtime option keys that are said to
>>> * be "strong" for a BDS.  An option is called "strong" if it changes
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 68a4404666..b258c7e98b 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>> * there.
>>> */
>>>if (bdrv_recurse_can_replace(src, to_replace)) {
>>> -bdrv_replace_node(to_replace, target_bs, &local_err);
>>> +/*
>>> + * It is OK for @to_replace to be an immediate child of
>>> + * @target_bs, because that is what happens with
>>> + * drive-mirror sync=none mode=absolute-paths: target_bs's
>>> + * backing file will be the source node, which is also
>>> + * to_replace (by default).
>>> + * bdrv_replace_node() handles this case by not letting
>>> + * target_bs->backing point to itself, but to the source
>>> + * still.
>>> + */
>>> +if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>>> +bdrv_replace_node(to_replace, target_bs, &local_err);
>>> +} else {
>>> +error_setg(&local_err, "Can no longer replace '%s' by 
>>> '%s', "
>>> +   "because the former is now a child of the 
>>> latter, "
>>> +   "and doing so would thus create a loop",
>>> +   to_replace->node_name, target_bs->node_name);
>>> +}
>>
>> you may swap if and else branch, dropping "!" mark..
> 
> Yes, but I just personally prefer to have the error case in the else branch.
> 
>>>} else {
>>>error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>   "because it can no longer be guaranteed that 
>>> doing so "
>>> diff --git a/blockdev.c b/blockdev.c
>>> index 9dc2238bf3..d29f147f72 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char 
>>> *job_id, BlockDriverState *bs,
>>>}
>>>
>>>if (has_replaces) {
>>> -BlockDriverState *to_replace_bs;
>>> +BlockDriverState *to_replace_bs, *target_backing_bs;
>>>AioContext *repl

Re: [PATCH] virtio: don't enable notifications during polling

2019-12-13 Thread Stefan Hajnoczi
On Mon, Dec 09, 2019 at 09:09:57PM +, Stefan Hajnoczi wrote:
> Virtqueue notifications are not necessary during polling, so we disable
> them.  This allows the guest driver to avoid MMIO vmexits.
> Unfortunately the virtio-blk and virtio-scsi handler functions re-enable
> notifications, defeating this optimization.
> 
> Fix virtio-blk and virtio-scsi emulation so they leave notifications
> disabled.  The key thing to remember for correctness is that polling
> always checks one last time after ending its loop, therefore it's safe
> to lose the race when re-enabling notifications at the end of polling.
> 
> There is a measurable performance improvement of 5-10% with the null-co
> block driver.  Real-life storage configurations will see a smaller
> improvement because the MMIO vmexit overhead contributes less to
> latency.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  hw/block/virtio-blk.c  |  9 +++--
>  hw/scsi/virtio-scsi.c  |  9 +++--
>  hw/virtio/virtio.c | 12 ++--
>  include/hw/virtio/virtio.h |  1 +
>  4 files changed, 21 insertions(+), 10 deletions(-)

Post-release ping :)

> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 4c357d2928..c4e55fb3de 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -764,13 +764,16 @@ bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
>  {
>  VirtIOBlockReq *req;
>  MultiReqBuffer mrb = {};
> +bool suppress_notifications = virtio_queue_get_notification(vq);
>  bool progress = false;
>  
>  aio_context_acquire(blk_get_aio_context(s->blk));
>  blk_io_plug(s->blk);
>  
>  do {
> -virtio_queue_set_notification(vq, 0);
> +if (suppress_notifications) {
> +virtio_queue_set_notification(vq, 0);
> +}
>  
>  while ((req = virtio_blk_get_request(s, vq))) {
>  progress = true;
> @@ -781,7 +784,9 @@ bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
>  }
>  }
>  
> -virtio_queue_set_notification(vq, 1);
> +if (suppress_notifications) {
> +virtio_queue_set_notification(vq, 1);
> +}
>  } while (!virtio_queue_empty(vq));
>  
>  if (mrb.num_reqs) {
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index e8b2b64d09..f080545f48 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -597,12 +597,15 @@ bool virtio_scsi_handle_cmd_vq(VirtIOSCSI *s, VirtQueue 
> *vq)
>  {
>  VirtIOSCSIReq *req, *next;
>  int ret = 0;
> +bool suppress_notifications = virtio_queue_get_notification(vq);
>  bool progress = false;
>  
>  QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
>  
>  do {
> -virtio_queue_set_notification(vq, 0);
> +if (suppress_notifications) {
> +virtio_queue_set_notification(vq, 0);
> +}
>  
>  while ((req = virtio_scsi_pop_req(s, vq))) {
>  progress = true;
> @@ -622,7 +625,9 @@ bool virtio_scsi_handle_cmd_vq(VirtIOSCSI *s, VirtQueue 
> *vq)
>  }
>  }
>  
> -virtio_queue_set_notification(vq, 1);
> +if (suppress_notifications) {
> +virtio_queue_set_notification(vq, 1);
> +}
>  } while (ret != -EINVAL && !virtio_queue_empty(vq));
>  
>  QTAILQ_FOREACH_SAFE(req, &reqs, next, next) {
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 762df12f4c..78e5852296 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -431,6 +431,11 @@ static void 
> virtio_queue_packed_set_notification(VirtQueue *vq, int enable)
>  }
>  }
>  
> +bool virtio_queue_get_notification(VirtQueue *vq)
> +{
> +return vq->notification;
> +}
> +
>  void virtio_queue_set_notification(VirtQueue *vq, int enable)
>  {
>  vq->notification = enable;
> @@ -3382,17 +3387,12 @@ static bool virtio_queue_host_notifier_aio_poll(void 
> *opaque)
>  {
>  EventNotifier *n = opaque;
>  VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
> -bool progress;
>  
>  if (!vq->vring.desc || virtio_queue_empty(vq)) {
>  return false;
>  }
>  
> -progress = virtio_queue_notify_aio_vq(vq);
> -
> -/* In case the handler function re-enabled notifications */
> -virtio_queue_set_notification(vq, 0);
> -return progress;
> +return virtio_queue_notify_aio_vq(vq);
>  }
>  
>  static void virtio_queue_host_notifier_aio_poll_end(EventNotifier *n)
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 3448d67d2a..8ee93873a4 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -224,6 +224,7 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
> version_id);
>  
>  void virtio_notify_config(VirtIODevice *vdev);
>  
> +bool virtio_queue_get_notification(VirtQueue *vq);
>  void virtio_queue_set_notification(VirtQueue *vq, int enable);
>  
>  int virtio_queue_ready(VirtQueue *vq);
> -- 
> 2.23.0
> 


si

Re: [PATCH for-5.0 v2 18/23] iotests: Add VM.assert_block_path()

2019-12-13 Thread Vladimir Sementsov-Ogievskiy
09.12.2019 18:10, Max Reitz wrote:
> On 03.12.19 13:59, Vladimir Sementsov-Ogievskiy wrote:
>> 11.11.2019 19:02, Max Reitz wrote:
>>> Signed-off-by: Max Reitz 
>>> ---
>>>tests/qemu-iotests/iotests.py | 59 +++
>>>1 file changed, 59 insertions(+)
>>>
>>> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
>>> index d34305ce69..3e03320ce3 100644
>>> --- a/tests/qemu-iotests/iotests.py
>>> +++ b/tests/qemu-iotests/iotests.py
>>> @@ -681,6 +681,65 @@ class VM(qtest.QEMUQtestMachine):
>>>
>>>return fields.items() <= ret.items()
>>>
>>> +"""
>>> +Check whether the node under the given path in the block graph is
>>> +@expected_node.
>>> +
>>> +@root is the node name of the node where the @path is rooted.
>>> +
>>> +@path is a string that consists of child names separated by
>>> +slashes.  It must begin with a slash.
>>
>> Why do you need this slash?
> 
> I don’t.  It just looked better to me.
> 
> (One reason would be so it could be empty to refer to @root, but as I
> said that isn’t very useful.)
> 
>> To stress that we are starting from root?
>> But root is not global, it's selected by previous argument, so for me the
>> path is more like relative than absolute..
>>
>>> +
>>> +Examples for @root + @path:
>>> +  - root="qcow2-node", path="/backing/file"
>>> +  - root="quorum-node", path="/children.2/file"
>>> +
>>> +Hypothetically, @path could be empty, in which case it would point
>>> +to @root.  However, in practice this case is not useful and hence
>>> +not allowed.
>>
>> 1. path can't be empty, as accordingly to previous point, it must start with 
>> '/'
> 
> Hence “hypothetically”.
> 
>> 2. path can be '/', which does exactly what you don't allow, and I don't see,
>> where it is restricted in code
> 
> No, it doesn’t.  That refers to a child of @root with an empty name.

Hmm, yes, OK.

> 
>>> +
>>> +@expected_node may be None.
>>
>> Which means that, we assert existence of the path except its last element,
>> yes? Worth mention this behavior here.
> 
> “(All elements of the path but the leaf must still exist.)”?  OK.

OK

> 
>>> +
>>> +@graph may be None or the result of an x-debug-query-block-graph
>>> +call that has already been performed.
>>> +"""
>>> +def assert_block_path(self, root, path, expected_node, graph=None):
>>> +if graph is None:
>>> +graph = self.qmp('x-debug-query-block-graph')['return']
>>> +
>>> +iter_path = iter(path.split('/'))
>>> +
>>> +# Must start with a /
>>> +assert next(iter_path) == ''
>>> +
>>> +node = next((node for node in graph['nodes'] if node['name'] == 
>>> root),
>>> +None)
>>> +
>>> +for path_node in iter_path:
>>> +assert node is not None, 'Cannot follow path %s' % path
>>> +
>>> +try:
>>> +node_id = next(edge['child'] for edge in graph['edges'] \
>>> + if edge['parent'] == 
>>> node['id'] and
>>> +edge['name'] == path_node)
>>> +
>>> +node = next(node for node in graph['nodes'] \
>>> + if node['id'] == node_id)
>>
>> this line cant fail. If it fail, it means a bug in 
>> x-debug-query-block-graph, so,
>> I'd prefer to move it out of try:except block.
> 
> But that makes the code uglier, in my opinion.  We’d then have to set
> node_id to e.g. None in the except branch (or rather just abolish the
> try-except then) and check whether it’s None before assigning node.
> Like this:
> 
> node_id = next(..., None)
> 
> if node_id is not None:
>  node = next(...)
> else:
>  node = None
> 
> I prefer the current try-except construct over that.

OK

> 
>>> +except StopIteration:
>>> +node = None
>>> +
>>> +assert node is not None or expected_node is None, \
>>> +   'No node found under %s (but expected %s)' % \
>>> +   (path, expected_node)
>>> +
>>> +assert expected_node is not None or node is None, \
>>> +   'Found node %s under %s (but expected none)' % \
>>> +   (node['name'], path)
>>> +
>>> +if node is not None and expected_node is not None:
>>
>> [1]
>> second part of condition already asserted by previous assertion
> 
> Yes, but I wanted to cover all four cases explicitly.  (In the usual 00,
> 01, 10, 11 manner.  Well, except it’s 10, 01, 11, 00.)
> 
>>> +assert node['name'] == expected_node, \
>>> +   'Found node %s under %s (but expected %s)' % \
>>> +   (node['name'], path, expected_node)
>>
>> IMHO, it would be easier to read like:
>>
>> if node is None:
>> assert  expected_node is None, \
>>'No node found under %s (but expected %s)' % \
>>(path, e

Re: [PATCH for-5.0 v2 18/23] iotests: Add VM.assert_block_path()

2019-12-13 Thread Vladimir Sementsov-Ogievskiy
11.11.2019 19:02, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> ---
>   tests/qemu-iotests/iotests.py | 59 +++
>   1 file changed, 59 insertions(+)
> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index d34305ce69..3e03320ce3 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -681,6 +681,65 @@ class VM(qtest.QEMUQtestMachine):
>   
>   return fields.items() <= ret.items()
>   
> +"""
> +Check whether the node under the given path in the block graph is
> +@expected_node.
> +
> +@root is the node name of the node where the @path is rooted.
> +
> +@path is a string that consists of child names separated by
> +slashes.  It must begin with a slash.
> +
> +Examples for @root + @path:
> +  - root="qcow2-node", path="/backing/file"
> +  - root="quorum-node", path="/children.2/file"
> +
> +Hypothetically, @path could be empty, in which case it would point
> +to @root.  However, in practice this case is not useful and hence
> +not allowed.
> +
> +@expected_node may be None.
> +
> +@graph may be None or the result of an x-debug-query-block-graph
> +call that has already been performed.
> +"""
> +def assert_block_path(self, root, path, expected_node, graph=None):
> +if graph is None:
> +graph = self.qmp('x-debug-query-block-graph')['return']
> +
> +iter_path = iter(path.split('/'))
> +
> +# Must start with a /
> +assert next(iter_path) == ''
> +
> +node = next((node for node in graph['nodes'] if node['name'] == 
> root),
> +None)
> +
> +for path_node in iter_path:

I'd rename path_node to child or edge, to not interfere with block nodes here.

> +assert node is not None, 'Cannot follow path %s' % path
> +
> +try:
> +node_id = next(edge['child'] for edge in graph['edges'] \
> + if edge['parent'] == node['id'] 
> and
> +edge['name'] == path_node)
> +
> +node = next(node for node in graph['nodes'] \
> + if node['id'] == node_id)
> +except StopIteration:
> +node = None
> +
> +assert node is not None or expected_node is None, \
> +   'No node found under %s (but expected %s)' % \
> +   (path, expected_node)
> +
> +assert expected_node is not None or node is None, \
> +   'Found node %s under %s (but expected none)' % \
> +   (node['name'], path)
> +
> +if node is not None and expected_node is not None:
> +assert node['name'] == expected_node, \
> +   'Found node %s under %s (but expected %s)' % \
> +   (node['name'], path, expected_node)
>   
>   index_re = re.compile(r'([^\[]+)\[([^\]]+)\]')
>   
> 


-- 
Best regards,
Vladimir



Re: [PATCH] virtio-fs: fix MSI-X nvectors calculation

2019-12-13 Thread Stefan Hajnoczi
On Mon, Dec 09, 2019 at 11:07:59AM +, Stefan Hajnoczi wrote:
> The following MSI-X vectors are required:
>  * VIRTIO Configuration Change
>  * hiprio virtqueue
>  * requests virtqueues
> 
> Fix the calculation to reserve enough MSI-X vectors.  Otherwise guest
> drivers fall back to a sub-optional configuration where all virtqueues
> share a single vector.
> 
> This change does not break live migration compatibility since
> vhost-user-fs-pci devices are not migratable yet.
> 
> Reported-by: Vivek Goyal 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  hw/virtio/vhost-user-fs-pci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


Re: [PATCH for-5.0 v2 19/23] iotests: Resolve TODOs in 041

2019-12-13 Thread Vladimir Sementsov-Ogievskiy
09.12.2019 18:15, Max Reitz wrote:
> On 03.12.19 14:33, Vladimir Sementsov-Ogievskiy wrote:
>> 03.12.2019 16:32, Vladimir Sementsov-Ogievskiy wrote:
>>> 11.11.2019 19:02, Max Reitz wrote:
 Signed-off-by: Max Reitz
>>>
>>>
>>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>>>
>>
>>
>> Oops, stop. Why do you remove line "self.vm.shutdown()" ?
> 
> Because we don’t need it.  tearDown() does it anyway.  I suppose I
> should mention it in the commit message.
> 

Yes...

But actually, better to remove extra shutdown from all test cases, not from
one, and than it would be separate patch.

Extra shutdown is left in (considering only class TestRepairQuorum):
test_pause
test_cancel_after_ready
test_cancel
test_complete


-- 
Best regards,
Vladimir



Re: [PATCH 0/2] RFC: add -mem-shared option

2019-12-13 Thread Stefan Hajnoczi
On Fri, Nov 29, 2019 at 10:23:25AM +0100, Igor Mammedov wrote:
> On Thu, 28 Nov 2019 16:59:33 +
> "Dr. David Alan Gilbert"  wrote:
> 
> > * Marc-André Lureau (marcandre.lur...@redhat.com) wrote:
> > > Hi,
> > > 
> > > Setting up shared memory for vhost-user is a bit complicated from
> > > command line, as it requires NUMA setup such as: m 4G -object
> > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > node,memdev=mem.
> > > 
> > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > that will make the -mem-path or anonymouse memory shareable.
> > > 
> > > Comments welcome,  
> > 
> > It's worth checking with Igor (cc'd) - he said he was going to work on
> > something similar.
> > 
> > One other thing this fixes is that it lets you potentially do vhost-user
> > on s390, since it currently has no NUMA.
> Switching to memdev will let vhost-user on s390 work as well.
> This is convenience option and workarounds inability to set main RAM
> properties in current impl. 

Gong Su asked about virtio-fs (vhost-user) on s390.  This patch series
might be the first step to enabling it.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v9 03/13] accel: collecting JIT statistics

2019-12-13 Thread Alex Bennée


Richard Henderson  writes:

> On 10/7/19 11:28 AM, Alex Bennée wrote:
>> @@ -1795,6 +1799,10 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>  if (flag & TB_EXEC_STATS) {
>>  tb->tb_stats->stats_enabled |= TB_EXEC_STATS;
>>  }
>> +
>> +if (flag & TB_JIT_STATS) {
>> +tb->tb_stats->stats_enabled |= TB_JIT_STATS;
>> +}
>
> So, assuming that you really meant this, and not replacing as I was guessing 
> vs
> patch 2, then this is
>
> tb->tb_stats->stats_enabled |=
> flag & (TB_EXEC_STATS | TB_JIT_STATS);
>
> But I still think it's weird to be wanting to modify the shared structure.
> What does that mean for concurrent code generation?

The idea was to have per translation area granularity on collecting the
stats so we didn't have to burden all areas with the overhead. Currently
this only takes effect when qemu_log_in_addr_range is in effect. However
as the run goes on we could make decisions to disable some or all stats
for stuff that doesn't come up that frequently.

However the current positioning doesn't work as we keep setting the flag
so I think we need to apply get_default_tbstats_flag() inside
tb_get_stats only when we first create the data block.

>
>> +/*
>> + * Collect JIT stats when enabled. We batch them all up here to
>> + * avoid spamming the cache with atomic accesses
>> + */
>> +if (tb_stats_enabled(tb, TB_JIT_STATS)) {
>> +TBStatistics *ts = tb->tb_stats;
>> +qemu_mutex_lock(&ts->jit_stats_lock);
>> +
>> +ts->code.num_guest_inst += prof->translation.nb_guest_insns;
>> +ts->code.num_tcg_ops += prof->translation.nb_ops_pre_opt;
>> +ts->code.num_tcg_ops_opt += tcg_ctx->nb_ops;
>> +ts->code.spills += prof->translation.nb_spills;
>> +ts->code.out_len += tb->tc.size;
>> +
>> +ts->translations.total++;
>> +if (phys_page2 != -1) {
>> +ts->translations.spanning++;
>> +}
>> +
>> +g_ptr_array_add(ts->tbs, tb);
>> +
>> +qemu_mutex_unlock(&ts->jit_stats_lock);
>> +}
>
> Hmm.  So we're to interpret all of code.field as sums, the average of which 
> can
> be obtained by dividing by translations.total.  Ok, but that should probably 
> be
> mentioned in a comment in/near the structure definition.

OK

> What are we planning to do with the set of all tb's collected here?

Originally we kept track for the coverset calculation as we need to know
where each individual TB goes next. The code was racy so I dropped it
from the series so tracking this now is possibly redundant although it
might be useful in the future.

>
>> @@ -3125,6 +3126,7 @@ static void temp_sync
>>  case TEMP_VAL_REG:
>>  tcg_out_st(s, ts->type, ts->reg,
>> ts->mem_base->reg, ts->mem_offset);
>> +s->prof.translation.nb_spills++;
>>  break;
>>  
>>  case TEMP_VAL_MEM:
>
> This is not a spill in the common compiler definition.
>
> This is "write the temp to its backing storage".  While this does happen in 
> the
> course of spilling, the vast majority of these are because we've finished
> modifying a global temp and must now update memory.  Which is not nearly the
> same thing as "spill".
>
> A spill in the compiler definition happens in tcg_reg_alloc, right after the
> comment "We must spill something".  ;-)

OK I'll fix that.

-- 
Alex Bennée



Re: [PATCH] hw/i386: De-duplicate gsi_handler() to remove kvm_pc_gsi_handler()

2019-12-13 Thread Paolo Bonzini
On 13/12/19 12:07, Philippe Mathieu-Daudé wrote:
> Both gsi_handler() and kvm_pc_gsi_handler() have the same content,
> except one comment. Move the comment, and de-duplicate the code.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/sysemu/kvm.h |  1 -
>  hw/i386/kvm/ioapic.c | 12 
>  hw/i386/pc.c |  5 ++---
>  3 files changed, 2 insertions(+), 16 deletions(-)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 9fe233b9bf..f5d0d0d710 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -515,7 +515,6 @@ int kvm_irqchip_add_irqfd_notifier(KVMState *s, 
> EventNotifier *n,
>  int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n,
>qemu_irq irq);
>  void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi);
> -void kvm_pc_gsi_handler(void *opaque, int n, int level);
>  void kvm_pc_setup_irq_routing(bool pci_enabled);
>  void kvm_init_irq_routing(KVMState *s);
>  
> diff --git a/hw/i386/kvm/ioapic.c b/hw/i386/kvm/ioapic.c
> index f94729c565..bae7413a39 100644
> --- a/hw/i386/kvm/ioapic.c
> +++ b/hw/i386/kvm/ioapic.c
> @@ -48,18 +48,6 @@ void kvm_pc_setup_irq_routing(bool pci_enabled)
>  }
>  }
>  
> -void kvm_pc_gsi_handler(void *opaque, int n, int level)
> -{
> -GSIState *s = opaque;
> -
> -if (n < ISA_NUM_IRQS) {
> -/* Kernel will forward to both PIC and IOAPIC */
> -qemu_set_irq(s->i8259_irq[n], level);
> -} else {
> -qemu_set_irq(s->ioapic_irq[n], level);
> -}
> -}
> -
>  typedef struct KVMIOAPICState KVMIOAPICState;
>  
>  struct KVMIOAPICState {
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index ac08e63604..97e9049b71 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -350,6 +350,7 @@ void gsi_handler(void *opaque, int n, int level)
>  
>  DPRINTF("pc: %s GSI %d\n", level ? "raising" : "lowering", n);
>  if (n < ISA_NUM_IRQS) {
> +/* Under KVM, Kernel will forward to both PIC and IOAPIC */
>  qemu_set_irq(s->i8259_irq[n], level);
>  }
>  qemu_set_irq(s->ioapic_irq[n], level);
> @@ -362,10 +363,8 @@ GSIState *pc_gsi_create(qemu_irq **irqs, bool 
> pci_enabled)
>  s = g_new0(GSIState, 1);
>  if (kvm_ioapic_in_kernel()) {
>  kvm_pc_setup_irq_routing(pci_enabled);
> -*irqs = qemu_allocate_irqs(kvm_pc_gsi_handler, s, GSI_NUM_PINS);
> -} else {
> -*irqs = qemu_allocate_irqs(gsi_handler, s, GSI_NUM_PINS);
>  }
> +*irqs = qemu_allocate_irqs(gsi_handler, s, GSI_NUM_PINS);
>  
>  return s;
>  }
> 

Queued, thanks.

Paolo




Re: [PATCH] hw/i386/pc: Simplify ioapic_init_gsi()

2019-12-13 Thread Paolo Bonzini
On 13/12/19 12:11, Philippe Mathieu-Daudé wrote:
> All callers of ioapic_init_gsi() provide a parent. We want new
> uses to follow the same good practice and provide the parent
> name, so do not make this optional: assert the parent name is
> provided, and simplify the code.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/i386/pc.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index ac08e63604..234945d328 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1488,15 +1488,14 @@ void ioapic_init_gsi(GSIState *gsi_state, const char 
> *parent_name)
>  SysBusDevice *d;
>  unsigned int i;
>  
> +assert(parent_name);
>  if (kvm_ioapic_in_kernel()) {
>  dev = qdev_create(NULL, TYPE_KVM_IOAPIC);
>  } else {
>  dev = qdev_create(NULL, TYPE_IOAPIC);
>  }
> -if (parent_name) {
> -object_property_add_child(object_resolve_path(parent_name, NULL),
> -  "ioapic", OBJECT(dev), NULL);
> -}
> +object_property_add_child(object_resolve_path(parent_name, NULL),
> +  "ioapic", OBJECT(dev), NULL);
>  qdev_init_nofail(dev);
>  d = SYS_BUS_DEVICE(dev);
>  sysbus_mmio_map(d, 0, IO_APIC_DEFAULT_ADDRESS);
> 

Queued, thanks.

Paolo




[PULL 1/2] vhost-user-fs: remove "vhostfd" property

2019-12-13 Thread Dr. David Alan Gilbert (git)
From: Marc-André Lureau 

The property doesn't make much sense for a vhost-user device.

Signed-off-by: Marc-André Lureau 
Message-Id: <20191116112016.14872-1-marcandre.lur...@redhat.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost-user-fs.c | 1 -
 include/hw/virtio/vhost-user-fs.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
index f0df7f4746..ca0b7fc9de 100644
--- a/hw/virtio/vhost-user-fs.c
+++ b/hw/virtio/vhost-user-fs.c
@@ -263,7 +263,6 @@ static Property vuf_properties[] = {
 DEFINE_PROP_UINT16("num-request-queues", VHostUserFS,
conf.num_request_queues, 1),
 DEFINE_PROP_UINT16("queue-size", VHostUserFS, conf.queue_size, 128),
-DEFINE_PROP_STRING("vhostfd", VHostUserFS, conf.vhostfd),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/vhost-user-fs.h 
b/include/hw/virtio/vhost-user-fs.h
index 539885b458..9ff1bdb7cf 100644
--- a/include/hw/virtio/vhost-user-fs.h
+++ b/include/hw/virtio/vhost-user-fs.h
@@ -28,7 +28,6 @@ typedef struct {
 char *tag;
 uint16_t num_request_queues;
 uint16_t queue_size;
-char *vhostfd;
 } VHostUserFSConf;
 
 typedef struct {
-- 
2.23.0




[PULL 0/2] virtiofs queue

2019-12-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

The following changes since commit b0ca999a43a22b38158a33d3f5881648bb4f:

  Update version for v4.2.0 release (2019-12-12 16:45:57 +)

are available in the Git repository at:

  git://github.com/dagrh/qemu.git tags/pull-virtiofs-20191213a

for you to fetch changes up to 366844f3d1329c6423dd752891a28ccb3ee8fddd:

  virtio-fs: fix MSI-X nvectors calculation (2019-12-13 10:53:57 +)


virtiofs pull 2019-12-13: Minor fixes and cleanups

Cleanup from Marc-André and MSI-X fix from Stefan.


Marc-André Lureau (1):
  vhost-user-fs: remove "vhostfd" property

Stefan Hajnoczi (1):
  virtio-fs: fix MSI-X nvectors calculation

 hw/virtio/vhost-user-fs-pci.c | 3 ++-
 hw/virtio/vhost-user-fs.c | 1 -
 include/hw/virtio/vhost-user-fs.h | 1 -
 3 files changed, 2 insertions(+), 3 deletions(-)




[PATCH 01/13] ppc: Drop useless extern annotation for functions

2019-12-13 Thread Greg Kurz
Signed-off-by: Greg Kurz 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/hw/ppc/pnv_xscom.h |   22 +++---
 include/hw/ppc/spapr_vio.h |6 +++---
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
index 09188d74b06b..2bdb7ae84fd3 100644
--- a/include/hw/ppc/pnv_xscom.h
+++ b/include/hw/ppc/pnv_xscom.h
@@ -113,16 +113,16 @@ typedef struct PnvXScomInterfaceClass {
 #define PNV10_XSCOM_PSIHB_BASE 0x3011D00
 #define PNV10_XSCOM_PSIHB_SIZE 0x100
 
-extern void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
-extern int pnv_dt_xscom(PnvChip *chip, void *fdt, int offset);
-
-extern void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
-MemoryRegion *mr);
-extern void pnv_xscom_region_init(MemoryRegion *mr,
-  struct Object *owner,
-  const MemoryRegionOps *ops,
-  void *opaque,
-  const char *name,
-  uint64_t size);
+void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
+int pnv_dt_xscom(PnvChip *chip, void *fdt, int offset);
+
+void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
+ MemoryRegion *mr);
+void pnv_xscom_region_init(MemoryRegion *mr,
+   struct Object *owner,
+   const MemoryRegionOps *ops,
+   void *opaque,
+   const char *name,
+   uint64_t size);
 
 #endif /* PPC_PNV_XSCOM_H */
diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
index 72762ed16b70..ce6d9b0c6605 100644
--- a/include/hw/ppc/spapr_vio.h
+++ b/include/hw/ppc/spapr_vio.h
@@ -80,10 +80,10 @@ struct SpaprVioBus {
 uint32_t next_reg;
 };
 
-extern SpaprVioBus *spapr_vio_bus_init(void);
-extern SpaprVioDevice *spapr_vio_find_by_reg(SpaprVioBus *bus, uint32_t reg);
+SpaprVioBus *spapr_vio_bus_init(void);
+SpaprVioDevice *spapr_vio_find_by_reg(SpaprVioBus *bus, uint32_t reg);
 void spapr_dt_vdevice(SpaprVioBus *bus, void *fdt);
-extern gchar *spapr_vio_stdout_path(SpaprVioBus *bus);
+gchar *spapr_vio_stdout_path(SpaprVioBus *bus);
 
 static inline void spapr_vio_irq_pulse(SpaprVioDevice *dev)
 {




[PULL 2/2] virtio-fs: fix MSI-X nvectors calculation

2019-12-13 Thread Dr. David Alan Gilbert (git)
From: Stefan Hajnoczi 

The following MSI-X vectors are required:
 * VIRTIO Configuration Change
 * hiprio virtqueue
 * requests virtqueues

Fix the calculation to reserve enough MSI-X vectors.  Otherwise guest
drivers fall back to a sub-optional configuration where all virtqueues
share a single vector.

This change does not break live migration compatibility since
vhost-user-fs-pci devices are not migratable yet.

Reported-by: Vivek Goyal 
Signed-off-by: Stefan Hajnoczi 
Message-Id: <20191209110759.35227-1-stefa...@redhat.com>
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost-user-fs-pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user-fs-pci.c b/hw/virtio/vhost-user-fs-pci.c
index 933a3f265b..e3a649d4a6 100644
--- a/hw/virtio/vhost-user-fs-pci.c
+++ b/hw/virtio/vhost-user-fs-pci.c
@@ -40,7 +40,8 @@ static void vhost_user_fs_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 DeviceState *vdev = DEVICE(&dev->vdev);
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
-vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 1;
+/* Also reserve config change and hiprio queue vectors */
+vpci_dev->nvectors = dev->vdev.conf.num_request_queues + 2;
 }
 
 qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus));
-- 
2.23.0




[PATCH 00/13] ppc/pnv: Get rid of chip_type attributes

2019-12-13 Thread Greg Kurz
The PnvChipClass type has a chip_type attribute which identifies various
POWER CPU chip types that can be used in a powernv machine.

typedef enum PnvChipType {
PNV_CHIP_POWER8E, /* AKA Murano (default) */
PNV_CHIP_POWER8,  /* AKA Venice */
PNV_CHIP_POWER8NVL,   /* AKA Naples */
PNV_CHIP_POWER9,  /* AKA Nimbus */
PNV_CHIP_POWER10, /* AKA TBD */
} PnvChipType;

This attribute is used in many places where we want a different behaviour
depending on the CPU type, either directly like:

switch (PNV_CHIP_GET_CLASS(chip)->chip_type) {
case PNV_CHIP_POWER8E:
case PNV_CHIP_POWER8:
case PNV_CHIP_POWER8NVL:
return ((addr >> 4) & ~0xfull) | ((addr >> 3) & 0xf);
case PNV_CHIP_POWER9:
case PNV_CHIP_POWER10:
return addr >> 3;
default:
g_assert_not_reached();
}

or through various helpers that rely on it:

/* Each core has an XSCOM MMIO region */
if (pnv_chip_is_power10(chip)) {
xscom_core_base = PNV10_XSCOM_EC_BASE(core_hwid);
} else if (pnv_chip_is_power9(chip)) {
xscom_core_base = PNV9_XSCOM_EC_BASE(core_hwid);
} else {
xscom_core_base = PNV_XSCOM_EX_BASE(core_hwid);
}

The chip_type is also duplicated in the PnvPsiClass type.

It looks a bit unfortunate to implement manually something that falls into
the scope of QOM. Especially since we don't seem to need a finer grain than
the CPU familly, ie. POWER8, POWER9, POWER10, ..., and we already have
specialized versions of PnvChipClass and PnvPsiClass for these.

This series basically QOM-ifies all the places where we check on the chip
type, and gets rid of the chip_type attributes and the is_powerXX() helpers.

Patch 1 was recently posted to the list but it isn't available in David's
ppc-for-5.0 tree yet, so I include it in this series for convenience.

--
Greg

---

Greg Kurz (13):
  ppc: Drop useless extern annotation for functions
  ppc/pnv: Introduce PnvPsiClass::compat
  ppc/pnv: Drop PnvPsiClass::chip_type
  ppc/pnv: Introduce PnvMachineClass and PnvMachineClass::compat
  ppc/pnv: Introduce PnvMachineClass::dt_power_mgt()
  ppc/pnv: Drop pnv_is_power9() and pnv_is_power10() helpers
  ppc/pnv: Introduce PnvChipClass::intc_print_info() method
  ppc/pnv: Introduce PnvChipClass::xscom_core_base() method
  ppc/pnv: Pass XSCOM base address and address size to pnv_dt_xscom()
  ppc/pnv: Pass content of the "compatible" property to pnv_dt_xscom()
  ppc/pnv: Drop pnv_chip_is_power9() and pnv_chip_is_power10() helpers
  ppc/pnv: Introduce PnvChipClass::xscom_pcba() method
  ppc/pnv: Drop PnvChipClass::type


 hw/ppc/pnv.c   |  150 +---
 hw/ppc/pnv_psi.c   |   28 +++-
 hw/ppc/pnv_xscom.c |   48 ++
 include/hw/ppc/pnv.h   |   53 ++--
 include/hw/ppc/pnv_psi.h   |3 +
 include/hw/ppc/pnv_xscom.h |   24 ---
 include/hw/ppc/spapr_vio.h |6 +-
 7 files changed, 169 insertions(+), 143 deletions(-)




[PATCH 02/13] ppc/pnv: Introduce PnvPsiClass::compat

2019-12-13 Thread Greg Kurz
The Processor Service Interface (PSI) model has a chip_type class level
attribute, which is used to generate the content of the "compatible" DT
property according to the CPU type.

Since the PSI model already has specialized classes for each supported
CPU type, it seems cleaner to achieve this with QOM. Provide the content
of the "compatible" property with a new class level attribute.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv_psi.c |   25 +++--
 include/hw/ppc/pnv_psi.h |2 ++
 2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index 572924388b3c..98a82b25e01f 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -536,10 +536,6 @@ static void pnv_psi_power8_realize(DeviceState *dev, Error 
**errp)
 qemu_register_reset(pnv_psi_reset, dev);
 }
 
-static const char compat_p8[] = "ibm,power8-psihb-x\0ibm,psihb-x";
-static const char compat_p9[] = "ibm,power9-psihb-x\0ibm,psihb-x";
-static const char compat_p10[] = "ibm,power10-psihb-x\0ibm,psihb-x";
-
 static int pnv_psi_dt_xscom(PnvXScomInterface *dev, void *fdt, int 
xscom_offset)
 {
 PnvPsiClass *ppc = PNV_PSI_GET_CLASS(dev);
@@ -558,16 +554,8 @@ static int pnv_psi_dt_xscom(PnvXScomInterface *dev, void 
*fdt, int xscom_offset)
 _FDT(fdt_setprop(fdt, offset, "reg", reg, sizeof(reg)));
 _FDT(fdt_setprop_cell(fdt, offset, "#address-cells", 2));
 _FDT(fdt_setprop_cell(fdt, offset, "#size-cells", 1));
-if (ppc->chip_type == PNV_CHIP_POWER10) {
-_FDT(fdt_setprop(fdt, offset, "compatible", compat_p10,
- sizeof(compat_p10)));
-} else if (ppc->chip_type == PNV_CHIP_POWER9) {
-_FDT(fdt_setprop(fdt, offset, "compatible", compat_p9,
- sizeof(compat_p9)));
-} else {
-_FDT(fdt_setprop(fdt, offset, "compatible", compat_p8,
- sizeof(compat_p8)));
-}
+_FDT(fdt_setprop(fdt, offset, "compatible", ppc->compat,
+ ppc->compat_size));
 return 0;
 }
 
@@ -581,6 +569,7 @@ static void pnv_psi_power8_class_init(ObjectClass *klass, 
void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvPsiClass *ppc = PNV_PSI_CLASS(klass);
+static const char compat[] = "ibm,power8-psihb-x\0ibm,psihb-x";
 
 dc->desc= "PowerNV PSI Controller POWER8";
 dc->realize = pnv_psi_power8_realize;
@@ -590,6 +579,8 @@ static void pnv_psi_power8_class_init(ObjectClass *klass, 
void *data)
 ppc->xscom_size = PNV_XSCOM_PSIHB_SIZE;
 ppc->bar_mask   = PSIHB_BAR_MASK;
 ppc->irq_set= pnv_psi_power8_irq_set;
+ppc->compat = compat;
+ppc->compat_size = sizeof(compat);
 }
 
 static const TypeInfo pnv_psi_power8_info = {
@@ -888,6 +879,7 @@ static void pnv_psi_power9_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvPsiClass *ppc = PNV_PSI_CLASS(klass);
 XiveNotifierClass *xfc = XIVE_NOTIFIER_CLASS(klass);
+static const char compat[] = "ibm,power9-psihb-x\0ibm,psihb-x";
 
 dc->desc= "PowerNV PSI Controller POWER9";
 dc->realize = pnv_psi_power9_realize;
@@ -897,6 +889,8 @@ static void pnv_psi_power9_class_init(ObjectClass *klass, 
void *data)
 ppc->xscom_size = PNV9_XSCOM_PSIHB_SIZE;
 ppc->bar_mask   = PSIHB9_BAR_MASK;
 ppc->irq_set= pnv_psi_power9_irq_set;
+ppc->compat = compat;
+ppc->compat_size = sizeof(compat);
 
 xfc->notify  = pnv_psi_notify;
 }
@@ -917,12 +911,15 @@ static void pnv_psi_power10_class_init(ObjectClass 
*klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvPsiClass *ppc = PNV_PSI_CLASS(klass);
+static const char compat[] = "ibm,power10-psihb-x\0ibm,psihb-x";
 
 dc->desc= "PowerNV PSI Controller POWER10";
 
 ppc->chip_type  = PNV_CHIP_POWER10;
 ppc->xscom_pcba = PNV10_XSCOM_PSIHB_BASE;
 ppc->xscom_size = PNV10_XSCOM_PSIHB_SIZE;
+ppc->compat = compat;
+ppc->compat_size = sizeof(compat);
 }
 
 static const TypeInfo pnv_psi_power10_info = {
diff --git a/include/hw/ppc/pnv_psi.h b/include/hw/ppc/pnv_psi.h
index a044aab304ae..fc068c95e543 100644
--- a/include/hw/ppc/pnv_psi.h
+++ b/include/hw/ppc/pnv_psi.h
@@ -83,6 +83,8 @@ typedef struct PnvPsiClass {
 uint32_t xscom_pcba;
 uint32_t xscom_size;
 uint64_t bar_mask;
+const char *compat;
+int compat_size;
 
 void (*irq_set)(PnvPsi *psi, int, bool state);
 } PnvPsiClass;




[PATCH 08/13] ppc/pnv: Introduce PnvChipClass::xscom_core_base() method

2019-12-13 Thread Greg Kurz
The pnv_chip_core_realize() function configures the XSCOM MMIO subregion
for each core of a single chip. The base address of the subregion depends
on the CPU type. Its computation is currently open-code using the
pnv_chip_is_powerXX() helpers. This can be achieved with QOM. Introduce
a method for this in the base chip class and implement it in child classes.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c |   31 ---
 include/hw/ppc/pnv.h |1 +
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 2a53e99bda2e..88efa755e611 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -616,6 +616,24 @@ static void pnv_chip_power9_pic_print_info(PnvChip *chip, 
Monitor *mon)
 pnv_psi_pic_print_info(&chip9->psi, mon);
 }
 
+static uint64_t pnv_chip_power8_xscom_core_base(PnvChip *chip,
+uint32_t core_id)
+{
+return PNV_XSCOM_EX_BASE(core_id);
+}
+
+static uint64_t pnv_chip_power9_xscom_core_base(PnvChip *chip,
+uint32_t core_id)
+{
+return PNV9_XSCOM_EC_BASE(core_id);
+}
+
+static uint64_t pnv_chip_power10_xscom_core_base(PnvChip *chip,
+ uint32_t core_id)
+{
+return PNV10_XSCOM_EC_BASE(core_id);
+}
+
 static bool pnv_match_cpu(const char *default_type, const char *cpu_type)
 {
 PowerPCCPUClass *ppc_default =
@@ -1107,6 +1125,7 @@ static void pnv_chip_power8e_class_init(ObjectClass 
*klass, void *data)
 k->isa_create = pnv_chip_power8_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
+k->xscom_core_base = pnv_chip_power8_xscom_core_base;
 dc->desc = "PowerNV Chip POWER8E";
 
 device_class_set_parent_realize(dc, pnv_chip_power8_realize,
@@ -1129,6 +1148,7 @@ static void pnv_chip_power8_class_init(ObjectClass 
*klass, void *data)
 k->isa_create = pnv_chip_power8_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
+k->xscom_core_base = pnv_chip_power8_xscom_core_base;
 dc->desc = "PowerNV Chip POWER8";
 
 device_class_set_parent_realize(dc, pnv_chip_power8_realize,
@@ -1151,6 +1171,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
*klass, void *data)
 k->isa_create = pnv_chip_power8nvl_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
+k->xscom_core_base = pnv_chip_power8_xscom_core_base;
 dc->desc = "PowerNV Chip POWER8NVL";
 
 device_class_set_parent_realize(dc, pnv_chip_power8_realize,
@@ -1323,6 +1344,7 @@ static void pnv_chip_power9_class_init(ObjectClass 
*klass, void *data)
 k->isa_create = pnv_chip_power9_isa_create;
 k->dt_populate = pnv_chip_power9_dt_populate;
 k->pic_print_info = pnv_chip_power9_pic_print_info;
+k->xscom_core_base = pnv_chip_power9_xscom_core_base;
 dc->desc = "PowerNV Chip POWER9";
 
 device_class_set_parent_realize(dc, pnv_chip_power9_realize,
@@ -1404,6 +1426,7 @@ static void pnv_chip_power10_class_init(ObjectClass 
*klass, void *data)
 k->isa_create = pnv_chip_power10_isa_create;
 k->dt_populate = pnv_chip_power10_dt_populate;
 k->pic_print_info = pnv_chip_power10_pic_print_info;
+k->xscom_core_base = pnv_chip_power10_xscom_core_base;
 dc->desc = "PowerNV Chip POWER10";
 
 device_class_set_parent_realize(dc, pnv_chip_power10_realize,
@@ -1491,13 +1514,7 @@ static void pnv_chip_core_realize(PnvChip *chip, Error 
**errp)
  &error_fatal);
 
 /* Each core has an XSCOM MMIO region */
-if (pnv_chip_is_power10(chip)) {
-xscom_core_base = PNV10_XSCOM_EC_BASE(core_hwid);
-} else if (pnv_chip_is_power9(chip)) {
-xscom_core_base = PNV9_XSCOM_EC_BASE(core_hwid);
-} else {
-xscom_core_base = PNV_XSCOM_EX_BASE(core_hwid);
-}
+xscom_core_base = pcc->xscom_core_base(chip, core_hwid);
 
 pnv_xscom_add_subregion(chip, xscom_core_base,
 &pnv_core->xscom_regs);
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 7d2402784d4b..17ca9a14ac8f 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -137,6 +137,7 @@ typedef struct PnvChipClass {
 ISABus *(*isa_create)(PnvChip *chip, Error **errp);
 void (*dt_populate)(PnvChip *chip, void *fdt);
 void (*pic_print_info)(PnvChip *chip, Monitor *mon);
+uint64_t (*xscom_core_base)(PnvChip *chip, uint32_t core_id);
 } PnvChipClass;
 
 #define PNV_CHIP_TYPE_SUFFIX "-" TYPE_PNV_CHIP




[PATCH 04/13] ppc/pnv: Introduce PnvMachineClass and PnvMachineClass::compat

2019-12-13 Thread Greg Kurz
The pnv_dt_create() function generates different contents for the
"compatible" property of the root node in the DT, depending on the
CPU type. This is open coded with multiple ifs using pnv_is_powerXX()
helpers.

It seems cleaner to achieve with QOM. Introduce a base class for the
powernv machine and a compat attribute that each child class can use
to provide the value for the "compatible" property.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c |   33 +++--
 include/hw/ppc/pnv.h |   13 +
 2 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 0be0b6b411c3..5ac149b149d8 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -484,9 +484,7 @@ static void pnv_dt_power_mgt(void *fdt)
 
 static void *pnv_dt_create(MachineState *machine)
 {
-const char plat_compat8[] = "qemu,powernv8\0qemu,powernv\0ibm,powernv";
-const char plat_compat9[] = "qemu,powernv9\0ibm,powernv";
-const char plat_compat10[] = "qemu,powernv10\0ibm,powernv";
+PnvMachineClass *pmc = PNV_MACHINE_GET_CLASS(machine);
 PnvMachineState *pnv = PNV_MACHINE(machine);
 void *fdt;
 char *buf;
@@ -504,17 +502,8 @@ static void *pnv_dt_create(MachineState *machine)
 _FDT((fdt_setprop_cell(fdt, 0, "#size-cells", 0x2)));
 _FDT((fdt_setprop_string(fdt, 0, "model",
  "IBM PowerNV (emulated by qemu)")));
-if (pnv_is_power10(pnv)) {
-_FDT((fdt_setprop(fdt, 0, "compatible", plat_compat10,
-  sizeof(plat_compat10;
-} else if (pnv_is_power9(pnv)) {
-_FDT((fdt_setprop(fdt, 0, "compatible", plat_compat9,
-  sizeof(plat_compat9;
-} else {
-_FDT((fdt_setprop(fdt, 0, "compatible", plat_compat8,
-  sizeof(plat_compat8;
-}
-
+_FDT((fdt_setprop(fdt, 0, "compatible", pmc->compat,
+  sizeof(pmc->compat;
 
 buf =  qemu_uuid_unparse_strdup(&qemu_uuid);
 _FDT((fdt_setprop_string(fdt, 0, "vm,uuid", buf)));
@@ -1692,6 +1681,8 @@ static void pnv_machine_power8_class_init(ObjectClass 
*oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
 XICSFabricClass *xic = XICS_FABRIC_CLASS(oc);
+PnvMachineClass *pmc = PNV_MACHINE_CLASS(oc);
+static const char compat[] = "qemu,powernv8\0qemu,powernv\0ibm,powernv";
 
 mc->desc = "IBM PowerNV (Non-Virtualized) POWER8";
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
@@ -1699,26 +1690,39 @@ static void pnv_machine_power8_class_init(ObjectClass 
*oc, void *data)
 xic->icp_get = pnv_icp_get;
 xic->ics_get = pnv_ics_get;
 xic->ics_resend = pnv_ics_resend;
+
+pmc->compat = compat;
+pmc->compat_size = sizeof(compat);
 }
 
 static void pnv_machine_power9_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
 XiveFabricClass *xfc = XIVE_FABRIC_CLASS(oc);
+PnvMachineClass *pmc = PNV_MACHINE_CLASS(oc);
+static const char compat[] = "qemu,powernv9\0ibm,powernv";
 
 mc->desc = "IBM PowerNV (Non-Virtualized) POWER9";
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
 xfc->match_nvt = pnv_match_nvt;
 
 mc->alias = "powernv";
+
+pmc->compat = compat;
+pmc->compat_size = sizeof(compat);
 }
 
 static void pnv_machine_power10_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
+PnvMachineClass *pmc = PNV_MACHINE_CLASS(oc);
+static const char compat[] = "qemu,powernv10\0ibm,powernv";
 
 mc->desc = "IBM PowerNV (Non-Virtualized) POWER10";
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power10_v1.0");
+
+pmc->compat = compat;
+pmc->compat_size = sizeof(compat);
 }
 
 static void pnv_machine_class_init(ObjectClass *oc, void *data)
@@ -1796,6 +1800,7 @@ static const TypeInfo types[] = {
 .instance_size = sizeof(PnvMachineState),
 .instance_init = pnv_machine_instance_init,
 .class_init= pnv_machine_class_init,
+.class_size= sizeof(PnvMachineClass),
 .interfaces = (InterfaceInfo[]) {
 { TYPE_INTERRUPT_STATS_PROVIDER },
 { },
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 92f80b1ccead..d534746bd493 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -185,6 +185,19 @@ PowerPCCPU *pnv_chip_find_cpu(PnvChip *chip, uint32_t pir);
 #define TYPE_PNV_MACHINE   MACHINE_TYPE_NAME("powernv")
 #define PNV_MACHINE(obj) \
 OBJECT_CHECK(PnvMachineState, (obj), TYPE_PNV_MACHINE)
+#define PNV_MACHINE_GET_CLASS(obj) \
+OBJECT_GET_CLASS(PnvMachineClass, obj, TYPE_PNV_MACHINE)
+#define PNV_MACHINE_CLASS(klass) \
+OBJECT_CLASS_CHECK(PnvMachineClass, klass, TYPE_PNV_MACHINE)
+
+typedef struct PnvMachineClass {
+/*< private >*/
+MachineClass parent_class;
+
+/*< public >*/
+const char *compat;
+int compat_size;
+} PnvMachineClass;
 
 typedef struct PnvMachin

[PATCH 12/13] ppc/pnv: Introduce PnvChipClass::xscom_pcba() method

2019-12-13 Thread Greg Kurz
The XSCOM bus is implemented with a QOM interface, which is mostly
generic from a CPU type standpoint, except for the computation of
addresses on the Pervasize Connect Bus (PCB) network. This is handled
by the pnv_xscom_pcba() function with a switch statement based on
the chip_type class level attribute of the CPU chip.

This can be achieved using QOM. Also the address argument is masked with
PNV_XSCOM_SIZE - 1, which is for POWER8 only. Addresses may have different
sizes with other CPU types. Have each CPU chip type handle the appropriate
computation with a QOM xscom_pcba() method.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c |   23 +++
 hw/ppc/pnv_xscom.c   |   14 +-
 include/hw/ppc/pnv.h |1 +
 3 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 0447b534b8c5..cc40b90e9cd2 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1121,6 +1121,12 @@ static void pnv_chip_power8_realize(DeviceState *dev, 
Error **errp)
 &chip8->homer.regs);
 }
 
+static uint32_t pnv_chip_power8_xscom_pcba(PnvChip *chip, uint64_t addr)
+{
+addr &= (PNV_XSCOM_SIZE - 1);
+return ((addr >> 4) & ~0xfull) | ((addr >> 3) & 0xf);
+}
+
 static void pnv_chip_power8e_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -1138,6 +1144,7 @@ static void pnv_chip_power8e_class_init(ObjectClass 
*klass, void *data)
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
 k->xscom_core_base = pnv_chip_power8_xscom_core_base;
+k->xscom_pcba = pnv_chip_power8_xscom_pcba;
 dc->desc = "PowerNV Chip POWER8E";
 
 device_class_set_parent_realize(dc, pnv_chip_power8_realize,
@@ -1161,6 +1168,7 @@ static void pnv_chip_power8_class_init(ObjectClass 
*klass, void *data)
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
 k->xscom_core_base = pnv_chip_power8_xscom_core_base;
+k->xscom_pcba = pnv_chip_power8_xscom_pcba;
 dc->desc = "PowerNV Chip POWER8";
 
 device_class_set_parent_realize(dc, pnv_chip_power8_realize,
@@ -1184,6 +1192,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
*klass, void *data)
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
 k->xscom_core_base = pnv_chip_power8_xscom_core_base;
+k->xscom_pcba = pnv_chip_power8_xscom_pcba;
 dc->desc = "PowerNV Chip POWER8NVL";
 
 device_class_set_parent_realize(dc, pnv_chip_power8_realize,
@@ -1340,6 +1349,12 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
 &chip9->homer.regs);
 }
 
+static uint32_t pnv_chip_power9_xscom_pcba(PnvChip *chip, uint64_t addr)
+{
+addr &= (PNV9_XSCOM_SIZE - 1);
+return addr >> 3;
+}
+
 static void pnv_chip_power9_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -1357,6 +1372,7 @@ static void pnv_chip_power9_class_init(ObjectClass 
*klass, void *data)
 k->dt_populate = pnv_chip_power9_dt_populate;
 k->pic_print_info = pnv_chip_power9_pic_print_info;
 k->xscom_core_base = pnv_chip_power9_xscom_core_base;
+k->xscom_pcba = pnv_chip_power9_xscom_pcba;
 dc->desc = "PowerNV Chip POWER9";
 
 device_class_set_parent_realize(dc, pnv_chip_power9_realize,
@@ -1422,6 +1438,12 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
 (uint64_t) PNV10_LPCM_BASE(chip));
 }
 
+static uint32_t pnv_chip_power10_xscom_pcba(PnvChip *chip, uint64_t addr)
+{
+addr &= (PNV10_XSCOM_SIZE - 1);
+return addr >> 3;
+}
+
 static void pnv_chip_power10_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -1439,6 +1461,7 @@ static void pnv_chip_power10_class_init(ObjectClass 
*klass, void *data)
 k->dt_populate = pnv_chip_power10_dt_populate;
 k->pic_print_info = pnv_chip_power10_pic_print_info;
 k->xscom_core_base = pnv_chip_power10_xscom_core_base;
+k->xscom_pcba = pnv_chip_power10_xscom_pcba;
 dc->desc = "PowerNV Chip POWER10";
 
 device_class_set_parent_realize(dc, pnv_chip_power10_realize,
diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
index 5ae9dfbb88ad..b681c72575b2 100644
--- a/hw/ppc/pnv_xscom.c
+++ b/hw/ppc/pnv_xscom.c
@@ -57,19 +57,7 @@ static void xscom_complete(CPUState *cs, uint64_t hmer_bits)
 
 static uint32_t pnv_xscom_pcba(PnvChip *chip, uint64_t addr)
 {
-addr &= (PNV_XSCOM_SIZE - 1);
-
-switch (PNV_CHIP_GET_CLASS(chip)->chip_type) {
-case PNV_CHIP_POWER8E:
-case PNV_CHIP_POWER8:
-case PNV_CHIP_POWER8NVL:
-return ((addr >> 4) & ~0xfull) | ((addr >> 3) & 0xf);
-case PNV_CHIP_POWER9:
-case PNV_CHIP_POWER10:
-return addr >> 3;
-default:
-g_assert_not_reached();
-}
+return PNV_

[PATCH 13/13] ppc/pnv: Drop PnvChipClass::type

2019-12-13 Thread Greg Kurz
It isn't used anymore.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c |5 -
 include/hw/ppc/pnv.h |9 -
 2 files changed, 14 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index cc40b90e9cd2..232b4a25603c 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1132,7 +1132,6 @@ static void pnv_chip_power8e_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvChipClass *k = PNV_CHIP_CLASS(klass);
 
-k->chip_type = PNV_CHIP_POWER8E;
 k->chip_cfam_id = 0x221ef0498000ull;  /* P8 Murano DD2.1 */
 k->cores_mask = POWER8E_CORE_MASK;
 k->core_pir = pnv_chip_core_pir_p8;
@@ -1156,7 +1155,6 @@ static void pnv_chip_power8_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvChipClass *k = PNV_CHIP_CLASS(klass);
 
-k->chip_type = PNV_CHIP_POWER8;
 k->chip_cfam_id = 0x220ea0498000ull; /* P8 Venice DD2.0 */
 k->cores_mask = POWER8_CORE_MASK;
 k->core_pir = pnv_chip_core_pir_p8;
@@ -1180,7 +1178,6 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvChipClass *k = PNV_CHIP_CLASS(klass);
 
-k->chip_type = PNV_CHIP_POWER8NVL;
 k->chip_cfam_id = 0x120d30498000ull;  /* P8 Naples DD1.0 */
 k->cores_mask = POWER8_CORE_MASK;
 k->core_pir = pnv_chip_core_pir_p8;
@@ -1360,7 +1357,6 @@ static void pnv_chip_power9_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvChipClass *k = PNV_CHIP_CLASS(klass);
 
-k->chip_type = PNV_CHIP_POWER9;
 k->chip_cfam_id = 0x220d10498000ull; /* P9 Nimbus DD2.0 */
 k->cores_mask = POWER9_CORE_MASK;
 k->core_pir = pnv_chip_core_pir_p9;
@@ -1449,7 +1445,6 @@ static void pnv_chip_power10_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PnvChipClass *k = PNV_CHIP_CLASS(klass);
 
-k->chip_type = PNV_CHIP_POWER10;
 k->chip_cfam_id = 0x120da0498000ull; /* P10 DD1.0 (with NX) */
 k->cores_mask = POWER10_CORE_MASK;
 k->core_pir = pnv_chip_core_pir_p10;
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 4972e93c2619..f78fd0dd967c 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -38,14 +38,6 @@
 #define PNV_CHIP_GET_CLASS(obj) \
  OBJECT_GET_CLASS(PnvChipClass, (obj), TYPE_PNV_CHIP)
 
-typedef enum PnvChipType {
-PNV_CHIP_POWER8E, /* AKA Murano (default) */
-PNV_CHIP_POWER8,  /* AKA Venice */
-PNV_CHIP_POWER8NVL,   /* AKA Naples */
-PNV_CHIP_POWER9,  /* AKA Nimbus */
-PNV_CHIP_POWER10, /* AKA TBD */
-} PnvChipType;
-
 typedef struct PnvChip {
 /*< private >*/
 SysBusDevice parent_obj;
@@ -123,7 +115,6 @@ typedef struct PnvChipClass {
 SysBusDeviceClass parent_class;
 
 /*< public >*/
-PnvChipType  chip_type;
 uint64_t chip_cfam_id;
 uint64_t cores_mask;
 




[PATCH 03/13] ppc/pnv: Drop PnvPsiClass::chip_type

2019-12-13 Thread Greg Kurz
It isn't used anymore.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv_psi.c |3 ---
 include/hw/ppc/pnv_psi.h |1 -
 2 files changed, 4 deletions(-)

diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index 98a82b25e01f..75e20d9da08b 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -574,7 +574,6 @@ static void pnv_psi_power8_class_init(ObjectClass *klass, 
void *data)
 dc->desc= "PowerNV PSI Controller POWER8";
 dc->realize = pnv_psi_power8_realize;
 
-ppc->chip_type =  PNV_CHIP_POWER8;
 ppc->xscom_pcba = PNV_XSCOM_PSIHB_BASE;
 ppc->xscom_size = PNV_XSCOM_PSIHB_SIZE;
 ppc->bar_mask   = PSIHB_BAR_MASK;
@@ -884,7 +883,6 @@ static void pnv_psi_power9_class_init(ObjectClass *klass, 
void *data)
 dc->desc= "PowerNV PSI Controller POWER9";
 dc->realize = pnv_psi_power9_realize;
 
-ppc->chip_type  = PNV_CHIP_POWER9;
 ppc->xscom_pcba = PNV9_XSCOM_PSIHB_BASE;
 ppc->xscom_size = PNV9_XSCOM_PSIHB_SIZE;
 ppc->bar_mask   = PSIHB9_BAR_MASK;
@@ -915,7 +913,6 @@ static void pnv_psi_power10_class_init(ObjectClass *klass, 
void *data)
 
 dc->desc= "PowerNV PSI Controller POWER10";
 
-ppc->chip_type  = PNV_CHIP_POWER10;
 ppc->xscom_pcba = PNV10_XSCOM_PSIHB_BASE;
 ppc->xscom_size = PNV10_XSCOM_PSIHB_SIZE;
 ppc->compat = compat;
diff --git a/include/hw/ppc/pnv_psi.h b/include/hw/ppc/pnv_psi.h
index fc068c95e543..f0f5b5519767 100644
--- a/include/hw/ppc/pnv_psi.h
+++ b/include/hw/ppc/pnv_psi.h
@@ -79,7 +79,6 @@ typedef struct Pnv9Psi {
 typedef struct PnvPsiClass {
 SysBusDeviceClass parent_class;
 
-int chip_type;
 uint32_t xscom_pcba;
 uint32_t xscom_size;
 uint64_t bar_mask;




[PATCH 05/13] ppc/pnv: Introduce PnvMachineClass::dt_power_mgt()

2019-12-13 Thread Greg Kurz
We add an extra node to advertise power management on some machines,
namely powernv9 and powernv10. This is achieved by using the
pnv_is_power9() and pnv_is_power10() helpers.

This can be achieved with QOM. Add a method to the base class for
powernv machines and have it implemented by machine types that
support power management instead.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c |   10 ++
 include/hw/ppc/pnv.h |8 ++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 5ac149b149d8..efc00f4cb67a 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -472,7 +472,7 @@ static void pnv_dt_isa(PnvMachineState *pnv, void *fdt)
&args);
 }
 
-static void pnv_dt_power_mgt(void *fdt)
+static void pnv_dt_power_mgt(PnvMachineState *pnv, void *fdt)
 {
 int off;
 
@@ -540,9 +540,9 @@ static void *pnv_dt_create(MachineState *machine)
 pnv_dt_bmc_sensors(pnv->bmc, fdt);
 }
 
-/* Create an extra node for power management on Power9 and Power10 */
-if (pnv_is_power9(pnv) || pnv_is_power10(pnv)) {
-pnv_dt_power_mgt(fdt);
+/* Create an extra node for power management on machines that support it */
+if (pmc->dt_power_mgt) {
+pmc->dt_power_mgt(pnv, fdt);
 }
 
 return fdt;
@@ -1710,6 +1710,7 @@ static void pnv_machine_power9_class_init(ObjectClass 
*oc, void *data)
 
 pmc->compat = compat;
 pmc->compat_size = sizeof(compat);
+pmc->dt_power_mgt = pnv_dt_power_mgt;
 }
 
 static void pnv_machine_power10_class_init(ObjectClass *oc, void *data)
@@ -1723,6 +1724,7 @@ static void pnv_machine_power10_class_init(ObjectClass 
*oc, void *data)
 
 pmc->compat = compat;
 pmc->compat_size = sizeof(compat);
+pmc->dt_power_mgt = pnv_dt_power_mgt;
 }
 
 static void pnv_machine_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index d534746bd493..8a42c199b65c 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -190,6 +190,8 @@ PowerPCCPU *pnv_chip_find_cpu(PnvChip *chip, uint32_t pir);
 #define PNV_MACHINE_CLASS(klass) \
 OBJECT_CLASS_CHECK(PnvMachineClass, klass, TYPE_PNV_MACHINE)
 
+typedef struct PnvMachineState PnvMachineState;
+
 typedef struct PnvMachineClass {
 /*< private >*/
 MachineClass parent_class;
@@ -197,9 +199,11 @@ typedef struct PnvMachineClass {
 /*< public >*/
 const char *compat;
 int compat_size;
+
+void (*dt_power_mgt)(PnvMachineState *pnv, void *fdt);
 } PnvMachineClass;
 
-typedef struct PnvMachineState {
+struct PnvMachineState {
 /*< private >*/
 MachineState parent_obj;
 
@@ -216,7 +220,7 @@ typedef struct PnvMachineState {
 Notifier powerdown_notifier;
 
 PnvPnor  *pnor;
-} PnvMachineState;
+};
 
 static inline bool pnv_chip_is_power9(const PnvChip *chip)
 {




Re: [PATCH] hw/i386: De-duplicate gsi_handler() to remove kvm_pc_gsi_handler()

2019-12-13 Thread Paolo Bonzini
On 13/12/19 12:07, Philippe Mathieu-Daudé wrote:
> Both gsi_handler() and kvm_pc_gsi_handler() have the same content,
> except one comment. Move the comment, and de-duplicate the code.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/sysemu/kvm.h |  1 -
>  hw/i386/kvm/ioapic.c | 12 
>  hw/i386/pc.c |  5 ++---
>  3 files changed, 2 insertions(+), 16 deletions(-)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 9fe233b9bf..f5d0d0d710 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -515,7 +515,6 @@ int kvm_irqchip_add_irqfd_notifier(KVMState *s, 
> EventNotifier *n,
>  int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n,
>qemu_irq irq);
>  void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi);
> -void kvm_pc_gsi_handler(void *opaque, int n, int level);
>  void kvm_pc_setup_irq_routing(bool pci_enabled);
>  void kvm_init_irq_routing(KVMState *s);
>  
> diff --git a/hw/i386/kvm/ioapic.c b/hw/i386/kvm/ioapic.c
> index f94729c565..bae7413a39 100644
> --- a/hw/i386/kvm/ioapic.c
> +++ b/hw/i386/kvm/ioapic.c
> @@ -48,18 +48,6 @@ void kvm_pc_setup_irq_routing(bool pci_enabled)
>  }
>  }
>  
> -void kvm_pc_gsi_handler(void *opaque, int n, int level)
> -{
> -GSIState *s = opaque;
> -
> -if (n < ISA_NUM_IRQS) {
> -/* Kernel will forward to both PIC and IOAPIC */
> -qemu_set_irq(s->i8259_irq[n], level);
> -} else {
> -qemu_set_irq(s->ioapic_irq[n], level);
> -}
> -}
> -
>  typedef struct KVMIOAPICState KVMIOAPICState;
>  
>  struct KVMIOAPICState {
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index ac08e63604..97e9049b71 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -350,6 +350,7 @@ void gsi_handler(void *opaque, int n, int level)
>  
>  DPRINTF("pc: %s GSI %d\n", level ? "raising" : "lowering", n);
>  if (n < ISA_NUM_IRQS) {
> +/* Under KVM, Kernel will forward to both PIC and IOAPIC */
>  qemu_set_irq(s->i8259_irq[n], level);
>  }
>  qemu_set_irq(s->ioapic_irq[n], level);
> @@ -362,10 +363,8 @@ GSIState *pc_gsi_create(qemu_irq **irqs, bool 
> pci_enabled)
>  s = g_new0(GSIState, 1);
>  if (kvm_ioapic_in_kernel()) {
>  kvm_pc_setup_irq_routing(pci_enabled);
> -*irqs = qemu_allocate_irqs(kvm_pc_gsi_handler, s, GSI_NUM_PINS);
> -} else {
> -*irqs = qemu_allocate_irqs(gsi_handler, s, GSI_NUM_PINS);
>  }
> +*irqs = qemu_allocate_irqs(gsi_handler, s, GSI_NUM_PINS);
>  
>  return s;
>  }
> 

Queued, thanks.

Paolo




[PATCH 11/13] ppc/pnv: Drop pnv_chip_is_power9() and pnv_chip_is_power10() helpers

2019-12-13 Thread Greg Kurz
They aren't used anymore.

Signed-off-by: Greg Kurz 
---
 include/hw/ppc/pnv.h |   10 --
 1 file changed, 10 deletions(-)

diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 17ca9a14ac8f..7a134a15d3b5 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -224,21 +224,11 @@ struct PnvMachineState {
 PnvPnor  *pnor;
 };
 
-static inline bool pnv_chip_is_power9(const PnvChip *chip)
-{
-return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER9;
-}
-
 PnvChip *pnv_get_chip(uint32_t chip_id);
 
 #define PNV_FDT_ADDR  0x0100
 #define PNV_TIMEBASE_FREQ 51200ULL
 
-static inline bool pnv_chip_is_power10(const PnvChip *chip)
-{
-return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER10;
-}
-
 /*
  * BMC helpers
  */




[PATCH 06/13] ppc/pnv: Drop pnv_is_power9() and pnv_is_power10() helpers

2019-12-13 Thread Greg Kurz
They aren't used anymore.

Signed-off-by: Greg Kurz 
---
 include/hw/ppc/pnv.h |   10 --
 1 file changed, 10 deletions(-)

diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 8a42c199b65c..c213bdd5ecd3 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -227,11 +227,6 @@ static inline bool pnv_chip_is_power9(const PnvChip *chip)
 return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER9;
 }
 
-static inline bool pnv_is_power9(PnvMachineState *pnv)
-{
-return pnv_chip_is_power9(pnv->chips[0]);
-}
-
 PnvChip *pnv_get_chip(uint32_t chip_id);
 
 #define PNV_FDT_ADDR  0x0100
@@ -242,11 +237,6 @@ static inline bool pnv_chip_is_power10(const PnvChip *chip)
 return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER10;
 }
 
-static inline bool pnv_is_power10(PnvMachineState *pnv)
-{
-return pnv_chip_is_power10(pnv->chips[0]);
-}
-
 /*
  * BMC helpers
  */




[PATCH 09/13] ppc/pnv: Pass XSCOM base address and address size to pnv_dt_xscom()

2019-12-13 Thread Greg Kurz
Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the "reg"
property. Just pass the base address and address size as arguments.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c   |   12 +---
 hw/ppc/pnv_xscom.c |   16 +++-
 include/hw/ppc/pnv_xscom.h |3 ++-
 3 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 88efa755e611..c532e98e752a 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -282,7 +282,9 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, void 
*fdt)
 {
 int i;
 
-pnv_dt_xscom(chip, fdt, 0);
+pnv_dt_xscom(chip, fdt, 0,
+ cpu_to_be64(PNV_XSCOM_BASE(chip)),
+ cpu_to_be64(PNV_XSCOM_SIZE));
 
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pnv_core = chip->cores[i];
@@ -302,7 +304,9 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, void 
*fdt)
 {
 int i;
 
-pnv_dt_xscom(chip, fdt, 0);
+pnv_dt_xscom(chip, fdt, 0,
+ cpu_to_be64(PNV9_XSCOM_BASE(chip)),
+ cpu_to_be64(PNV9_XSCOM_SIZE));
 
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pnv_core = chip->cores[i];
@@ -321,7 +325,9 @@ static void pnv_chip_power10_dt_populate(PnvChip *chip, 
void *fdt)
 {
 int i;
 
-pnv_dt_xscom(chip, fdt, 0);
+pnv_dt_xscom(chip, fdt, 0,
+ cpu_to_be64(PNV10_XSCOM_BASE(chip)),
+ cpu_to_be64(PNV10_XSCOM_SIZE));
 
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pnv_core = chip->cores[i];
diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
index df926003f2ba..8189767eb0bb 100644
--- a/hw/ppc/pnv_xscom.c
+++ b/hw/ppc/pnv_xscom.c
@@ -286,24 +286,14 @@ static const char compat_p8[] = 
"ibm,power8-xscom\0ibm,xscom";
 static const char compat_p9[] = "ibm,power9-xscom\0ibm,xscom";
 static const char compat_p10[] = "ibm,power10-xscom\0ibm,xscom";
 
-int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset)
+int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
+ uint64_t xscom_base, uint64_t xscom_size)
 {
-uint64_t reg[2];
+uint64_t reg[] = { xscom_base, xscom_size };
 int xscom_offset;
 ForeachPopulateArgs args;
 char *name;
 
-if (pnv_chip_is_power10(chip)) {
-reg[0] = cpu_to_be64(PNV10_XSCOM_BASE(chip));
-reg[1] = cpu_to_be64(PNV10_XSCOM_SIZE);
-} else if (pnv_chip_is_power9(chip)) {
-reg[0] = cpu_to_be64(PNV9_XSCOM_BASE(chip));
-reg[1] = cpu_to_be64(PNV9_XSCOM_SIZE);
-} else {
-reg[0] = cpu_to_be64(PNV_XSCOM_BASE(chip));
-reg[1] = cpu_to_be64(PNV_XSCOM_SIZE);
-}
-
 name = g_strdup_printf("xscom@%" PRIx64, be64_to_cpu(reg[0]));
 xscom_offset = fdt_add_subnode(fdt, root_offset, name);
 _FDT(xscom_offset);
diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
index 2bdb7ae84fd3..ad53f788b44c 100644
--- a/include/hw/ppc/pnv_xscom.h
+++ b/include/hw/ppc/pnv_xscom.h
@@ -114,7 +114,8 @@ typedef struct PnvXScomInterfaceClass {
 #define PNV10_XSCOM_PSIHB_SIZE 0x100
 
 void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
-int pnv_dt_xscom(PnvChip *chip, void *fdt, int offset);
+int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
+ uint64_t xscom_base, uint64_t xscom_size);
 
 void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
  MemoryRegion *mr);




Re: [PATCH 4/4] hw/i386/pc: Extract the port92 device

2019-12-13 Thread Paolo Bonzini
On 13/12/19 11:51, Philippe Mathieu-Daudé wrote:
> This device is only used by the PC machines. The pc.c file is
> already big enough, with 2255 lines. By removing 113 lines of
> it, we reduced it by 5%. It is now a bit easier to navigate
> the file.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> checkpatch warning:
> 
>   WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
>   #142:
>   new file mode 100644
> 
> is harmless because MAINTAINERS PC entry matches the directory:
> 
>   PC
>   ...
>   F: hw/i386/
> ---
>  include/hw/i386/pc.h  |   3 +
>  hw/i386/pc.c  | 113 -
>  hw/i386/port92.c  | 126 ++
>  hw/i386/Makefile.objs |   1 +
>  hw/i386/trace-events  |   2 +
>  5 files changed, 132 insertions(+), 113 deletions(-)
>  create mode 100644 hw/i386/port92.c
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 1f86eba3f9..7e8d18d6fa 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -224,8 +224,11 @@ int cmos_get_fd_drive_type(FloppyDriveType fd0);
>  
>  #define FW_CFG_IO_BASE 0x510
>  
> +/* port92.c */
>  #define PORT92_A20_LINE "a20"
>  
> +#define TYPE_PORT92 "port92"
> +
>  /* hpet.c */
>  extern int no_hpet;
>  
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 2e8992c7d0..15efcb29d5 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -733,119 +733,6 @@ void pc_cmos_init(PCMachineState *pcms,
>  qemu_register_reset(pc_cmos_init_late, &arg);
>  }
>  
> -#define TYPE_PORT92 "port92"
> -#define PORT92(obj) OBJECT_CHECK(Port92State, (obj), TYPE_PORT92)
> -
> -/* port 92 stuff: could be split off */
> -typedef struct Port92State {
> -ISADevice parent_obj;
> -
> -MemoryRegion io;
> -uint8_t outport;
> -qemu_irq a20_out;
> -} Port92State;
> -
> -static void port92_write(void *opaque, hwaddr addr, uint64_t val,
> - unsigned size)
> -{
> -Port92State *s = opaque;
> -int oldval = s->outport;
> -
> -trace_port92_write(val);
> -s->outport = val;
> -qemu_set_irq(s->a20_out, (val >> 1) & 1);
> -if ((val & 1) && !(oldval & 1)) {
> -qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
> -}
> -}
> -
> -static uint64_t port92_read(void *opaque, hwaddr addr,
> -unsigned size)
> -{
> -Port92State *s = opaque;
> -uint32_t ret;
> -
> -ret = s->outport;
> -trace_port92_read(ret);
> -return ret;
> -}
> -
> -static const VMStateDescription vmstate_port92_isa = {
> -.name = "port92",
> -.version_id = 1,
> -.minimum_version_id = 1,
> -.fields = (VMStateField[]) {
> -VMSTATE_UINT8(outport, Port92State),
> -VMSTATE_END_OF_LIST()
> -}
> -};
> -
> -static void port92_reset(DeviceState *d)
> -{
> -Port92State *s = PORT92(d);
> -
> -s->outport &= ~1;
> -}
> -
> -static const MemoryRegionOps port92_ops = {
> -.read = port92_read,
> -.write = port92_write,
> -.impl = {
> -.min_access_size = 1,
> -.max_access_size = 1,
> -},
> -.endianness = DEVICE_LITTLE_ENDIAN,
> -};
> -
> -static void port92_initfn(Object *obj)
> -{
> -Port92State *s = PORT92(obj);
> -
> -memory_region_init_io(&s->io, OBJECT(s), &port92_ops, s, "port92", 1);
> -
> -s->outport = 0;
> -
> -qdev_init_gpio_out_named(DEVICE(obj), &s->a20_out, PORT92_A20_LINE, 1);
> -}
> -
> -static void port92_realizefn(DeviceState *dev, Error **errp)
> -{
> -ISADevice *isadev = ISA_DEVICE(dev);
> -Port92State *s = PORT92(dev);
> -
> -isa_register_ioport(isadev, &s->io, 0x92);
> -}
> -
> -static void port92_class_initfn(ObjectClass *klass, void *data)
> -{
> -DeviceClass *dc = DEVICE_CLASS(klass);
> -
> -dc->realize = port92_realizefn;
> -dc->reset = port92_reset;
> -dc->vmsd = &vmstate_port92_isa;
> -/*
> - * Reason: unlike ordinary ISA devices, this one needs additional
> - * wiring: its A20 output line needs to be wired up with
> - * qdev_connect_gpio_out_named().
> - */
> -dc->user_creatable = false;
> -}
> -
> -static const TypeInfo port92_info = {
> -.name  = TYPE_PORT92,
> -.parent= TYPE_ISA_DEVICE,
> -.instance_size = sizeof(Port92State),
> -.instance_init = port92_initfn,
> -.class_init= port92_class_initfn,
> -};
> -
> -static void port92_register_types(void)
> -{
> -type_register_static(&port92_info);
> -}
> -
> -type_init(port92_register_types)
> -
>  static void handle_a20_line_change(void *opaque, int irq, int level)
>  {
>  X86CPU *cpu = opaque;
> diff --git a/hw/i386/port92.c b/hw/i386/port92.c
> new file mode 100644
> index 00..19866c44ef
> --- /dev/null
> +++ b/hw/i386/port92.c
> @@ -0,0 +1,126 @@
> +/*
> + * QEMU I/O port 0x92 (System Control Port A, to handle Fast Gate A20)
> + *
> + * Copyright (c) 2003-2004 Fabrice Bellard
> + *
> + * SPDX-License-Identifier: MIT
> + */
> +
> +#inc

[PATCH 07/13] ppc/pnv: Introduce PnvChipClass::intc_print_info() method

2019-12-13 Thread Greg Kurz
The pnv_pic_print_info() callback checks the type of the chip in order
to forward to the request appropriate interrupt controller. This can
be achieved with QOM. Introduce a method for this in the base chip class
and implement it in child classes.

This also prepares ground for the upcoming interrupt controller of POWER10
chips.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c |   30 +-
 include/hw/ppc/pnv.h |1 +
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index efc00f4cb67a..2a53e99bda2e 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -832,6 +832,12 @@ static void pnv_chip_power8_intc_destroy(PnvChip *chip, 
PowerPCCPU *cpu)
 pnv_cpu->intc = NULL;
 }
 
+static void pnv_chip_power8_intc_print_info(PnvChip *chip, PowerPCCPU *cpu,
+Monitor *mon)
+{
+icp_pic_print_info(ICP(pnv_cpu_state(cpu)->intc), mon);
+}
+
 /*
  *0:48  Reserved - Read as zeroes
  *   49:52  Node ID
@@ -889,6 +895,12 @@ static void pnv_chip_power9_intc_destroy(PnvChip *chip, 
PowerPCCPU *cpu)
 pnv_cpu->intc = NULL;
 }
 
+static void pnv_chip_power9_intc_print_info(PnvChip *chip, PowerPCCPU *cpu,
+Monitor *mon)
+{
+xive_tctx_pic_print_info(XIVE_TCTX(pnv_cpu_state(cpu)->intc), mon);
+}
+
 static void pnv_chip_power10_intc_create(PnvChip *chip, PowerPCCPU *cpu,
 Error **errp)
 {
@@ -910,6 +922,11 @@ static void pnv_chip_power10_intc_destroy(PnvChip *chip, 
PowerPCCPU *cpu)
 pnv_cpu->intc = NULL;
 }
 
+static void pnv_chip_power10_intc_print_info(PnvChip *chip, PowerPCCPU *cpu,
+ Monitor *mon)
+{
+}
+
 /*
  * Allowed core identifiers on a POWER8 Processor Chip :
  *
@@ -1086,6 +1103,7 @@ static void pnv_chip_power8e_class_init(ObjectClass 
*klass, void *data)
 k->intc_create = pnv_chip_power8_intc_create;
 k->intc_reset = pnv_chip_power8_intc_reset;
 k->intc_destroy = pnv_chip_power8_intc_destroy;
+k->intc_print_info = pnv_chip_power8_intc_print_info;
 k->isa_create = pnv_chip_power8_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
@@ -1107,6 +1125,7 @@ static void pnv_chip_power8_class_init(ObjectClass 
*klass, void *data)
 k->intc_create = pnv_chip_power8_intc_create;
 k->intc_reset = pnv_chip_power8_intc_reset;
 k->intc_destroy = pnv_chip_power8_intc_destroy;
+k->intc_print_info = pnv_chip_power8_intc_print_info;
 k->isa_create = pnv_chip_power8_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
@@ -1128,6 +1147,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
*klass, void *data)
 k->intc_create = pnv_chip_power8_intc_create;
 k->intc_reset = pnv_chip_power8_intc_reset;
 k->intc_destroy = pnv_chip_power8_intc_destroy;
+k->intc_print_info = pnv_chip_power8_intc_print_info;
 k->isa_create = pnv_chip_power8nvl_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
 k->pic_print_info = pnv_chip_power8_pic_print_info;
@@ -1299,6 +1319,7 @@ static void pnv_chip_power9_class_init(ObjectClass 
*klass, void *data)
 k->intc_create = pnv_chip_power9_intc_create;
 k->intc_reset = pnv_chip_power9_intc_reset;
 k->intc_destroy = pnv_chip_power9_intc_destroy;
+k->intc_print_info = pnv_chip_power9_intc_print_info;
 k->isa_create = pnv_chip_power9_isa_create;
 k->dt_populate = pnv_chip_power9_dt_populate;
 k->pic_print_info = pnv_chip_power9_pic_print_info;
@@ -1379,6 +1400,7 @@ static void pnv_chip_power10_class_init(ObjectClass 
*klass, void *data)
 k->intc_create = pnv_chip_power10_intc_create;
 k->intc_reset = pnv_chip_power10_intc_reset;
 k->intc_destroy = pnv_chip_power10_intc_destroy;
+k->intc_print_info = pnv_chip_power10_intc_print_info;
 k->isa_create = pnv_chip_power10_isa_create;
 k->dt_populate = pnv_chip_power10_dt_populate;
 k->pic_print_info = pnv_chip_power10_pic_print_info;
@@ -1575,11 +1597,9 @@ static void pnv_pic_print_info(InterruptStatsProvider 
*obj,
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 
-if (pnv_chip_is_power9(pnv->chips[0])) {
-xive_tctx_pic_print_info(XIVE_TCTX(pnv_cpu_state(cpu)->intc), mon);
-} else {
-icp_pic_print_info(ICP(pnv_cpu_state(cpu)->intc), mon);
-}
+/* XXX: loop on each chip/core/thread instead of CPU_FOREACH() */
+PNV_CHIP_GET_CLASS(pnv->chips[0])->intc_print_info(pnv->chips[0], cpu,
+   mon);
 }
 
 for (i = 0; i < pnv->num_chips; i++) {
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index c213bdd5ecd3..7d2402784d4b 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -133,6 +133,7 @@ typede

[PATCH 10/13] ppc/pnv: Pass content of the "compatible" property to pnv_dt_xscom()

2019-12-13 Thread Greg Kurz
Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the
"compatible" property. Just pass the compat string and its size as
arguments.

Signed-off-by: Greg Kurz 
---
 hw/ppc/pnv.c   |   12 +---
 hw/ppc/pnv_xscom.c |   20 +++-
 include/hw/ppc/pnv_xscom.h |3 ++-
 3 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index c532e98e752a..0447b534b8c5 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -280,11 +280,13 @@ static void pnv_dt_icp(PnvChip *chip, void *fdt, uint32_t 
pir,
 
 static void pnv_chip_power8_dt_populate(PnvChip *chip, void *fdt)
 {
+static const char compat[] = "ibm,power8-xscom\0ibm,xscom";
 int i;
 
 pnv_dt_xscom(chip, fdt, 0,
  cpu_to_be64(PNV_XSCOM_BASE(chip)),
- cpu_to_be64(PNV_XSCOM_SIZE));
+ cpu_to_be64(PNV_XSCOM_SIZE),
+ compat, sizeof(compat));
 
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pnv_core = chip->cores[i];
@@ -302,11 +304,13 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, 
void *fdt)
 
 static void pnv_chip_power9_dt_populate(PnvChip *chip, void *fdt)
 {
+static const char compat[] = "ibm,power9-xscom\0ibm,xscom";
 int i;
 
 pnv_dt_xscom(chip, fdt, 0,
  cpu_to_be64(PNV9_XSCOM_BASE(chip)),
- cpu_to_be64(PNV9_XSCOM_SIZE));
+ cpu_to_be64(PNV9_XSCOM_SIZE),
+ compat, sizeof(compat));
 
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pnv_core = chip->cores[i];
@@ -323,11 +327,13 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
void *fdt)
 
 static void pnv_chip_power10_dt_populate(PnvChip *chip, void *fdt)
 {
+static const char compat[] = "ibm,power10-xscom\0ibm,xscom";
 int i;
 
 pnv_dt_xscom(chip, fdt, 0,
  cpu_to_be64(PNV10_XSCOM_BASE(chip)),
- cpu_to_be64(PNV10_XSCOM_SIZE));
+ cpu_to_be64(PNV10_XSCOM_SIZE),
+ compat, sizeof(compat));
 
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pnv_core = chip->cores[i];
diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
index 8189767eb0bb..5ae9dfbb88ad 100644
--- a/hw/ppc/pnv_xscom.c
+++ b/hw/ppc/pnv_xscom.c
@@ -282,12 +282,9 @@ static int xscom_dt_child(Object *child, void *opaque)
 return 0;
 }
 
-static const char compat_p8[] = "ibm,power8-xscom\0ibm,xscom";
-static const char compat_p9[] = "ibm,power9-xscom\0ibm,xscom";
-static const char compat_p10[] = "ibm,power10-xscom\0ibm,xscom";
-
 int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
- uint64_t xscom_base, uint64_t xscom_size)
+ uint64_t xscom_base, uint64_t xscom_size,
+ const char *compat, int compat_size)
 {
 uint64_t reg[] = { xscom_base, xscom_size };
 int xscom_offset;
@@ -302,18 +299,7 @@ int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
 _FDT((fdt_setprop_cell(fdt, xscom_offset, "#address-cells", 1)));
 _FDT((fdt_setprop_cell(fdt, xscom_offset, "#size-cells", 1)));
 _FDT((fdt_setprop(fdt, xscom_offset, "reg", reg, sizeof(reg;
-
-if (pnv_chip_is_power10(chip)) {
-_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat_p10,
-  sizeof(compat_p10;
-} else if (pnv_chip_is_power9(chip)) {
-_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat_p9,
-  sizeof(compat_p9;
-} else {
-_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat_p8,
-  sizeof(compat_p8;
-}
-
+_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat, compat_size)));
 _FDT((fdt_setprop(fdt, xscom_offset, "scom-controller", NULL, 0)));
 
 args.fdt = fdt;
diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
index ad53f788b44c..f74c81a980f3 100644
--- a/include/hw/ppc/pnv_xscom.h
+++ b/include/hw/ppc/pnv_xscom.h
@@ -115,7 +115,8 @@ typedef struct PnvXScomInterfaceClass {
 
 void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
 int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
- uint64_t xscom_base, uint64_t xscom_size);
+ uint64_t xscom_base, uint64_t xscom_size,
+ const char *compat, int compat_size);
 
 void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
  MemoryRegion *mr);




Re: [PATCH v2 2/2] gdbstub: do not split gdb_monitor_write payload

2019-12-13 Thread Damien Hedde


On 12/12/19 11:52 AM, Alex Bennée wrote:
> 
> Damien Hedde  writes:
> 
>> On 12/11/19 7:59 PM, Alex Bennée wrote:
>>>
>>> Damien Hedde  writes:
>>>
 Since we can now send packets of arbitrary length:
 simplify gdb_monitor_write() and send the whole payload
 in one packet.
>>>
>>> Do we know gdb won't barf on us. Does the negotiated max packet size
>>> only apply to data sent to the gdbserver?
>>
>> Yes the negociated packet size is only about packet we can receive.
>> Qutoting the gdb doc:
>> | ‘PacketSize=bytes’
>> |
>> |The remote stub can accept packets up to at least bytes in length.
>> | GDB will send packets up to this size for bulk transfers, and will
>> | never send larger packets.
>>
>> The qSupported doc also says that "Any GDB which sends a ‘qSupported’
>> packet supports receiving packets of unlimited length".
>> I did some digging and qSupported appeared in gdb 6.6 (december 2006).
>> And gdb supported arbitrary sized packet even before that (6.4.9 2006
>> too).
> 
> I think that is worth a comment for the function gdb_monitor_write
> quoting the spec and the versions. With that comment:
> 
> Reviewed-by: Alex Bennée 
> 

Good idea ! Is that ok if I add these comments in the 1st patch along
with the gdbstate.last_packet field ? it seems a more central place.
I can still add a short note for gdb_monitor_write().

Damien



Re: [PATCH v3 4/4] pc-bios/s390x: Fix reset psw mask

2019-12-13 Thread Cornelia Huck
On Thu, 5 Dec 2019 11:12:39 +0100
Cornelia Huck  wrote:

> On Tue,  3 Dec 2019 08:28:13 -0500
> Janosch Frank  wrote:
> 
> > We need to set the short psw indication bit in the reset psw, as it is
> > a short psw.
> > 
> > fixes: 9629823290 ("pc-bios/s390-ccw: do a subsystem reset before running 
> > the guest")
> > Signed-off-by: Janosch Frank 
> > ---
> >  pc-bios/s390-ccw/jump2ipl.c | 12 +++-
> >  1 file changed, 7 insertions(+), 5 deletions(-)  
> 
> Thanks, applied (together with a rebuild of the bios images.)

This unfortunately breaks 'make check-qtest-s390x':

   TESTcheck-qtest-s390x: tests/boot-serial-test
   TESTcheck-qtest-s390x: tests/pxe-test
ERROR - too few tests run (expected 1, got 0)

When I revert this, the rebuild, and "s390x: Properly fetch and test
the short psw on diag308 subc 0/1" (as it exposes the bug this commit
tried to fix), everything passes again. No idea what is wrong, though :(

For now, I've dropped the three patches mentioned above from the
s390-next branch (I plan to send a pull request later). Let's fix this
on top once we figured out whatever went wrong, no need to rush here.




[PING] [PATCH] virtio: fix IO request length in virtio SCSI/block #PSBM-78839

2019-12-13 Thread Denis Plotnikov


On 05.12.2019 10:59, Denis Plotnikov wrote:
> Ping!
>
> On 25.11.2019 12:16, Denis Plotnikov wrote:
>>
>>
>> On 06.11.2019 15:03, Michael S. Tsirkin wrote:
>>> On Thu, Oct 24, 2019 at 11:34:34AM +, Denis Lunev wrote:
 On 10/24/19 12:28 AM, Michael S. Tsirkin wrote:
> On Fri, Oct 18, 2019 at 02:55:47PM +0300, Denis Plotnikov wrote:
>> From: "Denis V. Lunev" 
>>
>> Linux guests submit IO requests no longer than PAGE_SIZE * max_seg
>> field reported by SCSI controler. Thus typical sequential read with
>> 1 MB size results in the following pattern of the IO from the guest:
>>    8,16   1    15754 2.766095122  2071  D   R 2095104 + 1008 
>> [dd]
>>    8,16   1    15755 2.766108785  2071  D   R 2096112 + 1008 
>> [dd]
>>    8,16   1    15756 2.766113486  2071  D   R 2097120 + 32 [dd]
>>    8,16   1    15757 2.767668961 0  C   R 2095104 + 1008 [0]
>>    8,16   1    15758 2.768534315 0  C   R 2096112 + 1008 [0]
>>    8,16   1    15759 2.768539782 0  C   R 2097120 + 32 [0]
>> The IO was generated by
>>    dd if=/dev/sda of=/dev/null bs=1024 iflag=direct
>>
>> This effectively means that on rotational disks we will observe 3 
>> IOPS
>> for each 2 MBs processed. This definitely negatively affects both
>> guest and host IO performance.
>>
>> The cure is relatively simple - we should report lengthy 
>> scatter-gather
>> ability of the SCSI controller. Fortunately the situation here is 
>> very
>> good. VirtIO transport layer can accomodate 1024 items in one 
>> request
>> while we are using only 128. This situation is present since almost
>> very beginning. 2 items are dedicated for request metadata thus we
>> should publish VIRTQUEUE_MAX_SIZE - 2 as max_seg.
>>
>> The following pattern is observed after the patch:
>>    8,16   1 9921 2.662721340  2063  D   R 2095104 + 1024 
>> [dd]
>>    8,16   1 9922 2.662737585  2063  D   R 2096128 + 1024 
>> [dd]
>>    8,16   1 9923 2.665188167 0  C   R 2095104 + 1024 [0]
>>    8,16   1 9924 2.665198777 0  C   R 2096128 + 1024 [0]
>> which is much better.
>>
>> The dark side of this patch is that we are tweaking guest visible
>> parameter, though this should be relatively safe as above transport
>> layer support is present in QEMU/host Linux for a very long time.
>> The patch adds configurable property for VirtIO SCSI with a new 
>> default
>> and hardcode option for VirtBlock which does not provide good
>> configurable framework.
>>
>> Unfortunately the commit can not be applied as is. For the real 
>> cure we
>> need guest to be fixed to accomodate that queue length, which is 
>> done
>> only in the latest 4.14 kernel. Thus we are going to expose the 
>> property
>> and tweak it on machine type level.
>>
>> The problem with the old kernels is that they have
>> max_segments <= virtqueue_size restriction which cause the guest
>> crashing in the case of violation.
> This isn't just in the guests: virtio spec also seems to imply this,
> or at least be vague on this point.
>
> So I think it'll need a feature bit.
> Doing that in a safe way will also allow being compatible with old 
> guests.
>
> The only downside is it's a bit more work as we need to
> spec this out and add guest support.
>
>> To fix the case described above in the old kernels we can increase
>> virtqueue_size to 256 and max_segments to 254. The pitfall here is
>> that seabios allows the virtqueue_size-s < 128, however, the seabios
>> patch extending that value to 256 is pending.
> And the fix here is just to limit large vq size to virtio 1.0.
> In that mode it's fine I think:
>
>
>     /* check if the queue is available */
>     if (vp->use_modern) {
>     num = vp_read(&vp->common, virtio_pci_common_cfg, 
> queue_size);
>     if (num > MAX_QUEUE_NUM) {
>     vp_write(&vp->common, virtio_pci_common_cfg, queue_size,
>  MAX_QUEUE_NUM);
>     num = vp_read(&vp->common, virtio_pci_common_cfg, 
> queue_size);
>     }
>     } else {
>     num = vp_read(&vp->legacy, virtio_pci_legacy, queue_num);
>     }
>> The same seabios snippet,  but more detailed:
>>
>> vp_find_vq()
>> {
>>    ...
>>    /* check if the queue is available */
>>    if (vp->use_modern) {
>>    num = vp_read(&vp->common, virtio_pci_common_cfg, queue_size);
>>    if (num > MAX_QUEUE_NUM) {
>>    vp_write(&vp->common, virtio_pci_common_cfg, queue_size,
>>     MAX_QUEUE_NUM);
>>    num = vp_read(&vp->common, virtio_pci_common_cfg, 
>> queue_size);
>>    }
>>    } else {
>>    num = vp_read(&vp->legacy, virtio_pci_legac

Re: [PATCH v3 04/20] gdbstub: move mem_buf to GDBState and use GByteArray

2019-12-13 Thread Damien Hedde



On 12/11/19 6:05 PM, Alex Bennée wrote:
> This is in preparation for further re-factoring of the register API
> with the rest of the code. Theoretically the read register function
> could overwrite the MAX_PACKET_LENGTH buffer although currently all
> registers are well within the size range.
> 
> Signed-off-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
> Reviewed-by: Damien Hedde 
> Tested-by: Damien Hedde 
> 
> ---
> v3
>   - fixed up email on Damien's tags
> ---
>  gdbstub.c | 56 ++-
>  1 file changed, 35 insertions(+), 21 deletions(-)
> 

> @@ -2092,11 +2105,12 @@ static void handle_query_rcmd(GdbCmdContext *gdb_ctx, 
> void *user_ctx)
>  }
>  
>  len = len / 2;
> -hextomem(gdb_ctx->mem_buf, gdb_ctx->params[0].data, len);
> -gdb_ctx->mem_buf[len++] = 0;
> -qemu_chr_be_write(gdbserver_state.mon_chr, gdb_ctx->mem_buf, len);
> +g_byte_array_set_size(gdbserver_state.mem_buf, len);

Hi Alex,

Just found out that the g_byte_array_set_size() above should be removed.
hextomem() will append data starting at offset [len] instead of [0] and
we end up with an uninitialized prefix in the array.

> +hextomem(gdbserver_state.mem_buf, gdb_ctx->params[0].data, len);
> +g_byte_array_append(gdbserver_state.mem_buf, &zero, 1);
> +qemu_chr_be_write(gdbserver_state.mon_chr, gdbserver_state.mem_buf->data,
> +  gdbserver_state.mem_buf->len);
>  put_packet("OK");
> -
>  }
>  #endif
>  
> 

I did double-checked the rest of the patch and it is it the only resize
that passed through v2 review.

Regards,
Damien



Re: [PATCH v3 4/4] pc-bios/s390x: Fix reset psw mask

2019-12-13 Thread Janosch Frank
On 12/13/19 1:06 PM, Cornelia Huck wrote:
> On Thu, 5 Dec 2019 11:12:39 +0100
> Cornelia Huck  wrote:
> 
>> On Tue,  3 Dec 2019 08:28:13 -0500
>> Janosch Frank  wrote:
>>
>>> We need to set the short psw indication bit in the reset psw, as it is
>>> a short psw.
>>>
>>> fixes: 9629823290 ("pc-bios/s390-ccw: do a subsystem reset before running 
>>> the guest")
>>> Signed-off-by: Janosch Frank 
>>> ---
>>>  pc-bios/s390-ccw/jump2ipl.c | 12 +++-
>>>  1 file changed, 7 insertions(+), 5 deletions(-)  
>>
>> Thanks, applied (together with a rebuild of the bios images.)
> 
> This unfortunately breaks 'make check-qtest-s390x':
> 
>TESTcheck-qtest-s390x: tests/boot-serial-test
>TESTcheck-qtest-s390x: tests/pxe-test
> ERROR - too few tests run (expected 1, got 0)
> 
> When I revert this, the rebuild, and "s390x: Properly fetch and test
> the short psw on diag308 subc 0/1" (as it exposes the bug this commit
> tried to fix), everything passes again. No idea what is wrong, though :(
> 
> For now, I've dropped the three patches mentioned above from the
> s390-next branch (I plan to send a pull request later). Let's fix this
> on top once we figured out whatever went wrong, no need to rush here.
> 
> 
Sounds good



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] virtio: fix IO request length in virtio SCSI/block #PSBM-78839

2019-12-13 Thread Michael S. Tsirkin
On Mon, Nov 25, 2019 at 09:16:10AM +, Denis Plotnikov wrote:
> 
> 
> On 06.11.2019 15:03, Michael S. Tsirkin wrote:
> > On Thu, Oct 24, 2019 at 11:34:34AM +, Denis Lunev wrote:
> >> On 10/24/19 12:28 AM, Michael S. Tsirkin wrote:
> >>> On Fri, Oct 18, 2019 at 02:55:47PM +0300, Denis Plotnikov wrote:
>  From: "Denis V. Lunev" 
> 
>  Linux guests submit IO requests no longer than PAGE_SIZE * max_seg
>  field reported by SCSI controler. Thus typical sequential read with
>  1 MB size results in the following pattern of the IO from the guest:
> 8,16   115754 2.766095122  2071  D   R 2095104 + 1008 [dd]
> 8,16   115755 2.766108785  2071  D   R 2096112 + 1008 [dd]
> 8,16   115756 2.766113486  2071  D   R 2097120 + 32 [dd]
> 8,16   115757 2.767668961 0  C   R 2095104 + 1008 [0]
> 8,16   115758 2.768534315 0  C   R 2096112 + 1008 [0]
> 8,16   115759 2.768539782 0  C   R 2097120 + 32 [0]
>  The IO was generated by
> dd if=/dev/sda of=/dev/null bs=1024 iflag=direct
> 
>  This effectively means that on rotational disks we will observe 3 IOPS
>  for each 2 MBs processed. This definitely negatively affects both
>  guest and host IO performance.
> 
>  The cure is relatively simple - we should report lengthy scatter-gather
>  ability of the SCSI controller. Fortunately the situation here is very
>  good. VirtIO transport layer can accomodate 1024 items in one request
>  while we are using only 128. This situation is present since almost
>  very beginning. 2 items are dedicated for request metadata thus we
>  should publish VIRTQUEUE_MAX_SIZE - 2 as max_seg.
> 
>  The following pattern is observed after the patch:
> 8,16   1 9921 2.662721340  2063  D   R 2095104 + 1024 [dd]
> 8,16   1 9922 2.662737585  2063  D   R 2096128 + 1024 [dd]
> 8,16   1 9923 2.665188167 0  C   R 2095104 + 1024 [0]
> 8,16   1 9924 2.665198777 0  C   R 2096128 + 1024 [0]
>  which is much better.
> 
>  The dark side of this patch is that we are tweaking guest visible
>  parameter, though this should be relatively safe as above transport
>  layer support is present in QEMU/host Linux for a very long time.
>  The patch adds configurable property for VirtIO SCSI with a new default
>  and hardcode option for VirtBlock which does not provide good
>  configurable framework.
> 
>  Unfortunately the commit can not be applied as is. For the real cure we
>  need guest to be fixed to accomodate that queue length, which is done
>  only in the latest 4.14 kernel. Thus we are going to expose the property
>  and tweak it on machine type level.
> 
>  The problem with the old kernels is that they have
>  max_segments <= virtqueue_size restriction which cause the guest
>  crashing in the case of violation.
> >>> This isn't just in the guests: virtio spec also seems to imply this,
> >>> or at least be vague on this point.
> >>>
> >>> So I think it'll need a feature bit.
> >>> Doing that in a safe way will also allow being compatible with old guests.
> >>>
> >>> The only downside is it's a bit more work as we need to
> >>> spec this out and add guest support.
> >>>
>  To fix the case described above in the old kernels we can increase
>  virtqueue_size to 256 and max_segments to 254. The pitfall here is
>  that seabios allows the virtqueue_size-s < 128, however, the seabios
>  patch extending that value to 256 is pending.
> >>> And the fix here is just to limit large vq size to virtio 1.0.
> >>> In that mode it's fine I think:
> >>>
> >>>
> >>> /* check if the queue is available */
> >>> if (vp->use_modern) {
> >>> num = vp_read(&vp->common, virtio_pci_common_cfg, queue_size);
> >>> if (num > MAX_QUEUE_NUM) {
> >>> vp_write(&vp->common, virtio_pci_common_cfg, queue_size,
> >>>  MAX_QUEUE_NUM);
> >>> num = vp_read(&vp->common, virtio_pci_common_cfg, queue_size);
> >>> }
> >>> } else {
> >>> num = vp_read(&vp->legacy, virtio_pci_legacy, queue_num);
> >>> }
> The same seabios snippet,  but more detailed:
> 
> vp_find_vq()
> {
>     ...
>     /* check if the queue is available */
>     if (vp->use_modern) {
>     num = vp_read(&vp->common, virtio_pci_common_cfg, queue_size);
>     if (num > MAX_QUEUE_NUM) {
>     vp_write(&vp->common, virtio_pci_common_cfg, queue_size,
>      MAX_QUEUE_NUM);
>     num = vp_read(&vp->common, virtio_pci_common_cfg, queue_size);

So how about we drop this last line in bios?

Will fix things for existing hypervisors.
spec does not say guests need to re-read it.

>     }
>     } else {
>     num = vp_read(&vp->legacy, virtio_pci_legacy, queue_num);
>     }
>   

Re: [PATCH 03/13] ppc/pnv: Drop PnvPsiClass::chip_type

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 12:59, Greg Kurz wrote:
> It isn't used anymore.
> 
> Signed-off-by: Greg Kurz 


Reviewed-by: Cédric Le Goater 

> ---
>  hw/ppc/pnv_psi.c |3 ---
>  include/hw/ppc/pnv_psi.h |1 -
>  2 files changed, 4 deletions(-)
> 
> diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
> index 98a82b25e01f..75e20d9da08b 100644
> --- a/hw/ppc/pnv_psi.c
> +++ b/hw/ppc/pnv_psi.c
> @@ -574,7 +574,6 @@ static void pnv_psi_power8_class_init(ObjectClass *klass, 
> void *data)
>  dc->desc= "PowerNV PSI Controller POWER8";
>  dc->realize = pnv_psi_power8_realize;
>  
> -ppc->chip_type =  PNV_CHIP_POWER8;
>  ppc->xscom_pcba = PNV_XSCOM_PSIHB_BASE;
>  ppc->xscom_size = PNV_XSCOM_PSIHB_SIZE;
>  ppc->bar_mask   = PSIHB_BAR_MASK;
> @@ -884,7 +883,6 @@ static void pnv_psi_power9_class_init(ObjectClass *klass, 
> void *data)
>  dc->desc= "PowerNV PSI Controller POWER9";
>  dc->realize = pnv_psi_power9_realize;
>  
> -ppc->chip_type  = PNV_CHIP_POWER9;
>  ppc->xscom_pcba = PNV9_XSCOM_PSIHB_BASE;
>  ppc->xscom_size = PNV9_XSCOM_PSIHB_SIZE;
>  ppc->bar_mask   = PSIHB9_BAR_MASK;
> @@ -915,7 +913,6 @@ static void pnv_psi_power10_class_init(ObjectClass 
> *klass, void *data)
>  
>  dc->desc= "PowerNV PSI Controller POWER10";
>  
> -ppc->chip_type  = PNV_CHIP_POWER10;
>  ppc->xscom_pcba = PNV10_XSCOM_PSIHB_BASE;
>  ppc->xscom_size = PNV10_XSCOM_PSIHB_SIZE;
>  ppc->compat = compat;
> diff --git a/include/hw/ppc/pnv_psi.h b/include/hw/ppc/pnv_psi.h
> index fc068c95e543..f0f5b5519767 100644
> --- a/include/hw/ppc/pnv_psi.h
> +++ b/include/hw/ppc/pnv_psi.h
> @@ -79,7 +79,6 @@ typedef struct Pnv9Psi {
>  typedef struct PnvPsiClass {
>  SysBusDeviceClass parent_class;
>  
> -int chip_type;
>  uint32_t xscom_pcba;
>  uint32_t xscom_size;
>  uint64_t bar_mask;
> 




Re: [PATCH 02/13] ppc/pnv: Introduce PnvPsiClass::compat

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 12:59, Greg Kurz wrote:
> The Processor Service Interface (PSI) model has a chip_type class level
> attribute, which is used to generate the content of the "compatible" DT
> property according to the CPU type.
> 
> Since the PSI model already has specialized classes for each supported
> CPU type, it seems cleaner to achieve this with QOM. Provide the content
> of the "compatible" property with a new class level attribute.
> 
> Signed-off-by: Greg Kurz 


Reviewed-by: Cédric Le Goater 

> ---
>  hw/ppc/pnv_psi.c |   25 +++--
>  include/hw/ppc/pnv_psi.h |2 ++
>  2 files changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
> index 572924388b3c..98a82b25e01f 100644
> --- a/hw/ppc/pnv_psi.c
> +++ b/hw/ppc/pnv_psi.c
> @@ -536,10 +536,6 @@ static void pnv_psi_power8_realize(DeviceState *dev, 
> Error **errp)
>  qemu_register_reset(pnv_psi_reset, dev);
>  }
>  
> -static const char compat_p8[] = "ibm,power8-psihb-x\0ibm,psihb-x";
> -static const char compat_p9[] = "ibm,power9-psihb-x\0ibm,psihb-x";
> -static const char compat_p10[] = "ibm,power10-psihb-x\0ibm,psihb-x";
> -
>  static int pnv_psi_dt_xscom(PnvXScomInterface *dev, void *fdt, int 
> xscom_offset)
>  {
>  PnvPsiClass *ppc = PNV_PSI_GET_CLASS(dev);
> @@ -558,16 +554,8 @@ static int pnv_psi_dt_xscom(PnvXScomInterface *dev, void 
> *fdt, int xscom_offset)
>  _FDT(fdt_setprop(fdt, offset, "reg", reg, sizeof(reg)));
>  _FDT(fdt_setprop_cell(fdt, offset, "#address-cells", 2));
>  _FDT(fdt_setprop_cell(fdt, offset, "#size-cells", 1));
> -if (ppc->chip_type == PNV_CHIP_POWER10) {
> -_FDT(fdt_setprop(fdt, offset, "compatible", compat_p10,
> - sizeof(compat_p10)));
> -} else if (ppc->chip_type == PNV_CHIP_POWER9) {
> -_FDT(fdt_setprop(fdt, offset, "compatible", compat_p9,
> - sizeof(compat_p9)));
> -} else {
> -_FDT(fdt_setprop(fdt, offset, "compatible", compat_p8,
> - sizeof(compat_p8)));
> -}
> +_FDT(fdt_setprop(fdt, offset, "compatible", ppc->compat,
> + ppc->compat_size));
>  return 0;
>  }
>  
> @@ -581,6 +569,7 @@ static void pnv_psi_power8_class_init(ObjectClass *klass, 
> void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvPsiClass *ppc = PNV_PSI_CLASS(klass);
> +static const char compat[] = "ibm,power8-psihb-x\0ibm,psihb-x";
>  
>  dc->desc= "PowerNV PSI Controller POWER8";
>  dc->realize = pnv_psi_power8_realize;
> @@ -590,6 +579,8 @@ static void pnv_psi_power8_class_init(ObjectClass *klass, 
> void *data)
>  ppc->xscom_size = PNV_XSCOM_PSIHB_SIZE;
>  ppc->bar_mask   = PSIHB_BAR_MASK;
>  ppc->irq_set= pnv_psi_power8_irq_set;
> +ppc->compat = compat;
> +ppc->compat_size = sizeof(compat);
>  }
>  
>  static const TypeInfo pnv_psi_power8_info = {
> @@ -888,6 +879,7 @@ static void pnv_psi_power9_class_init(ObjectClass *klass, 
> void *data)
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvPsiClass *ppc = PNV_PSI_CLASS(klass);
>  XiveNotifierClass *xfc = XIVE_NOTIFIER_CLASS(klass);
> +static const char compat[] = "ibm,power9-psihb-x\0ibm,psihb-x";
>  
>  dc->desc= "PowerNV PSI Controller POWER9";
>  dc->realize = pnv_psi_power9_realize;
> @@ -897,6 +889,8 @@ static void pnv_psi_power9_class_init(ObjectClass *klass, 
> void *data)
>  ppc->xscom_size = PNV9_XSCOM_PSIHB_SIZE;
>  ppc->bar_mask   = PSIHB9_BAR_MASK;
>  ppc->irq_set= pnv_psi_power9_irq_set;
> +ppc->compat = compat;
> +ppc->compat_size = sizeof(compat);
>  
>  xfc->notify  = pnv_psi_notify;
>  }
> @@ -917,12 +911,15 @@ static void pnv_psi_power10_class_init(ObjectClass 
> *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvPsiClass *ppc = PNV_PSI_CLASS(klass);
> +static const char compat[] = "ibm,power10-psihb-x\0ibm,psihb-x";
>  
>  dc->desc= "PowerNV PSI Controller POWER10";
>  
>  ppc->chip_type  = PNV_CHIP_POWER10;
>  ppc->xscom_pcba = PNV10_XSCOM_PSIHB_BASE;
>  ppc->xscom_size = PNV10_XSCOM_PSIHB_SIZE;
> +ppc->compat = compat;
> +ppc->compat_size = sizeof(compat);
>  }
>  
>  static const TypeInfo pnv_psi_power10_info = {
> diff --git a/include/hw/ppc/pnv_psi.h b/include/hw/ppc/pnv_psi.h
> index a044aab304ae..fc068c95e543 100644
> --- a/include/hw/ppc/pnv_psi.h
> +++ b/include/hw/ppc/pnv_psi.h
> @@ -83,6 +83,8 @@ typedef struct PnvPsiClass {
>  uint32_t xscom_pcba;
>  uint32_t xscom_size;
>  uint64_t bar_mask;
> +const char *compat;
> +int compat_size;
>  
>  void (*irq_set)(PnvPsi *psi, int, bool state);
>  } PnvPsiClass;
> 




Re: [PATCH 05/13] ppc/pnv: Introduce PnvMachineClass::dt_power_mgt()

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 12:59, Greg Kurz wrote:
> We add an extra node to advertise power management on some machines,
> namely powernv9 and powernv10. This is achieved by using the
> pnv_is_power9() and pnv_is_power10() helpers.
> 
> This can be achieved with QOM. Add a method to the base class for
> powernv machines and have it implemented by machine types that
> support power management instead.
> 
> Signed-off-by: Greg Kurz 


Reviewed-by: Cédric Le Goater 

> ---
>  hw/ppc/pnv.c |   10 ++
>  include/hw/ppc/pnv.h |8 ++--
>  2 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 5ac149b149d8..efc00f4cb67a 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -472,7 +472,7 @@ static void pnv_dt_isa(PnvMachineState *pnv, void *fdt)
> &args);
>  }
>  
> -static void pnv_dt_power_mgt(void *fdt)
> +static void pnv_dt_power_mgt(PnvMachineState *pnv, void *fdt)
>  {
>  int off;
>  
> @@ -540,9 +540,9 @@ static void *pnv_dt_create(MachineState *machine)
>  pnv_dt_bmc_sensors(pnv->bmc, fdt);
>  }
>  
> -/* Create an extra node for power management on Power9 and Power10 */
> -if (pnv_is_power9(pnv) || pnv_is_power10(pnv)) {
> -pnv_dt_power_mgt(fdt);
> +/* Create an extra node for power management on machines that support it 
> */
> +if (pmc->dt_power_mgt) {
> +pmc->dt_power_mgt(pnv, fdt);
>  }
>  
>  return fdt;
> @@ -1710,6 +1710,7 @@ static void pnv_machine_power9_class_init(ObjectClass 
> *oc, void *data)
>  
>  pmc->compat = compat;
>  pmc->compat_size = sizeof(compat);
> +pmc->dt_power_mgt = pnv_dt_power_mgt;
>  }
>  
>  static void pnv_machine_power10_class_init(ObjectClass *oc, void *data)
> @@ -1723,6 +1724,7 @@ static void pnv_machine_power10_class_init(ObjectClass 
> *oc, void *data)
>  
>  pmc->compat = compat;
>  pmc->compat_size = sizeof(compat);
> +pmc->dt_power_mgt = pnv_dt_power_mgt;
>  }
>  
>  static void pnv_machine_class_init(ObjectClass *oc, void *data)
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index d534746bd493..8a42c199b65c 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -190,6 +190,8 @@ PowerPCCPU *pnv_chip_find_cpu(PnvChip *chip, uint32_t 
> pir);
>  #define PNV_MACHINE_CLASS(klass) \
>  OBJECT_CLASS_CHECK(PnvMachineClass, klass, TYPE_PNV_MACHINE)
>  
> +typedef struct PnvMachineState PnvMachineState;
> +
>  typedef struct PnvMachineClass {
>  /*< private >*/
>  MachineClass parent_class;
> @@ -197,9 +199,11 @@ typedef struct PnvMachineClass {
>  /*< public >*/
>  const char *compat;
>  int compat_size;
> +
> +void (*dt_power_mgt)(PnvMachineState *pnv, void *fdt);
>  } PnvMachineClass;
>  
> -typedef struct PnvMachineState {
> +struct PnvMachineState {
>  /*< private >*/
>  MachineState parent_obj;
>  
> @@ -216,7 +220,7 @@ typedef struct PnvMachineState {
>  Notifier powerdown_notifier;
>  
>  PnvPnor  *pnor;
> -} PnvMachineState;
> +};
>  
>  static inline bool pnv_chip_is_power9(const PnvChip *chip)
>  {
> 




Re: [PATCH 04/13] ppc/pnv: Introduce PnvMachineClass and PnvMachineClass::compat

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 12:59, Greg Kurz wrote:
> The pnv_dt_create() function generates different contents for the
> "compatible" property of the root node in the DT, depending on the
> CPU type. This is open coded with multiple ifs using pnv_is_powerXX()
> helpers.
> 
> It seems cleaner to achieve with QOM. Introduce a base class for the
> powernv machine and a compat attribute that each child class can use
> to provide the value for the "compatible" property.
> 
> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 


> ---
>  hw/ppc/pnv.c |   33 +++--
>  include/hw/ppc/pnv.h |   13 +
>  2 files changed, 32 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 0be0b6b411c3..5ac149b149d8 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -484,9 +484,7 @@ static void pnv_dt_power_mgt(void *fdt)
>  
>  static void *pnv_dt_create(MachineState *machine)
>  {
> -const char plat_compat8[] = "qemu,powernv8\0qemu,powernv\0ibm,powernv";
> -const char plat_compat9[] = "qemu,powernv9\0ibm,powernv";
> -const char plat_compat10[] = "qemu,powernv10\0ibm,powernv";
> +PnvMachineClass *pmc = PNV_MACHINE_GET_CLASS(machine);
>  PnvMachineState *pnv = PNV_MACHINE(machine);
>  void *fdt;
>  char *buf;
> @@ -504,17 +502,8 @@ static void *pnv_dt_create(MachineState *machine)
>  _FDT((fdt_setprop_cell(fdt, 0, "#size-cells", 0x2)));
>  _FDT((fdt_setprop_string(fdt, 0, "model",
>   "IBM PowerNV (emulated by qemu)")));
> -if (pnv_is_power10(pnv)) {
> -_FDT((fdt_setprop(fdt, 0, "compatible", plat_compat10,
> -  sizeof(plat_compat10;
> -} else if (pnv_is_power9(pnv)) {
> -_FDT((fdt_setprop(fdt, 0, "compatible", plat_compat9,
> -  sizeof(plat_compat9;
> -} else {
> -_FDT((fdt_setprop(fdt, 0, "compatible", plat_compat8,
> -  sizeof(plat_compat8;
> -}
> -
> +_FDT((fdt_setprop(fdt, 0, "compatible", pmc->compat,
> +  sizeof(pmc->compat;
>  
>  buf =  qemu_uuid_unparse_strdup(&qemu_uuid);
>  _FDT((fdt_setprop_string(fdt, 0, "vm,uuid", buf)));
> @@ -1692,6 +1681,8 @@ static void pnv_machine_power8_class_init(ObjectClass 
> *oc, void *data)
>  {
>  MachineClass *mc = MACHINE_CLASS(oc);
>  XICSFabricClass *xic = XICS_FABRIC_CLASS(oc);
> +PnvMachineClass *pmc = PNV_MACHINE_CLASS(oc);
> +static const char compat[] = "qemu,powernv8\0qemu,powernv\0ibm,powernv";
>  
>  mc->desc = "IBM PowerNV (Non-Virtualized) POWER8";
>  mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
> @@ -1699,26 +1690,39 @@ static void pnv_machine_power8_class_init(ObjectClass 
> *oc, void *data)
>  xic->icp_get = pnv_icp_get;
>  xic->ics_get = pnv_ics_get;
>  xic->ics_resend = pnv_ics_resend;
> +
> +pmc->compat = compat;
> +pmc->compat_size = sizeof(compat);
>  }
>  
>  static void pnv_machine_power9_class_init(ObjectClass *oc, void *data)
>  {
>  MachineClass *mc = MACHINE_CLASS(oc);
>  XiveFabricClass *xfc = XIVE_FABRIC_CLASS(oc);
> +PnvMachineClass *pmc = PNV_MACHINE_CLASS(oc);
> +static const char compat[] = "qemu,powernv9\0ibm,powernv";
>  
>  mc->desc = "IBM PowerNV (Non-Virtualized) POWER9";
>  mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
>  xfc->match_nvt = pnv_match_nvt;
>  
>  mc->alias = "powernv";
> +
> +pmc->compat = compat;
> +pmc->compat_size = sizeof(compat);
>  }
>  
>  static void pnv_machine_power10_class_init(ObjectClass *oc, void *data)
>  {
>  MachineClass *mc = MACHINE_CLASS(oc);
> +PnvMachineClass *pmc = PNV_MACHINE_CLASS(oc);
> +static const char compat[] = "qemu,powernv10\0ibm,powernv";
>  
>  mc->desc = "IBM PowerNV (Non-Virtualized) POWER10";
>  mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power10_v1.0");
> +
> +pmc->compat = compat;
> +pmc->compat_size = sizeof(compat);
>  }
>  
>  static void pnv_machine_class_init(ObjectClass *oc, void *data)
> @@ -1796,6 +1800,7 @@ static const TypeInfo types[] = {
>  .instance_size = sizeof(PnvMachineState),
>  .instance_init = pnv_machine_instance_init,
>  .class_init= pnv_machine_class_init,
> +.class_size= sizeof(PnvMachineClass),
>  .interfaces = (InterfaceInfo[]) {
>  { TYPE_INTERRUPT_STATS_PROVIDER },
>  { },
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index 92f80b1ccead..d534746bd493 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -185,6 +185,19 @@ PowerPCCPU *pnv_chip_find_cpu(PnvChip *chip, uint32_t 
> pir);
>  #define TYPE_PNV_MACHINE   MACHINE_TYPE_NAME("powernv")
>  #define PNV_MACHINE(obj) \
>  OBJECT_CHECK(PnvMachineState, (obj), TYPE_PNV_MACHINE)
> +#define PNV_MACHINE_GET_CLASS(obj) \
> +OBJECT_GET_CLASS(PnvMachineClass, obj, TYPE_PNV_MAC

Re: [PATCH v2 1/2] python/qemu: Add set_qmp_monitor() to QEMUMachine

2019-12-13 Thread Wainer dos Santos Moschetta



On 12/12/19 12:13 PM, Cleber Rosa wrote:

On Wed, Dec 11, 2019 at 01:55:35PM -0500, Wainer dos Santos Moschetta wrote:

The QEMUMachine VM has a monitor setup on which an QMP
connection is always attempted on _post_launch() (executed
by launch()). In case the QEMU process immediatly exits
then the qmp.accept() (used to establish the connection) stalls
until it reaches timeout and consequently an exception raises.

That behavior is undesirable when, for instance, it needs to
gather information from the QEMU binary ($ qemu -cpu list) or a
test which launches the VM expecting its failure.

This patch adds the set_qmp_monitor() method to QEMUMachine that
allows turn off the creation of the monitor machinery on VM launch.

Signed-off-by: Wainer dos Santos Moschetta 
Reviewed-by: Cleber Rosa 
---
  python/qemu/machine.py | 66 +++---
  1 file changed, 43 insertions(+), 23 deletions(-)

diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index a4631d6934..7d4d621a42 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -104,6 +104,7 @@ class QEMUMachine(object):
  self._events = []
  self._iolog = None
  self._socket_scm_helper = socket_scm_helper
+self._qmp_set = True   # Enable QMP monitor by default.
  self._qmp = None
  self._qemu_full_args = None
  self._test_dir = test_dir
@@ -228,15 +229,16 @@ class QEMUMachine(object):
  self._iolog = iolog.read()
  
  def _base_args(self):

-if isinstance(self._monitor_address, tuple):
-moncdev = "socket,id=mon,host=%s,port=%s" % (
+args = ['-display', 'none', '-vga', 'none']
+if self._qmp_set:
+if isinstance(self._monitor_address, tuple):
+moncdev = "socket,id=mon,host=%s,port=%s" % (
  self._monitor_address[0],
  self._monitor_address[1])

One thing I missed in my review on v1 was this now became badly
indented.  No worries, it's a minor issue that I can fix on my side
when queueing this patch.


Good catch. Thanks!

- Wainer



- Cleber.





Re: [PATCH v5 3/5] tpm_spapr: Support suspend and resume

2019-12-13 Thread Stefan Berger

On 12/13/19 12:39 AM, David Gibson wrote:

On Thu, Dec 12, 2019 at 03:24:28PM -0500, Stefan Berger wrote:

Extend the tpm_spapr frontend with VM suspend and resume support.

Signed-off-by: Stefan Berger 

diff --git a/hw/tpm/tpm_spapr.c b/hw/tpm/tpm_spapr.c
index c4a67e2403..8f5a142bd4 100644
--- a/hw/tpm/tpm_spapr.c
+++ b/hw/tpm/tpm_spapr.c
@@ -87,6 +87,8 @@ typedef struct {
  TPMVersion be_tpm_version;
  
  size_t be_buffer_size;

+
+bool deliver_response; /* whether to deliver response after VM resume */
  } SPAPRvTPMState;
  
  static void tpm_spapr_show_buffer(const unsigned char *buffer,

@@ -256,6 +258,12 @@ static void tpm_spapr_request_completed(TPMIf *ti, int ret)
  uint32_t len;
  int rc;
  
+if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {

I'm trying to figure out the circumstances in which
request_completed() would get called before post_load on the
destination.



This is on the source side where we must not deliver the response in 
case the devices are now suspending but defer the delivery to after the 
resume.






+/* defer delivery of response until .post_load */
+s->deliver_response |= true;

|= is a bitwise OR which is not what you want, although it will
*probably* work in practice.  Better to just use
s->deliver_response = true;


+return;
+}
+
  s->state = SPAPR_VTPM_STATE_COMPLETION;
  
  /* a max. of be_buffer_size bytes can be transported */

@@ -316,6 +324,7 @@ static void tpm_spapr_reset(SpaprVioDevice *dev)
  SPAPRvTPMState *s = VIO_SPAPR_VTPM(dev);
  
  s->state = SPAPR_VTPM_STATE_NONE;

+s->deliver_response = false;
  
  s->be_tpm_version = tpm_backend_get_tpm_version(s->be_driver);

  tpm_spapr_update_deviceclass(dev);
@@ -339,9 +348,53 @@ static enum TPMVersion tpm_spapr_get_version(TPMIf *ti)
  return tpm_backend_get_tpm_version(s->be_driver);
  }
  
+/* persistent state handling */

+
+static int tpm_spapr_pre_save(void *opaque)
+{
+SPAPRvTPMState *s = opaque;
+
+s->deliver_response |= tpm_backend_finish_sync(s->be_driver);

Same problem here.


+trace_tpm_spapr_pre_save(s->deliver_response);
+/*
+ * we cannot deliver the results to the VM since DMA would touch VM memory
+ */
+
+return 0;
+}
+
+static int tpm_spapr_post_load(void *opaque, int version_id)
+{
+SPAPRvTPMState *s = opaque;
+
+if (s->deliver_response) {
+trace_tpm_spapr_post_load();
+/* deliver the results to the VM via DMA */
+tpm_spapr_request_completed(TPM_IF(s), 0);
+s->deliver_response = false;
+}
+
+return 0;
+}
+
  static const VMStateDescription vmstate_spapr_vtpm = {
  .name = "tpm-spapr",
-.unmigratable = 1,
+.version_id = 1,
+.minimum_version_id = 0,
+.minimum_version_id_old = 0,
+.pre_save = tpm_spapr_pre_save,
+.post_load = tpm_spapr_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_SPAPR_VIO(vdev, SPAPRvTPMState),
+
+VMSTATE_UINT8(state, SPAPRvTPMState),
+VMSTATE_BUFFER(buffer, SPAPRvTPMState),

Transferring the whole 4kiB buffer unconditionally when it mostly
won't have anything useful in it doesn't seem like a great idea.



It's really only needed in case of a 'delayed response'. So, yeah, we 
could transfer data in only that case then.






+/* remember DMA address */
+VMSTATE_UINT32(crq.s.data, SPAPRvTPMState),
+VMSTATE_BOOL(deliver_response, SPAPRvTPMState),
+VMSTATE_END_OF_LIST(),
+}
  };
  
  static Property tpm_spapr_properties[] = {

diff --git a/hw/tpm/trace-events b/hw/tpm/trace-events
index 6278a39618..d109661b96 100644
--- a/hw/tpm/trace-events
+++ b/hw/tpm/trace-events
@@ -67,3 +67,5 @@ tpm_spapr_do_crq_get_version(uint32_t version) "response: version 
%u"
  tpm_spapr_do_crq_prepare_to_suspend(void) "response: preparing to suspend"
  tpm_spapr_do_crq_unknown_msg_type(uint8_t type) "Unknown message type 0x%02x"
  tpm_spapr_do_crq_unknown_crq(uint8_t raw1, uint8_t raw2) "unknown CRQ 0x%02x 
0x%02x ..."
+tpm_spapr_pre_save(bool v) "TPM response to deliver after resume: %d"
+tpm_spapr_post_load(void) "Delivering TPM response after resume"






RE: [PATCH 0/5] ARM virt: Add NVDIMM support

2019-12-13 Thread Shameerali Kolothum Thodi
Hi Igor,

> -Original Message-
> From: Igor Mammedov [mailto:imamm...@redhat.com]
> Sent: 11 December 2019 07:57
> To: Shameerali Kolothum Thodi 
> Cc: xiaoguangrong.e...@gmail.com; peter.mayd...@linaro.org;
> drjo...@redhat.com; shannon.zha...@gmail.com; qemu-devel@nongnu.org;
> Linuxarm ; Auger Eric ;
> qemu-...@nongnu.org; xuwei (O) ;
> ler...@redhat.com
> Subject: Re: [PATCH 0/5] ARM virt: Add NVDIMM support

[...]

> > I couldn't figure out yet, why this extra 4 bytes are added by aml code on
> ARM64
> > when the nvdimm_dsm_func_read_fit() returns NvdimmFuncReadFITOut
> without
> > any FIT data. ie, when the FIT buffer len (read_len) is zero.
> >
> > But the below will fix this issue,
> >
> > diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> > index f91eea3802..cddf95f4c1 100644
> > --- a/hw/acpi/nvdimm.c
> > +++ b/hw/acpi/nvdimm.c
> > @@ -588,7 +588,7 @@ static void
> nvdimm_dsm_func_read_fit(NVDIMMState *state, NvdimmDsmIn *in,
> >  nvdimm_debug("Read FIT: offset %#x FIT size %#x Dirty %s.\n",
> >   read_fit->offset, fit->len, fit_buf->dirty ? "Yes" : 
> > "No");
> >
> > -if (read_fit->offset > fit->len) {
> > +if (read_fit->offset >= fit->len) {
> >  func_ret_status = NVDIMM_DSM_RET_STATUS_INVALID;
> >  goto exit;
> >  }
> >
> >
> > This will return error code to aml in the second iteration when there is no
> further
> > FIT data to report. But, I am not sure why this check was omitted in the 
> > first
> place.
> >
> > Please let me know if this is acceptable and then probably I can look into 
> > a v2
> of this
> > series.
> Sorry, I don't have capacity to debug this right now,

No problem.

> but I'd prefer if 'why' question was answered first.

Right.

> Anyways, if something is unclear in how concrete AML code is build/works,
> feel free to ask and I'll try to explain and guide you.

Thanks for your help. I did spend some more time debugging this further.
I tried to introduce a totally new Buffer field object with different
sizes and printing the size after creation.

--- SSDT.dsl2019-12-12 15:28:21.976986949 +
+++ SSDT-arm64-dbg.dsl  2019-12-13 12:17:11.026806186 +
@@ -18,7 +18,7 @@
  * Compiler ID  "BXPC"
  * Compiler Version 0x0001 (1)
  */
-DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x0001)
+DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x0002)
 {
 Scope (\_SB)
 {
@@ -48,6 +48,11 @@
 RLEN,   32, 
 ODAT,   32736
 }
+
+Field (NRAM, DWordAcc, NoLock, Preserve)
+{
+NBUF,   32768 
+}
 
 If ((Arg4 == Zero))
 {
@@ -87,6 +92,12 @@
 Local3 = DerefOf (Local2)
 FARG = Local3
 }
+   
+Local2 = 0x2 
+printf("AML:NVDIMM Creating TBUF with bytes %o", Local2)
+CreateField (NBUF, Zero, (Local2 << 3), TBUF)
+Concatenate (Buffer (Zero){}, TBUF, Local3)
+printf("AML:NVDIMM Size of TBUF(Local3) %o", SizeOf(Local3))
 
 NTFI = Local6
 Local1 = (RLEN - 0x04)

And run it by changing Local2 with different values, It looks on ARM64, 

For cases where, Local2 <8, the created buffer size is always 8 bytes

"AML:NVDIMM Creating TBUF with bytes 0002"
"AML:NVDIMM Size of TBUF(Local3) 0008"

...
"AML:NVDIMM Creating TBUF with bytes 0005"
"AML:NVDIMM Size of TBUF(Local3) 0008"

And once Local2 >=8, it gets the correct size,

"AML:NVDIMM Creating TBUF with bytes 0009"
"AML:NVDIMM Size of TBUF(Local3) 0009"


But on x86, the behavior is like, 

For cases where, Local2 <4, the created buffer size is always 4 bytes

"AML:NVDIMM Creating TBUF with bytes 0002"
"AML:NVDIMM Size of TBUF(Local3) 0004"

"AML:NVDIMM Creating TBUF with bytes 0003"
"AML:NVDIMM Size of TBUF(Local3) 0004"

And once Local2 >= 4, it is ok

"AML:NVDIMM Creating TBUF with bytes 0005"
"AML:NVDIMM Size of TBUF(Local3) 0005"
...
"AML:NVDIMM Creating TBUF with bytes 0009"
"AML:NVDIMM Size of TBUF(Local3) 0009"

This is the reason why it works on x86 and not on ARM64. Because, if you
remember on second iteration of the FIT buffer, the requested buffer size is 4 .

I tried changing the AccessType of the below NBUF field from DWordAcc to
ByteAcc/BufferAcc, but no luck.

+Field (NRAM, DWordAcc, NoLock, Preserve)
+{
+NBUF,   32768 
+}

Not sure what we need to change for ARM64 to create buffer object of size 4
here. Please let me know if you have any pointers to debug this further.

(I am attaching both x86 and ARM64 SSDT dsl used for reference)

Thanks,
Shameer


> > Thanks,
> > Shameer
> >
> >
> >



SSDT-arm64-dbg.dsl
Description: S

Re: [PATCH 06/13] ppc/pnv: Drop pnv_is_power9() and pnv_is_power10() helpers

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> They aren't used anymore.

Good ! 

> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 


> ---
>  include/hw/ppc/pnv.h |   10 --
>  1 file changed, 10 deletions(-)
> 
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index 8a42c199b65c..c213bdd5ecd3 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -227,11 +227,6 @@ static inline bool pnv_chip_is_power9(const PnvChip 
> *chip)
>  return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER9;
>  }
>  
> -static inline bool pnv_is_power9(PnvMachineState *pnv)
> -{
> -return pnv_chip_is_power9(pnv->chips[0]);
> -}
> -
>  PnvChip *pnv_get_chip(uint32_t chip_id);
>  
>  #define PNV_FDT_ADDR  0x0100
> @@ -242,11 +237,6 @@ static inline bool pnv_chip_is_power10(const PnvChip 
> *chip)
>  return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER10;
>  }
>  
> -static inline bool pnv_is_power10(PnvMachineState *pnv)
> -{
> -return pnv_chip_is_power10(pnv->chips[0]);
> -}
> -
>  /*
>   * BMC helpers
>   */
> 




Re: [PATCH 07/13] ppc/pnv: Introduce PnvChipClass::intc_print_info() method

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> The pnv_pic_print_info() callback checks the type of the chip in order
> to forward to the request appropriate interrupt controller. This can
> be achieved with QOM. Introduce a method for this in the base chip class
> and implement it in child classes.
> 
> This also prepares ground for the upcoming interrupt controller of POWER10
> chips.
> 
> Signed-off-by: Greg Kurz 


Reviewed-by: Cédric Le Goater 

One comment below.

> ---
>  hw/ppc/pnv.c |   30 +-
>  include/hw/ppc/pnv.h |1 +
>  2 files changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index efc00f4cb67a..2a53e99bda2e 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -832,6 +832,12 @@ static void pnv_chip_power8_intc_destroy(PnvChip *chip, 
> PowerPCCPU *cpu)
>  pnv_cpu->intc = NULL;
>  }
>  
> +static void pnv_chip_power8_intc_print_info(PnvChip *chip, PowerPCCPU *cpu,
> +Monitor *mon)
> +{
> +icp_pic_print_info(ICP(pnv_cpu_state(cpu)->intc), mon);
> +}
> +
>  /*
>   *0:48  Reserved - Read as zeroes
>   *   49:52  Node ID
> @@ -889,6 +895,12 @@ static void pnv_chip_power9_intc_destroy(PnvChip *chip, 
> PowerPCCPU *cpu)
>  pnv_cpu->intc = NULL;
>  }
>  
> +static void pnv_chip_power9_intc_print_info(PnvChip *chip, PowerPCCPU *cpu,
> +Monitor *mon)
> +{
> +xive_tctx_pic_print_info(XIVE_TCTX(pnv_cpu_state(cpu)->intc), mon);
> +}
> +
>  static void pnv_chip_power10_intc_create(PnvChip *chip, PowerPCCPU *cpu,
>  Error **errp)
>  {
> @@ -910,6 +922,11 @@ static void pnv_chip_power10_intc_destroy(PnvChip *chip, 
> PowerPCCPU *cpu)
>  pnv_cpu->intc = NULL;
>  }
>  
> +static void pnv_chip_power10_intc_print_info(PnvChip *chip, PowerPCCPU *cpu,
> + Monitor *mon)
> +{
> +}
> +
>  /*
>   * Allowed core identifiers on a POWER8 Processor Chip :
>   *
> @@ -1086,6 +1103,7 @@ static void pnv_chip_power8e_class_init(ObjectClass 
> *klass, void *data)
>  k->intc_create = pnv_chip_power8_intc_create;
>  k->intc_reset = pnv_chip_power8_intc_reset;
>  k->intc_destroy = pnv_chip_power8_intc_destroy;
> +k->intc_print_info = pnv_chip_power8_intc_print_info;
>  k->isa_create = pnv_chip_power8_isa_create;
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
> @@ -1107,6 +1125,7 @@ static void pnv_chip_power8_class_init(ObjectClass 
> *klass, void *data)
>  k->intc_create = pnv_chip_power8_intc_create;
>  k->intc_reset = pnv_chip_power8_intc_reset;
>  k->intc_destroy = pnv_chip_power8_intc_destroy;
> +k->intc_print_info = pnv_chip_power8_intc_print_info;
>  k->isa_create = pnv_chip_power8_isa_create;
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
> @@ -1128,6 +1147,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
> *klass, void *data)
>  k->intc_create = pnv_chip_power8_intc_create;
>  k->intc_reset = pnv_chip_power8_intc_reset;
>  k->intc_destroy = pnv_chip_power8_intc_destroy;
> +k->intc_print_info = pnv_chip_power8_intc_print_info;
>  k->isa_create = pnv_chip_power8nvl_isa_create;
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
> @@ -1299,6 +1319,7 @@ static void pnv_chip_power9_class_init(ObjectClass 
> *klass, void *data)
>  k->intc_create = pnv_chip_power9_intc_create;
>  k->intc_reset = pnv_chip_power9_intc_reset;
>  k->intc_destroy = pnv_chip_power9_intc_destroy;
> +k->intc_print_info = pnv_chip_power9_intc_print_info;
>  k->isa_create = pnv_chip_power9_isa_create;
>  k->dt_populate = pnv_chip_power9_dt_populate;
>  k->pic_print_info = pnv_chip_power9_pic_print_info;
> @@ -1379,6 +1400,7 @@ static void pnv_chip_power10_class_init(ObjectClass 
> *klass, void *data)
>  k->intc_create = pnv_chip_power10_intc_create;
>  k->intc_reset = pnv_chip_power10_intc_reset;
>  k->intc_destroy = pnv_chip_power10_intc_destroy;
> +k->intc_print_info = pnv_chip_power10_intc_print_info;
>  k->isa_create = pnv_chip_power10_isa_create;
>  k->dt_populate = pnv_chip_power10_dt_populate;
>  k->pic_print_info = pnv_chip_power10_pic_print_info;
> @@ -1575,11 +1597,9 @@ static void pnv_pic_print_info(InterruptStatsProvider 
> *obj,
>  CPU_FOREACH(cs) {
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  
> -if (pnv_chip_is_power9(pnv->chips[0])) {
> -xive_tctx_pic_print_info(XIVE_TCTX(pnv_cpu_state(cpu)->intc), 
> mon);
> -} else {
> -icp_pic_print_info(ICP(pnv_cpu_state(cpu)->intc), mon);
> -}
> +/* XXX: loop on each chip/core/thread instead of CPU_FOREACH() */

May be we should introduce a helper such as : 

in

Re: [PATCH 08/13] ppc/pnv: Introduce PnvChipClass::xscom_core_base() method

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> The pnv_chip_core_realize() function configures the XSCOM MMIO subregion
> for each core of a single chip. The base address of the subregion depends
> on the CPU type. Its computation is currently open-code using the
> pnv_chip_is_powerXX() helpers. This can be achieved with QOM. Introduce
> a method for this in the base chip class and implement it in child classes.

OK. We might need to introduce a PnvXscom model one day but this is fine
for now.

> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 

> ---
>  hw/ppc/pnv.c |   31 ---
>  include/hw/ppc/pnv.h |1 +
>  2 files changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 2a53e99bda2e..88efa755e611 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -616,6 +616,24 @@ static void pnv_chip_power9_pic_print_info(PnvChip 
> *chip, Monitor *mon)
>  pnv_psi_pic_print_info(&chip9->psi, mon);
>  }
>  
> +static uint64_t pnv_chip_power8_xscom_core_base(PnvChip *chip,
> +uint32_t core_id)
> +{
> +return PNV_XSCOM_EX_BASE(core_id);
> +}
> +
> +static uint64_t pnv_chip_power9_xscom_core_base(PnvChip *chip,
> +uint32_t core_id)
> +{
> +return PNV9_XSCOM_EC_BASE(core_id);
> +}
> +
> +static uint64_t pnv_chip_power10_xscom_core_base(PnvChip *chip,
> + uint32_t core_id)
> +{
> +return PNV10_XSCOM_EC_BASE(core_id);
> +}
> +
>  static bool pnv_match_cpu(const char *default_type, const char *cpu_type)
>  {
>  PowerPCCPUClass *ppc_default =
> @@ -1107,6 +1125,7 @@ static void pnv_chip_power8e_class_init(ObjectClass 
> *klass, void *data)
>  k->isa_create = pnv_chip_power8_isa_create;
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
> +k->xscom_core_base = pnv_chip_power8_xscom_core_base;
>  dc->desc = "PowerNV Chip POWER8E";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power8_realize,
> @@ -1129,6 +1148,7 @@ static void pnv_chip_power8_class_init(ObjectClass 
> *klass, void *data)
>  k->isa_create = pnv_chip_power8_isa_create;
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
> +k->xscom_core_base = pnv_chip_power8_xscom_core_base;
>  dc->desc = "PowerNV Chip POWER8";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power8_realize,
> @@ -1151,6 +1171,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
> *klass, void *data)
>  k->isa_create = pnv_chip_power8nvl_isa_create;
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
> +k->xscom_core_base = pnv_chip_power8_xscom_core_base;
>  dc->desc = "PowerNV Chip POWER8NVL";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power8_realize,
> @@ -1323,6 +1344,7 @@ static void pnv_chip_power9_class_init(ObjectClass 
> *klass, void *data)
>  k->isa_create = pnv_chip_power9_isa_create;
>  k->dt_populate = pnv_chip_power9_dt_populate;
>  k->pic_print_info = pnv_chip_power9_pic_print_info;
> +k->xscom_core_base = pnv_chip_power9_xscom_core_base;
>  dc->desc = "PowerNV Chip POWER9";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power9_realize,
> @@ -1404,6 +1426,7 @@ static void pnv_chip_power10_class_init(ObjectClass 
> *klass, void *data)
>  k->isa_create = pnv_chip_power10_isa_create;
>  k->dt_populate = pnv_chip_power10_dt_populate;
>  k->pic_print_info = pnv_chip_power10_pic_print_info;
> +k->xscom_core_base = pnv_chip_power10_xscom_core_base;
>  dc->desc = "PowerNV Chip POWER10";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power10_realize,
> @@ -1491,13 +1514,7 @@ static void pnv_chip_core_realize(PnvChip *chip, Error 
> **errp)
>   &error_fatal);
>  
>  /* Each core has an XSCOM MMIO region */
> -if (pnv_chip_is_power10(chip)) {
> -xscom_core_base = PNV10_XSCOM_EC_BASE(core_hwid);
> -} else if (pnv_chip_is_power9(chip)) {
> -xscom_core_base = PNV9_XSCOM_EC_BASE(core_hwid);
> -} else {
> -xscom_core_base = PNV_XSCOM_EX_BASE(core_hwid);
> -}
> +xscom_core_base = pcc->xscom_core_base(chip, core_hwid);
>  
>  pnv_xscom_add_subregion(chip, xscom_core_base,
>  &pnv_core->xscom_regs);
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index 7d2402784d4b..17ca9a14ac8f 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -137,6 +137,7 @@ typedef struct PnvChipClass {
>  ISABus *(*isa_create)(PnvChip *chip, Error **errp);
>  void (*dt_populate)(PnvChip *chip, void *fdt);
>  void (*pic_print_info)(PnvChip *chip, Monitor *mon);
> +uint64_t (*xscom

Re: [PATCH 09/13] ppc/pnv: Pass XSCOM base address and address size to pnv_dt_xscom()

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
> it shouldn't have to guess the chip type in order to populate the "reg"
> property. Just pass the base address and address size as arguments.

Much better,
 
> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 

> ---
>  hw/ppc/pnv.c   |   12 +---
>  hw/ppc/pnv_xscom.c |   16 +++-
>  include/hw/ppc/pnv_xscom.h |3 ++-
>  3 files changed, 14 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 88efa755e611..c532e98e752a 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -282,7 +282,9 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, 
> void *fdt)
>  {
>  int i;
>  
> -pnv_dt_xscom(chip, fdt, 0);
> +pnv_dt_xscom(chip, fdt, 0,
> + cpu_to_be64(PNV_XSCOM_BASE(chip)),
> + cpu_to_be64(PNV_XSCOM_SIZE));
>  
>  for (i = 0; i < chip->nr_cores; i++) {
>  PnvCore *pnv_core = chip->cores[i];
> @@ -302,7 +304,9 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
> void *fdt)
>  {
>  int i;
>  
> -pnv_dt_xscom(chip, fdt, 0);
> +pnv_dt_xscom(chip, fdt, 0,
> + cpu_to_be64(PNV9_XSCOM_BASE(chip)),
> + cpu_to_be64(PNV9_XSCOM_SIZE));
>  
>  for (i = 0; i < chip->nr_cores; i++) {
>  PnvCore *pnv_core = chip->cores[i];
> @@ -321,7 +325,9 @@ static void pnv_chip_power10_dt_populate(PnvChip *chip, 
> void *fdt)
>  {
>  int i;
>  
> -pnv_dt_xscom(chip, fdt, 0);
> +pnv_dt_xscom(chip, fdt, 0,
> + cpu_to_be64(PNV10_XSCOM_BASE(chip)),
> + cpu_to_be64(PNV10_XSCOM_SIZE));
>  
>  for (i = 0; i < chip->nr_cores; i++) {
>  PnvCore *pnv_core = chip->cores[i];
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index df926003f2ba..8189767eb0bb 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -286,24 +286,14 @@ static const char compat_p8[] = 
> "ibm,power8-xscom\0ibm,xscom";
>  static const char compat_p9[] = "ibm,power9-xscom\0ibm,xscom";
>  static const char compat_p10[] = "ibm,power10-xscom\0ibm,xscom";
>  
> -int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset)
> +int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
> + uint64_t xscom_base, uint64_t xscom_size)
>  {
> -uint64_t reg[2];
> +uint64_t reg[] = { xscom_base, xscom_size };
>  int xscom_offset;
>  ForeachPopulateArgs args;
>  char *name;
>  
> -if (pnv_chip_is_power10(chip)) {
> -reg[0] = cpu_to_be64(PNV10_XSCOM_BASE(chip));
> -reg[1] = cpu_to_be64(PNV10_XSCOM_SIZE);
> -} else if (pnv_chip_is_power9(chip)) {
> -reg[0] = cpu_to_be64(PNV9_XSCOM_BASE(chip));
> -reg[1] = cpu_to_be64(PNV9_XSCOM_SIZE);
> -} else {
> -reg[0] = cpu_to_be64(PNV_XSCOM_BASE(chip));
> -reg[1] = cpu_to_be64(PNV_XSCOM_SIZE);
> -}
> -
>  name = g_strdup_printf("xscom@%" PRIx64, be64_to_cpu(reg[0]));
>  xscom_offset = fdt_add_subnode(fdt, root_offset, name);
>  _FDT(xscom_offset);
> diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
> index 2bdb7ae84fd3..ad53f788b44c 100644
> --- a/include/hw/ppc/pnv_xscom.h
> +++ b/include/hw/ppc/pnv_xscom.h
> @@ -114,7 +114,8 @@ typedef struct PnvXScomInterfaceClass {
>  #define PNV10_XSCOM_PSIHB_SIZE 0x100
>  
>  void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
> -int pnv_dt_xscom(PnvChip *chip, void *fdt, int offset);
> +int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
> + uint64_t xscom_base, uint64_t xscom_size);
>  
>  void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
>   MemoryRegion *mr);
> 




Re: [PATCH v5 1/5] tpm_spapr: Support TPM for ppc64 using CRQ based interface

2019-12-13 Thread Stefan Berger

On 12/13/19 12:34 AM, David Gibson wrote:

On Thu, Dec 12, 2019 at 03:24:26PM -0500, Stefan Berger wrote:

Implement support for TPM on ppc64 by implementing the vTPM CRQ interface
as a frontend. It can use the tpm_emulator driver backend with the external
swtpm.

The Linux vTPM driver for ppc64 works with this emulation.

This TPM emulator also handles the TPM 2 case.

Signed-off-by: Stefan Berger 
Reviewed-by: David Gibson 

diff --git a/hw/tpm/Kconfig b/hw/tpm/Kconfig
index 4c8ee87d67..66a570aac1 100644
--- a/hw/tpm/Kconfig
+++ b/hw/tpm/Kconfig
@@ -22,3 +22,9 @@ config TPM_EMULATOR
  bool
  default y
  depends on TPMDEV
+
+config TPM_SPAPR
+bool
+default n
+select TPMDEV
+depends on PSERIES
diff --git a/hw/tpm/Makefile.objs b/hw/tpm/Makefile.objs
index de0b85d02a..85eb99ae05 100644
--- a/hw/tpm/Makefile.objs
+++ b/hw/tpm/Makefile.objs
@@ -4,3 +4,4 @@ common-obj-$(CONFIG_TPM_TIS) += tpm_tis.o
  common-obj-$(CONFIG_TPM_CRB) += tpm_crb.o
  common-obj-$(CONFIG_TPM_PASSTHROUGH) += tpm_passthrough.o
  common-obj-$(CONFIG_TPM_EMULATOR) += tpm_emulator.o
+obj-$(CONFIG_TPM_SPAPR) += tpm_spapr.o
diff --git a/hw/tpm/tpm_spapr.c b/hw/tpm/tpm_spapr.c
new file mode 100644
index 00..c4a67e2403
--- /dev/null
+++ b/hw/tpm/tpm_spapr.c
@@ -0,0 +1,405 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * PAPR Virtual TPM
+ *
+ * Copyright (c) 2015, 2017 IBM Corporation.
+ *
+ * Authors:
+ *Stefan Berger 
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/qdev-properties.h"
+#include "migration/vmstate.h"
+
+#include "sysemu/tpm_backend.h"
+#include "tpm_int.h"
+#include "tpm_util.h"
+
+#include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_vio.h"
+#include "trace.h"
+
+#define DEBUG_SPAPR 0
+
+#define VIO_SPAPR_VTPM(obj) \
+ OBJECT_CHECK(SPAPRvTPMState, (obj), TYPE_TPM_SPAPR)
+
+typedef struct VioCRQ {

How does this structure relate to the existing SpaprVioCrq?


The existing one looks like this:

typedef struct SpaprVioCrq {
    uint64_t qladdr;
    uint32_t qsize;
    uint32_t qnext;
    int(*SendFunc)(struct SpaprVioDevice *vdev, uint8_t *crq);
} SpaprVioCrq;

I don't seem to find the fields there that we need for vTPM support.



Also we're now avoiding exceptions to StudlyCaps, because it causes
more confusion even if it is to match other capitalization
conventions.  So, I'd suggest 'VioCrq', 'TpmSpaprCrq' etc.



Will adjust.





+uint8_t valid;  /* 0x80: cmd; 0xc0: init crq */
+/* 0x81-0x83: CRQ message response */
+uint8_t msg;/* see below */
+uint16_t len;   /* len of TPM request; len of TPM response */
+uint32_t data;  /* rtce_dma_handle when sending TPM request */
+uint64_t reserved;
+} VioCRQ;
+
+typedef union TPMSpaprCRQ {
+VioCRQ s;
+uint8_t raw[sizeof(VioCRQ)];
+} TPMSpaprCRQ;

A union just to get raw bytes seems a really weird thing to do (as
opposed to just casting to (char *))



Ok, I will change it.






+
+#define SPAPR_VTPM_VALID_INIT_CRQ_COMMAND  0xC0
+#define SPAPR_VTPM_VALID_COMMAND   0x80
+#define SPAPR_VTPM_MSG_RESULT  0x80
+
+/* msg types for valid = SPAPR_VTPM_VALID_INIT_CRQ */
+#define SPAPR_VTPM_INIT_CRQ_RESULT   0x1
+#define SPAPR_VTPM_INIT_CRQ_COMPLETE_RESULT  0x2
+
+/* msg types for valid = SPAPR_VTPM_VALID_CMD */
+#define SPAPR_VTPM_GET_VERSION   0x1
+#define SPAPR_VTPM_TPM_COMMAND   0x2
+#define SPAPR_VTPM_GET_RTCE_BUFFER_SIZE  0x3
+#define SPAPR_VTPM_PREPARE_TO_SUSPEND0x4
+
+/* response error messages */
+#define SPAPR_VTPM_VTPM_ERROR0xff
+
+/* error codes */
+#define SPAPR_VTPM_ERR_COPY_IN_FAILED0x3
+#define SPAPR_VTPM_ERR_COPY_OUT_FAILED   0x4
+
+#define MAX_BUFFER_SIZE TARGET_PAGE_SIZE
+
+typedef struct {
+SpaprVioDevice vdev;
+
+TPMSpaprCRQ crq; /* track single TPM command */
+
+uint8_t state;
+#define SPAPR_VTPM_STATE_NONE 0
+#define SPAPR_VTPM_STATE_EXECUTION1
+#define SPAPR_VTPM_STATE_COMPLETION   2

I see this field written, but never read.  What's up with that?



    if (s->state == SPAPR_VTPM_STATE_EXECUTION) {
    return H_BUSY;
    }

Is this what you mean?





+
+unsigned char buffer[MAX_BUFFER_SIZE];
+
+TPMBackendCmd cmd;
+
+TPMBackend *be_driver;
+TPMVersion be_tpm_version;
+
+size_t be_buffer_size;
+} SPAPRvTPMState;

SpaprVtpmState

Or just SpaprTpmState, since we use just "tpm spapr" rather than
"vtpm" in plenty of other places.



Will adjust.





+
+static void tpm_spapr_show_buffer(const unsigned char *buffer,
+  size_t buffer_size, const char *string)
+{
+size_t len, i;
+char *line_buffer, *p;
+
+len = MIN(tpm_cmd_get_size(buffer), buffer_size);
+
+/*

Re: [PATCH 10/13] ppc/pnv: Pass content of the "compatible" property to pnv_dt_xscom()

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
> it shouldn't have to guess the chip type in order to populate the
> "compatible" property. Just pass the compat string and its size as
> arguments.

Yeah. That is where I think a PnXscom model and class could be a little
cleaner. This is minor.

> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 

> ---
>  hw/ppc/pnv.c   |   12 +---
>  hw/ppc/pnv_xscom.c |   20 +++-
>  include/hw/ppc/pnv_xscom.h |3 ++-
>  3 files changed, 14 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index c532e98e752a..0447b534b8c5 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -280,11 +280,13 @@ static void pnv_dt_icp(PnvChip *chip, void *fdt, 
> uint32_t pir,
>  
>  static void pnv_chip_power8_dt_populate(PnvChip *chip, void *fdt)
>  {
> +static const char compat[] = "ibm,power8-xscom\0ibm,xscom";
>  int i;
>  
>  pnv_dt_xscom(chip, fdt, 0,
>   cpu_to_be64(PNV_XSCOM_BASE(chip)),
> - cpu_to_be64(PNV_XSCOM_SIZE));
> + cpu_to_be64(PNV_XSCOM_SIZE),
> + compat, sizeof(compat));
>  
>  for (i = 0; i < chip->nr_cores; i++) {
>  PnvCore *pnv_core = chip->cores[i];
> @@ -302,11 +304,13 @@ static void pnv_chip_power8_dt_populate(PnvChip *chip, 
> void *fdt)
>  
>  static void pnv_chip_power9_dt_populate(PnvChip *chip, void *fdt)
>  {
> +static const char compat[] = "ibm,power9-xscom\0ibm,xscom";
>  int i;
>  
>  pnv_dt_xscom(chip, fdt, 0,
>   cpu_to_be64(PNV9_XSCOM_BASE(chip)),
> - cpu_to_be64(PNV9_XSCOM_SIZE));
> + cpu_to_be64(PNV9_XSCOM_SIZE),
> + compat, sizeof(compat));
>  
>  for (i = 0; i < chip->nr_cores; i++) {
>  PnvCore *pnv_core = chip->cores[i];
> @@ -323,11 +327,13 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, 
> void *fdt)
>  
>  static void pnv_chip_power10_dt_populate(PnvChip *chip, void *fdt)
>  {
> +static const char compat[] = "ibm,power10-xscom\0ibm,xscom";
>  int i;
>  
>  pnv_dt_xscom(chip, fdt, 0,
>   cpu_to_be64(PNV10_XSCOM_BASE(chip)),
> - cpu_to_be64(PNV10_XSCOM_SIZE));
> + cpu_to_be64(PNV10_XSCOM_SIZE),
> + compat, sizeof(compat));
>  
>  for (i = 0; i < chip->nr_cores; i++) {
>  PnvCore *pnv_core = chip->cores[i];
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index 8189767eb0bb..5ae9dfbb88ad 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -282,12 +282,9 @@ static int xscom_dt_child(Object *child, void *opaque)
>  return 0;
>  }
>  
> -static const char compat_p8[] = "ibm,power8-xscom\0ibm,xscom";
> -static const char compat_p9[] = "ibm,power9-xscom\0ibm,xscom";
> -static const char compat_p10[] = "ibm,power10-xscom\0ibm,xscom";
> -
>  int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
> - uint64_t xscom_base, uint64_t xscom_size)
> + uint64_t xscom_base, uint64_t xscom_size,
> + const char *compat, int compat_size)
>  {
>  uint64_t reg[] = { xscom_base, xscom_size };
>  int xscom_offset;
> @@ -302,18 +299,7 @@ int pnv_dt_xscom(PnvChip *chip, void *fdt, int 
> root_offset,
>  _FDT((fdt_setprop_cell(fdt, xscom_offset, "#address-cells", 1)));
>  _FDT((fdt_setprop_cell(fdt, xscom_offset, "#size-cells", 1)));
>  _FDT((fdt_setprop(fdt, xscom_offset, "reg", reg, sizeof(reg;
> -
> -if (pnv_chip_is_power10(chip)) {
> -_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat_p10,
> -  sizeof(compat_p10;
> -} else if (pnv_chip_is_power9(chip)) {
> -_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat_p9,
> -  sizeof(compat_p9;
> -} else {
> -_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat_p8,
> -  sizeof(compat_p8;
> -}
> -
> +_FDT((fdt_setprop(fdt, xscom_offset, "compatible", compat, 
> compat_size)));
>  _FDT((fdt_setprop(fdt, xscom_offset, "scom-controller", NULL, 0)));
>  
>  args.fdt = fdt;
> diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
> index ad53f788b44c..f74c81a980f3 100644
> --- a/include/hw/ppc/pnv_xscom.h
> +++ b/include/hw/ppc/pnv_xscom.h
> @@ -115,7 +115,8 @@ typedef struct PnvXScomInterfaceClass {
>  
>  void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
>  int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
> - uint64_t xscom_base, uint64_t xscom_size);
> + uint64_t xscom_base, uint64_t xscom_size,
> + const char *compat, int compat_size);
>  
>  void pnv_xscom_add_subregion(PnvChip *chip, hwaddr offset,
>   MemoryRegion *mr);
> 




Re: [PATCH 12/13] ppc/pnv: Introduce PnvChipClass::xscom_pcba() method

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> The XSCOM bus is implemented with a QOM interface, which is mostly
> generic from a CPU type standpoint, except for the computation of
> addresses on the Pervasize Connect Bus (PCB) network. This is handled

Pervasive

> by the pnv_xscom_pcba() function with a switch statement based on
> the chip_type class level attribute of the CPU chip.
> 
> This can be achieved using QOM. Also the address argument is masked with
> PNV_XSCOM_SIZE - 1, which is for POWER8 only. Addresses may have different
> sizes with other CPU types. Have each CPU chip type handle the appropriate
> computation with a QOM xscom_pcba() method.

PnvXscom model ? :)

> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 

> ---
>  hw/ppc/pnv.c |   23 +++
>  hw/ppc/pnv_xscom.c   |   14 +-
>  include/hw/ppc/pnv.h |1 +
>  3 files changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 0447b534b8c5..cc40b90e9cd2 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1121,6 +1121,12 @@ static void pnv_chip_power8_realize(DeviceState *dev, 
> Error **errp)
>  &chip8->homer.regs);
>  }
>  
> +static uint32_t pnv_chip_power8_xscom_pcba(PnvChip *chip, uint64_t addr)
> +{
> +addr &= (PNV_XSCOM_SIZE - 1);
> +return ((addr >> 4) & ~0xfull) | ((addr >> 3) & 0xf);
> +}
> +
>  static void pnv_chip_power8e_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -1138,6 +1144,7 @@ static void pnv_chip_power8e_class_init(ObjectClass 
> *klass, void *data)
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
>  k->xscom_core_base = pnv_chip_power8_xscom_core_base;
> +k->xscom_pcba = pnv_chip_power8_xscom_pcba;
>  dc->desc = "PowerNV Chip POWER8E";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power8_realize,
> @@ -1161,6 +1168,7 @@ static void pnv_chip_power8_class_init(ObjectClass 
> *klass, void *data)
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
>  k->xscom_core_base = pnv_chip_power8_xscom_core_base;
> +k->xscom_pcba = pnv_chip_power8_xscom_pcba;
>  dc->desc = "PowerNV Chip POWER8";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power8_realize,
> @@ -1184,6 +1192,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
> *klass, void *data)
>  k->dt_populate = pnv_chip_power8_dt_populate;
>  k->pic_print_info = pnv_chip_power8_pic_print_info;
>  k->xscom_core_base = pnv_chip_power8_xscom_core_base;
> +k->xscom_pcba = pnv_chip_power8_xscom_pcba;
>  dc->desc = "PowerNV Chip POWER8NVL";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power8_realize,
> @@ -1340,6 +1349,12 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
> Error **errp)
>  &chip9->homer.regs);
>  }
>  
> +static uint32_t pnv_chip_power9_xscom_pcba(PnvChip *chip, uint64_t addr)
> +{
> +addr &= (PNV9_XSCOM_SIZE - 1);
> +return addr >> 3;
> +}
> +
>  static void pnv_chip_power9_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -1357,6 +1372,7 @@ static void pnv_chip_power9_class_init(ObjectClass 
> *klass, void *data)
>  k->dt_populate = pnv_chip_power9_dt_populate;
>  k->pic_print_info = pnv_chip_power9_pic_print_info;
>  k->xscom_core_base = pnv_chip_power9_xscom_core_base;
> +k->xscom_pcba = pnv_chip_power9_xscom_pcba;
>  dc->desc = "PowerNV Chip POWER9";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power9_realize,
> @@ -1422,6 +1438,12 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
> Error **errp)
>  (uint64_t) 
> PNV10_LPCM_BASE(chip));
>  }
>  
> +static uint32_t pnv_chip_power10_xscom_pcba(PnvChip *chip, uint64_t addr)
> +{
> +addr &= (PNV10_XSCOM_SIZE - 1);
> +return addr >> 3;
> +}
> +
>  static void pnv_chip_power10_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -1439,6 +1461,7 @@ static void pnv_chip_power10_class_init(ObjectClass 
> *klass, void *data)
>  k->dt_populate = pnv_chip_power10_dt_populate;
>  k->pic_print_info = pnv_chip_power10_pic_print_info;
>  k->xscom_core_base = pnv_chip_power10_xscom_core_base;
> +k->xscom_pcba = pnv_chip_power10_xscom_pcba;
>  dc->desc = "PowerNV Chip POWER10";
>  
>  device_class_set_parent_realize(dc, pnv_chip_power10_realize,
> diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
> index 5ae9dfbb88ad..b681c72575b2 100644
> --- a/hw/ppc/pnv_xscom.c
> +++ b/hw/ppc/pnv_xscom.c
> @@ -57,19 +57,7 @@ static void xscom_complete(CPUState *cs, uint64_t 
> hmer_bits)
>  
>  static uint32_t pnv_xscom_pcba(PnvChip *chip, uint64_t addr)
>  {
> -addr &= (PNV_XSCOM_SIZE - 1);
> -
> -   

Re: [PATCH 11/13] ppc/pnv: Drop pnv_chip_is_power9() and pnv_chip_is_power10() helpers

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> They aren't used anymore.

Good !

> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 

> ---
>  include/hw/ppc/pnv.h |   10 --
>  1 file changed, 10 deletions(-)
> 
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index 17ca9a14ac8f..7a134a15d3b5 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -224,21 +224,11 @@ struct PnvMachineState {
>  PnvPnor  *pnor;
>  };
>  
> -static inline bool pnv_chip_is_power9(const PnvChip *chip)
> -{
> -return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER9;
> -}
> -
>  PnvChip *pnv_get_chip(uint32_t chip_id);
>  
>  #define PNV_FDT_ADDR  0x0100
>  #define PNV_TIMEBASE_FREQ 51200ULL
>  
> -static inline bool pnv_chip_is_power10(const PnvChip *chip)
> -{
> -return PNV_CHIP_GET_CLASS(chip)->chip_type == PNV_CHIP_POWER10;
> -}
> -
>  /*
>   * BMC helpers
>   */
> 




Re: [PATCH 13/13] ppc/pnv: Drop PnvChipClass::type

2019-12-13 Thread Cédric Le Goater
On 13/12/2019 13:00, Greg Kurz wrote:
> It isn't used anymore.

Fantastic ! 

> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 

Thanks, 

C.

> ---
>  hw/ppc/pnv.c |5 -
>  include/hw/ppc/pnv.h |9 -
>  2 files changed, 14 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index cc40b90e9cd2..232b4a25603c 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1132,7 +1132,6 @@ static void pnv_chip_power8e_class_init(ObjectClass 
> *klass, void *data)
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvChipClass *k = PNV_CHIP_CLASS(klass);
>  
> -k->chip_type = PNV_CHIP_POWER8E;
>  k->chip_cfam_id = 0x221ef0498000ull;  /* P8 Murano DD2.1 */
>  k->cores_mask = POWER8E_CORE_MASK;
>  k->core_pir = pnv_chip_core_pir_p8;
> @@ -1156,7 +1155,6 @@ static void pnv_chip_power8_class_init(ObjectClass 
> *klass, void *data)
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvChipClass *k = PNV_CHIP_CLASS(klass);
>  
> -k->chip_type = PNV_CHIP_POWER8;
>  k->chip_cfam_id = 0x220ea0498000ull; /* P8 Venice DD2.0 */
>  k->cores_mask = POWER8_CORE_MASK;
>  k->core_pir = pnv_chip_core_pir_p8;
> @@ -1180,7 +1178,6 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
> *klass, void *data)
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvChipClass *k = PNV_CHIP_CLASS(klass);
>  
> -k->chip_type = PNV_CHIP_POWER8NVL;
>  k->chip_cfam_id = 0x120d30498000ull;  /* P8 Naples DD1.0 */
>  k->cores_mask = POWER8_CORE_MASK;
>  k->core_pir = pnv_chip_core_pir_p8;
> @@ -1360,7 +1357,6 @@ static void pnv_chip_power9_class_init(ObjectClass 
> *klass, void *data)
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvChipClass *k = PNV_CHIP_CLASS(klass);
>  
> -k->chip_type = PNV_CHIP_POWER9;
>  k->chip_cfam_id = 0x220d10498000ull; /* P9 Nimbus DD2.0 */
>  k->cores_mask = POWER9_CORE_MASK;
>  k->core_pir = pnv_chip_core_pir_p9;
> @@ -1449,7 +1445,6 @@ static void pnv_chip_power10_class_init(ObjectClass 
> *klass, void *data)
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  PnvChipClass *k = PNV_CHIP_CLASS(klass);
>  
> -k->chip_type = PNV_CHIP_POWER10;
>  k->chip_cfam_id = 0x120da0498000ull; /* P10 DD1.0 (with NX) */
>  k->cores_mask = POWER10_CORE_MASK;
>  k->core_pir = pnv_chip_core_pir_p10;
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index 4972e93c2619..f78fd0dd967c 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -38,14 +38,6 @@
>  #define PNV_CHIP_GET_CLASS(obj) \
>   OBJECT_GET_CLASS(PnvChipClass, (obj), TYPE_PNV_CHIP)
>  
> -typedef enum PnvChipType {
> -PNV_CHIP_POWER8E, /* AKA Murano (default) */
> -PNV_CHIP_POWER8,  /* AKA Venice */
> -PNV_CHIP_POWER8NVL,   /* AKA Naples */
> -PNV_CHIP_POWER9,  /* AKA Nimbus */
> -PNV_CHIP_POWER10, /* AKA TBD */
> -} PnvChipType;
> -
>  typedef struct PnvChip {
>  /*< private >*/
>  SysBusDevice parent_obj;
> @@ -123,7 +115,6 @@ typedef struct PnvChipClass {
>  SysBusDeviceClass parent_class;
>  
>  /*< public >*/
> -PnvChipType  chip_type;
>  uint64_t chip_cfam_id;
>  uint64_t cores_mask;
>  
> 




Re: [PATCH 0/2] RFC: add -mem-shared option

2019-12-13 Thread Igor Mammedov
On Fri, 13 Dec 2019 11:39:57 +
Stefan Hajnoczi  wrote:

> On Fri, Nov 29, 2019 at 10:23:25AM +0100, Igor Mammedov wrote:
> > On Thu, 28 Nov 2019 16:59:33 +
> > "Dr. David Alan Gilbert"  wrote:
> >   
> > > * Marc-André Lureau (marcandre.lur...@redhat.com) wrote:  
> > > > Hi,
> > > > 
> > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > node,memdev=mem.
> > > > 
> > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > that will make the -mem-path or anonymouse memory shareable.
> > > > 
> > > > Comments welcome,
> > > 
> > > It's worth checking with Igor (cc'd) - he said he was going to work on
> > > something similar.
> > > 
> > > One other thing this fixes is that it lets you potentially do vhost-user
> > > on s390, since it currently has no NUMA.  
> > Switching to memdev will let vhost-user on s390 work as well.
> > This is convenience option and workarounds inability to set main RAM
> > properties in current impl.   
> 
> Gong Su asked about virtio-fs (vhost-user) on s390.  This patch series
> might be the first step to enabling it.

I'm preparing(resplitting/cleaning up) series that will switch main RAM
to memdev backend.

(
I'd prefer to post complete series that does conversion across all boards.
But if it's pressing, I surely can post several patches to enable it for s390
and get some early feedback on approach
)


> 
> Stefan




[PATCH 0/2] rcu_read auto macro use

2019-12-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Hi,
  A couple more uses of the rcu_read macros; in qsp and
hyperv (neither of which list maintainers, so I guess
best through RCU).

The hyperv case saves a temporary.
The qsp case uses an rcu_read_lock around the lifetime
of a snapshot and carefully comments that; but now
it's automatic.

[Hyperv not tested]

Dave

Dr. David Alan Gilbert (2):
  hyperv: Use auto rcu_read macros
  qsp: Use WITH_RCU_READ_LOCK_GUARD

 hw/hyperv/hyperv.c | 22 +-
 util/qsp.c | 22 ++
 2 files changed, 19 insertions(+), 25 deletions(-)

-- 
2.23.0




[PATCH 2/2] qsp: Use WITH_RCU_READ_LOCK_GUARD

2019-12-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

The automatic rcu read lock maintenance works quite
nicely in this case where it previously relied on a comment to
delimit the lifetime and now has a block.

Signed-off-by: Dr. David Alan Gilbert 
---
 util/qsp.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/util/qsp.c b/util/qsp.c
index 62265417fd..7d5147f1b2 100644
--- a/util/qsp.c
+++ b/util/qsp.c
@@ -598,7 +598,6 @@ static void qsp_ht_delete(void *p, uint32_t h, void *htp)
 
 static void qsp_mktree(GTree *tree, bool callsite_coalesce)
 {
-QSPSnapshot *snap;
 struct qht ht, coalesce_ht;
 struct qht *htp;
 
@@ -610,20 +609,19 @@ static void qsp_mktree(GTree *tree, bool 
callsite_coalesce)
  * We must remain in an RCU read-side critical section until we're done
  * with the snapshot.
  */
-rcu_read_lock();
-snap = atomic_rcu_read(&qsp_snapshot);
+WITH_RCU_READ_LOCK_GUARD() {
+QSPSnapshot *snap = atomic_rcu_read(&qsp_snapshot);
 
-/* Aggregate all results from the global hash table into a local one */
-qht_init(&ht, qsp_entry_no_thread_cmp, QSP_INITIAL_SIZE,
- QHT_MODE_AUTO_RESIZE | QHT_MODE_RAW_MUTEXES);
-qht_iter(&qsp_ht, qsp_aggregate, &ht);
+/* Aggregate all results from the global hash table into a local one */
+qht_init(&ht, qsp_entry_no_thread_cmp, QSP_INITIAL_SIZE,
+ QHT_MODE_AUTO_RESIZE | QHT_MODE_RAW_MUTEXES);
+qht_iter(&qsp_ht, qsp_aggregate, &ht);
 
-/* compute the difference wrt the snapshot, if any */
-if (snap) {
-qsp_diff(&snap->ht, &ht);
+/* compute the difference wrt the snapshot, if any */
+if (snap) {
+qsp_diff(&snap->ht, &ht);
+}
 }
-/* done with the snapshot; RCU can reclaim it */
-rcu_read_unlock();
 
 htp = &ht;
 if (callsite_coalesce) {
-- 
2.23.0




[PATCH 1/2] hyperv: Use auto rcu_read macros

2019-12-13 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Use RCU_READ_LOCK_GUARD and WITH_RCU_READ_LOCK_GUARD
to replace the manual rcu_read_(un)lock calls.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/hyperv/hyperv.c | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
index 6ebf31c310..da8ce82725 100644
--- a/hw/hyperv/hyperv.c
+++ b/hw/hyperv/hyperv.c
@@ -546,14 +546,14 @@ uint16_t hyperv_hcall_post_message(uint64_t param, bool 
fast)
 }
 
 ret = HV_STATUS_INVALID_CONNECTION_ID;
-rcu_read_lock();
-QLIST_FOREACH_RCU(mh, &msg_handlers, link) {
-if (mh->conn_id == (msg->connection_id & HV_CONNECTION_ID_MASK)) {
-ret = mh->handler(msg, mh->data);
-break;
+WITH_RCU_READ_LOCK_GUARD() {
+QLIST_FOREACH_RCU(mh, &msg_handlers, link) {
+if (mh->conn_id == (msg->connection_id & HV_CONNECTION_ID_MASK)) {
+ret = mh->handler(msg, mh->data);
+break;
+}
 }
 }
-rcu_read_unlock();
 
 unmap:
 cpu_physical_memory_unmap(msg, len, 0, 0);
@@ -619,7 +619,6 @@ int hyperv_set_event_flag_handler(uint32_t conn_id, 
EventNotifier *notifier)
 
 uint16_t hyperv_hcall_signal_event(uint64_t param, bool fast)
 {
-uint16_t ret;
 EventFlagHandler *handler;
 
 if (unlikely(!fast)) {
@@ -645,15 +644,12 @@ uint16_t hyperv_hcall_signal_event(uint64_t param, bool 
fast)
 return HV_STATUS_INVALID_HYPERCALL_INPUT;
 }
 
-ret = HV_STATUS_INVALID_CONNECTION_ID;
-rcu_read_lock();
+RCU_READ_LOCK_GUARD();
 QLIST_FOREACH_RCU(handler, &event_flag_handlers, link) {
 if (handler->conn_id == param) {
 event_notifier_set(handler->notifier);
-ret = 0;
-break;
+return 0;
 }
 }
-rcu_read_unlock();
-return ret;
+return HV_STATUS_INVALID_CONNECTION_ID;
 }
-- 
2.23.0




Re: [PATCH] hw/isa/isa-bus: Use ISA_NUM_IRQS instead of magic number

2019-12-13 Thread Philippe Mathieu-Daudé

On 12/13/19 11:59 AM, Philippe Mathieu-Daudé wrote:

We have a definition for the number of ISA IRQs, use it.

Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/isa/isa-bus.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
index 388800603b..1d79ed133c 100644
--- a/hw/isa/isa-bus.c
+++ b/hw/isa/isa-bus.c
@@ -85,7 +85,7 @@ void isa_bus_irqs(ISABus *bus, qemu_irq *irqs)
  qemu_irq isa_get_irq(ISADevice *dev, int isairq)
  {
  assert(!dev || ISA_BUS(qdev_get_parent_bus(DEVICE(dev))) == isabus);
-if (isairq < 0 || isairq > 15) {
+if (isairq < 0 || isairq > >= ISA_NUM_IRQS) {


Oops, Pablo was quicker than patchew to notice this buggy patch =)

I messed when copy/pasting from a bigger series :/

So no need to review further.


  hw_error("isa irq %d invalid", isairq);
  }
  return isabus->irqs[isairq];






Re: [PATCH 1/2] hyperv: Use auto rcu_read macros

2019-12-13 Thread Roman Kagan
On Fri, Dec 13, 2019 at 01:19:30PM +, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Use RCU_READ_LOCK_GUARD and WITH_RCU_READ_LOCK_GUARD
> to replace the manual rcu_read_(un)lock calls.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  hw/hyperv/hyperv.c | 22 +-
>  1 file changed, 9 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/hyperv/hyperv.c b/hw/hyperv/hyperv.c
> index 6ebf31c310..da8ce82725 100644
> --- a/hw/hyperv/hyperv.c
> +++ b/hw/hyperv/hyperv.c
> @@ -546,14 +546,14 @@ uint16_t hyperv_hcall_post_message(uint64_t param, bool 
> fast)
>  }
>  
>  ret = HV_STATUS_INVALID_CONNECTION_ID;
> -rcu_read_lock();
> -QLIST_FOREACH_RCU(mh, &msg_handlers, link) {
> -if (mh->conn_id == (msg->connection_id & HV_CONNECTION_ID_MASK)) {
> -ret = mh->handler(msg, mh->data);
> -break;
> +WITH_RCU_READ_LOCK_GUARD() {
> +QLIST_FOREACH_RCU(mh, &msg_handlers, link) {
> +if (mh->conn_id == (msg->connection_id & HV_CONNECTION_ID_MASK)) 
> {
> +ret = mh->handler(msg, mh->data);
> +break;
> +}
>  }
>  }
> -rcu_read_unlock();
>  
>  unmap:
>  cpu_physical_memory_unmap(msg, len, 0, 0);
> @@ -619,7 +619,6 @@ int hyperv_set_event_flag_handler(uint32_t conn_id, 
> EventNotifier *notifier)
>  
>  uint16_t hyperv_hcall_signal_event(uint64_t param, bool fast)
>  {
> -uint16_t ret;
>  EventFlagHandler *handler;
>  
>  if (unlikely(!fast)) {
> @@ -645,15 +644,12 @@ uint16_t hyperv_hcall_signal_event(uint64_t param, bool 
> fast)
>  return HV_STATUS_INVALID_HYPERCALL_INPUT;
>  }
>  
> -ret = HV_STATUS_INVALID_CONNECTION_ID;
> -rcu_read_lock();
> +RCU_READ_LOCK_GUARD();
>  QLIST_FOREACH_RCU(handler, &event_flag_handlers, link) {
>  if (handler->conn_id == param) {
>  event_notifier_set(handler->notifier);
> -ret = 0;
> -break;
> +return 0;
>  }
>  }
> -rcu_read_unlock();
> -return ret;
> +return HV_STATUS_INVALID_CONNECTION_ID;
>  }

I have a slight preference towards using WITH_RCU_READ_LOCK_GUARD
instead of sticking RCU_READ_LOCK_GUARD in the middle of the function
and implicitly relying on there being none but trivial statements past
the rcu-protected section.

Nothing that I would insist on, though, so

Reviewed-by: Roman Kagan 



Re: [PATCH 0/2] rcu_read auto macro use

2019-12-13 Thread Paolo Bonzini
On 13/12/19 14:19, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Hi,
>   A couple more uses of the rcu_read macros; in qsp and
> hyperv (neither of which list maintainers, so I guess
> best through RCU).
> 
> The hyperv case saves a temporary.
> The qsp case uses an rcu_read_lock around the lifetime
> of a snapshot and carefully comments that; but now
> it's automatic.
> 
> [Hyperv not tested]

Queued, thanks.

Paolo

> Dave
> 
> Dr. David Alan Gilbert (2):
>   hyperv: Use auto rcu_read macros
>   qsp: Use WITH_RCU_READ_LOCK_GUARD
> 
>  hw/hyperv/hyperv.c | 22 +-
>  util/qsp.c | 22 ++
>  2 files changed, 19 insertions(+), 25 deletions(-)
> 




[PATCH] memory: use RCU_READ_LOCK_GUARD

2019-12-13 Thread Paolo Bonzini
Cc: Dr. David Alan Gilbert 
Signed-off-by: Paolo Bonzini 
---
 include/exec/memory.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index e499dc2..e42a9d7 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2165,7 +2165,7 @@ MemTxResult address_space_read(AddressSpace *as, hwaddr 
addr,
 
 if (__builtin_constant_p(len)) {
 if (len) {
-rcu_read_lock();
+RCU_READ_LOCK_GUARD();
 fv = address_space_to_flatview(as);
 l = len;
 mr = flatview_translate(fv, addr, &addr1, &l, false, attrs);
@@ -2176,7 +2176,6 @@ MemTxResult address_space_read(AddressSpace *as, hwaddr 
addr,
 result = flatview_read_continue(fv, addr, attrs, buf, len,
 addr1, l, mr);
 }
-rcu_read_unlock();
 }
 } else {
 result = address_space_read_full(as, addr, attrs, buf, len);
-- 
1.8.3.1




[PATCH] colo: fix return without releasing RCU

2019-12-13 Thread Paolo Bonzini
Use WITH_RCU_READ_LOCK_GUARD to avoid exiting colo_init_ram_cache
without releasing RCU.

Cc: Dr. David Alan Gilbert 
Signed-off-by: Paolo Bonzini 
---
 migration/ram.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 7dd7f81..8d7c015 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3891,26 +3891,27 @@ int colo_init_ram_cache(void)
 {
 RAMBlock *block;
 
-rcu_read_lock();
-RAMBLOCK_FOREACH_NOT_IGNORED(block) {
-block->colo_cache = qemu_anon_ram_alloc(block->used_length,
-NULL,
-false);
-if (!block->colo_cache) {
-error_report("%s: Can't alloc memory for COLO cache of block %s,"
- "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
- block->used_length);
-RAMBLOCK_FOREACH_NOT_IGNORED(block) {
-if (block->colo_cache) {
-qemu_anon_ram_free(block->colo_cache, block->used_length);
-block->colo_cache = NULL;
+WITH_RCU_READ_LOCK_GUARD() {
+RAMBLOCK_FOREACH_NOT_IGNORED(block) {
+block->colo_cache = qemu_anon_ram_alloc(block->used_length,
+NULL,
+false);
+if (!block->colo_cache) {
+error_report("%s: Can't alloc memory for COLO cache of block 
%s,"
+ "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
+ block->used_length);
+RAMBLOCK_FOREACH_NOT_IGNORED(block) {
+if (block->colo_cache) {
+qemu_anon_ram_free(block->colo_cache, 
block->used_length);
+block->colo_cache = NULL;
+}
 }
+return -errno;
 }
-return -errno;
+memcpy(block->colo_cache, block->host, block->used_length);
 }
-memcpy(block->colo_cache, block->host, block->used_length);
 }
-rcu_read_unlock();
+
 /*
 * Record the dirty pages that sent by PVM, we use this dirty bitmap 
together
 * with to decide which page in cache should be flushed into SVM's RAM. Here
-- 
1.8.3.1




[PATCH RFC] qapi: Allow getting flat output from 'query-named-block-nodes'

2019-12-13 Thread Peter Krempa
When a management application manages node names there's no reason to
recurse into backing images in the output of query-named-block-nodes.

Add a parameter to the command which will return just the top level
structs.

Signed-off-by: Peter Krempa 
---
 block.c   |  5 +++--
 block/qapi.c  | 10 --
 blockdev.c| 12 ++--
 include/block/block.h |  2 +-
 include/block/qapi.h  |  4 +++-
 monitor/hmp-cmds.c|  2 +-
 qapi/block-core.json  |  6 +-
 7 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 473eb6eeaa..b30bdfa0d3 100644
--- a/block.c
+++ b/block.c
@@ -4766,14 +4766,15 @@ BlockDriverState *bdrv_find_node(const char *node_name)
 }

 /* Put this QMP function here so it can access the static graph_bdrv_states. */
-BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp)
+BlockDeviceInfoList *bdrv_named_nodes_list(bool flat,
+   Error **errp)
 {
 BlockDeviceInfoList *list, *entry;
 BlockDriverState *bs;

 list = NULL;
 QTAILQ_FOREACH(bs, &graph_bdrv_states, node_list) {
-BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, errp);
+BlockDeviceInfo *info = bdrv_block_device_info(NULL, bs, flat, errp);
 if (!info) {
 qapi_free_BlockDeviceInfoList(list);
 return NULL;
diff --git a/block/qapi.c b/block/qapi.c
index 9a5d0c9b27..84048e1a57 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -42,7 +42,9 @@
 #include "qemu/cutils.h"

 BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
-BlockDriverState *bs, Error **errp)
+BlockDriverState *bs,
+bool flat,
+Error **errp)
 {
 ImageInfo **p_image_info;
 BlockDriverState *bs0;
@@ -156,6 +158,10 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
 return NULL;
 }

+/* stop gathering data for flat output */
+if (flat)
+break;
+
 if (bs0->drv && bs0->backing) {
 info->backing_file_depth++;
 bs0 = bs0->backing->bs;
@@ -389,7 +395,7 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo 
**p_info,

 if (bs && bs->drv) {
 info->has_inserted = true;
-info->inserted = bdrv_block_device_info(blk, bs, errp);
+info->inserted = bdrv_block_device_info(blk, bs, false, errp);
 if (info->inserted == NULL) {
 goto err;
 }
diff --git a/blockdev.c b/blockdev.c
index 8e029e9c01..5f9c5e258f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3707,9 +3707,17 @@ void qmp_drive_backup(DriveBackup *arg, Error **errp)
 }
 }

-BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
+BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
+ bool flat,
+ Error **errp)
 {
-return bdrv_named_nodes_list(errp);
+bool return_flat = false;
+
+if (has_flat) {
+return_flat = flat;
+}
+
+return bdrv_named_nodes_list(return_flat, errp);
 }

 XDbgBlockGraph *qmp_x_debug_query_block_graph(Error **errp)
diff --git a/include/block/block.h b/include/block/block.h
index 1df9848e74..177ba09e3f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -468,7 +468,7 @@ void bdrv_lock_medium(BlockDriverState *bs, bool locked);
 void bdrv_eject(BlockDriverState *bs, bool eject_flag);
 const char *bdrv_get_format_name(BlockDriverState *bs);
 BlockDriverState *bdrv_find_node(const char *node_name);
-BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp);
+BlockDeviceInfoList *bdrv_named_nodes_list(bool flat, Error **errp);
 XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp);
 BlockDriverState *bdrv_lookup_bs(const char *device,
  const char *node_name,
diff --git a/include/block/qapi.h b/include/block/qapi.h
index cd9410dee3..22c7807c89 100644
--- a/include/block/qapi.h
+++ b/include/block/qapi.h
@@ -29,7 +29,9 @@
 #include "block/snapshot.h"

 BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
-BlockDriverState *bs, Error **errp);
+BlockDriverState *bs,
+bool flat,
+Error **errp);
 int bdrv_query_snapshot_info_list(BlockDriverState *bs,
   SnapshotInfoList **p_list,
   Error **errp);
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index b2551c16d1..651969819b 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -620,7 +620,7 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
 }

 /* Print node information */
-blockdev_list = qmp_query_named_block_nodes(NULL);
+blockdev_lis

[PATCH] virtio: update queue size on guest write

2019-12-13 Thread Michael S. Tsirkin
Some guests read back queue size after writing it.
Update the size immediatly upon write otherwise
they get confused.

Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index c6b47a9c73..e5c759e19e 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1256,6 +1256,8 @@ static void virtio_pci_common_write(void *opaque, hwaddr 
addr,
 break;
 case VIRTIO_PCI_COMMON_Q_SIZE:
 proxy->vqs[vdev->queue_sel].num = val;
+virtio_queue_set_num(vdev, vdev->queue_sel,
+ proxy->vqs[vdev->queue_sel].num);
 break;
 case VIRTIO_PCI_COMMON_Q_MSIX:
 msix_vector_unuse(&proxy->pci_dev,
-- 
MST




[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2019-12-13 Thread dann frazier
** Changed in: kunpeng920
   Status: New => Confirmed

** Changed in: qemu (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: qemu (Ubuntu Disco)
   Status: New => Confirmed

** Changed in: qemu (Ubuntu Focal)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Confirmed
Status in QEMU:
  In Progress
Status in qemu package in Ubuntu:
  Confirmed
Status in qemu source package in Bionic:
  Confirmed
Status in qemu source package in Disco:
  Confirmed
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in Focal:
  Confirmed

Bug description:
  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  [ Original Description ]

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns (fds=, nfds=,
  timeout=timeout@entry=-1) at util/qemu-timer.c:322
  #3  0xbbefbf80 in os_host_main_loop_wait (timeout=-1)
  at util/main-loop.c:233
  #4  main_loop_wait (nonblocking=) at util/main-loop.c:497
  #5  0xbbe2aa30 in convert_do_copy (s=0xc123bb58) at 
qemu-img.c:1980
  #6  img_convert (argc=, argv=) at 
qemu-img.c:2456
  #7  0xbbe2333c in main (argc=7, argv=) at 
qemu-img.c:4975

  Reproduced w/ latest QEMU git (@ 53744e0a182)

To manage notifications about this bug go to:
https://bugs.launchpad.net/kunpeng920/+bug/1805256/+subscriptions



[PULL 0/2] Block patches

2019-12-13 Thread Stefan Hajnoczi
The following changes since commit b0ca999a43a22b38158a33d3f5881648bb4f:

  Update version for v4.2.0 release (2019-12-12 16:45:57 +)

are available in the Git repository at:

  https://github.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to 86d2a49b41832355ab50cf60cec0cd50680fc0e5:

  iothread: document -object iothread on man page (2019-12-13 11:24:07 +)


Pull request



Evgeny Yakovlev (1):
  virtio-blk: advertise F_WCE (F_FLUSH) if F_CONFIG_WCE is advertised

Stefan Hajnoczi (1):
  iothread: document -object iothread on man page

 hw/arm/virt.c  |  1 +
 hw/block/virtio-blk.c  |  6 +-
 hw/core/machine.c  |  5 +
 hw/i386/pc_piix.c  |  1 +
 hw/i386/pc_q35.c   |  1 +
 hw/ppc/spapr.c |  2 +-
 hw/s390x/s390-virtio-ccw.c |  1 +
 include/hw/boards.h|  3 +++
 include/hw/virtio/virtio-blk.h |  1 +
 qemu-options.hx| 38 ++
 10 files changed, 57 insertions(+), 2 deletions(-)

-- 
2.23.0




[PULL 2/2] iothread: document -object iothread on man page

2019-12-13 Thread Stefan Hajnoczi
Add -object iothread documentation to the man page, including references
to the query-iothread QMP command and qom-set syntax for adjusting
adaptive polling parameters at run-time.

Reported-by: Zhenyu Ye 
Signed-off-by: Stefan Hajnoczi 
Message-id: 20191025122236.29815-1-stefa...@redhat.com
Message-Id: <20191025122236.29815-1-stefa...@redhat.com>
Signed-off-by: Stefan Hajnoczi 
---
 qemu-options.hx | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/qemu-options.hx b/qemu-options.hx
index 65c9473b73..68d1592ccc 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4926,6 +4926,44 @@ access
 CN=laptop.example.com,O=Example Home,L=London,ST=London,C=GB
 @end example
 
+@item -object 
iothread,id=@var{id},poll-max-ns=@var{poll-max-ns},poll-grow=@var{poll-grow},poll-shrink=@var{poll-shrink}
+
+Creates a dedicated event loop thread that devices can be assigned to.  This is
+known as an IOThread.  By default device emulation happens in vCPU threads or
+the main event loop thread.  This can become a scalability bottleneck.
+IOThreads allow device emulation and I/O to run on other host CPUs.
+
+The @option{id} parameter is a unique ID that will be used to reference this
+IOThread from @option{-device ...,iothread=@var{id}}.  Multiple devices can be
+assigned to an IOThread.  Note that not all devices support an
+@option{iothread} parameter.
+
+The @code{query-iothreads} QMP command lists IOThreads and reports their thread
+IDs so that the user can configure host CPU pinning/affinity.
+
+IOThreads use an adaptive polling algorithm to reduce event loop latency.
+Instead of entering a blocking system call to monitor file descriptors and then
+pay the cost of being woken up when an event occurs, the polling algorithm
+spins waiting for events for a short time.  The algorithm's default parameters
+are suitable for many cases but can be adjusted based on knowledge of the
+workload and/or host device latency.
+
+The @option{poll-max-ns} parameter is the maximum number of nanoseconds to busy
+wait for events.  Polling can be disabled by setting this value to 0.
+
+The @option{poll-grow} parameter is the multiplier used to increase the polling
+time when the algorithm detects it is missing events due to not polling long
+enough.
+
+The @option{poll-shrink} parameter is the divisor used to decrease the polling
+time when the algorithm detects it is spending too long polling without
+encountering events.
+
+The polling parameters can be modified at run-time using the @code{qom-set} 
command (where @code{iothread1} is the IOThread's @code{id}):
+
+@example
+(qemu) qom-set /objects/iothread1 poll-max-ns 10
+@end example
 
 @end table
 
-- 
2.23.0




[PULL 1/2] virtio-blk: advertise F_WCE (F_FLUSH) if F_CONFIG_WCE is advertised

2019-12-13 Thread Stefan Hajnoczi
From: Evgeny Yakovlev 

Virtio spec 1.1 (and earlier), 5.2.5.2 Driver Requirements: Device
Initialization:

"Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it if
they offer VIRTIO_BLK_F_CONFIG_WCE"

Currently F_CONFIG_WCE and F_WCE are not connected to each other.
Qemu will advertise F_CONFIG_WCE if config-wce argument is
set for virtio-blk device. And F_WCE is advertised only if
underlying block backend actually has it's caching enabled.

Fix this by advertising F_WCE if F_CONFIG_WCE is also advertised.

To preserve backwards compatibility with newer machine types make this
behaviour governed by "x-enable-wce-if-config-wce" virtio-blk-device
property and introduce hw_compat_4_2 with new property being off by
default for all machine types <= 4.2 (but don't introduce 4.3
machine type itself yet).

Signed-off-by: Evgeny Yakovlev 
Message-Id: <1572978137-189218-1-git-send-email-wr...@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi 
---
 hw/arm/virt.c  | 1 +
 hw/block/virtio-blk.c  | 6 +-
 hw/core/machine.c  | 5 +
 hw/i386/pc_piix.c  | 1 +
 hw/i386/pc_q35.c   | 1 +
 hw/ppc/spapr.c | 2 +-
 hw/s390x/s390-virtio-ccw.c | 1 +
 include/hw/boards.h| 3 +++
 include/hw/virtio/virtio-blk.h | 1 +
 9 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index d4bedc2607..bf4b1cbfb8 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2149,6 +2149,7 @@ type_init(machvirt_machine_init);
 
 static void virt_machine_4_2_options(MachineClass *mc)
 {
+compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
 }
 DEFINE_VIRT_MACHINE_AS_LATEST(4, 2)
 
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 4c357d2928..d62e6377c2 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -991,7 +991,9 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, 
uint64_t features,
 virtio_add_feature(&features, VIRTIO_BLK_F_SCSI);
 }
 
-if (blk_enable_write_cache(s->blk)) {
+if (blk_enable_write_cache(s->blk) ||
+(s->conf.x_enable_wce_if_config_wce &&
+ virtio_has_feature(features, VIRTIO_BLK_F_CONFIG_WCE))) {
 virtio_add_feature(&features, VIRTIO_BLK_F_WCE);
 }
 if (blk_is_read_only(s->blk)) {
@@ -1270,6 +1272,8 @@ static Property virtio_blk_properties[] = {
conf.max_discard_sectors, BDRV_REQUEST_MAX_SECTORS),
 DEFINE_PROP_UINT32("max-write-zeroes-sectors", VirtIOBlock,
conf.max_write_zeroes_sectors, 
BDRV_REQUEST_MAX_SECTORS),
+DEFINE_PROP_BOOL("x-enable-wce-if-config-wce", VirtIOBlock,
+ conf.x_enable_wce_if_config_wce, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1689ad3bf8..023548b4f3 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -27,6 +27,11 @@
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
 
+GlobalProperty hw_compat_4_2[] = {
+{ "virtio-blk-device", "x-enable-wce-if-config-wce", "off" },
+};
+const size_t hw_compat_4_2_len = G_N_ELEMENTS(hw_compat_4_2);
+
 GlobalProperty hw_compat_4_1[] = {
 { "virtio-pci", "x-pcie-flr-init", "off" },
 };
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1bd70d1abb..87aced0742 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -431,6 +431,7 @@ static void pc_i440fx_4_2_machine_options(MachineClass *m)
 m->alias = "pc";
 m->is_default = 1;
 pcmc->default_cpu_version = 1;
+compat_props_add(m->compat_props, hw_compat_4_2, hw_compat_4_2_len);
 }
 
 DEFINE_I440FX_MACHINE(v4_2, "pc-i440fx-4.2", NULL,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 385e5cffb1..2608cd0062 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -354,6 +354,7 @@ static void pc_q35_4_2_machine_options(MachineClass *m)
 pc_q35_machine_options(m);
 m->alias = "q35";
 pcmc->default_cpu_version = 1;
+compat_props_add(m->compat_props, hw_compat_4_2, hw_compat_4_2_len);
 }
 
 DEFINE_Q35_MACHINE(v4_2, "pc-q35-4.2", NULL,
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e076f6023c..2ca92f2148 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4496,7 +4496,7 @@ static const TypeInfo spapr_machine_info = {
  */
 static void spapr_machine_4_2_class_options(MachineClass *mc)
 {
-/* Defaults for the latest behaviour inherited from the base class */
+compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
 }
 
 DEFINE_SPAPR_MACHINE(4_2, "4.2", true);
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index d3edeef0ad..cb5fe4c84d 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -645,6 +645,7 @@ static void ccw_machine_4_2_instance_options(MachineState 
*machine)
 
 static void ccw_machine_4_2_class_options(MachineClass *mc)
 {
+compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
 }
 DEFINE_CCW_MACHIN

Re: [PATCH 092/104] virtiofsd: add man page

2019-12-13 Thread Liam Merwick

On 12/12/2019 16:38, Dr. David Alan Gilbert (git) wrote:

From: Stefan Hajnoczi 

Signed-off-by: Stefan Hajnoczi 
---
  Makefile   |  7 +++
  tools/virtiofsd/virtiofsd.texi | 85 ++
  2 files changed, 92 insertions(+)
  create mode 100644 tools/virtiofsd/virtiofsd.texi



... deleted ...


+@c man begin EXAMPLES
+Export @code{/var/lib/fs/vm001/} on vhost-user UNIX domain socket 
@code{/var/run/vm001-vhost-fs.sock}:
+
+@example
+host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o 
source=/var/lib/fs/vm001
+host# qemu-system-x86_64 \
+-chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \
+-device vhost-user-fs-pci,chardev=char0,tag=myfs \
+-object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
+-numa node,memdev=mem \
+...
+guest# mount -t virtio_fs \
+-o 
default_permissions,allow_other,user_id=0,group_id=0,rootmode=04,dax \
+myfs /mnt




Should this be 'mount -t virtiofs myfs /mnt' like on 
https://virtio-fs.gitlab.io/howto-qemu.html ?


otherwise

Reviewed-by: Liam Merwick 




[PATCH] virtio-blk: deprecate SCSI passthrough

2019-12-13 Thread Stefan Hajnoczi
The Linux virtio_blk.ko guest driver is removing legacy SCSI passthrough
support.  Deprecate this feature in QEMU too.

Signed-off-by: Stefan Hajnoczi 
---
 qemu-deprecated.texi | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index 4b4b7425ac..ef94d497da 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -285,6 +285,17 @@ spec you can use the ``-cpu rv64gcsu,priv_spec=v1.9.1`` 
command line argument.
 
 @section Device options
 
+@subsection Emulated device options
+
+@subsubsection -device virtio-blk,scsi=on|off (since 5.0.0)
+
+The virtio-blk SCSI passthrough feature is a legacy VIRTIO feature.  VIRTIO 1.0
+and later do not support it because the virtio-scsi device was introduced for
+full SCSI support.  Use virtio-scsi instead when SCSI passthrough is required.
+
+Note this also applies to ``-device virtio-blk-pci,scsi=on|off'', which is an
+alias.
+
 @subsection Block device options
 
 @subsubsection "backing": "" (since 2.12.0)
-- 
2.23.0




Re: [PATCH v2 5/8] x86: move SMM property to X86MachineState

2019-12-13 Thread Philippe Mathieu-Daudé

On 12/12/19 6:29 PM, Paolo Bonzini wrote:

Add it to microvm as well, it is a generic property of the x86
architecture.

Suggested-by: Sergio Lopez 
Signed-off-by: Paolo Bonzini 


Reviewed-by: Philippe Mathieu-Daudé 


---
  hw/i386/pc.c  | 49 -
  hw/i386/pc_piix.c |  6 +++---
  hw/i386/pc_q35.c  |  2 +-
  hw/i386/x86.c | 50 +-
  include/hw/i386/pc.h  |  3 ---
  include/hw/i386/x86.h |  5 +
  target/i386/kvm.c |  3 +--
  7 files changed, 59 insertions(+), 59 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index fdbd2bf..6a3212e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2028,48 +2028,6 @@ static void pc_machine_set_vmport(Object *obj, Visitor 
*v, const char *name,
  visit_type_OnOffAuto(v, name, &pcms->vmport, errp);
  }
  
-bool pc_machine_is_smm_enabled(PCMachineState *pcms)

-{
-bool smm_available = false;
-
-if (pcms->smm == ON_OFF_AUTO_OFF) {
-return false;
-}
-
-if (tcg_enabled() || qtest_enabled()) {
-smm_available = true;
-} else if (kvm_enabled()) {
-smm_available = kvm_has_smm();
-}
-
-if (smm_available) {
-return true;
-}
-
-if (pcms->smm == ON_OFF_AUTO_ON) {
-error_report("System Management Mode not supported by this 
hypervisor.");
-exit(1);
-}
-return false;
-}
-
-static void pc_machine_get_smm(Object *obj, Visitor *v, const char *name,
-   void *opaque, Error **errp)
-{
-PCMachineState *pcms = PC_MACHINE(obj);
-OnOffAuto smm = pcms->smm;
-
-visit_type_OnOffAuto(v, name, &smm, errp);
-}
-
-static void pc_machine_set_smm(Object *obj, Visitor *v, const char *name,
-   void *opaque, Error **errp)
-{
-PCMachineState *pcms = PC_MACHINE(obj);
-
-visit_type_OnOffAuto(v, name, &pcms->smm, errp);
-}
-
  static bool pc_machine_get_smbus(Object *obj, Error **errp)
  {
  PCMachineState *pcms = PC_MACHINE(obj);
@@ -2116,7 +2074,6 @@ static void pc_machine_initfn(Object *obj)
  {
  PCMachineState *pcms = PC_MACHINE(obj);
  
-pcms->smm = ON_OFF_AUTO_AUTO;

  #ifdef CONFIG_VMPORT
  pcms->vmport = ON_OFF_AUTO_AUTO;
  #else
@@ -2223,12 +2180,6 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
  pc_machine_get_device_memory_region_size, NULL,
  NULL, NULL, &error_abort);
  
-object_class_property_add(oc, PC_MACHINE_SMM, "OnOffAuto",

-pc_machine_get_smm, pc_machine_set_smm,
-NULL, NULL, &error_abort);
-object_class_property_set_description(oc, PC_MACHINE_SMM,
-"Enable SMM (pc & q35)", &error_abort);
-
  object_class_property_add(oc, PC_MACHINE_VMPORT, "OnOffAuto",
  pc_machine_get_vmport, pc_machine_set_vmport,
  NULL, NULL, &error_abort);
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1bd70d1..dd0f00e 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -281,7 +281,7 @@ else {
  /* TODO: Populate SPD eeprom data.  */
  pcms->smbus = piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100,
  x86ms->gsi[9], smi_irq,
-pc_machine_is_smm_enabled(pcms),
+x86_machine_is_smm_enabled(x86ms),
  &piix4_pm);
  smbus_eeprom_init(pcms->smbus, 8, NULL, 0);
  
@@ -309,9 +309,9 @@ else {
  
  static void pc_compat_2_3_fn(MachineState *machine)

  {
-PCMachineState *pcms = PC_MACHINE(machine);
+X86MachineState *x86ms = X86_MACHINE(machine);
  if (kvm_enabled()) {
-pcms->smm = ON_OFF_AUTO_OFF;
+x86ms->smm = ON_OFF_AUTO_OFF;
  }
  }
  
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c

index 385e5cf..bccaaee 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -276,7 +276,7 @@ static void pc_q35_init(MachineState *machine)
   0xff0104);
  
  /* connect pm stuff to lpc */

-ich9_lpc_pm_init(lpc, pc_machine_is_smm_enabled(pcms));
+ich9_lpc_pm_init(lpc, x86_machine_is_smm_enabled(x86ms));
  
  if (pcms->sata_enabled) {

  /* ahci and SATA device, for q35 1 ahci controller is built-in */
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 3e4aee5..6fb01e4 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -746,10 +746,53 @@ static void x86_machine_set_max_ram_below_4g(Object *obj, 
Visitor *v,
  x86ms->max_ram_below_4g = value;
  }
  
+bool x86_machine_is_smm_enabled(X86MachineState *x86ms)

+{
+bool smm_available = false;
+
+if (x86ms->smm == ON_OFF_AUTO_OFF) {
+return false;
+}
+
+if (tcg_enabled() || qtest_enabled()) {
+smm_available = true;
+} else if (kvm_enabled()) {
+smm_available = kvm_has_smm();
+}
+
+if (smm_available) {
+return true;
+}
+
+if (x86ms->smm == ON_OFF_AUTO_ON) {
+error_report

  1   2   >